The 2024 CrateDB architecture guide covering all key concepts is out.

Download now
Skip to content
Features

Fault Tolerance

Fault tolerance in CrateDB is managed through a robust and efficient system designed to ensure uninterrupted operation, even amid hardware or software failures. This is achieved through a distributed architecture that provides scalability and resilience to node failures. Key features such as automatic sharding and data replication across nodes ensure high availability and redundancy.

The master-node architecture of CrateDB allows for efficient cluster management and quick recovery by electing a new master node in case of a failure. Automatic recovery mechanisms redistribute tasks and data to healthy nodes, facilitating continuous operation. Data replication at the shard level across different nodes further ensures data availability and fault tolerance. Write operations are considered successful when data is written to the primary shard and replicated to the necessary number of replica shards, as per the defined replication factor. This fault-tolerant design contributes to CrateDB's ability to handle large datasets while minimizing the impact of potential failures.

Product documentation

Sharding

Replication

Resiliency

Clustering

Additional resources

Interested in learning more?