High Availability
One of the key benefits of a distributed database like CrateDB is its ability to provide high availability for always-on applications, thanks to its shared-nothing architecture. This architecture ensures excellent performance with zero downtime at a minimal operational effort, and in contrast to a primary-secondary architecture, every node can perform every operation and all nodes are configured in the same way.
- CrateDB goes beyond just allowing multi-node setups; nodes can be distributed across multiple availability zones or data centers to further enhance availability.
- The system ensures uninterrupted data access during maintenance operations through the execution of rolling software updates.
- CrateDB natively provides automatic replication of data to a configurable number of nodes in the cluster. CrateDB clusters exhibit self-healing characteristics, where nodes re-joining a cluster after a failover automatically synchronize with the latest data.
Achieving high availability with CrateDB requires a minimum of three nodes to maintain a quorum for master node election, which holds the cluster state. The determination of the number of nodes is guided by the availability Service Level Agreement (SLA), specifying how many nodes can fail before the cluster cannot accept reads and writes. It is recommended to have at least one replica; depending on the availability SLA, having two or more replicas significantly enhances the level of failure tolerance.
CrateDB offers users the flexibility, on a per-table level, to decide how many replicas of the data should be created. This choice dictates how many nodes each table and its shards are replicated on, providing fine-grained control over data redundancy.
Securing high availability in a shared-nothing architecture
1. The failed node leaves the cluster due to some hardware failure or a rolling upgrade.
2. CrateDB performs an automatic failover, where all the data remain available on the other nodes.
3. As the new node joins, automatic data synchronization and rebalancing take place.
4. Once the new node is part of the system and data are fully replicated again, the self-healing process is completed.
CrateDB at Big Data Conference Europe 2022
Not all Time-Series are Equal – Challenges of Storing and Analyzing Industrial Data.
Timestamp: 12:22 – 18:53
CrateDB Architecture Guide
This comprehensive guide covers all the key concepts you need to know about CrateDB's architecture. It will help you gain a deeper understanding of what makes it performant, scalable, flexible and easy to use. Armed with this knowledge, you will be better equipped to make informed decisions about when to leverage CrateDB for your data projects.