CrateDB utilizes Lucene to evenly distribute tabular data across the cluster into append-only shards. Lucene enhances SQL performance with full-text search and geospatial search, enabling easy scaling and dynamic schemas.
In CrateDB, every table is sharded, meaning that tables are divided and distributed across the cluster nodes. Each shard in CrateDB is a Lucene index broken down into segments, which are physically stored in a directory accessible to the node that manages the shards. The append-only nature of these segments ensures data immutability on disk, simplifying tasks like data replication, data recovery, and shard synchronization.
Storage and consistency
On-demand Workshop 2023
Introduction to CrateDB and its Architecture