Lucene Engine
CrateDB's fully distributed query engine is built on top of Apache Lucene®. Lucene engine supports CrateDB's core infrastructure for storage and indexing.
CrateDB utilizes Lucene to evenly distribute tabular data across the cluster into append-only shards. Lucene enhances SQL performance with full-text search and geospatial search, enabling easy scaling and dynamic schemas.
In CrateDB, every table is sharded, meaning that tables are divided and distributed across the cluster nodes. Each shard in CrateDB is a Lucene index broken down into segments, which are physically stored in a directory accessible to the node that manages the shards. The append-only nature of these segments ensures data immutability on disk, simplifying tasks like data replication, data recovery, and shard synchronization.
Additional resources
White Paper
CrateDB: Architecture Guide
The unique architecture of CrateDB allows it to prioritize scalability, performance and cost-efficiency at the same time, giving the industry the ability to access the power of their data.
Blog
Indexing and Storage in CrateDB
Blog
Guide to write operations in CrateDB
On-demand Workshop 2023
Introduction to CrateDB and its Architecture