Query Performance
CrateDB provides high-performance capabilities with query response time in milliseconds to process and analyze data efficiently. With its advanced capabilities, CrateDB provides a high-performance distributed query engine, writes, and reads, enhancing query performance significantly:
- Distributed Query Engine: CrateDB’s query engine is architected to optimize data throughput and query performance, particularly as the number of concurrent operations grows. Key features like distributed query processing, advanced indexing techniques, real-time data ingestion, and real-time querying, synergize to deliver a seamless and high-performance user experience.
- Distributed Writes: CrateDB employs a sharding mechanism to distribute data across multiple nodes in a cluster. This sharding strategy enables parallel write operations, allowing independent and concurrent writing to each shard on different nodes. This distribution prevents any single node from becoming a bottleneck, improving write throughput and scalability.
- Distributed Reads: CrateDB’s design, focused on distribution, leads to operations being split across shards and their replicas by the query planner. This strategy accelerates aggregations by selecting only the necessary data partitions, utilizing available hardware on individual nodes, distributing queries across all nodes, and pushing down aggregations to multiple nodes to reduce pressure during the merge step of query execution.
CrateDB's fully distributed query engine and columnar storage bring the following benefits:
- Data immediately available for query upon ingestion
- Ad-hoc queries across billions of records in a few milliseconds
- Hyper-fast aggregations using columnar storage
- No need to downsample or pre-aggregate data
- In-memory SQL query performance thanks to parallel query processing and distributed columnar field caches
- Fully distributed query engine leveraging the power of Apache Lucene®
1M inserts per second with a CrateDB cluster of 5 standard nodes only
Learn more about the latest benchmark on CrateDB performance >
"Working with CrateDB brings positive outcomes. The ingestion and throughput have very good performance, with 1 million values/sec, the horizontal scalability where we can add as many nodes as we need and the automatic query distribution across the whole cluster"
Marko Sommarberg
Lead, Digital Strategy and Business Development at ABB
With CrateDB, ABB could achieve:
1 Mill values/sec | 30k to 120k event/sec |
Data ingestion | Event retrieval |
Additional resources
CrateDB at Berlin Buzzwords 2023
When ms matter: Maximizing query performance in CrateDB
On-demand workshop
Query Planning and Optimizations
Customer interview
Redefining warehouse intelligence with CrateDB
"Our distribution centers produce a lot of sensor data and we enable our customers to take data driven decisions. CrateDB allows us to operate on any Cloud and on-prem/Edge with simplicity and stellar performance, and significant cost advantages."
Alexander Mann
Owner Connected Warehouse Architecture
TGW Logistics Group