Skip to content

Distributed Database 

CrateDB is a distributed database, which means that data is stored on multiple nodes in a network (see also shared-nothing architecture). In a CrateDB cluster, data is equally distributed through automatic rebalancing, and its distributed SQL query engine allows for aggregations, JOINs, sub-selects, and ad-hoc queries to be performed at in-memory speed. CrateDB also integrates native, full-text search features, which enable you to store and query structured or unstructured data together. Therefore, you no longer have to use separate SQL and Search databases to manage tabular and non-tabular data.

CrateDB Distributed Database

Benefits of a distributed database

Distributed SQL queries

CrateDB uses ANSI SQL as its query language for data querying and manipulation. This reduces the learning curve and allows users to focus on query logic rather than dealing with the details of a distributed system and a proprietary query language. Users can also write user-defined functions to manipulate data.

SQL statements are translated into a series of processing steps, optimized for efficiency. CrateDB's execution involves logical and physical plans that guide data retrieval from distributed nodes. The execution layer distributes these plans across nodes for parallel processing. This approach ensures effective and scalable query execution in a distributed database environment.

Product documentation


SQL Syntax

Additional resources

On-demand Workshop 2023

Introduction to CrateDB and its Architecture 

Timestamp:  14:01–16:40

CrateDB at Berlin Buzzwords 2023

When milliseconds matter: maximizing query performance in CrateDB.

Timestamp:  1:00 – 1:28


Distributed query execution in CrateDB: What you need to know

Learn how CrateDB generates execution plans, and the optimizations influence the order of operators.

Read more

Want to know more about the distributed architecture of CrateDB?