Skip to content
Infrastructure

Node Architecture

Every node counts. Every node performs.

CrateDB’s node architecture is designed for symmetry, resilience, and performance. Unlike traditional primary–secondary systems, every node in a CrateDB cluster can perform every operation: ingesting, querying, storing, and coordinating data. This shared-nothing, uniform design ensures no single point of failure, predictable performance, and seamless scalability, the foundation of CrateDB’s real-time distributed capabilities.

A fully symmetric cluster

CrateDB doesn’t rely on dedicated master or replica roles. Each node is equally capable and configured in the same way and can:

  • Accept SQL queries or writes
  • Act as a query coordinator (= handler node)
  • Store and retrieve data from its local shards
  • Communicate with other nodes through the transport layer

This symmetry allows CrateDB to balance load automatically, recover gracefully from node failures, and maintain performance as data and workloads grow.

cr-quote-image

Node communication and ports

Every CrateDB node exposes three main interfaces for interaction and coordination:

Port Function Description
HTTP / REST SQL and admin requests Accepts SQL queries and management commands via HTTP.
PostgreSQL Wire Protocol Application connectivity Allows direct integration with PostgreSQL-compatible tools and drivers.
Transport Protocol Cluster communication Used internally for node discovery, data exchange, and job coordination.

This multi-port design ensures high compatibility with external tools while maintaining efficient internal coordination across the cluster.

cr-quote-image

Inside a CrateDB node

Each node in the cluster contains four key components that work together to process and manage data efficiently:

SQL Handler: The SQL Handler is the node’s entry point for all incoming queries. It parses SQL statements, validates syntax, and generates an optimized execution plan. For distributed queries, it decides how to parallelize tasks across the cluster.

Job Execution Service: Once an execution plan is ready, it’s sent to the Job Execution Service, which manages and distributes the execution as one or more “jobs.” Each job may involve multiple tasks (filters, aggregations, joins) executed locally or remotely.
These jobs are coordinated using the Transport Protocol, enabling real-time parallel execution across all nodes.

Cluster State Service: At the heart of every CrateDB cluster is the Cluster State Service, responsible for maintaining global state and coordination. It manages:

  • Node discovery and membership
  • Master node election (for cluster metadata only)
  • Shard allocation and rebalancing
  • Cluster health monitoring

This service ensures that the cluster remains consistent, resilient, and self-healing,  even as nodes join, leave, or recover.

Data Storage Component: Each node’s storage engine manages local data and indexing. CrateDB stores data in sharded tables, with each shard represented as a Lucene index on disk. This allows fast access, advanced full-text search, and fine-grained indexing without external components. Each node reads and writes data locally, which keeps operations fast and network-efficient, key to CrateDB’s real-time query performance.

 

Node architecture in CrateDB

 

cr-quote-image

How it all comes together

When a query is received:

  1. The handler node (the one receiving the query) parses it and creates an execution plan.
  2. It distributes relevant operations to data nodes where the required shards reside.
  3. Each data node executes its part of the plan locally and returns partial results.
  4. The coordinator node aggregates these results and returns the final dataset to the client.
All of this happens automatically, in milliseconds, without user intervention or manual configuration.
cr-quote-image

CrateDB architecture guide

This comprehensive guide covers all the key concepts you need to know about CrateDB's architecture. It will help you gain a deeper understanding of what makes it performant, scalable, flexible and easy to use. Armed with this knowledge, you will be better equipped to make informed decisions about when to leverage CrateDB for your data projects. 

CrateDB-Architecture-Guide-Cover

Additional resources

Want to learn more?