Skip to content
Login
Try for free
Login
Try for free
Product

Architecture

nativesql

Seamless SQL Access

We chose SQL as the data access language to make CrateDB easy for mainstream developers to adopt. Everyone knows SQL; it’s powerful, and it makes integration easy. CrateDB is compatible with most SQL tools, interfacing via the PostgreSQL Wire Protocol and an HTTP interface. A distributed SQL engine is a hard thing to build. It took a few years to reach a viable level of Native SQL compatibility, with support for joins, aggregations, indexes, sub-queries, user-defined functions, and so on. We juiced our SQL up with some nice things commonly found with NoSQL, like full-text search, geospatial queries, and JSON object columns.

Objects and Dynamic Schemas

The real beauty of CrateDB is how it combines the familiarity of SQL with the scalability and data flexibility of NoSQL databases. We accomplished this by building our distributed SQL engine on a foundation of our own and other open source NoSQL technologies instead of traditional relational DBMS techniques.

Another benefit of CrateDB’s SQL-NoSQL architecture is schema flexibility. In comparison to other traditional relational databases, CrateDB supports dynamic data type mappings with automatic schema updates. Each relational record is stored as a JSON document, allowing for structure changes on the fly. 

High-Velocity Data Insertion

CrateDB provides an eventually consistent, non-blocking data insertion model. This allows for the insertion of tens of thousands of data points per second per node while querying the data at the same time.

Data durability and consistency are also important, and we took steps to address those with as little impact on performance as possible. To ensure data durability, we implemented write-ahead logging. For consistency, CrateDB includes record versioning, optimistic concurrency control, and a table-level refresh frequency setting, which forces CrateDB data to become consistent on a periodic basis (every n milliseconds).

Real-Time Queries

Real-time databases usually require all the data to fit in the main memory, but that limits how much data you can manage. CrateDB solves this issue by using in-memory Lucene Segments that help achieve near real-time performance.

Distributed query processing also contributes to fast performance, and a query planner that makes very smart decisions about which nodes are best-suited to perform different aggregations and joins.

Flexible Deployment Options

CrateDB is highly flexible and can be deployed on the Edge, On-Premises, or in the Cloud to meet your organization's unique needs. It also supports hybrid scenarios out of the box. Whether in the data center or remotely, CrateDB can run anywhere. This is especially useful when network latency is intolerable or when data needs to be aggregated before being transferred to a central cloud instance for wider-scale processing.

CrateDB: Technical Overview

A detailed description CrateDB's architecture. Find out all about how CrateDB is built!

Download White Paper
Scheme of the architecture of CrateDB

6 Things Enterprises Need to Consider When Choosing A Database

To help you make the wright decision, here's six things to consider when choosing a database for your next project.

Download White Paper
6 Things Enterprises Need to Consider When Choosing a Database

CrateDB and Docker

Scale your database like your application

Watch Video
A presenter standing in front of a projection where it says CrateDB and Docker