The 2024 CrateDB architecture guide covering all key concepts is out.

Download now
Skip to content
Solutions

Multi-model Database

CrateDB is an open source, multi-model and distributed database that offers high performance, scalability and flexibility. You can easily query data with SQL. CrateDB supports structured, semi-structured, and unstructured data types: tables, JSON, time series, vectors, geospatial data, and binary objects, simplifying data management and reducing the need for multiple databases and complex synchronizations.

Open-source multi-model database with SQL

Hyper-fast. Queries in milliseconds.

        

/* Table definition to store truck reports on delivery routes */
CREATE TABLE truck_report (
   truck_id TEXT,
   ts TIMESTAMP,
   status SMALLINT,
   details TEXT INDEX using fulltext with (analyzer = 'english'),
   location GEO_POINT,
   extra OBJECT AS (
      product_ids ARRAY(TEXT)
   ),
   PRIMARY KEY (truck_id, ts)
);

/* Table definition to store product details */
CREATE TABLE product (
    product_id TEXT PRIMARY KEY,                   
    name TEXT,                               
    description TEXT INDEX using fulltext with (analyzer = 'english'),                                        
    dimensions OBJECT AS (
      length DOUBLE PRECISION,
      width DOUBLE PRECISION
    ),             
    images_urls ARRAY(TEXT),
    tags ARRAY(TEXT),               
    image_features FLOAT_VECTOR(1024),            
    additional_info OBJECT(IGNORED)
);
        

+--------------+---------------+--------+---------------------------------------+-----------------------------------------+-------------------------------------------+
| truck_id     |            ts | status | details                               | location                                | extra                                     |
+--------------+---------------+--------+---------------------------------------+-----------------------------------------+-------------------------------------------+
| truck11A28   | 1701608750000 |     12 | unknown issues detected - retry later | [13.467379929497838, 52.50462996773422] | {"product_ids": ["product1"] }            |
| truck11A28   | 1701608700000 |      1 |                                       | [13.520951000102300, 52.52981911123839] | {}                                        |
| truck102GG   | 1701608700000 |     43 | several unknown issues identified     | [12.700487001203748, 52.80146022102753] | {}                                        |
| truck3C20Z   | 1701605100000 |      2 |                                       | [12.467379929497838, 51.50462996773422] | {"product_ids": ["product7", "product1"]} |
| truck001ZB   | 1701608700000 |      2 |                                       | [13.568370029497838, 52.50462996773422] | {}                                        |
+--------------+---------------+--------+---------------------------------------+-----------------------------------------+-------------------------------------------+

+------------+------------+-------------------------+--------------------------------+-----------------------------------------------------------------------------------+-----------------------------+------------------------------------------------------------+-----------------+
| product_id | name       | description             | dimensions                     | images_urls                                                                       | tags                        | image_features                                             | additional_info |
+------------+------------+-------------------------+--------------------------------+-----------------------------------------------------------------------------------+-----------------------------+------------------------------------------------------------+-----------------+
| product1   | red desk   | study desk for children | {"height": 72, "width": 67.2}  | ["cdn.images.com/photo/desk-up-1538412", "cdn.freeimages.com/photo/desk-1538412"] | ["red", "soft", "children"] | [-0.3, 0.555555, 0.1, 0.14, 1.0, ... , 0.0, -0.33333, 0.0] | NULL            |
| product7   | teddy bear | green fluffy teddy bear | {"height": 32, "width": 14.55} | ["cdn.images.com/photo/tb-up22-1538412", "cdn.freeimages.com/photo/tb-1538412"]   | ["green", "fluffy", "toy"]  | [0.2, 0.21, -0.3333, -0.7, 0.0, ... , 0.25, 0.33333, 0.92] | NULL            |
+------------+------------+-------------------------+--------------------------------+-----------------------------------------------------------------------------------+-----------------------------+------------------------------------------------------------+-----------------+
        

/* Based on image features vector, retrieve the most similar products to a given item */
SELECT name, _score 
FROM product
WHERE KNN_MATCH(image_features, [0.14, 1, 0, ..., 0.555, 1], 2) 
/* embedding for a pink dinosaur */
ORDER BY _score DESC;
        

+-----------------+-------------+
| name            |      _score |
+-----------------+-------------+
| purple dinosaur | 0.813950193 |
| green dinosaur  | 0.785511027 |
+-----------------+-------------+
        

/* Get the name of the products transported by trucks that reported their location
whithin a region */
SELECT distinct truck_id, product_id, name
FROM truck_report
JOIN product ON product_id = ANY(extra['product_ids'])
WHERE WITHIN(
        location, 
        'POLYGON(
          (13.376072 52.541463, 
           13.443165 52.540162, 
           13.445887 52.504546,
           13.380545 52.505848,
           13.376072 52.541463)
         )'
      )   
  AND ts > (CURRENT_TIMESTAMP - INTERVAL '15 minutes');
        

+-------------+------------+------------+
| truck_id    | product_id | name       |
+-------------+------------+------------+
| truck11A28  | product1   |  red desk  |
| truck3C20Z  | product1   |  red desk  |
| truck3C20Z  | product7   | teddy bear |
+-------------+------------+------------+
        

/* Based on the truck reports, retrieve the most recent reported issues
for each truck_id */
SELECT truck_id, MAX(ts), MAX_BY(details, ts) AS issue_details
FROM truck_report
WHERE MATCH(details, 'issue')
GROUP BY truck_id
LIMIT 3;
        

+--------------+---------------+---------------------------------------+
| truck_id     | max(ts)       | issue_details                         |
+--------------+---------------+---------------------------------------+
| truck11A28   | 1701608750000 | unknown issues detected - retry later |
| truck102GG   | 1701608700000 | several unknown issues identified     |
| truck2220Z   | 1701608520000 | delayed delivery - route issue        |
+--------------+---------------+---------------------------------------+
        

/* Based on delivery truck reports, this query returns the number of reports per day
that were sent within Berlin's city center region */
SELECT date_bin('24 hours'::interval, ts, 0) AS day, 
       COUNT(*) AS num_reports
FROM truck_report
WHERE WITHIN(
        location, 
        'POLYGON(
          (13.362382 52.540331,
           13.365300 52.498027,
           13.453191 52.498549,
           13.448727 52.539913,
           13.362382 52.540331)
         )'
      ) 
    AND ts > (CURRENT_TIMESTAMP - INTERVAL '15 days') 
GROUP BY 1
ORDER BY 1 DESC;
        

+---------------+-------------+
|           day | num_reports |
+---------------+-------------+
| 1701648000000 |          25 |
| 1701561600000 |         280 |
| 1701475200000 |        1060 |
| 1701388800000 |        2970 |
| 1701302400000 |        2844 |
| 1701216000000 |        2462 |
| 1701129600000 |        2331 |
| 1701043200000 |        2573 |
| 1700956800000 |         582 |
| 1700870400000 |        1404 |
| 1700784000000 |        2791 |
| 1700697600000 |        1953 |
| 1700611200000 |        2046 |
| 1700524800000 |        3177 |
| 1700438400000 |        2446 |
+---------------+-------------+

Multiple data models

Handle multiple data models seamlessly within the same database, including structured (table), unstructured (vector, text), semi-structured (JSON) and binary (BLOB). It can also handle multiple formats, such as geospatial, time series, and log data. All of this is natively supported by a storage engine built on top of Apache Lucene

CrateDB's flexible data modeling allows for the collection and storage of a wide range of data types. It can accept any type of data format without needing specific development and collect data from various sources, such as enterprise application data (CRM, ERP), analytics data, streams data, sensor data, API data, and data from data lakes or other databases.

cr-quote-image

Dynamic schema

Modify and adapt data structures seamlessly with dynamic schema. Its adaptability allows data models to evolve without constraints, ensuring a seamless transition as data needs change over time.

CrateDB's versatile data modeling also enables the storage of complex objects and nested objects without human intervention. New data types and formats can be added on the fly, adapting to changing needs and requirements and eliminating the need for multiple databases to be configured for synchronization. See Dynamic Database Schemas >

cr-quote-image

SQL query language

Interact uniformly with diverse data models using native SQL in CrateDB. Its unified query language simplifies data manipulation across various models, enabling easy query and seamless data manipulation.
cr-quote-image

Scalability

Scale effortlessly to meet growing demands with CrateDB's horizontal scaling capabilities. Whether handling increased data volumes or user traffic, CrateDB ensures scalability by adding more nodes or enhancing individual node resources.

cr-quote-image

Multiversion concurrency control

Experience eventually consistent and efficient data access with multi-version concurrency control. Atomic operations and durability guarantees on record level ensure that writes to documents either succeed entirely or have no effect, queries by primary key always return the latest results.

cr-quote-image

Indexing and query optimization

Optimize performance with efficient indexing in CrateDB. Its robust indexing and query optimization mechanisms, combined with automatic indexing, columnar storage, and highly efficient tree structures, enhance performance across various data models and use cases.

cr-quote-image

Data replication and distribution

Ensure fault tolerance and high availability through CrateDB's support for data replication and distribution across multiple nodes. It ensures data reliability, disaster recovery, and uninterrupted operations in distributed systems.
cr-quote-image

Easy and seamless integrations

Integrate with various programming languages, frameworks, and tools with native SQL and the PostgreSQL Wire Protocol. Drivers and libraries are available for many programming languages, as well as a REST API.

View a sample list of integrations >

cr-quote-image

Community and support

Benefit from a supportive developer community and versatile support plans. CrateDB’s open source licensing model leverages an active community's power and lowers your licensing costs. Whether you want complete peace of mind with the SaaS model or deploy the product yourself, we have the right option if you choose a fast and scalable open source multi-model database.
cr-quote-image

Modeling data in CrateDB

Learn from Karyn Azevedo, Solution Engineer at CrateDB​, how to choose the right data type for your use case.

Timestamp: 5:31 - 18:37

"Thanks to CrateDB's great indexing, dedicated data types, and subsequent great performance, we could execute an event and data-driven architecture, with the performance and scalability necessary for storing time-series data over time. The SQL query syntax capability of CrateDB also played a part in achieving this great outcome, as it made it easy for the team to write good performing queries using existing know-how. CrateDB is now an integral part of our big data streaming architecture and it is delivering as promised."
Kristoffer Axelsson Principal Solution Architect Thomas Concrete Group Learn more
cr-quote-img-white
"With CrateDB it was extremely easy to have a single place that we could query through our entire system within milliseconds at any moment in time, and this was impossible before."
Dmytro Boguslavskyy CTO & Co-Founder kooky Learn more
cr-quote-img-white
"CrateDB was a better solution for our needs than any other SQL or NoSQL database we tried. It was easy to migrate code off of our legacy SQL database and onto CrateDB to immediately benefit from its data flexibility and scalable performance."
Sheriff Mohamed Director of Architecture GolfNow Learn more
cr-quote-img-white
"Postgres couldn't keep up with the data we have; Datastax Enterprise had ingest scaling issues with spatial data; Cassandra didn't have spatial query operations. CrateDB was the only database we found that could smoothly process data for our users and for our data science team. We fell in love with it immediately."
Kartik Venkatesh CTO Spatially Health Learn more
cr-quote-img-white

Other resources on multi-model databases

Webinar

Key Concepts, Architecture, and Live Demo

Documentation

Documentation