Download the CrateDB Architecture Guide

Download Now
Skip to content
Data models

JSON and Document Data in CrateDB

Store semi-structured data without predefined schemas. Query nested fields, filter dynamic attributes, and run aggregations. All in standard SQL.

The typical workaround for semi-structured data is either forcing it into a rigid schema — losing flexibility every time the payload changes — or using a document database that can't join, aggregate, or run analytics efficiently. CrateDB treats JSON as a first-class data type. You get document flexibility and SQL analytical power in the same engine, without a separate database or a synchronization pipeline between them.

JSON query examples

Note the bracket notation for nested field access.

        

/* Insert a JSON document with nested objects and arrays directly 
   into CrateDB using standard SQL. No schema definition required 
   for the dynamic fields — CrateDB adapts automatically. */

INSERT INTO community_areas(areanumber, name, details) VALUES (
   35,
   'DOUGLAS',
   '{
      "description": "The Douglas community area stretches...",
      "population": 20291,
      "zipcodes": [ "60609", "60616", "60653" ],
      "notableresident": {
         "name": "Ida B. Wells",
         "bio": "An American investigative journalist, educator and..."
      }
   }'::OBJECT
)
        
        
        
        

/* Access nested keys using bracket notation. 
   Slice arrays, filter on dynamic attributes, and project any field 
   including deeply nested ones without unpacking the document first.
   The query runs in milliseconds even across millions of records. */

/* Retrieve fields from objects and slice arrays. */
SELECT name, 
       array_slice(details['zipcodes'], 1, 1) as first_zip, 
       details['population'] AS population, 
       details['notableresident']['name'] AS famous_resident 
FROM community_areas 
WHERE areanumber = 35;

        

+---------+-----------+------------+-----------------+
| name    | first_zip | population | famous_resident |
+---------+-----------+------------+-----------------+
| DOUGLAS | ["60609"] |      20291 | Ida B. Wells    |
+---------+-----------+------------+-----------------+

 

        

/* Two documents with different structures inserted in the same statement.
   CrateDB's dynamic schema handles both without a migration, 
   a downtime window, or a schema redesign. */

/* Insert documents for two Chicago community areas.  Dynamic 
   object schemas allow these two documents to have different
   structures. */
INSERT INTO community_areas(areanumber, name, details) VALUES 
  (
    35,
    'DOUGLAS',
    '{
      "description": "The Douglas community area stretches...",
      "population": 20291,
      "zipcodes": [ "60609", "60616", "60653" ],
      "notableresident": {
        "name": "Ida B. Wells",
        "bio": "An American investigative journalist, educator and..."
      }
    }'::OBJECT
  ),
  (
    41,
    'HYDE PARK',
    '{
      "description": "This community area is home to the Museum of Science...",
      "population": 29456,
      "notableresident": {
        "name": "Muhammad Ali",
        "born": 1942
      }
    }'::OBJECT    
  );
        
        
        

See JSON ingestion and querying in practice

See JSON ingestion and querying in practice

Watch how to ingest nested JSON documents and query them with SQL in CrateDB, including dynamic schema flexibility and multi-model queries across JSON, geospatial, and full-text data in a single session.

Flexible schema handling

JSON payloads rarely stay the same. Device firmware updates add fields, API versions change structure, new event types appear. CrateDB's dynamic schema adapts as data evolves; new fields are added automatically as documents are ingested, with no DDL changes, no downtime, and no migration scripts. You can model only the fields you need as typed columns and keep the rest flexible, with full SQL access to everything.
cr-quote-image

SQL on nested JSON

Query JSON fields directly using standard SQL bracket notation. Access nested keys at any depth, filter on dynamic attributes, and run aggregations without custom query languages or intermediate processing pipelines.

Example:
SELECT payload['device']['temperature']
FROM readings
WHERE payload['status'] = 'active';


Joins between JSON documents and relational tables work the same way; no special syntax, no query planner hints.

cr-quote-image

Automatic indexing

CrateDB's indexing engine, built on Apache Lucene, automatically indexes every JSON attribute on ingestion, including nested fields and dynamic ones that didn't exist when the table was created. No manual setup, no index design decisions, no tuning as the schema evolves. Every property is instantly available for filtering, search, and aggregation from the moment the document lands.

cr-quote-image

Real time query readiness

Incoming JSON documents are queryable within milliseconds of ingestion. Dashboards, alerting systems, and automation pipelines see fresh data immediately. No batch delay, no indexing lag, no separate refresh cycle.

Distributed and resilient

CrateDB distributes JSON documents across nodes for parallel processing and high availability. Query performance stays consistent as the dataset grows; the same query that runs in milliseconds on ten million documents runs in milliseconds on ten billion.
cr-quote-image

CrateDB handles JSON alongside time-series, vector, geospatial, full-text, and relational data in the same engine. No separate pipelines.

Shoott is a photography company that offers affordable, professional photoshoots through a seamless booking process and personalized customer experience. They faced a critical challenge when their analytics engine, powered by Rockset, was discontinued.

"CrateDB has enabled us to maintain seamless analytics despite the transition from Rockset. With its flexible schema and powerful indexing, we’ve been able to handle complex data efficiently and reduce manual overhead. This has allowed our teams to focus on delivering key insights while ensuring our systems remain scalable for future growth."

Dmitry Feoktistov
Development Team Lead
Shoott

Shoott_photo-1
DriveNow helps travelers find and easily compare the best online rates for car and campervan rentals in real time. They use CrateDB to store clickstream data, comprising logs of pages users visit, links they click, search filters they select and site-generated emails they interact with. Real-time queries are made to analyze how promotional campaigns, user interface design changes and A/B tests affect the user experience.
"CrateDB is ideal because it's capable of writing data at a high rate, and delivering fast queries to our business team at the same time. We couldn't have done that using a traditional SQL database without a lot of difficulty."
Drivenow

Go deeper on JSON data in CrateDB

Tutorials and workshops

Related pages

FAQ

A JSON value can be one of several types: an object, array, number, string, Boolean (true or false), or null. Objects within JSON are collections of key-value pairs, and these keys can contain values of any type, including nested objects or arrays. In CrateDB, JSON objects are treated as container data types, allowing for flexible and efficient data storage.

Storing document/JSON data efficiently requires a flexible database that can handle various attributes and nesting levels. Databases typically define objects as strict, dynamic, or ignored, depending on whether you want to enforce schemas strictly or allow for dynamic schema updates. This flexibility is crucial for managing complex data structures that involve arrays of objects and different nesting levels. In CrateDB, JSON objects can be stored in OBJECT columns, providing the versatility needed for multi-model data storage while maintaining efficient handling and query performance.

When choosing a database for document/JSON data, consider its ability to efficiently store and query JSON documents, indexing capabilities, and support for dynamic schemas. Databases like CrateDB, MongoDB, Couchbase, and ArangoDB excel in these areas, providing flexibility and efficient querying. 

CrateDB's advantage in JSON workloads is not document storage alone, it's the combination of document flexibility with analytical SQL. You can store evolving JSON payloads without predefined schemas and then run joins, aggregations, window functions, and full-text search across them using standard SQL. That combination — document flexibility plus analytical SQL plus horizontal scalability — is what makes CrateDB different from dedicated document databases.

Document/JSON data is highly versatile and used across various industries for web applications, APIs, configuration files, data storage, and logging. This format supports flexible, schema-less structures, making it ideal for rapidly changing datasets and complex data models. In CrateDB, JSON data is efficiently stored, indexed, and queried, providing seamless integration with PostgreSQL clients and robust search capabilities.

Start querying in minutes

Free forever on CrateDB Cloud. No credit card required.