JSON and Document Data in CrateDB
The typical workaround for semi-structured data is either forcing it into a rigid schema — losing flexibility every time the payload changes — or using a document database that can't join, aggregate, or run analytics efficiently. CrateDB treats JSON as a first-class data type. You get document flexibility and SQL analytical power in the same engine, without a separate database or a synchronization pipeline between them.
JSON query examples
Note the bracket notation for nested field access.
/* Insert a JSON document with nested objects and arrays directly
into CrateDB using standard SQL. No schema definition required
for the dynamic fields — CrateDB adapts automatically. */
INSERT INTO community_areas(areanumber, name, details) VALUES (
35,
'DOUGLAS',
'{
"description": "The Douglas community area stretches...",
"population": 20291,
"zipcodes": [ "60609", "60616", "60653" ],
"notableresident": {
"name": "Ida B. Wells",
"bio": "An American investigative journalist, educator and..."
}
}'::OBJECT
)
/* Access nested keys using bracket notation.
Slice arrays, filter on dynamic attributes, and project any field
including deeply nested ones without unpacking the document first.
The query runs in milliseconds even across millions of records. */
/* Retrieve fields from objects and slice arrays. */
SELECT name,
array_slice(details['zipcodes'], 1, 1) as first_zip,
details['population'] AS population,
details['notableresident']['name'] AS famous_resident
FROM community_areas
WHERE areanumber = 35;
+---------+-----------+------------+-----------------+
| name | first_zip | population | famous_resident |
+---------+-----------+------------+-----------------+
| DOUGLAS | ["60609"] | 20291 | Ida B. Wells |
+---------+-----------+------------+-----------------+
/* Two documents with different structures inserted in the same statement.
CrateDB's dynamic schema handles both without a migration,
a downtime window, or a schema redesign. */
/* Insert documents for two Chicago community areas. Dynamic
object schemas allow these two documents to have different
structures. */
INSERT INTO community_areas(areanumber, name, details) VALUES
(
35,
'DOUGLAS',
'{
"description": "The Douglas community area stretches...",
"population": 20291,
"zipcodes": [ "60609", "60616", "60653" ],
"notableresident": {
"name": "Ida B. Wells",
"bio": "An American investigative journalist, educator and..."
}
}'::OBJECT
),
(
41,
'HYDE PARK',
'{
"description": "This community area is home to the Museum of Science...",
"population": 29456,
"notableresident": {
"name": "Muhammad Ali",
"born": 1942
}
}'::OBJECT
);
See JSON ingestion and querying in practice
See JSON ingestion and querying in practice
Flexible schema handling
SQL on nested JSON
Query JSON fields directly using standard SQL bracket notation. Access nested keys at any depth, filter on dynamic attributes, and run aggregations without custom query languages or intermediate processing pipelines.
Example:SELECT payload['device']['temperature']FROM readingsWHERE payload['status'] = 'active';
Joins between JSON documents and relational tables work the same way; no special syntax, no query planner hints.
Automatic indexing
CrateDB's indexing engine, built on Apache Lucene, automatically indexes every JSON attribute on ingestion, including nested fields and dynamic ones that didn't exist when the table was created. No manual setup, no index design decisions, no tuning as the schema evolves. Every property is instantly available for filtering, search, and aggregation from the moment the document lands.
Real time query readiness
Distributed and resilient
Shoott is a photography company that offers affordable, professional photoshoots through a seamless booking process and personalized customer experience. They faced a critical challenge when their analytics engine, powered by Rockset, was discontinued.
"CrateDB has enabled us to maintain seamless analytics despite the transition from Rockset. With its flexible schema and powerful indexing, we’ve been able to handle complex data efficiently and reduce manual overhead. This has allowed our teams to focus on delivering key insights while ensuring our systems remain scalable for future growth."
Dmitry Feoktistov
Development Team Lead
Shoott
Go deeper on JSON data in CrateDB
Documentation
Tutorials and workshops
Related pages
FAQ
A JSON value can be one of several types: an object, array, number, string, Boolean (true or false), or null. Objects within JSON are collections of key-value pairs, and these keys can contain values of any type, including nested objects or arrays. In CrateDB, JSON objects are treated as container data types, allowing for flexible and efficient data storage.
Storing document/JSON data efficiently requires a flexible database that can handle various attributes and nesting levels. Databases typically define objects as strict, dynamic, or ignored, depending on whether you want to enforce schemas strictly or allow for dynamic schema updates. This flexibility is crucial for managing complex data structures that involve arrays of objects and different nesting levels. In CrateDB, JSON objects can be stored in OBJECT columns, providing the versatility needed for multi-model data storage while maintaining efficient handling and query performance.
When choosing a database for document/JSON data, consider its ability to efficiently store and query JSON documents, indexing capabilities, and support for dynamic schemas. Databases like CrateDB, MongoDB, Couchbase, and ArangoDB excel in these areas, providing flexibility and efficient querying.
CrateDB's advantage in JSON workloads is not document storage alone, it's the combination of document flexibility with analytical SQL. You can store evolving JSON payloads without predefined schemas and then run joins, aggregations, window functions, and full-text search across them using standard SQL. That combination — document flexibility plus analytical SQL plus horizontal scalability — is what makes CrateDB different from dedicated document databases.
Document/JSON data is highly versatile and used across various industries for web applications, APIs, configuration files, data storage, and logging. This format supports flexible, schema-less structures, making it ideal for rapidly changing datasets and complex data models. In CrateDB, JSON data is efficiently stored, indexed, and queried, providing seamless integration with PostgreSQL clients and robust search capabilities.