The Guide for Time Series Data Projects is out.

Download now
Skip to content
Solutions

AI/ML Database

CrateDB is an open source, multi-model and distributed database that offers high performance, scalability and flexibility. It is designed to grow your Machine Learning and AI workloads. CrateDB enables a wide range of use cases across industries while reducing development time and total cost of ownership through a single, scalable platform with native SQL support.

Open source AI/ML database, all with SQL

Hyper-fast. Queries in milliseconds.

        

SELECT text, _score
FROM word_embeddings
WHERE knn_match(embedding,[0.3, 0.6, 0.0, 0.9], 2)
ORDER BY _score DESC; 
        

|------------------------|--------|
|         text           | _score |
|------------------------|--------|
|Discovering galaxies    |0.917431|
|Discovering moon        |0.909090|
|Exploring the cosmos    |0.909090|
|Sending the mission     |0.270270|
|------------------------|--------|
        

SELECT text, _score
FROM word_embeddings
WHERE knn_match(embedding, (SELECT embedding FROM word_embeddings WHERE text ='Discovering galaxies'), 2)
ORDER BY _score DESC
        

|------------------------|--------|
|         text           | _score |
|------------------------|--------|
|Discovering galaxies    |1       |
|Discovering moon        |0.952381|
|Exploring the cosmos    |0.840336|
|Sending the mission     |0.250626|
|------------------------|--------|

Vector storage

With vector storage, you can easily store and retrieve embeddings generated by ML models, seamlessly integrating vectorized data with your existing datasets. It allows you to enrich your existing data with semantics, providing context that aligns with your data and enhancing explainability.

cr-quote-image

Advanced search capabilities

CrateDB offers advanced search capabilities through its similarity search and flexible filtering, combining full-text and vector search. Similarity search allows users to find similarities across any data represented as vectors, while the combination of full-text and vector search improves the search precision by enhancing semantic similarity and keyword matching. These features facilitate enhanced recommendations, anomaly detection, and other AI/ML use cases.

cr-quote-image

Ingestion

CrateDB can ingest, process and analyze multi-structured data and dynamic schemas – relational tables, complex objects and arrays, time-series, geospatial, and textual data. It easily handles millions of events per second – arriving in real-time steams or batches without sacrificing durability of data. Data can be seamlessly ingested from various data sources, such as Amazon S3 and Kafka.
cr-quote-image

Native SQL support

CrateDB is an SQL database that implements the PostgreSQL Wire Protocol. With CrateDB, you can easily query even complex and dynamic schemas in a familiar SQL format, without the need to learn custom languages. The massive parallel execution of queries ensures fast response times, making it ideal for handling ad-hoc queries across large datasets, including those commonly encountered in AI/ML applications.

cr-quote-image

Ecosystem

CrateDB seamlessly integrates with your AI and analytics stack by leveraging the support of the PostgreSQL Wire Protocol. Take advantage of CrateDB's native SQL support for complex data analytics to accelerate the integration with AI models and optimize your AI projects.

View a sample list of integrations >

cr-quote-image

Reduced TCO

CrateDB offers a low Total Cost of Ownership (TCO) by eliminating the need to manage multiple systems. It seamlessly integrates your data, keeping your (meta-)data and vector representations aligned without the complexity of data synchronization processes. With its use of native SQL, CrateDB simplifies development and ensures compatibility with existing systems.

cr-quote-image

CrateDB at AI & Big Data Expo

CrateDB's VP Product shares his vision for the future with multi-model SQL databases and Large Language Models. 

How to Build AI-driven Knowledge Assistants with a Vector Store, LLMs and RAG Pipelines

CrateDB is an open source distributed database designed for AI/ML use cases. It efficiently manages diverse data types and ensures real-time data accessibility for continuous model training and prediction. With vector storage and similarity search features, CrateDB unlocks new dimensions of efficiency in complex data analytics, pattern recognition, and AI. All of this is built on a scalable architecture that supports native SQL, facilitating streamlined querying and reducing system complexity. Whether in the cloud, on-premises, or at the Edge, CrateDB offers the flexibility and efficiency needed for all AI and ML operations.

"Working with CrateDB brings positive outcomes. The ingestion and throughput have very good performance, with 1 million values/sec, the horizontal scalability where we can add as many nodes as we need and the automatic query distribution across the whole cluster."
Marko Sommarberg Lead, Digital Strategy and Business Development ABB Learn more
cr-quote-img-white
"CrateDB is the only database that gives us the speed, scalability and ease of use to collect and aggregate measurements from hundreds of thousands of industrial sensors for real-time visibility into power, temperature, pressure, speed and torque."
Jürgen Sutterlütti Vice President, Energy Segment and Marketing Gantner Instruments Learn more
cr-quote-img-white
"CrateDB gives us ease of SQL combined with easy scaling, and real-time querying of full-text data."
Qualtrics Learn more
cr-quote-img-white

Other resources on AI/ML databases