Webinar: Why Industrial IoT Data Breaks Traditional Databases — and What to Do About It

Register Now
Skip to content
Compare

TimescaleDB or CrateDB? The answer depends on your cardinality.

TimescaleDB is a well-designed PostgreSQL extension for time-series workloads at moderate scale. CrateDB is a distributed database built from the ground up for high-cardinality, multi-dimensional analytics on live operational data. Here is an honest comparison.

TimescaleDB is now marketed as TigerData following a 2025 rebrand. This page addresses both the TimescaleDB product and the TigerData platform.

Built as an extension vs. built as a database

TimescaleDB is a PostgreSQL extension. It adds time-series capabilities (automatic partitioning, compression, continuous aggregates) on top of a standard PostgreSQL instance. That makes it immediately familiar to PostgreSQL teams and compatible with every PostgreSQL tool in use. It also means it inherits PostgreSQL's single-node model. Horizontal scaling for self-hosted TimescaleDB requires TimescaleDB Distributed, which adds significant operational complexity.

CrateDB is not a PostgreSQL extension. It is a distributed database that uses the PostgreSQL wire protocol, so your SQL skills, BI tools, and most PostgreSQL-compatible libraries connect without modification. The distributed architecture is the foundation, not a feature that can be added. Every node in a CrateDB cluster is equal. The cluster self-balances as data grows. You scale by adding nodes; CrateDB handles the rest.

Do you need continuous aggregates?

TimescaleDB recommends Continuous Aggregates for high-query-volume analytical workloads. A Continuous Aggregate is an incrementally refreshed materialized view, it pre-computes a rollup query in the background so your dashboards do not run that aggregation on raw data every time. This works well when you know your query patterns in advance.

The cost: you must define your rollup queries before your data arrives. When requirements change (new dimensions, new filters, new combinations) the aggregate must be redefined and refreshed. At high cardinality (more than a few hundred thousand unique series), the number of aggregates required to cover ad-hoc query patterns grows faster than most teams can manage.

CrateDB does not require pre-aggregation. Ad-hoc queries return sub-second results on billions of raw records, across 900,000 unique series, without defining rollups in advance. The index is built automatically at ingestion. The query runs on the data that is actually there.

  CrateDB TimescaleDB
Ad-hoc queries at high cardinality Sub-second on raw data Recommended to pre-define via Continuous Aggregates
Define query patterns in advance? Not required Yes, for best performance
New dimensions after deployment Query immediately May require new aggregates
900K+ unique series Designed for this Performance degrades without aggregation strategy

 

Feature comparison

Feature CrateDB TimescaleDB (TigerData)
Architecture Distributed-native, shared-nothing PostgreSQL extension
Horizontal scaling Native: add nodes, cluster self-balances Requires TimescaleDB Distributed or cloud
Time-series Yes, native Yes, core product
JSON / document Yes, first-class, native Yes, via PostgreSQL JSONB
Vector search Yes, native, co-located with time-series Yes, via pgvector extension
Full-text search Yes, native, distributed, Lucene-based Yes, via PostgreSQL tsvector / extensions
Geospatial Yes, native Yes, via PostGIS extension
Query language Standard SQL (PostgreSQL wire protocol) Standard SQL (PostgreSQL native)
Pre-aggregation required  No  Recommended for analytical query scale
Auto-indexing on ingestion  Yes, every field, milliseconds Hypertable partitioning; additional indexes manual
Compression Columnar compression Up to 95% (delta, dictionary, RLE)
Time-series functions Standard SQL window functions 200+ Hyperfunctions (interpolation, time-weighted avg, etc.)
Edge deployment  Yes Limited
Open source edition  CrateDB OSS (Apache 2.0) TimescaleDB OSS (Timescale License / Apache 2.0)
Managed cloud  CrateDB Cloud (AWS, Azure, GCP) Tiger Cloud
On-premises  Self-hosted. Support and Advanced features with CrateDB Enterprise Self-hosted
PostgreSQL ecosystem compatibility Most tools (wire protocol) Full (it is PostgreSQL)

 

One database or five extensions?

A team running TimescaleDB for time-series typically also runs:

  • PostgreSQL JSONB for document storage
  • pgvector for vector embeddings
  • PostGIS for geospatial queries
  • PostgreSQL tsvector or an external search engine for full-text

Each extension has its own upgrade path, tuning requirements, and failure mode. Keeping data synchronized across them is a recurring engineering cost.

CrateDB stores time-series, JSON, vector, full-text, and spatial data in one engine. One ingestion pipeline. One query language. One operational surface. A single SQL query can join a time-series table with a JSON document column, apply a spatial filter, score against a vector embedding, and rank by full-text relevance. All at sub-second latency.

 

Guidance on choosing TimescaleDB or CrateDB

Choose TimescaleDB when: Choose CrateDB when:
  • Your team already runs PostgreSQL and wants to add time-series capability with zero migration
  • Your cardinality is moderate — under 100,000 unique series — and you can define your key query patterns
  • You need purpose-built Hyperfunctions: time-weighted averages, interpolation across gaps, percentile approximations
  • Single-node or simple managed cloud is sufficient for your current scale
  • Maximum PostgreSQL ecosystem compatibility is a hard requirement
  • Your cardinality is high: 100,000+ unique devices, assets, or users, and pre-defining all query rollups is not feasible
  • You need to combine time-series data with JSON payloads, vector embeddings, full-text search, or spatial queries in a single system
  • Horizontal scaling is a current or near-term requirement
  • You are building an AI feature pipeline that needs real-time vectors alongside live operational data
  • You want to eliminate multiple specialized databases and the synchronization cost between them
  • You need sub-second ad-hoc query results, including on queries you have not written yet
  • Edge deployment with intermittent connectivity is required

Get Started Now