Skip to content
Blog

Data Historian vs. Time Series Database: Which Belongs in Your Industrial Stack

Most industrial teams evaluating their data stack eventually hit the same question: do we need a data historian or a time series database? If we already have a historian, do we replace it? If we are evaluating a time series database, does it make the historian redundant?

The question makes sense. Both systems store time-stamped operational data. Both appear in industrial IoT architecture diagrams. But they are designed for fundamentally different jobs. Framing the decision as a binary choice between a data historian vs. a time series database is the reason most teams end up solving the wrong problem.

This post compares the two across the dimensions that matter for industrial teams: ingestion protocol, query language, analytics capability, and deployment model. The goal is not to declare one winner. The goal is to help you understand what each system is actually built to do, and how to stack them correctly.

What a data historian is built to do

A data historian collects time-stamped values from operational technology (OT) sources: sensors, PLCs, DCS systems, and SCADA platforms. It speaks the languages of the plant floor natively: OPC-UA, OPC-DA, Modbus, and proprietary vendor protocols. This is its defining strength.

Data is organized around tags. Each tag represents a single measurement point: a temperature reading at a specific location, a pressure value from a specific sensor, a flow rate on a specific line. A historian tracks thousands to millions of these tags, storing the value, timestamp, and quality code for every reading. The quality code is important: historians certified for production use carry data integrity guarantees that time series databases do not replicate.

Historians are deployed on the OT network, often in air-gapped environments where production data must not leave the facility. Systems like AVEVA PI (formerly OSIsoft PI), Honeywell Uniformance, AVEVA Historian (formerly Wonderware), and Siemens SIMATIC Historian dominate this category. They have decades of industrial deployment behind them. Plant operations teams know them. Regulatory auditors accept their certified data records.

What historians are not built to do: cross-asset analytics, cross-plant queries, joins with ERP or MES context, or OEE calculations that span multiple production lines. Tag retrieval is their model. Analytics is not.

What a time series database is built to do

A time series database stores and queries time-stamped numerical and categorical data. It is designed for the IT side of the stack: ingestion via HTTP APIs, message brokers like Kafka and MQTT, or collection agents like Telegraf.

Where historians organize data around tags, time series databases organize data around tables, measurements, or series, depending on the system. They support analytical queries: aggregations, time-windowed summaries, percentile calculations, and, in systems that support SQL, cross-series joins. They expose standard interfaces. Grafana, Superset, and most BI tools connect to SQL-capable time series databases without custom connectors.

The trade-off is OT connectivity. Most time series databases were not designed for native OPC-UA or OPC-DA ingestion. Getting data from a historian or PLC into a time series database requires a bridge: Telegraf, a custom ingestion agent, or a historian replication layer. The OT connectivity requirement does not disappear. It shifts from "does the database speak OPC-UA" to "how does data get from the plant floor to the database."

That distinction matters when you are evaluating a data historian vs. a time series database for an industrial IoT database role. The historian already solved the OT connectivity problem. A time series database asks you to solve it again.

Direct comparison: what each system handles

Dimension Data historian Time series database
Primary role OT data collection and storage Analytics and querying
Ingestion protocol OPC-UA, OPC-DA, Modbus, proprietary HTTP API, Kafka, MQTT, Telegraf
Query language Proprietary (PI SQL, PI Web API, REST) SQL or proprietary DSL
Cross-asset joins Not supported Supported (SQL systems)
Analytics depth Tag retrieval, simple aggregations CTEs, window functions, complex aggregations
Cardinality handling Tag-based indexing Varies by database
Deployment model OT network, often air-gapped IT network, cloud or on-premises
Certified data integrity Built-in Limited or not built-in
BI tool integration Limited Native (Grafana, Tableau, Superset)

The distinction is not performance. It is purpose. A historian is an OT data collection system with retrieval capability. A time series database is an analytics engine that handles time-indexed data. They are both correct. They are solving different problems

Why replacing the historian is usually the wrong conclusion

Teams evaluating time series databases often do so because the historian's analytics are too thin. Queries that span multiple asset types or multiple production sites are slow or impossible. Building an OEE dashboard in the historian's native interface requires workarounds. Joining sensor readings with ERP maintenance records is not supported. These are real limitations.

But they are limitations of the historian's analytics layer, not its data collection capability. The historian is doing its job: collecting, storing, and certifying plant-floor data from OT sources. No time series database replicates its OPC-UA connectivity, its data integrity certification, or its decades of industrial deployment history.

Replacing the historian means rebuilding OT connectivity from scratch, re-certifying data integrity for regulatory purposes, and convincing plant operations teams to trust a new system with production data. That is a large project. The analytics problem does not require it.

The correct reframe: what are you actually trying to fix? If the answer is "I cannot run cross-plant OEE queries," "my shift dashboards are four hours stale," or "joining sensor data with maintenance records requires a scheduled export job", those are analytics problems. The historian is not the right tool for analytics. Adding a time series database alongside the historian solves the analytics problem without touching the historian.

The architecture that works: historian for ingestion, CrateDB for analytics

The architecture that resolves this cleanly keeps the historian in place and adds a time series database as the analytics layer. Data flows from the historian to the time series database via replication, a shared OT source feeding both systems in parallel, or a forwarding agent. The OT network stays unchanged. The historian continues collecting certified production data. The analytics layer handles dashboards, OEE calculations, cross-site queries, and integration with IT systems.

CrateDB fits this analytics layer role for industrial workloads because of how it handles the data structure industrial sensors actually produce: high cardinality, dynamic schemas, and the need to join across assets, plants, and operational context in a single SQL query.

ALPLA, a global packaging manufacturer running 181 production facilities, stores 900 sensor types in a single CrateDB table. Dashboard queries that previously took 3 to 5 minutes now return in milliseconds. The historian handles plant-floor ingestion and data integrity. CrateDB handles the cross-plant analytics the historian was never designed to run.

For a detailed view of what a modern data historian covers and where it stops, the Modern Data Historian page covers the full scope of historian architecture and how CrateDB fits in the OT/IT stack.

What CrateDB adds as the analytics layer

When CrateDB sits alongside the historian, queries that were impossible or too slow become standard SQL. The historian answers "what was sensor B7's temperature at 14:32 on Tuesday." CrateDB answers "what is the OEE across all lines in all plants this week, broken down by shift."

A cross-plant OEE query across all facilities:

-- OEE summary by facility and shift, last 7 days
SELECT
    plant_id,
    shift_id,
    DATE_TRUNC('day', ts)                       AS day,
    AVG(availability)                           AS avg_availability,
    AVG(performance)                            AS avg_performance,
    AVG(quality)                                AS avg_quality,
    AVG(availability * performance * quality)   AS oee
FROM production_metrics
WHERE ts >= NOW() - INTERVAL '7 days'
GROUP BY plant_id, shift_id, day
ORDER BY day DESC, plant_id, shift_id;

This query runs against data that arrived seconds ago. No export job. No nightly batch. No ETL pipeline to a reporting database.

CrateDB also handles the schema flexibility that industrial sensor data requires. New sensor types are absorbed at ingest via dynamic columns without a schema migration. As facilities add equipment and measurement points, the table structure accommodates them without pipeline downtime.

For manufacturing analytics teams building this live operations view for the first time, OEE Analytics on Live Data: How to Move from Nightly Exports to Real-Time Dashboards walks through the full transition: what the export-job architecture looks like, where it breaks down for shift supervisors, and what a live analytics path produces.

For the broader industrial IoT database architecture, including how Telegraf connects OT sources to CrateDB and what a full ingestion-to-dashboard pipeline looks like, Real-Time IoT Analytics at Scale covers the complete architecture.

Which belongs in your industrial stack

A data historian is the right tool for collecting and certifying operational data from the plant floor. A time series database is the right tool for querying that data at scale, joining it with IT context, and powering live manufacturing analytics dashboards. For most industrial teams, both belong in the stack. The architect's job is to connect them in the right sequence.

The historian stays at the OT layer. CrateDB sits at the analytics layer. The practitioner gets cross-plant OEE queries in milliseconds instead of a nightly export.

To run the analytics layer against live sensor data before committing to an architecture, the Weather Monitoring Guided Path runs in one Docker command. A live Grafana OEE dashboard in under 30 minutes. No account required.