Skip to content
Blog

Predictive Maintenance Database Architecture: From Sensor Data to Maintenance Trigger in SQL

A motor on a production line fails at 04:20 on a Tuesday. The vibration data showed it trending toward failure for six days. The query that would have caught it was too slow to run continuously against 2 million daily sensor readings, so nobody ran it.

This is not a data collection failure. It is a query architecture problem.

The three signals that precede equipment failure

Predictive maintenance depends on three recurring data patterns. Every piece of equipment that fails telegraphs its failure in at least one of them.

Threshold crossing: A reading exceeds a safe operating limit. Vibration above 85 m/s², temperature above 92°C, pressure outside a configured band. The reading itself is the alert.

Rate of change: The reading is rising faster than normal, even if it has not crossed the threshold yet. An asset at 65 m/s² and climbing 20 units over six hours is more urgent than one holding steady at 80 m/s². The threshold query misses the second asset entirely.

Cross-asset correlation: When one machine shows early failure symptoms, which other machines in the same facility are showing the same pattern? Sister assets running the same equipment under similar load conditions fail in clusters. Inspecting one and ignoring the others is incomplete.

The challenge is not that these signals are hard to define. The challenge is querying them fast enough, continuously, against the full sensor stream.

Why the database layer determines how early you can act

Most predictive maintenance architectures work in batch. Sensors write to a historian. The historian exports to CSV or Parquet files on a schedule. A pipeline loads those files into a feature store or ML model. The model outputs a risk score. An alert fires.

The lag from sensor reading to maintenance alert is measured in hours. In some installations, days.

The database is the bottleneck. If data has to leave the ingestion system before the query engine can reach it, the prediction is already behind. A bearing failure that develops over six days is not caught six days early. It is caught the morning the export job finally includes enough data for the model to fire.

CrateDB sits between the sensor stream and the prediction logic. Data arrives directly via Telegraf, Kafka, or MQTT and is available for SQL the moment it lands. Threshold triggers, trend detection, and cross-asset correlation run against live data. No export step between ingestion and query. The full ingestion architecture is covered in the IoT Analytics Architecture Guide.

Threshold trigger: querying for sustained limit crossings

A single high reading is often noise. A threshold query that fires on one spike generates too many false positives to be actionable for a maintenance team managing hundreds of assets. The useful trigger is sustained exceedance: the same asset crossing the limit on multiple readings within a defined time window.

-- Maintenance trigger: assets with sustained vibration above safe operating limit
-- Fires only when an asset exceeds 85 m/s² on 5 or more readings in the last hour
-- Suppresses single-spike false positives without a separate deduplication step
SELECT
    asset_id,
    plant_id,
    COUNT(*) AS threshold_exceedances,
    ROUND(MAX(vibration_ms2), 2) AS peak_vibration_ms2,
    MIN(ts) AS first_exceeded_at
FROM sensor_readings
WHERE ts > NOW() - INTERVAL '1 hour'
  AND vibration_ms2 > 85.0
GROUP BY asset_id, plant_id
HAVING COUNT(*) >= 5
ORDER BY threshold_exceedances DESC;

This query runs against live data. A Grafana alert panel executes it on a configurable interval: every 30 seconds, every minute. No export step, no intermediate file, no batch pipeline. The HAVING COUNT(*) >= 5 clause is the entire false-positive filter. Adjust the count and the window to match each asset type's sensor frequency.

Trend detection: rate of change, not just absolute value

An asset at 80 m/s² holding steady requires monitoring. An asset at 65 m/s² rising 20 units over the last six hours requires immediate attention. Threshold queries treat both identically. Trend detection separates them.

The query below compares the last six hours of readings against the prior six hours for every asset in the dataset:

-- Trend detection: assets with accelerating vibration over 12 hours
-- Compares the recent 6-hour window against the prior 6-hour window
-- A delta above 15.0 triggers an inspection recommendation
SELECT
    asset_id,
    plant_id,
    ROUND(
        AVG(CASE WHEN ts > NOW() - INTERVAL '6 hours' THEN vibration_ms2 END),
        2
    ) AS avg_recent_6h,
    ROUND(
        AVG(CASE WHEN ts BETWEEN NOW() - INTERVAL '12 hours' AND NOW() - INTERVAL '6 hours' THEN vibration_ms2 END),
        2
    ) AS avg_prior_6h,
    ROUND(
        AVG(CASE WHEN ts > NOW() - INTERVAL '6 hours' THEN vibration_ms2 END)
        - AVG(CASE WHEN ts BETWEEN NOW() - INTERVAL '12 hours' AND NOW() - INTERVAL '6 hours' THEN vibration_ms2 END),
        2
    ) AS vibration_delta
FROM sensor_readings
WHERE ts > NOW() - INTERVAL '12 hours'
GROUP BY asset_id, plant_id
HAVING (
    AVG(CASE WHEN ts > NOW() - INTERVAL '6 hours' THEN vibration_ms2 END)
    - AVG(CASE WHEN ts BETWEEN NOW() - INTERVAL '12 hours' AND NOW() - INTERVAL '6 hours' THEN vibration_ms2 END)
) > 15.0
ORDER BY vibration_delta DESC;

Both queries use standard SQL aggregate expressions. CrateDB's distributed query engine executes them across the full sensor dataset without a pre-aggregation step and without proprietary query syntax. Every SQL client that connects via the PostgreSQL wire protocol (Grafana, Tableau, Power BI) runs these queries directly against live data.

Cross-asset correlation: prioritizing what to inspect next

A confirmed bearing failure on one conveyor line is an early indicator for sister assets running the same equipment under similar load. The pattern is the same in the data: a rising vibration curve that accelerates before threshold crossing. The query below finds all other assets in the same facility showing a matching upward trend over the same 72-hour window, ranked by the steepness of their rise.

-- Cross-asset correlation: find assets with rising vibration matching a known failing unit
-- Use to prioritize inspection order before a cascade failure
WITH candidate_assets AS (
    SELECT
        asset_id,
        ROUND(MAX(vibration_ms2) - MIN(vibration_ms2), 2) AS rise_72h,
        ROUND(MAX(vibration_ms2), 2) AS current_peak,
        ROUND(AVG(vibration_ms2), 2) AS avg_vibration
    FROM sensor_readings
    WHERE plant_id = 'PLANT_007'
      AND asset_type = 'conveyor_bearing'
      AND ts > NOW() - INTERVAL '72 hours'
      AND asset_id != 'BEARING-PLANT7-C3-04'
    GROUP BY asset_id
)
SELECT asset_id, rise_72h, current_peak, avg_vibration
FROM candidate_assets
WHERE rise_72h > 20.0
ORDER BY rise_72h DESC;

This query outputs a ranked inspection list: the conveyor bearings in the same facility showing the steepest vibration rise over the past three days. The maintenance team acts in order. For teams running inspections across multiple facilities, the distributed query pattern from the cross-plant analytics guide extends this to the full site footprint without additional architecture.

CrateDB as the data layer, not the ML model

Predictive maintenance architectures that include machine learning put CrateDB at the data layer. The ML model — whether a gradient-boosted failure classifier or a neural network trained on vibration signatures — needs fast, current feature data on demand. CrateDB serves that role.

The model queries CrateDB for the last N readings from an asset, aggregated by time bucket, as part of every inference request. CrateDB returns sub-second results across the full sensor history because every field is auto-indexed on ingestion. The model receives fresh data for every prediction without a separate feature store between the sensor stream and the inference pipeline.

For teams running real-time inference, CrateDB also stores model outputs alongside the sensor readings: risk scores, anomaly flags, predicted time-to-failure windows. A single SQL query retrieves both the raw readings and the model scores for a given asset in the same response. The same database that handles ingest at a million values per second handles the model's feature queries at sub-second latency.

CrateDB is not the ML model. It is what makes the ML model fast enough to be useful for operational decisions rather than retrospective analysis.

At production scale: ABB Ability Genix

ABB's Ability Genix platform delivers industrial AI to manufacturing, energy, and process industries globally. The platform processes sensor data from factory equipment at 1 million values ingested per second, with 30,000 to 120,000 events retrieved per second for active analytics and AI feature computation.

At that ingest and retrieval rate, the database layer is an active participant in the prediction loop, not a passive store. CrateDB's shared-nothing architecture distributes ingest and query load across cluster nodes. Adding nodes scales both ingestion throughput and query parallelism without changes to the pipeline or schema.

"Working with CrateDB brings positive outcomes. The ingestion and throughput have very good performance, with 1 million values/sec, the horizontal scalability where we can add as many nodes as we need and the automatic query distribution across the whole cluster." — Marko Sommarberg, Lead, Digital Strategy and Business Development, ABB

Full story: cratedb.com/stories/abb

Building the predictive maintenance data layer

The three SQL patterns above (threshold trigger, trend detection, and cross-asset correlation) are the query surface for predictive maintenance. ML models consume these as features. Grafana panels run them as alert conditions. Maintenance scheduling systems call them as API endpoints.

What determines the gap between a failing sensor signal and a maintenance action is how fast the database answers these queries against live data. A database that serves them without an export step, without pre-aggregation, and without a specialized time-series function library removes that gap. For manufacturing teams with sensor-dense assets and tight maintenance windows, the difference between a scheduled inspection and an unplanned failure is often measured in the hours the prediction arrived too late.

The full ingestion architecture (Telegraf configuration, Kafka connector patterns, and SQL query structure for industrial deployments) is in the IoT Analytics Architecture Guide.

Query your sensor stream in real time. Try CrateDB Live: one Docker command, under 30 minutes to a working sensor dashboard.