Ad-hoc queries

Support highly dynamic ad-hoc querying even on large-scale, real-time datasets.

Introduction

Whether you’re a developer, data analyst, or operator, CrateDB lets you ask new questions on the fly without waiting for data pipelines, ETL, or pre-aggregated views. Thanks to its distributed SQL engine, flexible data modeling, and support for semi-structured data, CrateDB is ideal for interactive exploration and live analytics.

CrateDB is your go-to database for fast, flexible, and reliable ad-hoc querying. Whether you’re debugging systems, answering tough questions, or uncovering hidden insights, CrateDB empowers you to work at the speed of thought on real-time data, at any scale.

What are ad-hoc queries?

Ad-hoc queries are spontaneous, often one-off SQL queries used for:

  • Troubleshooting

  • Exploratory analysis

  • Debugging systems

  • Investigating anomalies

  • Supporting customer questions

  • Generating quick reports

They are unpredictable by nature—and CrateDB is designed to handle exactly that.


Benefits of using CrateDB for ad-hoc queries

Feature

Benefit

Distributed SQL engine

Fast performance, even on complex queries

Real-time ingestion

Query new data moments after it arrives

Flexible schemas

Combine structured, JSON, text, and geospatial data

Full SQL support

Use familiar SQL for joins, filters, sorting, and more

Easy integrations

Query from CLI, notebooks, BI tools, or HTTP API


When to use CrateDB for ad-hoc queries

  • Explore new patterns in operational or business data

  • Run troubleshooting queries across complex systems

  • Query fresh data instantly, without waiting for batch jobs

  • Combine structured + JSON + full-text + spatial + vector data

  • Avoid maintaining rigid ETL pipelines or OLAP cubes

Common Query Patterns

Quick Filters

SELECT timestamp, message, host
FROM logs
WHERE service = 'auth' AND log_level = 'error'
ORDER BY timestamp DESC
LIMIT 50;

Explore Nested JSON

SELECT
  payload['device']['os'],
  COUNT(*) AS count
FROM events
GROUP BY payload['device']['os'];

Geospatial Debugging

SELECT id, latitude, longitude
FROM vehicles
WHERE within(
  location,
  'POLYGON ((-73.97 40.78, -73.95 40.78, -73.95 40.76, -73.97 40.76, -73.97 40.78))'
);

Time-bound Query

SELECT timestamp, device_id, value
FROM sensor_data
WHERE timestamp > now() - INTERVAL '15 minutes';

Join Across Tables

SELECT o.order_id, c.name, o.total
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '7 days';

Real-World Examples

DevOps & Observability: Investigate production issues by filtering logs or telemetry on the fly.

SELECT message
FROM logs
WHERE log_level = 'warn' AND host = 'api-01'
ORDER BY timestamp DESC;

Data Exploration: Test hypotheses by slicing data in new ways without waiting for prebuilt reports.

SELECT city, AVG(duration)
FROM rides
GROUP BY city
HAVING AVG(duration) > 10;

Product Analytics: Check how often a new product was bought after launch.

SELECT COUNT(*)
FROM orders
WHERE product_id = 'xyz123'
AND order_date >= '2025-07-01';

Tools & Interfaces

CrateDB offers several interfaces for ad-hoc queries:

  • Admin UI Console – Web-based SQL editor with result viewer

  • PostgreSQL Clients – psql, DBeaver, DataGrip, etc.

  • HTTP Client – Send ad-hoc queries over HTTP

  • CrateDB Python Client – Ideal for notebooks and automation

  • Grafana / Superset – Query builder UI and live dashboards

Example via HTTP:

curl -sSf -u USERNAME:PASSWORD -X POST https://your.cratedb.cloud:4200/_sql \
  -H "Content-Type: application/json" \
  -d '{"stmt": "SELECT timestamp, message, host FROM logs WHERE log_level = ? LIMIT 10", "args": ["error"]}'

See also