Ad-hoc queries¶
Support highly dynamic ad-hoc querying even on large-scale, real-time datasets.
Introduction
Whether you’re a developer, data analyst, or operator, CrateDB lets you ask new questions on the fly without waiting for data pipelines, ETL, or pre-aggregated views. Thanks to its distributed SQL engine, flexible data modeling, and support for semi-structured data, CrateDB is ideal for interactive exploration and live analytics.
CrateDB is your go-to database for fast, flexible, and reliable ad-hoc querying. Whether you’re debugging systems, answering tough questions, or uncovering hidden insights, CrateDB empowers you to work at the speed of thought on real-time data, at any scale.
What are ad-hoc queries?
Ad-hoc queries are spontaneous, often one-off SQL queries used for:
Troubleshooting
Exploratory analysis
Debugging systems
Investigating anomalies
Supporting customer questions
Generating quick reports
They are unpredictable by nature—and CrateDB is designed to handle exactly that.
Benefits of using CrateDB for ad-hoc queries
Feature |
Benefit |
|---|---|
Distributed SQL engine |
Fast performance, even on complex queries |
Real-time ingestion |
Query new data moments after it arrives |
Flexible schemas |
Combine structured, JSON, text, and geospatial data |
Full SQL support |
Use familiar SQL for joins, filters, sorting, and more |
Easy integrations |
Query from CLI, notebooks, BI tools, or HTTP API |
When to use CrateDB for ad-hoc queries
Explore new patterns in operational or business data
Run troubleshooting queries across complex systems
Query fresh data instantly, without waiting for batch jobs
Combine structured + JSON + full-text + spatial + vector data
Avoid maintaining rigid ETL pipelines or OLAP cubes
Common Query Patterns¶
Quick Filters
SELECT timestamp, message, host
FROM logs
WHERE service = 'auth' AND log_level = 'error'
ORDER BY timestamp DESC
LIMIT 50;
Explore Nested JSON
SELECT
payload['device']['os'],
COUNT(*) AS count
FROM events
GROUP BY payload['device']['os'];
Geospatial Debugging
SELECT id, latitude, longitude
FROM vehicles
WHERE within(
location,
'POLYGON ((-73.97 40.78, -73.95 40.78, -73.95 40.76, -73.97 40.76, -73.97 40.78))'
);
Time-bound Query
SELECT timestamp, device_id, value
FROM sensor_data
WHERE timestamp > now() - INTERVAL '15 minutes';
Join Across Tables
SELECT o.order_id, c.name, o.total
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '7 days';
Real-World Examples¶
DevOps & Observability: Investigate production issues by filtering logs or telemetry on the fly.
SELECT message
FROM logs
WHERE log_level = 'warn' AND host = 'api-01'
ORDER BY timestamp DESC;
Data Exploration: Test hypotheses by slicing data in new ways without waiting for prebuilt reports.
SELECT city, AVG(duration)
FROM rides
GROUP BY city
HAVING AVG(duration) > 10;
Product Analytics: Check how often a new product was bought after launch.
SELECT COUNT(*)
FROM orders
WHERE product_id = 'xyz123'
AND order_date >= '2025-07-01';
Tools & Interfaces¶
CrateDB offers several interfaces for ad-hoc queries:
Admin UI Console – Web-based SQL editor with result viewer
PostgreSQL Clients – psql, DBeaver, DataGrip, etc.
HTTP Client – Send ad-hoc queries over HTTP
CrateDB Python Client – Ideal for notebooks and automation
Grafana / Superset – Query builder UI and live dashboards
Example via HTTP:
curl -sSf -u USERNAME:PASSWORD -X POST https://your.cratedb.cloud:4200/_sql \
-H "Content-Type: application/json" \
-d '{"stmt": "SELECT timestamp, message, host FROM logs WHERE log_level = ? LIMIT 10", "args": ["error"]}'