dbt¶
About
dbt is a tool for transforming data in data warehouses using Python and SQL.
It is an SQL‑first transformation workflow platform that lets teams quickly and collaboratively deploy analytics code following software engineering best practices such as modularity, portability, CI/CD, and documentation.
Introduction
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
With dbt, anyone on your data team can safely contribute to production-grade data pipelines.
The idea is that data engineers make source data available to an environment where dbt projects run, for example with Debezium or with Airflow / Astronomer. Afterwards, data analysts can run their dbt projects against this data to produce models (tables and views) that can be used with a number of Business Intelligence applications.
Features
The data abstraction layer provided by dbt-core allows the decoupling of the models on which reports and dashboards rely from the source data. When business rules or source systems change, you can still maintain the same models as a stable interface.
Some of the things that dbt can do include:
Import reference data from CSV files.
Track changes in source data with different strategies so that downstream models do not need to be built every time from scratch.
Run tests on data, to confirm assumptions remain valid, and to validate any changes made to the models’ logic.
dbt and CrateDB
Due to its unique capabilities, CrateDB is an excellent warehouse choice for data transformation projects. It offers automatic indexing, fast aggregations, easy partitioning, and the ability to scale horizontally.
Managed dbt
With dbt Cloud, you can ditch time-consuming setup, and the struggles of scaling your data production. dbt Cloud is a full-suite service that is built for scale.
Start building data products quickly using the dbt Cloud IDE with integrated security and governance controls.
Schedule, deploy, and monitor your data products using the scalable and reliable dbt Cloud Scheduler.
Help your data teams discover and reuse data products using hosted docs or integrations with the powerful Discovery API.
Extend your workflow beyond dbt Cloud with 30+ seamless integrations covering a range of use cases across the Modern Data Stack, from observability and data quality to visualization, reverse ETL, and much more.
Ship more high-quality data and scale your development like the 1000s of companies that use dbt Cloud. They’ve used its convenient and collaboration-friendly interface to eliminate the bottlenecks that keep growth limited.
Setup¶
Install the most recent version of the dbt-cratedb2 Python package.
pip install --upgrade 'dbt-cratedb2'
Configure¶
Because CrateDB is compatible with PostgreSQL, the same connectivity options apply, as outlined in the dbt Postgres Setup documentation.
The dbt connection profile settings for CrateDB stored in profiles.yml
are identical to PostgreSQL.
cratedb_analytics:
target: dev
outputs:
dev:
type: cratedb
host: [clustername].aks1.westeurope.azure.cratedb.net
port: 5432
user: [username]
pass: [password]
dbname: crate # CrateDB's only catalog is `crate`.
schema: doc # Define schema. `doc` is the default.
search_path: doc # Use the same value as `schema` by default.
Learn¶
Learn how to use CrateDB with dbt by exploring a full tutorial and a few other examples.
Guides
Usage Guidelines
Example Projects
Webinars
Introduction to dbt
Learn how to get started using dbt by following along with an easy step-by-step tutorial.
In this video, you will learn how to install dbt, initialize a new project and then publish your project to a GitHub repository.
Webinar Fundamentals
Notes¶
Please also refer to CrateDB setup | dbt Developer Hub.
These dbt features have been tested successfully:
models with view, table, and ephemeral materializations
Incremental materializations (with
incremental_strategy='delete+insert'and without involving OBJECT columns)
We hope you find this useful. CrateDB is continuously adding new features and we will be very happy to hear about your experience using CrateDB with dbt.


