The Guide for Time Series Data Projects is out.

Download now
Skip to content

CrateDB + Apache Airflow

Simplify your orchestration workflows with seamless automation.
CrateDB Logo

CrateDB is a hyper-fast opensource distributed database technology that combines the best out of the SQL and NoSQL worlds. It supports the PostgreSQL wire protocol for easy integration with many data engineering tools.

Airflow Logo
Apache Airflow is a popular tool for creating, scheduling, and monitoring data pipelines, with over 16 million downloads every month, it is the de facto standard for expressing data flows as Python code.
The combination of CrateDB and Apache Airflow is a powerful solution for businesses and organizations that need to manage and analyze large amounts of data. By leveraging both tools, you can easily automate orchestration workflows and build efficient pipelines that can process data quickly and accurately, leading to valuable insights.

Learn how to get started using Apache Airflow and CrateDB

Astronomer powers Apache Airflow, the go-to for expressing data flows as code. For trusted data, Astronomer provides Astro, the modern data orchestration platform, powered by Airflow, for building, running, and observing pipelines-as-code. Know more here

More resources on the combined value of Apache Airflow and CrateDB


Implementing a data retention policy in CrateDB using Apache Airflow

A data retention policy describes the practice of storing and managing data for a designated period of time. Once a data set completes its retention period, it should be deleted or archived, depending on requirements.


Automating the import of Parquet files with Apache Airflow

The workflow represented in this tutorial is a simple way to import Parquet files to CrateDB by transforming them into a .csv file. As previously mentioned, there are other approaches out there, we encourage you to try them out.