The Guide for Time Series Data Projects is out.

Download now
Skip to content

Dynamic workflow orchestration with Apache Airflow and CrateDB

With the rise of complex data solutions, the need for automating and orchestrating database processes is becoming increasingly important: there are more and more use cases when a database change requires a chain of operations where the execution of an operation depends on the execution of the previous ones. Furthermore, to quickly develop and adapt database orchestrations, one must use tools that offer scalable solutions that are easy to monitor and manage.

This talk will illustrate how easy it is to automate orchestration workflows with Apache Airflow and CrateDB. Apache Airflow is one of the most popular platforms for programmatically creating, scheduling, and monitoring workflows. The workflows are defined as directed acyclic graphs (DAGs) where each node in a DAG represents an execution task. Initially, Airflow was designed in a way that each task run independently. As of Airflow 2.3, a dynamic task mapping feature has been introduced, making Airflow a perfect solution for building dynamic workflows.

On the other hand, CrateDB is an open-source, distributed database that makes storage and analysis of massive amounts of data simple and efficient. CrateDB offers a high degree of scalability, flexibility, and availability. One of CrateDB’s key strengths is its compatibility with many data engineering tools, including Apache Airflow.

In this talk, you will learn how to set up a new orchestration project, and how to use Airflow with CrateDB to orchestrate complex tasks.

Speaker: Marija Selakovic, Developer Advocate at