Flink¶

Apache Flink is a programming framework and distributed processing engine for stateful computations over unbounded and bounded data streams, written in Java. It is a battle-hardened stream processor widely used for demanding real-time applications.

Connect¶

Flink’s JdbcSink is a streaming connector that writes data to a JDBC database, for example using the [PostgreSQL JDBC Driver] that also works with CrateDB. When configuring the data sink, use:

url:: jdbc:postgresql://localhost:5432/crate
driver:: org.postgresql.Driver

Synopsis¶

from pyflink.common import Types
from pyflink.datastream.connectors.jdbc import JdbcConnectionOptions, JdbcExecutionOptions, JdbcSink

JdbcSink.sink(
    "INSERT INTO doc.weather_flink_sink (location, current) VALUES (?, ?)",
    Types.ROW_NAMED(["location", "current"], [Types.STRING(), Types.STRING()]),
    JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
    .with_url("jdbc:postgresql://localhost:5432/crate")
    .with_driver_name("org.postgresql.Driver")
    .with_user_name("crate")
    .with_password("")
    .build())

Learn¶

Guides

Build a data ingestion pipeline

Learn how to build a data ingestion pipeline using three open-source tools: Apache Kafka, Flink, and CrateDB.

Example: Kafka receives telemetry messages from IoT sensors and devices. Flink consumes the data stream and stores it into CrateDB. All tools are distributed systems that provide elastic scaling, fault tolerance, high-throughput, and low-latency performance via parallel processing.

https://dev.to/crate/build-a-data-ingestion-pipeline-using-kafka-flink-and-cratedb-1h5o

Source: Executable Stack (Java)

An executable stack with Apache Kafka, Apache Flink, and CrateDB. Uses Java.

https://github.com/crate/cratedb-examples/tree/main/framework/flink/kafka-jdbcsink-java

Source: Executable Stack (Python)

An executable stack with Apache Kafka, Apache Flink, and CrateDB. Uses Python.

https://github.com/crate/cratedb-examples/tree/main/framework/flink/kafka-jdbcsink-python

Webinars

Apache Flink 101

Why Flink is interesting for building real-time streaming applications, and how it works.

Flink’s performance and robustness are the results of a handful of core design principles, including a shared-nothing architecture with local state, event-time processing, and state snapshots (for recovery). This course introduces you to these core concepts.

Webinar Fundamentals

CrateDB Community Day: Maximizing your data potential with CrateDB integrations

Flink connects different messaging systems, file systems, and database key/value stores for multiple purposes. For data integrations, it can serve as a data hub between systems and much more like event-driven applications, and it’s very flexible.

The webinar includes a live demo of Apache Flink with CrateDB as source or sink.

Webinar Integrations