Contents Menu Expand Light mode Dark mode Skip to content

Join our webinar on Modern Data Pipelines for Smart Factories​ on Jun 26th

Register Now
  • Product
    • Database
      • Overview
      • SQL examples
      • Integrations
      • Security
    • Data models
      • Time-series
      • Document/JSON
      • Vector
      • Full-text
      • Spatial
      • Relational
  • Use cases
    • Real-time analytics
    • Hybrid search
    • AI/ML integration
    • AI chatbots
    • Internet of Things
    • Geospatial analytics
    • Log & event analysis
  • Industries
    • Energy
    • Financial Services
    • FMCG
    • Logistics
    • Manufacturing
    • Oil, gas & mining
    • Smart city solutions
    • Technology platforms
    • Telco
    • Transportation
  • Resources
    • Customer stories
    • Academy
    • Asset library
    • Blog
    • Events
  • Developer
    • Documentation
    • Drivers and tools
    • Community
    • GitHub
    • Support
  • Pricing
  • Login
  • Get Started
  • Overview
  • CrateDB Cloud
  • Guides and Tutorials
    • Installation
    • Getting Started
    • All Features
    • Administration
    • Performance Guides
    • Application Domains
    • Integrations
      • Database IDEs
      • ORM Libraries
      • Dataframe Libraries
      • Load and Export
        • Apache Airflow / Astronomer
        • Apache Flink
          • Apache Flink
        • Apache Hop
        • Apache Iceberg / RisingWave
        • Apache Kafka
        • Apache NiFi
        • AWS DMS
        • AWS Kinesis
        • Azure Functions
        • dbt
        • DynamoDB
        • Estuary
        • InfluxDB
        • Kestra
        • MongoDB
        • MySQL
        • Node-RED
        • RisingWave
        • Singer / Meltano
        • SQL Server Integration Services
        • StreamSets
      • Change Data Capture (CDC)
      • Model Context Protocol (MCP)
      • System Metrics
      • Data Visualization
      • Business Intelligence
      • Data Lineage
      • Software Testing
    • Migrations
    • Reference Architectures
  • Reference Manual
  • Admin UI
  • CrateDB CLI
  • Cloud CLI
  • Drivers and Integrations
  • Support
  • Community
  • Integration Tutorials
  • Sample Applications
  • Academy

Apache Flink¶

Apache Flink is a programming framework and distributed processing engine for stateful computations over unbounded and bounded data streams, written in Java. It is a battle-hardened stream processor widely used for demanding real-time applications.

Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale. It received the 2023 SIGMOD Systems Award.

Apache Flink greatly expanded the use of stream data-processing. Data Pipelines Done Right.

Managed Flink

A few companies are specializing in offering managed Flink services.

  • Aiven offers their managed Aiven for Apache Flink solution.

  • Immerok Cloud’s offering is being converged into Flink managed by Confluent, see Confluent Streaming Data Pipelines.

Connect¶

Flink’s JdbcSink is a streaming connector that writes data to a JDBC database, for example using the PostgreSQL JDBC Driver that also works with CrateDB. When configuring the data sink, use:

url:

jdbc:postgresql://localhost:5432/crate

driver:

org.postgresql.Driver

Synopsis¶

from pyflink.common import Types
from pyflink.datastream.connectors.jdbc import JdbcConnectionOptions, JdbcExecutionOptions, JdbcSink

JdbcSink.sink(
    "INSERT INTO doc.weather_flink_sink (location, current) VALUES (?, ?)",
    Types.ROW_NAMED(["location", "current"], [Types.STRING(), Types.STRING()]),
    JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
    .with_url("jdbc:postgresql://localhost:5432/crate")
    .with_driver_name("org.postgresql.Driver")
    .with_user_name("crate")
    .with_password("")
    .build())
More Examples
import org.apache.flink.connector.jdbc.JdbcConnectionOptions;
import org.apache.flink.connector.jdbc.JdbcExecutionOptions;
import org.apache.flink.connector.jdbc.JdbcSink;

JdbcSink.sink(
    "INSERT INTO my_schema.books (id, title, authors, year) VALUES (?, ?, ?, ?)",
    (statement, book) -> {
        statement.setLong(1, book.id);
        statement.setString(2, book.title);
        statement.setString(3, book.authors);
        statement.setInt(4, book.year);
    },
    JdbcExecutionOptions.builder()
        .withBatchSize(1000)
        .withBatchIntervalMs(200)
        .withMaxRetries(5)
        .build(),
    new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
        .withUrl("jdbc:postgresql://localhost:5432/crate")
        .withDriverName("org.postgresql.Driver")
        .withUsername("crate")
        .withPassword("")
        .build()
));
from pyflink.common import Types
from pyflink.datastream.connectors.jdbc import JdbcConnectionOptions, JdbcExecutionOptions, JdbcSink

JdbcSink.sink(
    "INSERT INTO doc.weather_flink_sink (location, current) VALUES (?, ?)",
    Types.ROW_NAMED(["location", "current"], [Types.STRING(), Types.STRING()]),
    JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
    .with_url("jdbc:postgresql://localhost:5432/crate")
    .with_driver_name("org.postgresql.Driver")
    .with_user_name("crate")
    .with_password("")
    .build(),
    JdbcExecutionOptions.builder()
    .with_batch_interval_ms(1000)
    .with_batch_size(200)
    .with_max_retries(5)
    .build()
)

Learn¶

Tutorials

  • Build a data ingestion pipeline using Kafka, Flink, and CrateDB

Development

  • Executable stack with Apache Kafka, Apache Flink, and CrateDB (Java)

  • Streaming data with Apache Kafka, Apache Flink and CrateDB (Python)

Webinars

Apache Flink 101

Why Flink is interesting for building real-time streaming applications, and how it works.

Flink’s performance and robustness are the results of a handful of core design principles, including a shared-nothing architecture with local state, event-time processing, and state snapshots (for recovery). This course introduces you to these core concepts.

 

Webinar Fundamentals

CrateDB Community Day: Maximizing your data potential with CrateDB integrations

Flink connects different messaging systems, file systems, and database key/value stores for multiple purposes. For data integrations, it can serve as a data hub between systems and much more like event-driven applications, and it’s very flexible.

The webinar includes a live demo of Apache Flink with CrateDB as source or sink.

  • CrateDB Community Day 2nd Edition: Summary and Highlights

  • Community Day: Stream processing with Apache Flink and CrateDB

 

Webinar Integrations

Next
Stream processing from Iceberg tables to CrateDB using RisingWave
Previous
Load and Export
  Feedback

  Suggest improvement

  Edit page

  View page source

On this page
  • Apache Flink
    • Connect
    • Synopsis
    • Learn

Subscribe to the CrateDB Newsletter now

  • Imprint
  • Contact
  • Legal
Follow us
Follow us on X Follow us on LinkedIn Follow us on Facebook Follow us on Instagram Follow us on Facebook