pandas

pandas logo
pandas CI

About

pandas is a fast, powerful, flexible, and easy-to-use open-source data analysis and manipulation tool, built on top of the Python programming language. It offers data structures and operations for manipulating numerical tables and time series.

Pandas is built around data structures called Series and DataFrames. Data for these collections can be imported from various file formats such as comma-separated values, JSON, Parquet, SQL database tables or queries, and Microsoft Excel. A Series is a 1-dimensional data structure built on top of NumPy’s array.

Install

pip install pandas sqlalchemy-cratedb

Synopsis

Write pandas dataframe to CrateDB.

example.py

import sqlalchemy as sa
from sqlalchemy_cratedb import insert_bulk

CRATEDB_URI = "crate://crate:crate@localhost:4200"
TABLE_NAME = "example"

df = makeTimeDataFrame(rows=500_000, freq="s")
engine = sa.create_engine(CRATEDB_URI)
df.to_sql(
    name=TABLE_NAME,
    con=engine,
    if_exists="replace",
    index=False,
    chunksize=20_000,
    method=insert_bulk,
)

Quickstart example

Create the file example.py including the synopsis code shared above. Complete the example by using the makeTimeDataFrame() function.

def makeTimeDataFrame(rows=5_000, freq = "B"):
    import numpy as np
    import pandas as pd
    return pd.DataFrame(
        np.random.default_rng(2).standard_normal((rows, 4)),
        columns=pd.Index(list("ABCD"), dtype=object),
        index=pd.date_range("2000-01-01", periods=rows, freq=freq),
    )

Start CrateDB using Docker or Podman, then invoke the example program.

docker run --rm --publish=5432:5432 docker.io/crate '-Cdiscovery.type=single-node'
pip install pandas sqlalchemy-cratedb
python example.py

Full example

Connect to CrateDB and CrateDB Cloud using pandas.

https://github.com/crate/cratedb-examples/tree/main/by-dataframe/pandas

Guides

Related sections