SQLAlchemy: Getting started

This section of the documentation shows how to connect to CrateDB using its SQLAlchemy dialect, and how to run basic DDL statements based on an SQLAlchemy ORM schema definition.

Subsequent sections of the documentation will cover:

Table of Contents


Import the relevant symbols:

>>> import sqlalchemy as sa
>>> from sqlalchemy.orm import sessionmaker
>>> try:
...     from sqlalchemy.orm import declarative_base
... except ImportError:
...     from sqlalchemy.ext.declarative import declarative_base

Establish a connection to the database, see also Engine Configuration and Connect to CrateDB:

>>> engine = sa.create_engine(f"crate://{crate_host}")
>>> connection = engine.connect()

Create an SQLAlchemy Session:

>>> session = sessionmaker(bind=engine)()
>>> Base = declarative_base()


In SQLAlchemy, a connection is established using the create_engine function. This function takes a connection string, actually an URL, that varies from database to database.

In order to connect to a CrateDB cluster, the following connection strings are valid:

>>> sa.create_engine('crate://')

This will connect to the default server (‘’). In order to connect to a different server the following syntax can be used:

>>> sa.create_engine('crate://otherserver:4200')

Multiple Hosts

Because CrateDB is a clustered database running on multiple servers, it is recommended to connect to all of them. This enables the DB-API layer to use round-robin to distribute the load and skip a server if it becomes unavailable. In order to make the driver aware of multiple servers, use the connect_args parameter like so:

>>> sa.create_engine('crate://', connect_args={
...     'servers': ['host1:4200', 'host2:4200']
... })

TLS Options

As defined in HTTPS connection support, the client validates SSL server certificates by default. To configure this further, use e.g. the ca_cert attribute within the connect_args, like:

>>> ssl_engine = sa.create_engine(
...     'crate://',
...     connect_args={
...         'servers': ['https://host1:4200'],
...         'ca_cert': '/path/to/cacert.pem',
...     })

In order to disable SSL verification, use verify_ssl_cert = False, like:

>>> ssl_engine = sa.create_engine(
...     'crate://',
...     connect_args={
...         'servers': ['https://host1:4200'],
...         'verify_ssl_cert': False,
...     })

Timeout Options

In order to configure TCP timeout options, use the timeout parameter within connect_args,

>>> timeout_engine = sa.create_engine('crate://localhost/', connect_args={'timeout': 42.42})
>>> timeout_engine.raw_connection().driver_connection.client._pool_kw["timeout"]

or use the timeout URL parameter within the database connection URL.

>>> timeout_engine = sa.create_engine('crate://localhost/?timeout=42.42')
>>> timeout_engine.raw_connection().driver_connection.client._pool_kw["timeout"]

Pool Size

In order to configure the database connection pool size, use the pool_size parameter within connect_args,

>>> timeout_engine = sa.create_engine('crate://localhost/', connect_args={'pool_size': 20})
>>> timeout_engine.raw_connection().driver_connection.client._pool_kw["maxsize"]

or use the pool_size URL parameter within the database connection URL.

>>> timeout_engine = sa.create_engine('crate://localhost/?pool_size=20')
>>> timeout_engine.raw_connection().driver_connection.client._pool_kw["maxsize"]

Basic DDL operations


CrateDB currently does not know about different “databases”. Instead, tables can be created in different schemas. Schemas are created implicitly on table creation and cannot be created explicitly. If a schema does not exist yet, it will be created.

The default CrateDB schema is doc, and if you do not specify a schema, this is what will be used.

See also Schema selection and Schemas.

Create tables

First the table definition as class, using SQLAlchemy’s Declarative Mapping:

>>> class Department(Base):
...     __tablename__ = 'departments'
...     __table_args__ = {
...         'crate_number_of_replicas': '0'
...     }
...     id = sa.Column(sa.String, primary_key=True)
...     name = sa.Column(sa.String)
...     code = sa.Column(sa.Integer)

As seen below, the table doesn’t exist yet:

>>> engine.dialect.has_table(connection, table_name='departments')

In order to create all missing tables, the create_all method can be used:

>>> Base.metadata.create_all(bind=engine)

With that, the table has been created:

>>> engine.dialect.has_table(connection, table_name='departments')

Let’s also verify that by inquiring the information_schema.columns table:

>>> stmt = ("select table_name, column_name, ordinal_position, data_type "
...         "from information_schema.columns "
...         "where table_name = 'departments' "
...         "order by column_name")
>>> pprint([str(r) for r in connection.execute(sa.text(stmt))])
["('departments', 'code', 3, 'integer')",
 "('departments', 'id', 1, 'text')",
 "('departments', 'name', 2, 'text')"]

Drop tables

In order to delete all tables reference within the ORM schema, invoke Base.metadata.drop_all(). To delete a single table, use drop(...), as shown below:

>>> Base.metadata.tables['departments'].drop(engine)
>>> engine.dialect.has_table(connection, table_name='departments')