Database client¶
crate.client.connect
is the primary method to connect to CrateDB using
Python. This section of the documentation outlines different methods to connect
to the database cluster, as well as how to run basic inquiries to the database,
and closing the connection again.
Table of Contents
Connect to a database¶
Before we can start we have to import the client:
>>> from crate import client
The client provides a connect()
function which is used to establish a
connection, the first argument is the url of the server to connect to:
>>> connection = client.connect(crate_host)
>>> connection.close()
CrateDB is a clustered database providing high availability through replication. In order for clients to make use of this property it is recommended to specify all hosts of the cluster. This way if a server does not respond, the request is automatically routed to the next server:
>>> invalid_host = 'http://not_responding_host:4200'
>>> connection = client.connect([invalid_host, crate_host])
>>> connection.close()
If no servers
are given, the default one http://127.0.0.1:4200
is used:
>>> connection = client.connect()
>>> connection.client._active_servers
['http://127.0.0.1:4200']
>>> connection.close()
If the option error_trace
is set to True
, the client will print a whole
traceback if a server error occurs:
>>> connection = client.connect([crate_host], error_trace=True)
>>> connection.close()
Network Timeouts¶
It’s possible to define a default timeout value in seconds for all servers
using the optional parameter timeout
. In this case, it will serve as a
total timeout (connect and read):
>>> connection = client.connect([crate_host, invalid_host], timeout=5)
>>> connection.close()
If you want to adjust the connect- vs. read-timeout values individually,
please use the urllib3.Timeout
object like:
>>> import urllib3
>>> connection = client.connect(
... [crate_host, invalid_host],
... timeout=urllib3.Timeout(connect=5, read=None))
>>> connection.close()
Authentication¶
Users that are trusted as by definition of the auth.host_based.config
setting do not need a password, but only require the username
argument to
connect:
>>> connection = client.connect([crate_host],
... username='trusted_me')
>>> connection.client.username
'trusted_me'
>>> connection.client.password
>>> connection.close()
The username for trusted users can also be provided in the URL:
>>> connection = client.connect(['http://trusted_me@' + crate_host])
>>> connection.client.username
'trusted_me'
>>> connection.client.password
>>> connection.close()
To connect to CrateDB with as a user that requires password authentication, you
also need to provide password
as argument for the connect()
call:
>>> connection = client.connect([crate_host],
... username='me',
... password='my_secret_pw')
>>> connection.client.username
'me'
>>> connection.client.password
'my_secret_pw'
>>> connection.close()
The authentication credentials can also be provided in the URL:
>>> connection = client.connect(['http://me:my_secret_pw@' + crate_host])
>>> connection.client.username
'me'
>>> connection.client.password
'my_secret_pw'
>>> connection.close()
Default Schema¶
To connect to CrateDB and use a different default schema than doc
, you can
provide the schema
keyword argument in the connect()
method, like so:
>>> connection = client.connect([crate_host],
... schema='custom_schema')
>>> connection.close()
Inserting Data¶
Use user “crate” for rest of the tests:
>>> connection = client.connect([crate_host])
Before executing any statement, a cursor has to be opened to perform database operations:
>>> cursor = connection.cursor()
>>> cursor.execute("""INSERT INTO locations
... (name, date, kind, position) VALUES (?, ?, ?, ?)""",
... ('Einstein Cross', '2007-03-11', 'Quasar', 7))
To bulk insert data you can use the executemany
function:
>>> cursor.executemany("""INSERT INTO locations
... (name, date, kind, position) VALUES (?, ?, ?, ?)""",
... [('Cloverleaf', '2007-03-11', 'Quasar', 7),
... ('Old Faithful', '2007-03-11', 'Quasar', 7)])
[{'rowcount': 1}, {'rowcount': 1}]
executemany
returns a list of results for every parameter. Each result
contains a rowcount. If an error occurs, the rowcount is -2
and the result
may contain an error_message
depending on the error.
Refresh locations:
>>> cursor.execute("REFRESH TABLE locations")
Updating Data¶
Values for TIMESTAMP
columns can be obtained as a string literal, date
,
or datetime
object. If it contains timezone information, it is converted to
UTC, and the timezone information is discarded.
>>> import datetime as dt
>>> timestamp_full = "2023-06-26T09:24:00.123+02:00"
>>> timestamp_date = "2023-06-26"
>>> datetime_aware = dt.datetime.fromisoformat("2023-06-26T09:24:00.123+02:00")
>>> datetime_naive = dt.datetime.fromisoformat("2023-06-26T09:24:00.123")
>>> datetime_date = dt.date.fromisoformat("2023-06-26")
>>> cursor.execute("UPDATE locations SET date=? WHERE name='Cloverleaf'", (timestamp_full, ))
>>> cursor.execute("UPDATE locations SET date=? WHERE name='Cloverleaf'", (timestamp_date, ))
>>> cursor.execute("UPDATE locations SET date=? WHERE name='Cloverleaf'", (datetime_aware, ))
>>> cursor.execute("UPDATE locations SET date=? WHERE name='Cloverleaf'", (datetime_naive, ))
>>> cursor.execute("UPDATE locations SET date=? WHERE name='Cloverleaf'", (datetime_date, ))
Selecting Data¶
To perform the select operation simply execute the statement on the open cursor:
>>> cursor.execute("SELECT name FROM locations where name = ?", ('Algol',))
To retrieve a row we can use one of the cursor’s fetch functions (described below).
fetchone()¶
fetchone()
with each call returns the next row from the results:
>>> result = cursor.fetchone()
>>> pprint(result)
['Algol']
If no more data is available, an empty result is returned:
>>> while cursor.fetchone():
... pass
>>> cursor.fetchone()
fetchmany()¶
fetch_many()
returns a list of all remaining rows, containing no more than
the specified size of rows:
>>> cursor.execute("SELECT name FROM locations order by name")
>>> result = cursor.fetchmany(2)
>>> pprint(result)
[['Aldebaran'], ['Algol']]
If a size is not given, the cursor’s arraysize, which defaults to ‘1’, determines the number of rows to be fetched:
>>> cursor.fetchmany()
[['Allosimanius Syneca']]
It’s also possible to change the cursors arraysize to an other value:
>>> cursor.arraysize = 3
>>> cursor.fetchmany()
[['Alpha Centauri'], ['Altair'], ['Argabuthon']]
fetchall()¶
fetchall()
returns a list of all remaining rows:
>>> cursor.execute("SELECT name FROM locations order by name")
>>> result = cursor.fetchall()
>>> pprint(result)
[['Aldebaran'],
['Algol'],
['Allosimanius Syneca'],
['Alpha Centauri'],
['Altair'],
['Argabuthon'],
['Arkintoofle Minor'],
['Bartledan'],
['Cloverleaf'],
['Einstein Cross'],
['Folfanga'],
['Galactic Sector QQ7 Active J Gamma'],
['Galaxy'],
['North West Ripple'],
['Old Faithful'],
['Outer Eastern Rim']]
Cursor Description¶
The description
property of the cursor returns a sequence of 7-item
sequences containing the column name as first parameter. Just the name field is
supported, all other fields are ‘None’:
>>> cursor.execute("SELECT * FROM locations order by name")
>>> result = cursor.fetchone()
>>> pprint(result)
['Aldebaran',
1658167836758,
1658167836758,
1658167836758,
None,
None,
'Star System',
None,
1,
'Max Quordlepleen claims that the only thing left after the end of the '
'Universe will be the sweets trolley and a fine selection of Aldebaran '
'liqueurs.',
None]
>>> result = cursor.description
>>> pprint(result)
(('name', None, None, None, None, None, None),
('date', None, None, None, None, None, None),
('datetime_tz', None, None, None, None, None, None),
('datetime_notz', None, None, None, None, None, None),
('nullable_datetime', None, None, None, None, None, None),
('nullable_date', None, None, None, None, None, None),
('kind', None, None, None, None, None, None),
('flag', None, None, None, None, None, None),
('position', None, None, None, None, None, None),
('description', None, None, None, None, None, None),
('details', None, None, None, None, None, None))
Closing the Cursor¶
The following command closes the cursor:
>>> cursor.close()
If a cursor is closed, it will be unusable from this point forward.
If any operation is attempted to a closed cursor an ProgrammingError
will
be raised.
>>> cursor.execute("SELECT * FROM locations")
Traceback (most recent call last):
...
crate.client.exceptions.ProgrammingError: Cursor closed
Closing the Connection¶
The following command closes the connection:
>>> connection.close()
If a connection is closed, it will be unusable from this point forward. If any
operation using the connection is attempted to a closed connection an
ProgrammingError
will be raised:
>>> cursor.execute("SELECT * FROM locations")
Traceback (most recent call last):
...
crate.client.exceptions.ProgrammingError: Connection closed
>>> cursor = connection.cursor()
Traceback (most recent call last):
...
crate.client.exceptions.ProgrammingError: Connection closed