HTTP endpoint¶
CrateDB provides a HTTP Endpoint that can be used to submit SQL queries. The
endpoint is accessible under <servername:port>/_sql
.
SQL statements are sent to the _sql
endpoint in json
format, whereby
the statement is sent as value associated to the key stmt
.
See also
A simple SELECT
statement can be submitted like this:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql' \
... -d '{"stmt":"select name, position from locations order by id limit 2"}'
{
"cols": [
"name",
"position"
],
"rows": [
[
"North West Ripple",
1
],
[
"Outer Eastern Rim",
2
]
],
"rowcount": 2,
"duration": ...
}
Note
We’re using a simple command line invocation of curl
here so you can
see how to run this by hand in the terminal. For the rest of the examples
in this document, we use here documents (i.e. EOF
) for multi line
readability.
Table of contents
Parameter substitution¶
In addition to the stmt
key the request body may also contain an args
key which can be used for SQL parameter substitution.
The SQL statement has to be changed to use placeholders where the values should
be inserted. Placeholders can either be numbered (in the form of $1
,
$2
, etc.) or unnumbered using a question mark ?
.
The placeholders will then be substituted with values from an array that is
expected under the args
key:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql' -d@- <<- EOF
... {
... "stmt":
... "select date,position from locations
... where date <= \$1 and position < \$2 order by position",
... "args": ["1979-10-12", 3]
... }
... EOF
{
"cols": [
"date",
"position"
],
"rows": [
[
308534400000,
1
],
[
308534400000,
2
]
],
"rowcount": 2,
"duration": ...
}
Note
In this example the placeholders start with an backslash due to shell escaping.
Warning
Parameter substitution must not be used within subscript notation.
For example, column[?]
is not allowed.
The same query using question marks as placeholders looks like this:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql' -d@- <<- EOF
... {
... "stmt":
... "select date,position from locations
... where date <= ? and position < ? order by position",
... "args": ["1979-10-12", 3]
... }
... EOF
{
"cols": [
"date",
"position"
],
"rows": [
[
308534400000,
1
],
[
308534400000,
2
]
],
"rowcount": 2,
"duration": ...
}
Note
With some queries the row count is not ascertainable. In this cases
rowcount
is -1
.
Default schema¶
It is possible to set a default schema while querying the CrateDB cluster via
_sql
end point. In such case the HTTP request should contain the
Default-Schema
header with the specified schema name:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql' \
... -H 'Default-Schema: doc' -d@- <<- EOF
... {
... "stmt":"select name, position from locations order by id limit 2"
... }
... EOF
{
"cols": [
"name",
"position"
],
"rows": [
[
"North West Ripple",
1
],
[
"Outer Eastern Rim",
2
]
],
"rowcount": 2,
"duration": ...
}
If the schema name is not specified in the header, the default doc
schema
will be used instead.
Column types¶
CrateDB can respond a list col_types
with the data type ID of every
responded column. This way one can know what exact data type a column is
holding.
In order to get the list of column data types, a types
query parameter must
be passed to the request:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql?types' -d@- <<- EOF
... {
... "stmt":
... "select date, position from locations
... where date <= \$1 and position < \$2 order by position",
... "args": ["1979-10-12", 3]
... }
... EOF
{
"cols": [
"date",
"position"
],
"col_types": [
11,
9
],
"rows": [
[
308534400000,
1
],
[
308534400000,
2
]
],
"rowcount": 2,
"duration": ...
}
The Array
collection data type is displayed as a list where the first value
is the collection type and the second is the inner type. The inner type could
also be a collection.
Example of JSON representation of a column list of (String, Integer[]):
"column_types": [ 4, [ 100, 9 ] ]
Available data types¶
IDs of all currently available data types:
ID |
Data type |
---|---|
0 |
|
1 |
Not supported |
2 |
|
3 |
|
4 |
|
5 |
|
6 |
|
7 |
|
8 |
|
9 |
|
10 |
|
11 |
|
12 |
|
13 |
|
14 |
|
15 |
|
16 |
Unchecked object |
17 |
|
19 |
|
20 |
|
21 |
|
22 |
|
23 |
|
24 |
|
25 |
|
26 |
|
27 |
|
28 |
|
100 |
Error handling¶
Queries that are invalid or cannot be satisfied will result in an error response. The response will contain an error code, an error message and in some cases additional arguments that are specific to the error code.
Client libraries should use the error code to translate the error into an appropriate exception:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql' -d@- <<- EOF
... {
... "stmt":"select name, position from foo.locations"
... }
... EOF
{
"error": {
"message": "SchemaUnknownException[Schema 'foo' unknown]",
"code": 4045
}
}
To get more insight into what exactly went wrong an additional error_trace
GET
parameter can be specified to return the stack trace:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql?error_trace=true' -d@- <<- EOF
... {
... "stmt":"select name, position from foo.locations"
... }
... EOF
{
"error": {
"message": "SchemaUnknownException[Schema 'foo' unknown]",
"code": 4045
},
"error_trace": "..."
}
Note
This parameter is intended for CrateDB developers or for users requesting support for CrateDB. Client libraries shouldn’t make use of this option and not include the stack trace.
Error codes¶
Code |
Error |
---|---|
4000 |
The statement contains an invalid syntax or unsupported SQL statement |
4001 |
The statement contains an invalid analyzer definition. |
4002 |
The name of the relation is invalid. |
4003 |
Field type validation failed |
4004 |
Possible feature not supported (yet) |
4005 |
Alter table using a table alias is not supported. |
4006 |
The used column alias is ambiguous. |
4007 |
The operation is not supported on this relation, as it is not accessible. |
4008 |
The name of the column is invalid. |
4009 |
CrateDB License is expired. (Deprecated.) |
4010 |
User is not authorized to perform the SQL statement. |
4011 |
Missing privilege for user. |
4031 |
Only read operations are allowed on this node. |
4041 |
Unknown relation. |
4042 |
Unknown analyzer. |
4043 |
Unknown column. |
4044 |
Unknown type. |
4045 |
Unknown schema. |
4046 |
Unknown Partition. |
4047 |
Unknown Repository. |
4048 |
Unknown Snapshot. |
4049 |
Unknown user-defined function. |
40410 |
Unknown user. |
4091 |
A document with the same primary key exists already. |
4092 |
A VersionConflict. Might be thrown if an attempt was made to update the same document concurrently. |
4093 |
A relation with the same name exists already. |
4094 |
The used table alias contains tables with different schema. |
4095 |
A repository with the same name exists already. |
4096 |
A snapshot with the same name already exists in the repository. |
4097 |
A partition for the same values already exists in this table. |
4098 |
A user-defined function with the same signature already exists. |
4099 |
A user with the same name already exists. |
4100 |
An object with the same name already exists. |
5000 |
Unhandled server error. |
5001 |
The execution of one or more tasks failed. |
5002 |
One or more shards are not available. |
5003 |
The query failed on one or more shards |
5004 |
Creating a snapshot failed |
5030 |
The query was killed by a |
Bulk operations¶
The HTTP endpoint supports executing a single SQL statement many times with different parameters.
Instead of the args
(Parameter substitution) key, use the key
bulk_args
. This allows to specify a list of lists, containing all the
parameters which shall be processed. The inner lists need to match the specified
columns.
The bulk response contains a results
array, with a row count for each bulk
operation. Those results are in the same order as the issued operations of the
bulk operation.
Here an example that inserts three records at once:
sh$ curl -sS -H 'Content-Type: application/json' \
... -X POST '127.0.0.1:4200/_sql' -d@- <<- EOF
... {
... "stmt": "INSERT INTO locations (id, name, kind, description)
... VALUES (?, ?, ?, ?)",
... "bulk_args": [
... [1337, "Earth", "Planet", "An awesome place to spend some time on."],
... [1338, "Sun", "Star", "An extraordinarily hot place."],
... [1339, "Titan", "Moon", "Titan, where it rains fossil fuels."]
... ]
... }
... EOF
{
"cols": [],
"duration": ...,
"results": [
{
"rowcount": 1
},
{
"rowcount": 1
},
{
"rowcount": 1
}
]
}
Statements with a result set cannot be executed in bulk. The supported bulk SQL statements are:
Insert
Update
Delete
Bulk errors¶
There are two kinds of error behaviors for bulk requests:
Analysis error: Occurs if the statement is invalid, either due to syntax errors or semantic errors identified during the analysis phase before the execution starts. In this case the whole operation fails and you’ll get a single error:
{ "error": { "code": 4043, "message": "ColumnUnknownException[Column y unknown]" } }
Runtime error: For errors happening after the analysis phase succeeded during execution. For example on duplicate primary key errors or check constraint failures. In this case CrateDB continues processing the other bulk arguments and reports the results via a
rowcount
where-2
indicates an error:{ "cols": [], "duration": 2.195417, "results": [ { "rowcount": 1 }, { "rowcount": -2 } ] }
Note
The error_trace
option does not work with bulk operations.