Version 6.4.0 - Unreleased¶
Note
In development. 6.4.0 isn’t released yet. These are the release notes for the upcoming release.
Note
If you are upgrading a cluster, you must be running CrateDB 5.0.0 or higher before you upgrade to 6.4.0.
We recommend that you upgrade to the latest 6.3 release before moving to 6.4.0.
A rolling upgrade from >= 6.3.0 to 6.4.0 is supported. Before upgrading, you should back up your data.
Warning
Tables that were created before CrateDB 5.x will not function with 6.x and must be recreated before moving to 6.x.x.
You can recreate tables using COPY TO and COPY FROM or by
inserting the data into a new table.
Table of contents
Breaking Changes¶
None
Deprecations¶
None
Changes¶
SQL Statements¶
Added support for ALTER REPOSITORY.
None
SQL Standard and PostgreSQL Compatibility¶
Added a information_schema.collations table.
Added a
catalog_namecolumn to information_schema.schemata.Dhruv Patel added the
pg_catalog.pg_usertable for PostgreSQL compatibility, resolving issues with tools like DbVisualizer that query this table during schema introspection.
Data Types¶
Added support for
geo_pointvalues in JSON array string format, for example"[14.988999953493476, 51.10299998894334]". This allows CSV files containinggeo_pointvalues in this format to be imported without transforming the source data first.
Scalar and Aggregation Functions¶
Added support for adding or subtracting intervals to values of type date.
Added support for
THandthtemplate patterns to the to_char scalar function which can be used to add a ordinal suffix for numbers.Added support for marking literal characters with double quotes in the to_char template pattern. Characters enclosed in
"..."are treated as literal text and not interpreted as formatting tokens. Backslash escaping within double-quoted sections is also supported (e.g.,\"for a literal",\\for a literal\).Robert Palmer added the blake3 scalar function for computing BLAKE3 checksums of strings.
Bing O’Dowd added the largest triangle three buckets aggregation function for downsampling data using the Largest Triangle Three Buckets algorithm
Performance and Resilience Improvements¶
Improved the performance of
SELECT strKeyColumn, count(*) FROM <tbl> GROUP BY 1queries if they don’t contain anyWHEREclause. In some of our benchmarks by up to 140%.Improved the performance of queries on the
sys.shardstable. In some of our benchmarks by up to ~20% if the sequence number of translog stats were selected.Added column pruning for scalar subqueries. This can help improve performance for sub-queries which intermediately selected columns not used in the results.
Improved the performance of the :ref:
to_char <scalar-to_char>scalar. In some of our benchmarks by up to ~25%.Improved the performance of bulk insert operations against tables that use user defined functions in a generated column which also has check constraints.
Increased the default queue size of the
writesearch pool but lowered the amount of internal retries forINSERT INTO <table> VALUESstatements in exchange. This should lower the load for bulk inserts into tables with many shards at the expensive of potentially higher memory load before requests get rejected if clients make single inserts with high concurrency/many connections.Improved the performance of the min() and max() aggregations for columns of type NUMERIC.
Improved query planning for lookup joins under
LIMITby deferring collection of columns that are not needed during join execution until the final fetch phase.
Administration and Operations¶
Added
session_idcolumn to the sys.jobs and sys.jobs_log tables.Added a
statecolumn to the sys.cluster table.Added a
remove-corrupted-datacommand to the crate-node CLI tool for disaster recovery of a corrupted shard which cannot be automatically recovered from a good copy (replica) or restored from a backup.
Client interfaces¶
Changed the HTTP interface to set parameter type hints which are guessed based on the
argspayload. This can help avoid having to add explicit casts within a SQL statement for parameter placeholders.For example, instead of
SELECT percentile(height, ?::double precision) ...it is possible to useSELECT percentile(height, ?)ifargscontains a numeric value.A side effect of this change is that cast semantics can become stricter. This could surface latent issues if the context where the parameter is used requires a cast to a different type. Consider for example an
INSERT INTO tbl (xs) VALUES (?), (?)wherexsis of typetext[]but the payload forxsis an object array.Previously this used a cast from
undefinedtotext[]and succeeded, now it can lead to errors like:The types of the columns within VALUES lists must match. Found `text_array` and `object_array` at position: 1