This guide demonstrates how to create reusable, parameterized queries using CrateDB's views and execute them via the CrateDB HTTP Endpoint. This method streamlines query management, making it easier for former Rockset users to transition to CrateDB.
Step 1: Create a CrateDB Cloud environment
(if you already have one, you can skip this step)
Step 2: Create a table
After you have set up a CrateDB Cloud instance, you can proceed by creating a table. This could be done using the HTTP endpoint, but in this example, we use the CrateDB Cloud Console.
Use this code, paste it into the console, and execute.
CREATE TABLE test.weather_data ( timestamp TIMESTAMP, location VARCHAR, temperature DOUBLE, humidity DOUBLE, wind_speed DOUBLE );
Step 3: Insert sample data
Now that the table is created, it is time to insert some rows. In this example, we use a publicly available weather dataset. Copy the code into the Console and execute it to load the 70k rows.
COPY test.weather_data FROM 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/data_weather.csv.gz' WITH (format='csv', compression='gzip', empty_string_as_null=true);
Step 4: Query the data
Let's assume a query where NULL
readings are interpolated based on the previous and the next value.
WITH OrderedData AS ( SELECT timestamp, location, temperature, LAG(temperature, 1) IGNORE NULLS OVER w AS prev_temp, LEAD(temperature, 1) IGNORE NULLS OVER w AS next_temp FROM test.weather_data WINDOW w AS (PARTITION BY location ORDER BY timestamp) ) SELECT timestamp, location, temperature, CASE WHEN temperature IS NOT NULL THEN temperature WHEN prev_temp IS NOT NULL AND next_temp IS NOT NULL THEN (prev_temp + next_temp) / 2 ELSE NULL END AS interpolated_temperature FROM OrderedData WHERE temperature is NULL ORDER BY location, timestamp LIMIT 50;
This could also be executed using the HTTP endpoint.
In a shell, set the following environment variables.
export your_username=<USERNAME> export your_password=<PASSWORD> export cluster_url=<Host of the CrateDB Cluster - can be found on the overview page of your cluster>
>
curl -sS -u ${your_username}:${your_password} -H 'Content-Type: application/json' \ -X POST https://${cluster_url}:4200/_sql -d@- <<- EOF { "stmt": " WITH OrderedData AS ( SELECT timestamp, location, temperature, LAG(temperature, 1) IGNORE NULLS OVER w AS prev_temp, LEAD(temperature, 1) IGNORE NULLS OVER w AS next_temp FROM test.weather_data WINDOW w AS (PARTITION BY location ORDER BY timestamp) ) SELECT timestamp, location, temperature, CASE WHEN temperature IS NOT NULL THEN temperature WHEN prev_temp IS NOT NULL AND next_temp IS NOT NULL THEN (prev_temp + next_temp) / 2 ELSE NULL END AS interpolated_temperature FROM OrderedData WHERE temperature is NULL ORDER BY location, timestamp LIMIT 50; " } EOF
Even though this works fine, you end up with a significant POST command. The best way to simplify things is by creating views.
Step 5: Create a View
Instead of including large SQL statements in the post of your command, you could simply create a view using the code.
CREATE VIEW v_interpolated_temp AS WITH OrderedData AS ( SELECT timestamp, location, temperature, LAG(temperature, 1) IGNORE NULLS OVER w AS prev_temp, LEAD(temperature, 1) IGNORE NULLS OVER w AS next_temp FROM test.weather_data WINDOW w AS (PARTITION BY location ORDER BY timestamp) ) SELECT timestamp, location, temperature, CASE WHEN temperature IS NOT NULL THEN temperature WHEN prev_temp IS NOT NULL AND next_temp IS NOT NULL THEN (prev_temp + next_temp) / 2 ELSE NULL END AS interpolated_temperature FROM OrderedData WHERE temperature is NULL ORDER BY location, timestamp LIMIT 50;
Copy the code and execute it in the Console like this example:
With the view in place, the HTTP endpoint command looks like this.
curl -sS -u ${your_username}:${your_password} -H 'Content-Type: application/json' \ -X POST https://${cluster_url}:4200/_sql -d@- <<- EOF { "stmt": " select * from v_interpolated_temp " } EOF
Using views still gives you the power to use filters, for example. Let's assume you want to filter on location='Berlin' and interpolate_temperature > 10. You can still use the view.
curl -sS -u ${your_username}:${your_password} -H 'Content-Type: application/json' \ -X POST https://${cluster_url}:4200/_sql -d@- <<- EOF { "stmt": " select location, interpolated_temperature from v_interpolated_temp WHERE location = \$1 AND interpolated_temperature > \$2 ", "args": ["Berlin", 10] } EOF
Conclusion
Using views combined with the CrateDB HTTP endpoint, you can efficiently manage and execute complex queries, providing an accessible pathway for Rockset users to transition to CrateDB.