Skip to content
Explore

Try CrateDB Live

Scenario: Real-time IoT Analytics

B. Geospatial Queries

B1. How to extract latitude & longitude from a GEO_POINT

You can use the latitude and longitude functions to return the co-ordinates in a GEO_POINT datatype. This example will return the bounding box for all loaded data:

SELECT
  min(longitude(geo_location)) AS min_longitude,
  max(longitude(geo_location)) AS max_longitude,
  min(latitude(geo_location)) AS min_latitude,
  max(latitude(geo_location)) AS max_latitude
FROM
  demo.climate_data;

Let’s use the result of this to work out how many data points there are in a subset of the data, take this query and fill in some correct values based on the result of the above query.

SELECT COUNT(*) AS points
FROM demo.climate_data
WHERE data['latitude'] BETWEEN 47.40 AND 54.29
  AND data['longitude'] BETWEEN 6.08 AND 14.99;

If you get it right, this should return a non-zero count. If it’s zero, check your query and try again. More information on CrateDB’s geospatial capabilities is detailed here.

A more powerful way of doing this is to use the WITHIN function. This allows us to find points that lie within a polygon, which gives us the flexibility to use shapes other than a rectangle. The SQL below returns data from Bavaria:

SELECT count(*) AS points
FROM demo.climate_data
WHERE within(
    geo_location,
    'POLYGON ((
        9.056 50.0702, 9.5206 50.071, 9.5638 50.1864, 9.777 50.2783,
        9.7968 50.3737, 9.9948 50.4007, 10.1159 50.5115, 10.2984 50.4563,
        10.5575 50.2984, 10.5938 50.1934, 10.7155 50.1686, 10.8788 50.2253,
        10.8739 50.2925, 10.7868 50.3313, 10.8425 50.3517, 11.2737 50.2397,
        11.2986 50.4449, 11.359 50.4788, 11.517 50.3407, 11.9222 50.377,
        12.0602 50.2787, 12.0461 50.2217, 12.2216 50.0148, 12.4745 49.8851,
        12.3463 49.7313, 12.615 49.4023, 12.7646 49.2961, 12.9338 49.2853,
        13.4098 48.9252, 13.703 48.8334, 13.7649 48.6244, 13.6933 48.5712,
        13.4382 48.6092, 13.3806 48.4074, 12.7071 48.1371, 12.7205 48.0496,
        12.9363 47.8456, 12.8593 47.7013, 13.0354 47.6453, 12.9911 47.5098,
        12.8306 47.5768, 12.7739 47.7046, 12.503 47.6681, 12.2357 47.7708,
        12.1432 47.7268, 12.1583 47.6434, 11.6115 47.628, 11.2493 47.4427,
        11.0023 47.4294, 10.8717 47.5676, 10.42 47.6147, 10.3748 47.5758,
        10.4087 47.4352, 10.2334 47.3164, 10.2333 47.4035, 10.1148 47.4067,
        10.0439 47.5206, 9.8246 47.6133, 10.1642 47.6629, 10.1783 48.1013,
        10.0407 48.3835, 10.3382 48.4979, 10.3723 48.6257, 10.5248 48.6731,
        10.4971 48.9515, 10.2001 49.2197, 10.1581 49.5362, 10.0825 49.5832,
        9.9731 49.5189, 9.8133 49.7557, 9.6897 49.7575, 9.6528 49.8225,
        9.3325 49.8219, 9.2733 49.7665, 9.3332 49.6803, 9.1404 49.6131,
        9.1879 49.7502, 9.0658 49.9172, 9.056 50.0702
    ))'
);
Bavaria region in Germany

The list of points is coma-separated, in the form:

longitude1 latitude1, longitude2 latitude2, longitude3 latitude3 ...

Which effectively defines the shape of the Bavaria region.

B2. Distance calculations:

CrateDB also supports distance calculations, the following query gives the nearest points to Berlin, Germany (approximate location is 52.5N, 13.4E)

SELECT
  geo_location,
  distance(geo_location, 'POINT(13.4 52.5)') AS metres,
  min(data['temperature']) AS min_tmpr,
  max(data['temperature']) AS max_tmpr
FROM demo.climate_data
GROUP BY geo_location, metres
ORDER BY metres ASC
LIMIT 50;

 


 

C. Queries – Correlation

C1. Is pressure and temperate related?

Is there a relationship between temperature and atmospheric pressure (at ground level)?

Run this query and decide for yourself:

SELECT
  geo_location,
  avg(data['temperature']) AS avg_temp,
  avg(data['pressure']) AS avg_pressure
FROM demo.climate_data
GROUP BY 1
ORDER BY 2 DESC;

Exercise: Is there a way to improve this query? Maybe a combination of temperature variance from average and air pressure from 100,000 (1 atmosphere)?

C2. Is pressure and wind speed related?

Is there a relationship between temperature and wind speed?

SELECT
  geo_location,
  avg(data['temperature']) AS avg_temp,
  avg(data['pressure']) AS avg_pressure,
  avg(sqrt(power(data['u10'], 2) + power(data['v10'], 2))) AS wind_speed
FROM demo.climate_data
GROUP BY 1
ORDER BY 3 DESC;

Exercise: Try altering the ordering, either ASCending or DESCending, or using a different column (or combination).

 


 

D. Queries – JSON object model

D1. Insert a new value into the dataset:

INSERT INTO demo.climate_data (measurement_time, geo_location, data) VALUES (123, [8.78831111111111, 54.903], {"longitude" = 8.78831111111111, "latitude" = 54.903, "temperature" = 16.868310546875023, "u10" = 4.472952365875244, "v10" = -1.3958832025527954, "pressure" = 102426.1015625});

This is using the existing schema. What if we want to add a new 'humidity' field to our dataset, do we need to recreate the table?

In CrateDB we can define object columns (i.e. JSON) as one of three different types:

  • OBJECT(STRICT) – The schema for the object is fixed

  • OBJECT(DYNAMIC) – The schema is dynamic, and new attributes can be added

  • OBJECT(IGNORED) – The JSON is stored ignoring the datatypes of the attributes and not indexing the field

See this page for more detailed information.

When you created this table, you set the data column to be DYNAMIC, so we can add additional attributes easily, and these attributes will be indexed.

Add a new key in the JSON object without the need to modify the schema.

INSERT INTO demo.climate_data (measurement_time, geo_location, data) VALUES (123, [8.78831111111111, 54.903], {"humidity" = 43.4, "longitude" = 8.78831111111111, "latitude" = 54.903, "temperature" = 16.868310546875023, "u10" = 4.472952365875244, "v10" = -1.3958832025527954, "pressure" = 102426.1015625});

D2. Now try to query the new key:

SELECT
  data['humidity']
FROM
  demo.climate_data
WHERE
  data['humidity'] IS NOT NULL;

This makes it very easy to extend the data model, ideal for IoT applications where additional sensor information may become available.

And of course, with CrateDB all of this additional data will be indexed for great query performance.