CrateDB excels at storing, analysing, and tracking the location of people and objects in real time. Whether you're managing a fleet of vehicles, offering customers information about services near their location, or analysing historical traffic patterns, CrateDB's geospatial data types and functions have you covered.
We'll begin by learning how to store different types of geospatial data in CrateDB and move on to explore different types of geosearch query. The GEO_POINT data type is used to store latitude and longitude coordinates. In common with other data types in CrateDB, GEO_POINTs can be used to define a column in a table or the field in an object type column.
Here we're looking at an excerpt of the CREATE TABLE command that creates the table for the 311 calls in our Chicago data set. Inside the locationdetails object, the location column stores the latitude and longitude from which the issue described in the call was reported. This is a fragment of an INSERT statement to add a new 311 call record. When specifying the location details, we can either provide an array containing the longitude and latitude values separated by a comma as shown here, or we can provide these details in the WKT or Well Known Text format.
This is what a GEO_POINT field looks like when returned from a SELECT query. Now we've stored some GEO_POINT data, let's study some basic queries using this data type.
The distance function returns the distance in meters between two GEO_POINTs. Here we have a query that calculates the distance between the location of a specific street light that has been reported as not working and the location of the maintenance depot responsible for fixing it. We can see that CrateDB has calculated that these two points are about 4.3 kilometers apart and we can see that mapping software agrees with this assessment. We can of course use the distance function as a filter when querying data.
This query finds all open street light out reports within 2 kilometers of the depot returning the service request number and address.
The second geospatial data type supported by CrateDB is the GEO_SHAPE. A GEO_SHAPE column can store 2 dimensional shapes represented as either GeoJSON objects or in WKT format. CrateDB's GEO_SHAPE supports a range of shapes from a simple point or a line string with complex polygons and even multi polygons for non contiguous areas, and even a geometry collection, a set of shapes containing any of the other types.
Here I'm using an online tool to visualise the GeoJSON representation of Kenwood, one of the Chicago community areas in our data set. Each community area is represented as a MultiPolygon. This format enables us to model complex shapes such as the lakeside edge of this community area. We'll see how to perform queries over this sort of data later in the video.
When defining a GEO_SHAPE column, you can choose how CrateDB indexes the shapes. This allows you to control the size of the index needed for your GEO_SHAPEs and the accuracy that they're stored with.
In this example, taken from our Chicago Community Areas data set, we're using the geohash index with one meter accuracy. Other options for indexing include a Quad-tree or a BKD- tree index. Refer to the CrateDB GEO_SHAPE documentation to understand the benefits and trade-offs of each of the available indexing options.
GEO_SHAPE values are inserted using either the GeoJSON geometry object representation as shown here, or using the WKT representation here. Let's look at two ways to use GEO_SHAPE data in a query. The within function takes two GEO_SHAPEs as its parameters and returns true if the first shape is contained within the second.
This query tells us which community area the point representing a cafe at 807 W Maxwell St. is part of. We'll use WKT for this query. CrateDB tells us that this location is part of the Near West Side community. We can also ask CrateDB which communities intersect with an area described by a given GEO_SHAPE.
This query returns the names of all communities that intersect with the polygon shown on the map. In this case, we're using GeoJSON when specifying the points of the polygon in our query. When we run the query, we see that the area drawn on the map intersects with 17 of Chicago's community areas. Thanks to CrateDB's PostgreSQL Wire Protocol compatibility, it's easy to connect it to popular dashboard and visualization tools.
Here I'm running a simple query from Grafana to generate a map showing the location of all the open street light issues contained in our 311 call data set.
CrateDB has excellent support for geospatial data types. The use of standard representations such as GeoJSON and WKT make it easy to import, query, analyse and visualise geospatial data. The sample data set contains geospatial data in the community areas, 311 calls and taxi rights tables. Take some time to explore these in your CrateDB cluster and try out some geospatial queries of your own.