Skip to content
Product > Data models

Geospatial Data

CrateDB stores, analyzes and tracks the location of people and objects in real-time, all with SQL.
 

Geospatial data queries with SQL

Hyper-fast. Results in milliseconds.

        

/* Based on Chicago taxi trips, this query lists the longest 5 trips 
   starting within 1km of the Shedd Aquarium and tells us the name
   of the community area that the trip ended within. */
SELECT tripid, 
       miles,
       c.name as dropoff_area
FROM taxi_rides t, community_areas c
WHERE distance('POINT(-87.6118581 41.8672843)'::GEO_POINT, pickupcentroidlocation) < 1000
      AND dropoffcommunityarea = c.areanumber
ORDER BY miles DESC
LIMIT 5; 
        

+------------------------------------------+-------+--------------+
| tripid                                   | miles | dropoff_area |
+------------------------------------------+-------+--------------+
| e6d27fb09f7b17695a71666654c40fd56b79f472 | 46.5  | LOOP         |
| eb8dbd2d81faa21f8f6d80117748a8630c55d55c | 37.27 | OHARE        |
| 6aaa0d2aed0423f1321ff5df949afef0e3583d58 | 37.16 | OHARE        |
| e9ab078e5e6ed4d29cfbd310ee75d47236d3afbb | 32.03 | OHARE        |
| 329b4e6980070cab4c2af19ddbc7b73c1497a669 | 31.67 | OHARE        |
+------------------------------------------+-------+--------------+
        

/* Based on Chicago taxi trips, this query retuern the number 
   of trips per day that started at O'Hare airport and were
   over 20 miles long. */
SELECT date_bin('24 hour'::interval, starttime, 0) AS day,
       count(*) as trips
FROM taxi_rides
WHERE within(
        pickupcentroidlocation, 
        'POLYGON (
          (-87.92151405039768 42.008992106324456, 
           -87.93962405297475 42.00147632076016, 
           -87.94067626623455 41.95592273798644, 
           -87.89918645000783 41.9491763875842, 
           -87.88992003878232 41.95716729007498, 
           -87.8821237584836 41.95716462062882, 
           -87.87074008603852 41.967968085787476, 
           -87.87727540517479 41.99772302560805, 
           -87.89707897269737 42.009775581434354, 
           -87.92151405039768 42.008992106324456)
        )'
      )
      AND miles > 20
GROUP BY day
ORDER BY day DESC;
        

+---------------+-------+
|           day | trips |
+---------------+-------+
| 1714521600000 |    24 |
| 1714435200000 |   110 |
| 1714348800000 |   109 |
| 1714262400000 |    77 |
| 1714176000000 |    52 |
| 1714089600000 |    98 |
| 1714003200000 |   113 |
| 1713916800000 |    82 |
| 1713830400000 |    59 |
| 1713744000000 |    91 |
| 1713657600000 |    69 |
| 1713571200000 |    48 |
| 1713484800000 |    82 |
| 1713398400000 |    75 |
| 1713312000000 |    80 |
| 1713225600000 |    98 |
| 1713139200000 |   104 |
| 1713052800000 |    63 |
| 1712966400000 |    75 |
| 1712880000000 |   142 |
| 1712793600000 |   169 |
| 1712707200000 |   103 |
| 1712620800000 |   154 |
| 1712534400000 |   203 |
| 1712448000000 |    98 |
| 1712361600000 |    71 |
| 1712275200000 |    80 |
| 1712188800000 |    79 |
| 1712102400000 |    72 |
| 1712016000000 |    91 |
| 1711929600000 |    39 |
+---------------+-------+

        

/* Which communities of more than 30,000 people does a flight from O'Hare to 
   Midway airport pass over? */
SELECT name,
       details['population'] AS population
FROM community_areas
WHERE
  intersects (
    'LINESTRING (-87.90753846894022 41.9807787186139, -87.72939445102593 41.97508268795004, -87.55763552335945 41.97982941555023, -87.47846040428642 41.92427055787562, -87.51102436455034 41.837749425166244, -87.59225264192574 41.80057450627759, -87.64801197528328 41.770600684095974, -87.69988112259261 41.7819633786835, -87.7530469985847 41.78583107072223)',
    boundaries
  )
  AND details['population'] > 30000
ORDER BY population DESC;
        

+------------------------+------------+
| name                   | population |
+------------------------+------------+
| UPTOWN                 |      57182 |
| EDGEWATER              |      56296 |
| CHICAGO LAWN           |      55931 |
| ALBANY PARK            |      48396 |
| LINCOLN SQUARE         |      40494 |
| NORWOOD PARK           |      38303 |
| GARFIELD RIDGE         |      35439 |
| WEST LAWN              |      33662 |
| GREATER GRAND CROSSING |      31471 |
+------------------------+------------+

Geospatial functions

With CrateDB geospatial functions, you can address these use cases easily:

  • Spatial count: find how many records are located within a specific area, thanks to the within function.
  • Spatial distance: find the closest record, thanks to the distance function.
  • Spatial overlap: find whether two areas overlap, thanks to the within and intersects functions.

Geospatial data formats

Geopoints

CrateDB offers versatile support for geospatial data types. For storing geopoint data, there are two primary formats you can use in the database:

  • Array Format: Geopoints can be represented as an array where the coordinates are in the format [longitude, latitude]. This array format provides a simple and straightforward way to store point locations.
  • WKT Format: CrateDB also supports the Well-Known Text (WKT) representation for geopoints. You can store geopoint data using the format: POINT(longitude latitude). This notation provides a standardized way to represent geospatial data.

Geoshapes

For storing more complex geospatial shapes into the database, CrateDB offers the following formats:

  • GeoJSON Format: This is a popular format for encoding a variety of geographical data structures. It’s versatile and can represent simple points, lines, and polygons as well as more complex shapes.
  • WKT Formats: Apart from the POINT format, CrateDB supports a variety of Well-Known Text formats for geoshapes. Whether you’re storing polygons, lines, or multi-points, WKT offers a standard representation for each of these geospatial structures.
Geospatial-Webinar-Play

Working with geospatial data

Working with geospatial data involves meeting challenges of variety and scale. This webinar recording will take you through everything you need to know about putting CrateDB to work as a geospatial database.

Curious to learn more?

Geospatial data has a lot of points and generates a lot of data, that’s why you need great scalability. With CrateDB, you get scalable SQL support for geospatial data types and functions for geospatial applications such as fleet tracking, mapping, location analytics.

Location is important for many machine data analysis. For this reason, CrateDB can store and query geographical information using the geo_point and geo_shape data types. You can control geographic index precision and resolution for faster query results, and also run exact queries with scalar functions like intersects, within, and distance.

"CrateDB was the only geospatial database we found that could smoothly process data for our users and for our data science team. We fell in love with it immediately."
Kartik Venkatesh CTO Spatially Health Learn more
cr-quote-img-white
"Most of the cars in our customers' fleets are in use almost 24 hours a day, and we need to store and analyse the massive amounts of data they generate in real time. We tried a few different SQL and NoSQL databases, and CrateDB offered the best combination of high performance, scalability and ease of use."
Mark Sutheran Founder Clickdrive.io Learn more
cr-quote-img-white

Other resources on geospatial data

Documentation

Geo data types

GEO_POINT, GEO_SHAPE

Documentation

Geo search

MATCH, intersects, within, distance

FAQ

Geospatial data represents specific geographical locations using latitude and longitude coordinates or text fields naming geographical areas like countries or states. This data is crucial for applications that require tracking and analyzing the location of people and objects. CrateDB supports the storage, analysis, and real-time tracking of geospatial data - all using SQL.

Spatial data encompasses various formats used to represent geographic locations and shapes. Common examples include arrays for simple point data and GeoJSON or WKT formats for more complex shapes. CrateDB supports these geospatial data types, offering flexibility and precision in storing and querying geospatial information.

Storing geospatial data efficiently requires a database with strong scalability due to the large volume of data points it generates. It should allow for the control of geographic index precision and resolution to enable faster query results and support exact queries with functions like intersects, within, and distance. CrateDB excels in this area by offering scalable SQL support for geospatial data types, including geo_point and geo_shape, making it ideal for geospatial applications.

The best databases for spatial data should offer advanced geospatial functions to handle spatial count, distance and overlap efficiently. Examples of databases optimized for spatial data include CrateDB, Amazon Aurora, Esri ArcGIS, and MariaDB. CrateDB supports robust geospatial capabilities, allowing for seamless integration and analysis of spatial data.