Generate time series data using Python¶
This tutorial will show you how to generate mock time series data about the International Space Station (ISS) using Python.
See also
Table of contents
Prerequisites¶
CrateDB must be installed and running.
Make sure you’re running an up-to-date version of Python (we recommend 3.7 or higher).
Then, use pip to install the requests and CrateDB Python Client libraries:
sh$ pip install requests crate
The rest of this tutorial is designed for Python’s interactive mode so that you can experiment with the commands as you see fit. The standard Python interpreter works fine for this, but we recommend IPython for a more user-friendly experience.
You can install IPython with Pip:
sh$ pip install ipython
Once installed, you can start an interactive IPython session like this:
sh$ ipython
Get the current position of the ISS¶
Open Notify is a third-party service that provides an API to consume data about the current position, or ground point, of the ISS.
The endpoint for this API is http://api.open-notify.org/iss-now.json.
Start an interactive Python session (as above).
Next, import the requests library:
>>> import requests
Then, read the current position of the ISS with an HTTP GET request to the Open Notify API endpoint, like this:
>>> response = requests.get("http://api.open-notify.org/iss-now.json")
>>> response.json()
{'message': 'success',
'timestamp': 1582730500,
'iss_position': {'latitude': '33.3581', 'longitude': '-57.3929'}}
As shown, the endpoint returns a JSON payload, which contains an
iss_position
object with latitude
and longitude
data.
You can encapsulate this operation with a function that returns longitude and latitude as a WKT string:
>>> def position():
... response = requests.get("http://api.open-notify.org/iss-now.json")
... position = response.json()["iss_position"]
... return f'POINT ({position["longitude"]} {position["latitude"]})'
When you run this function, it should return your point string:
>>> position()
'POINT (-30.9188 42.8036)'
Set up CrateDB¶
First, import the CrateDB Python Client client:
>>> from crate import client
Then, Connect to CrateDB:
>>> connection = client.connect("localhost:4200")
Note
You can omit the function argument if CrateDB is running on
localhost:4200
. We have included it here for the sake of clarity.
Modify the argument if you wish to connect to a CrateDB node on a different
host or port number.
Get a cursor:
>>> cursor = connection.cursor()
Finally, create a table suitable for writing ISS position coordinates.
>>> cursor.execute(
... """CREATE TABLE iss (
... timestamp TIMESTAMP GENERATED ALWAYS AS CURRENT_TIMESTAMP,
... position GEO_POINT)"""
... )
In the The CrateDB Admin UI, you should see the new table when you navigate to the Tables screen using the left-hand navigation menu:
Record the ISS position¶
With the table in place, you can start recording the position of the ISS.
The following command calls your position
function and will insert the result into the iss
table:
>>> cursor.execute("INSERT INTO iss (position) VALUES (?)", [position()])
Press the up arrow on your keyboard and hit Enter to run the same command a few more times.
When you’re done, you can select that data back out of CrateDB.
>>> cursor.execute('SELECT * FROM iss ORDER BY timestamp DESC')
Then, fetch all the result rows at once:
>>> cursor.fetchall()
[[1582295967721, [-8.0689, 25.8967]],
[1582295966383, [-8.1371, 25.967]],
[1582295926523, [-9.9662, 27.8032]]]
Here you have recorded three sets of ISS position coordinates.
Automate the process¶
Now you have key components, you can automate the data collection.
Create a new file called iss-position.py
, like this:
import time
import requests
from crate import client
def position():
response = requests.get("http://api.open-notify.org/iss-now.json")
position = response.json()["iss_position"]
return f'POINT ({position["longitude"]} {position["latitude"]})'
def insert():
# New connection each time
try:
connection = client.connect("localhost:4200")
print("CONNECT OK")
except Exception as err:
print("CONNECT ERROR: %s" % err)
return
cursor = connection.cursor()
try:
cursor.execute(
"INSERT INTO iss (position) VALUES (?)", [position()],
)
print("INSERT OK")
except Exception as err:
print("INSERT ERROR: %s" % err)
return
# Loop indefinitely
while True:
insert()
print("Sleeping for 10 seconds...")
time.sleep(10)
Here, the script sleeps for 10 seconds after each sample. Accordingly, the time series data will have a resolution of 10 seconds. You may want to configure your script differently.
Run the script from the command line, like so:
sh$ python iss-position.py
CONNECT OK
INSERT OK
Sleeping for 10 seconds...
CONNECT OK
INSERT OK
Sleeping for 10 seconds...
CONNECT OK
INSERT OK
Sleeping for 10 seconds...
As the script runs, you should see the table filling up in the CrateDB Admin UI:
Lots of freshly generated time series data, ready for use.
And, for bonus points, if you select the arrow next to the location data, it will open up a map view showing the current position of the ISS:
Tip
The ISS passes over large bodies of water. If the map looks empty, try zooming out.