Command Line (CLI) Clients

One of the entry points to SpaceKnow platform are command-line tools (client console programs).

All programs described below can be called with the --help (-h) option to show all of their supported arguments and usages.

Prerequisites

Note

These optional steps that make working with CLI tools more convenient need to be done only once for each user and machine.

Set-up PATH

User-facing programs reside inside scripts/ folder inside backend git repository. These scripts are designed to be called from anywhere (from any arbitrary working directory).

In order to save some typing, add the folder to the PATH environment variable. Assuming that your backend repository is ~/spaceknow/backend, add following to your ~/.bashrc:

export PATH="$HOME/spaceknow/backend/scripts:$PATH"

Now you can type ragnar-download.py -o ... instead of ~/spaceknow/backend/scripts/ragnar-download.py -o ... no matter where you are.

Ragnar Clients (Former Batchloader)

Ragnar clients download raw satellite imagery (.ski files) from SpaceKnow platform to the local filesystem.

The process is performed in 2 (+1 optional) steps:

  1. Search for available imagery (scene-tiles).
  2. (Optionally) filter scene-tile records.
  3. Download available scene-tiles as .ski files.

All examples in the following sections assume that:

  • path to source GeoJSON is ~/spaceknow/storage/spaceknow-sources/20160102_foo/20160102_foo.geojson
  • desired location of output files is ~/spaceknow/storage/spaceknow-sources/20160102_foo/.

1. Search for Available Imagery

# switch to desired working directory if not already there
cd ~/spaceknow/storage/spaceknow-sources/20160102_foo

ragnar-search.py -j 10 -i 20160102_foo.geojson -p ab -d pleiades -o available-tiles.jsons

The command above reads the 20160102_foo.geojson GeoJSON file, requests imagery from ab provider and its pleiades dataset by using 10 parallel jobs. It saves available scene-tile records to the available-tiles.jsons file.

Note

See Available Satellite Imagery Providers for a list of providers and their datasets.

The ragnar-search.py supports additional options to set the tile size and start/end dates. See its --help output.

Upon successful finish, ragnar-search.py outputs statistics of found tiles, including the total number of found scene-tiles. You can use ragnar-filter.py stats to show statistics on an existing file.

Human-readable list of scene-tiles is printed by ragnar-filter.py list.

To view found records as a table, following jq expression can be used:

(echo 'set,row,col,datetime,cloud cover,satellite,resolution m/pix'
 jq -r '[.setNumber, .tilePosition.row, .tilePosition.col, .image.datetime,
 .image.cloudCover, .image.satellite, .image.bands[0].gsd] |
 @csv' available-tiles.jsons) > available-tiles.csv

You can also use the jq tool to extract found scene footprints along with other metadata into a GeoJSON file:

jq -s '{type: "FeatureCollection", features: [.[] | {type: "Feature",
    geometry: .image.footprint, properties: ((.image | del(.footprint)) +
    {gsd: .image.bands[0].gsd, setNumber: .setNumber, row: .tilePosition.row,
    col: .tilePosition.col})}]}' available-tiles.jsons

If you want to create a CSV table that breaks imagery availability by year/month/week/day, you can use ragnar-availability-report.py.

To summarize, ragnar-search.py transforms the source GeoJSON file into an available scene-tile file:

20160102_foo.geojson → ragnar-search.py → available-tiles.jsons

Scene-tile Records (Technical)

Each scene-tile record — a line in available-tiles.jsons and a future .ski file — is a condensed JSON with attributes that describe the tile and an available imagery scene.

Notable example is the extent attribute that contains a GeoJSON with a tile and intersection with the whole polygon. The extent is usually a GeoJSON Feature with properties taken from GeoJSON Feature from which the record was created by ragnar-search.py.

Warning

The records should not be parsed manually. Please use the sk.cli.record.SceneTileRecord Python class in the backend repository instead.

2. (Optionally) Filter Scene-tile Records

The ragnar-filter.py command can be used to filter scene-tile records.

ragnar-filter.py date --from 2016-01-01 -i available-tiles.jsons -o available-tiles-2016-and-on.jsons

The above command reads scene-tile records from available-tiles.jsons, filters only those after 2016-01-01 and outputs conforming records to available-tiles-2016-and-on.jsons.

ragnar-filter.py provides multiple filtering sub-commands with various options. See its --help text for the complete reference.

Statistics of the filtered scene-tiles are output after each run; you can use special stats subcommand that just computes statistics and suppresses record writing. list subcommand prints scene-tiles that are being processed in a concise (readable) form.

available-tiles.jsons → ragnar-filter.py ... → available-tiles-filtered.jsons

It might be useful to select just scenes which contain particular bands. The following command might be used to filter scenes which bands start with Total_Ozone:

jq -c '. | select(.image.bands[].names[0] | startswith("Total_Ozone") )'
    available-tiles.jsons

Filter Chaining, Logical Expressions

Input and output of ragnar-filter.py is default to standard input and standard output, respectively. This allows for easy filter chaining:

ragnar-filter.py date ... -i all.jsons | ragnar-filter.py satellite ... -o filtered.jsons

An example above filters on basis of the capture date first, and then filters on basis of the satellite. This forms logical AND of the two filters. Realize that the filter order matters as filters may be stateful (e.g. latest).

It is also possible to construct a logical OR, with the help of some temporary files:

ragnar-filter.py date ...  ... -i all.jsons -o filtered-by-date.jsons
ragnar-filter.py satellite ... -i all.jsons -o filtered-by-satellite.jsons

cat filtered-by-date.jsons filtered-by-satellite.jsons | awk '!x[$0]++' > filtered-by-date-or-satellite.jsons

Note

The OR expression above can produce duplicate scene-tiles, so awk '!x[$0]++' was used to deduplicate output (without sorting caused by usual sort | uniq).

3. Allocate Area (High-Resolution Only)

If the imagery is high-resolution (sub 1 meter), downloading and analyzing it is limited to allocated areas and time-ranges. See Credits API (Billing, Payments and Credits) for more information on the topic.

Program ragnar-allocate.py serves for allocation of areas and time-ranges given by a set of scene-tile records.

ragnar-allocate.py -i available-tiles.jsons

4. Download Available Scene-tiles

cd ~/spaceknow/storage/spaceknow-sources/20160102_foo

ragnar-download.py -j 10 -i available-tiles.jsons -o . -r 0.5

The above command reads a list of scene-tile records from available-tiles.jsons and downloads .ski files scaled to 50 cm resolution to a structure under . (current directory), employing 10 parallel jobs:

available-tiles.jsons → ragnar-download.py → set-00001/...

Non-existent output directories are created (including the top-level output directory itself); completely downloaded records are not downloaded again. This means you can safely rerun the command after it was interrupted. Record numbers in progress so the output corresponds to the line numbers in input file, in case you need to remove problematic scene-tiles.

ragnar-download.py can optionally request resolution scaling to be done by Ragnar Get Image API. See the --help text.

More Examples

  • Search and download latest images:
ragnar-search.py -j 10 -i prague.geojson -p ab -d pleiades | ragnar-filter.py latest | ragnar-download.py -j 10 -o .

Taking above shortcuts is however discouraged for all but small GeoJSONs as it makes the process more fragile — one would have to repeat the search due to e.g. interruption during ragnar-download.py.

  • Search for multiple providers in following steps, download once:
ragnar-search.py -j 10 -i prague.geojson -p ab -d pleiades -o pleiades.jsons
ragnar-search.py -j 10 -i prague.geojson -p gbdx -d idaho -o idaho.jsons
cat pleiades.jsons idaho.jsons > all.jsons
ragnar-download.py -j 10 -i all.jsons -o .

Warning

Do not combine multiple different input GeoJSON files into one resulting available scene-tiles file. Doing so would produce confusing results when downloaded. Append more features to original GeoJSON and rerun the search instead.

  • Recreate available-tiles.jsons from downloaded dataset:
cat set-*/r*c*/*_record.json > available-tiles-recreated.jsons
  • Fetch fresh metadata of a scene:
ragnar-get-scene.py <scene-id-here>
ragnar-get-scene.py <scene-id-here> | jq .footprint | geojsonio
  • Generate grid tiles used by ragnar-search.py without running the search:
ragnar-grid.py -i italy.geojson --utm 32633 -t 24000 -o grid.geojson

Order scenes to GBDX IDAHO

Search in gbdx preview returns all imagery available in GBDX platform, but not all imagery is available in dataset idaho. It is possible to “order” scenes so they later appear in idaho.

ragnar-search.py -j 10 -i prague.geojson -o available-tiles-preview.jsons -p gbdx -d preview
ragnar-order.py -j 10 -i available-tiles-preview.jsons

Kraken Client

Kraken client kraken-download.py allows you to download Kraken API (Imagery and Analyses) map tiles. It can, in parallel:

  • Download and assemble together PNG tiles for all eligible Kraken map types.
  • Download and assemble together GeoJSON and area.json vector results.

kraken-download.py accepts the same available-tiles.jsons file format as produced by ragnar-search.py and accepted by ragnar-download.py.

However in case of kraken-downlad.py, tiling should not be requested during search (e.g. ragnar-search.py --tile-size 0 ..., or the tiles should be much greater than default 500 x 500 m. (Kraken maps can be up to \(81 km^2\) as of version 87.1)

Usual Procedure

1. Search for Available Imagery without tiling or with large tile size.
3. Allocate Kraken tiles (applies only to high-resolution imagery):
kraken-allocate.py -i tiles.jsons
4. Download Kraken tiles:
kraken-download.py -i tiles.jsons -j 10 -p 'set-{set_number:05d}/{date:%Y%m%dT%H%M%S}_{scene_id_hash}/{map_type}_' -t change,cars+geotiff,imagery

Pairwise algorithm mosaicking

Following steps can be used to run pairwise algorithm (e.g. change) on AOI not covered by single scene between successive weeks:

ragnar-search.py -j 10 -i prague.geojson -o available-tiles.jsons -p pl -d PSOrthoTile
ragnar-filter.py -i available-tiles.jsons amweekly --aggregate-until 0 --clip -o filtered-tiles.jsons
kraken-pairwise-records.py -i filtered-tiles.jsons -p amweekly -t change -o kraken-records.jsons
kraken-download.py -k kraken-records.jsons -p 'set-{set_number:05d}/{date:%Y%m%dT%H%M%S}/{map_type}_{prev_scene_id_hash}_{scene_id_hash}'

SAR Pairing of images

For pairing SAR images (e.g. running sar-change) requires to use special script, which undergoes different steps:

ragnar-search.py -j 10 -i prague.geojson -o available-tiles.jsons -p ee -d COPERNICUS/S2
ragnar-filter.py -i available-tiles.jsons --bands VV VH filtered-tiles.jsons
kraken-sar-pairwise-records.py -i filtered-tiles.jsons -t sar-change -o kraken-records.jsons --only-top-orbit-number
kraken-download.py -k kraken-records.jsons -p 'set-{set_number:05d}/{date:%Y%m%dT%H%M%S}/{map_type}_{prev_scene_id_hash}_{scene_id_hash}'

Authentication

CLIs communicating with SpaceKnow APIs authorize using tokens provided by SpaceKnow Authentication Daemon. CLIs start the daemon automatically if it is not already available. SpaceKnow Authentication Daemon handles authentication flows and makes sure that multiple CLI executions can share a single long-running authentication session.

Note

Due to security reasons, sessions are not persistent, i.e. SpaceKnow Authentication Daemon keeps no sessions across restarts.

SpaceKnow Authentication Daemon supports multiple simultaneous authentication sessions. CLIs read SPACEKNOW_EMAIL environment variable and ensure that appropriate session is used. Currently available session is used or a new session is started when SPACEKNOW_EMAIL is not set. Sessions are identified by the email of the authenticated user. Hence, SPACEKNOW_EMAIL must be set when more than one session is available.

Warning

SpaceKnow Authentication Daemon creates a Unix socket on local file-system. Any program with access to the socket file can obtain ID token of any running session.

For example, an account with e-mail info@example.com can be used from a kraken-download.py execution:

SPACEKNOW_EMAIL=info@example.com kraken-download.py ...

Alternatively, SPACEKNOW_EMAIL can be set permanently with:

export SPACEKNOW_EMAIL=info@example.com

SpaceKnow Authentication Daemon supports two authentication flows: Proof Key for Code Exchange (PKCE) [1] where user authenticates from her browser and Resource Owner Password Credentials Grant (ROPC) [2] where user stores her credentials in a set of environment variables.

To utilize ROPC authentication flow, the following environment variables must be set: SPACEKNOW_CLIENT_ID, SPACEKNOW_CLIENT_SECRET, SPACEKNOW_EMAIL, SPACEKNOW_PASSWORD. Otherwise the PKCE flow is used.

Where:

  • SPACEKNOW_CLIENT_ID – ID of the application registered on Auth0 (e.g. ID representing CLIs)
  • SPACEKNOW_CLIENT_SECRET – secret used to authorize the application
  • SPACEKNOW_EMAIL – email of the user to be authenticated
  • SPACEKNOW_PASSWORD – password of the user to be authenticated

Warning

Using SPACEKNOW_CLIENT_SECRET and SPACEKNOW_PASSWORD environment variables may pose additional security risks. Their usage is discouraged outside isolated environments or when PKCE authentication flow can be used instead.

One may utilize Kubernetes Secrets to securely set environment variables in a Kubernetes Pod.

Note

It is possible to run SpaceKnow Authentication Daemon locally and local forward its Unix socket to the remote machine. However, CLIs are not able to authenticate after the SSH connection is closed. Use a variant of the following command to local forward SpaceKnow Authentication Daemon Unix socket: ssh -L ~/.spaceknow/auth.sock:/home/<user>/.spaceknow/auth.sock <remote>

Footnotes

[1]https://tools.ietf.org/html/rfc7636, https://auth0.com/docs/api-auth/grant/authorization-code-pkce
[2]https://tools.ietf.org/html/rfc6749#section-4.3, https://auth0.com/docs/api-auth/grant/password