Datacube API
Datacube API is a system for storing, querying and filtering data-point time series.
Concept
Datacube serves a “data lake” of data-points, an ever increasing storage of versioned, multi-dimensional time-series data. New data from various sources, including but not limited to SpaceKnow algorithms applied on satellite imagery, are constantly streamed to Datacube.
A data-point is a scalar (for example a result of a specific algorithm, e.g. mean NDVI value, or number of detected cars) that can be computed on a given (multi)polygon AoI and one source data sample (usually one satellite image shot or a imagery mosaic). Datacube is a query-able, multi-dimensional data-point time-series storage and API.
Data-points have the following properties (i.e. dimensions of the datacube):
version
(str) – version of the data-point. For example, data can be re-computed when the source algorithm changes or when new interpolation could be computed after new satellite images become available. Once a version of a data-point is uploaded it remains forever available.Alphabetically greater versions are considered newer. For example, version
ab
is considered newer than versionaa
.startDatetime
,endDatetime
(datetime) – a time-range (possibly 0 seconds wide) of the source data acquisition (for example a time when underlying satellite image was taken).algorithm
(str) – a string identification of the underlying algorithm(s) which produced the particular data-point. For example,ndmi_mean
.project
(str) – For example,aluminium
orcoal
(but notchina-coal
).aoi
(GeoJSON) – point AoI, this information is not retrievable with the API but data-points can be filtered to a region.aoiId
(str) – a unique identifier of the data-point AoI.source
(str) – underlying data source identification, for examplelandsat_8
,viirs_nl_monthly
.firstSeen
(datetime) – date of the initial availability of the data to us, or date when we’ve first seen the data when the former is not available. This value is optional.ingestDatetime
(datetime) – time at which the data-point was inserted into Datacube.cloudCover
(float) – expected cloudiness in range [0, 1], 0 being cloud-free. Satellite-derived data-points may have diminished quality due to high cloud cover. This value is optional.intersectionRatio
(float) – a value in range [0, 1]. Ratio of site’s area that has been analyzed in given sample and total site’s area. This value is optional.
Get Data
This API end-point returns a link to a CSV on Google Cloud Storage with column for each Datacube dimension and another column with values.
Note
By default only data with newest version are returned. This behaviour
can be overridden by setting keepAllVersions
to true
or by filtering by
version.
Filters
Data can be filtered with so-called filters. Filter on project
and
algorithm
is mandatory for each query.
Each filter is a JSON object in the following form.
type
(str) – type of filter, see list of types below.field
(str) – field / axis on which the filter should be applied. See available fields in Concept.params
(object) – object with filter parameters which is specific for each filter type.
The following filter-types are available.
time-range
– keeps only data-points inside a date-range. Filter bounds can be half-open if one of the bounding parameters is left out.Parameters:
from
– a date-time in the formYYYY-mm-dd HH:MM:SS
, inclusive.to
– a date-time in the formYYYY-mm-dd HH:MM:SS
, inclusive.
value-range
– keeps only data-points inside a value-range. Filter bounds can be half-open if one of the bounding parameters is left out. Missing / null values are filtered out by this filter.Parameters:
min
– a minimum allowed value, inclusive.max
– a maximum allowed value, inclusive.
value-list
– keeps only data-points which have given field set to one of provided values. Needs to be used forproject
andalgorithm
. Other possible fields to filter areversion
,aoiId
,source
andtag
. Behaviour oftag
filter is a little different than the rest - all provided values must be matched by given data-point to satisfy the filter.Parameters:
values
(list) – list of allowed string values.
geo-intersects
– keeps only data-points with AoI within a given area. Parameters:geometryLabel
– label of the filtering geometry. For countries lower-case ISO identifier is used. Examples:cz
cn
Warning
If filtering geometry is not chosen or multiple filtering geometries are
chosen and resulting AOIs have different newest version only those with
overall newest version are returned. This behaviour can be overridden by
setting keepAllVersions
to true
or by filtering by version.
Permission Packages
Each user can be given (multiple) permission packages. Each permission package allows the user to access some data-point subset. A permission package is a list of mandatory filters (see above). The user is allowed to perform a query if she uses all filters from some of her permission packages. In other words, the user must use at least as restrictive filter as one of her permission packages.
Note
Users need regular datacube.get
permission to have access to the API
endpoint. Permission packages work on-top of that.
All users are automatically granted permission package with value-range
filter on field project
set to values ['test']
. Project test
contains test values. This implicit “free” permission package serves for
development purposes.
Aggregation
It is possible to retrieve periodic aggregates of the data. In such a case, the
original data-points are replaced by average values computed over multiple
data-points falling into the same time-period. The aggregation preserves
version
, algorithm
, project
, aoiId
and source
, id est
data-points with different values in one of the mentioned fields are treated as
separate.
When aggregated data are requested, original start and end date-times are
replaced with formatted dates of the time periods. daily
(%Y-%m-%d
),
weekly
(%G-W%V
, i.e. ISO 8601 week), monthly
(%Y-%m
),
yearly
(%Y
) aggregations are allowed.
Deduplication
Datacube output deduplicates data by default. Rows with same project
,
algorithm
, start_timestamp
, end_timestamp
, aoi_id
, source
,
version
are treated as duplicates.
Duplicate entries are ordered by first_seen
(descending) and by
ingest_timestamp
(descending) and first data-point is kept.
Warning
Be aware that row with higher first_seen
value takes precedence over
row with higher ingest_timestamp
.
For aggregation this ensures that average values are calculated from unique
values only. Using attribute keepDuplicates
set to true
can be used to
override this behaviour and return all the rows.
Warning
Be aware that keeping duplicates especially in aggregated queries can really skew the results.
Postprocessing
Datacube API can provide a simple view on top of the query result. This option
is available for the asynchronous endpoint only.
Postprocessing can be done by adding 'postprocess':'postprocess_name'
to
request json.
The list of supported post-processing operations follows:
pivot
– groups values by datetime transposes aoi_ids into columns and averages values that were grouped.columns
– allows to filter resulting view based on given columns. Requested columns are provided viacolumns
property.seasonal_decompose
– adjust time series for seasonal effects. New columns are added to the resulting csv -valueSeasonallyAdjusted
andseasonalDecomposition
. A resampling must be specified viaresample
property (‘W’, ‘M’ or ‘Y’).sd_hpfilter
– adjust time series for seasonal effects and apply Hodrick–Prescott filter. New columns are added to the result - all columns listed forseasonal_decompose
plustrend
andgap
. A resampling must be specified viaresample
property (‘M’ or ‘Y’).abs_diff
– replace each value by its absolute difference from the previous value, the first row is lost in the processdiff_log
– replace each value by the difference between its logarithm and a logarithm of the previous value, the first row is lost in the process, can lead to an error if the time series contains zero or negative valuesunique_datetimes
– Remove datapoints with duplicate start or end datetime for each AoI. Take preferentially datapoints with shorter time span. This modification is needed for SAR change datapoints where a problem arises when scenes are ingested out of order in Earth Engine.yoy_percent_diff
– replace each value by its percentage difference from the value a year ago, the first year is lost in the process, works only on continuous monthly series
Warning
Important metadata will be lost when using postprocessing.
Needed Permissions: datacube.get
- POST /datacube/datapoints/get
- Request JSON Object:
filters (list) – filter on single project and single algorithm is required; list of filters of data. Only data-points matching all filters are returned.
aggregate (str) – optional; see Aggregation.
keepDuplicates (bool) – optional; see Deduplication.
showAoiPositions (bool) – optional; Set to
true
to include AoI midpoint positions in resulting CSV. Requiresdatacube.get-aoi-positions
permission.showAoiTags (bool) – optional; Set to
true
to include AoI tags i.e.industry
,type
,sub_type
in resulting CSV. Requiresdatacube.get-aoi-tags
permission.keepAllVersions (bool) – optional; By default datacube results are filtered to only contain latest version that was matched by the query. By setting keepAllVersions to
true
this post-filter is removed.
Example request:
{ "filters": [ { "type": "time-range", "field": "startDatetime", "params": { "from": "2018-01-01 00:10:00" } }, { "type": "value-range", "field": "cloudCover", "params": { "max": 0.05 } }, { "type": "value-list", "field": "project", "params": { "values": ["industrial_production"] } }, { "type": "value-list", "field": "algorithm", "params": { "values": ["ndmi_mean"] } }, { "type": "value-list", "field": "tag", "params": { "values": ["mining", "mine", "ore"] } }, { "type": "geo-intersects", "field": "aoi", "params": { "geometryLabel": "cz" } } ] }
Example response:
{ "csvLink": "https://storage.googleapis.com/spaceknow-devel-datacube.appspot.com/cache/4a405b16348ae606.csv", "totalRows": 2648 }
Get Large Amount of Data-points
Some queries to Datacube API can match a larger number of data-points than it
is possible to query and return in time span of one HTTP request. Pass the same
payload as to /datacube/datapoints/get
to the asynchronous endpoint to
initiate the query.
- POST /datacube/datapoints/get/initiate
Example request:
See the synchronous endpoint.
Example response:
See pipeline initiation.
Needed Permissions: datacube.get
The Tasking API should be used to check whether the result is ready.
Once the result is ready, retrieve it from the retrieve endpoint.
- POST /datacube/datapoints/get/retrieve
Example request:
See pipeline retrieve.
Example response:
{ "csvLink": "https://storage.googleapis.com/spaceknow-devel-datacube.appspot.com/cache/4a405b16348ae606.csv", "totalRows": 2648 }
Needed Permissions: datacube.get
List Available AoIs
This endpoint lists all AoIs available for filtering of data-points in Datacube. You may be limited to a subset of these AoIs, please contact SpaceKnow representatives to get access to more.
Needed Permissions: datacube.get
- POST /datacube/aois/get
Example request:
{}
Example response:
[ { "label": "cz", "name": "Czech Republic" }, { "label": "cn", "name": "China" } ]
List Catalogue of Available Data-points
This endpoint lists catalogue of data-points available in Datacube. Different algorithms are available for different regions, date ranges, and their inputs come from different sources.
Needed Permissions: datacube.get
- POST /datacube/catalogue/get
- Request JSON Object:
version (str) – optional; search for specific version.
project (str) – optional; search for specific project.
source (str) – optional; search for specific source.
algorithm (str) – optional; search for specific algorithm.
aoiName (str) – optional; search for specific aoi name.
aoiLabel (str) – optional; search for specific aoi label.
Example request:
{}
Example response:
{ "ndmi_mean": [ { "version": "160", "aoi_label": "cz", "aoi_name": "Czech Republic", "source": "LANDSAT/LC08/C01/T1", "project": "coal", "last_update": "2019-02-27 17:39:56", "date_ranges": [{ "from": "2010-01-01", "to": "2019-02-28" }], "metadata": null }, { "version": "160", "aoi_label": "cz", "aoi_name": "Czech Republic", "source": "COPERNICUS/S2", "project": "steel", "last_update": "2019-02-27 17:39:56", "date_ranges": [{ "from": "2010-01-01", "to": "2016-10-12" }, { "from": "2017-05-01", "to": "2019-02-28" }], "metadata": null } ], "radiance_mean": [ { "version": "162", "aoi_label": "de", "aoi_name": "Germany", "source": "VIIRS/Monthly", "project": "night_lights", "last_update": "2019-02-03 12:19:01", "date_ranges": [{ "from": "2018-01-01", "to": "2019-02-28" }] "metadata": {"dataFrequency": "monthly"} } ] }
Example request:
{ "aoi_label": "cz", "project": "steel" }
Example response:
{ "ndmi_mean": [ { "version": "160", "aoi_label": "cz", "aoi_name": "Czech Republic", "source": "COPERNICUS/S2", "project": "steel", "last_update": "2019-02-27 17:39:56", "date_ranges": [{ "from": "2010-01-01", "to": "2016-10-12" }, { "from": "2017-05-01", "to": "2019-02-28" }], "metadata": null } ] }
Get Product
This API endpoint returns CSV containing the requested SpaceKnow Datacube product.
- POST /datacube/product/get
Example request:
- Request JSON Object:
productId (str) – Product ID of the requested product.
pitDt (str) – optional; Point in time date in format ‘YYYY-MM-DD’. Endpoint returns most recent product data calculated with datetime less or equal to pit date.
{ "productId": "bai_monthly_jp", "pitDt": "2020-10-10" }
Example response:
value_dt,delivery_dt,col A,col B 2020-09-17 08:41:17 UTC,2020-09-18 08:41:44 UTC,0.3,0.2 2020-09-17 08:43:24 UTC,2020-09-18 08:43:25 UTC,,0.3
Product Permission Packages
Each user can be given (multiple) product permission packages. A permission package is a list of product IDs which user is allowed to retrieve. The data is trimmed using the user’s earliest date cut-off when they have multiple product permission packages over the same product ID.
Note
Users need regular datacube.get
permission to have access to the API
endpoint. Product permission packages work on-top of that.
Get Product Catalogue
This endpoint lists SpaceKnow products available in Datacube or gets particular product by its id. downloadable flag signals whether user have the ability to download listed product.
Needed Permissions: datacube.get
- POST /datacube/product-catalogue/get
Example requests:
{}
{"productId": "SK ABAI AR"}
Example response:
[ { "productId": "SK ABAI AR", "description": "Activity detected on reflected radiation over industrial locations. Seasonally adjusted.", "metadata": { "country": "Argentina", "frequency": "monthly", "imagery": "low-res optical imagery" }, "downloadable": true, "active": true, "salesInfo": { "column_name_request1": { "frequency": "monthly", "region": "argentina", "rolling": "1d", "topic": "Transport", "typeOfIndex": "ABAI", "units": "%", "valueType": "activity", }, "column_name_request2": { "frequency": "monthly", "region": "argentina", "rolling": "1d", "topic": "Transport", "typeOfIndex": "ABAI", "units": "%", "valueType": "activity", } } } ]
Get Product Sales Info
This endpoint transforms given product ID (more specifically, its request definition IDs) into CSV with ID abbreviations translated into human-readable column names. It serves as a detailed data dictionary for our products.
Needed Permissions: datacube.get
- POST /datacube/product/sales-info/get
Example requests:
- Request JSON Object:
productId (str) – Product ID of the requested product.
{"productId": "SK_TRA_DIS_CON_LOG_ASI_US_D"}
Example response:
column_id
topic
type_of_index
region
frequency
rolling
value_type
units
SK_TRA_DIS_CON_LOG_ASI_US_D_30d_normal_activity_level
Transport
ASI - normal
United States
weekly
30d
activity
%
SK_TRA_DIS_CON_LOG_ASI_US_D_30d_high_activity_level
Transport
ASI - high
United States
weekly
30d
activity
%
SK_TRA_DIS_CON_LOG_ASI_US_D_30d_low_activity_level
Transport
ASI - low
United States
weekly
30d
activity
%