.. Copyright (C) 2016-2024 SpaceKnow, Inc. .. _ski: SpaceKnow Image (SKI) ===================== SKI is a SpaceKnow's proprietary binary file format used for aerial and satellite images. Most visual data inside SK platform are handled in this file format. SKI supports multiple bands (i.e. channels). For example each band can represent a single part of electromagnetic spectrum (e.g. near-infrared). A band is a 2D matrix of values. Each band in a single SKI can have a different resolution (number of rows and columns) and bit-depth. 8, 16, 32 and 64 bits per pixel per band are supported. .. image:: /attachments/ski/2_bands_different_bit_depth.svg Python API ---------- Pythonic API for SKI access is available in the :mod:`sk.tools.ski.handle` module, which replaces the deprecated SkiHandle Low-level SKI reader/writer. Mostly an implementation detail, externally used only in special cases. Ski Abstract base class for geo-referenced SKIs. Utility functions for operations like scaling or reprojecting work with this type. ImagerySki(GeoReferencedSki) Representation of geo-referenced imagery SKI with rich metadata like satellite name or cloud cover. AnalysisSki(GeoReferencedSki) Representation of an SKI with less metadata---only those crucial for geo-referencing. Typically used for analysis results. MaskedBand One band (channel) of an SKI with mask and convenience methods. MaskedBandWithMeta(MaskedBand) Extension of MaskedBand with geo-referencing metadata, with optional rich metadata. SKI Handle Classes ^^^^^^^^^^^^^^^^^^ Both :class:`SkiHandle` and :class:`Ski` (subclasses) are essentially wrappers around their :attr:`band_map` (dict) attribute. :attr:`band_map` is a :obj:`band_id` (str) => :class:`MaskedBand` or :class:`MaskedBandWithMeta` mapping. Manipulation with bands is therefore easy: .. code-block:: python >>> from sk.tools.ski.handle import SkiHandle, MaskedBand >>> import numpy as np >>> ski_handle = SkiHandle.load('doc/files/old.ski') >>> # get IDs of all bands in ski: >>> list(ski_handle.band_map.keys()) ['blue', 'green', 'near-infrared', 'red'] >>> # add new "near-ir2" band created out of Numpy arrays: >>> band_np = np.zeros(ski_handle.band_map['blue'].data.shape) >>> mask_np = np.zeros(ski_handle.band_map['blue'].data.shape, dtype=np.uint8) >>> ski_handle.band_map['near-ir2'] = MaskedBand(band_np, mask_np) >>> list(ski_handle.band_map.keys()) ['blue', 'green', 'near-infrared', 'red', 'near-ir2'] >>> # remove "near-ir2" band: >>> del ski_handle.band_map['near-ir2'] >>> list(ski_handle.band_map.keys()) ['blue', 'green', 'near-infrared', 'red'] >>> # duplicate "red" band as "blue" (referencing same data): >>> ski_handle.band_map['blue'] = ski_handle.band_map['red'] Loading and saving to path or file-like object is done using :func:`load` and :func:`save`. It is possible to load legacy SKIs with non-conforming metadata by breaking it into multiple steps: .. code-block:: python >>> from sk.tools.ski.handle import SkiHandle, ImagerySki >>> # load SKI which doesn't have valid CRS EPSG filled in: >>> raw_ski = SkiHandle.load('doc/files/old.ski') >>> raw_ski.meta['crsEpsg'] = 12345 >>> ski = ImagerySki.from_ski_handle(raw_ski) :class:`Ski` provides convenience multiple-band access methods :func:`get_pil_like_data`, :func:`get_mask_intersection`. :class:`AnalysisSki` provides convenience method :func:`from_reference_band` that is useful for creating analysis results out of reference imagery or other analysis result SKI/band. Band Classes ^^^^^^^^^^^^ Band classes :class:`MaskedBand` and :class:`MaskedBandWithMeta` are containers for NumPy band data and mask arrays. Mask is a bit array with multiple possible values for each pixel. The classes however provide convenience computed boolean mask properties :attr:`valid_mask` and :attr:`requested_mask`, with setters (in-line modification alone of these computed arrays has no effect). .. code-block:: python >>> data = np.array([[1, 2], [3, 4]]) >>> valid = np.array([[True, True], [False, False]]) >>> requested = np.array([[True, False], [False, True]]) >>> masked_band = MaskedBand.from_data_valid_requested(data, valid, requested) >>> masked_band.data array([[1, 2], [3, 4]]) >>> masked_band.mask array([[3, 1], [0, 2]], dtype=uint8) >>> masked_band.valid_mask array([[ True, True], [False, False]]) >>> masked_band.requested_mask array([[ True, False], [False, True]]) >>> # data, mask can be modified in-line: >>> masked_band.data[0, 0] = 5 >>> masked_band.data array([[5, 2], [3, 4]]) >>> masked_band.mask[0, 0] = 2 >>> masked_band.mask array([[2, 1], [0, 2]], dtype=uint8) >>> # Beware, changing computed boolean properties has no effect >>> masked_band.valid_mask[0, 0] = True # this is an error, no effect >>> masked_band.valid_mask[0, 0] False .. _ski.scaling: Scaling ^^^^^^^ Rescale an SKI, keeping geo-referencing metadata correct: .. code-block:: python from sk.tools.image.scaling import scale_ski, scale_ski_to_shape # create new georeferenced SKI such that each band has approximate # resolution equal to approximately 3 meters, resulting band shapes might # differ new_ski = scale_ski(old_ski, target_resolution=3) # or specify desired shape of band after scaling new_ski = scale_ski_to_shape(old_ski, target_shape=(512, 512)) .. _ski.reprojecting: Reprojecting ^^^^^^^^^^^^ Reproject SKI to a different projection or change its origin, correctly updating geo-referencing metadata along the way: .. code-block:: python from sk.tools.image.reprojection import reproject_ski # reproject SKI to Web Mercator projection new_ski = reproject_ski(ski, 3857, target_origin, target_pixel_size, (512, 512), pad_width) Note that this function can be used to crop SKI whose bands do not have equal shape. .. _ski.cropping: Cropping ^^^^^^^^ Following example illustrates how to crop SKI given a row, column, height and width of desired cropped SKI: .. code-block:: python from sk.tools.image.cropping import crop_ski # crop SKI with shape (256, 256) starting at 512 col and row new_ski = crop_ski(ski, 512, 512, 256, 256) Note that input SKI needs to have bands with the same shape, see :ref:`scaling `. Binary On-Disk Representation ----------------------------- SKI is gzipped tar archive with band files, info and meta JSON files. Bands ^^^^^ Bands are stored into separate files with names ``xxxxx.skb`` where x is index of the band (e.g. ``00000.skb``). First two bytes of the file are an unsigned integer whose value map to data type according to following table: +-------+-----------------+ | value | dtype | +=======+=================+ | 2 | binarized | +-------+-----------------+ | 8 | uint8 | +-------+-----------------+ | 9 | int8 | +-------+-----------------+ | 16 | uint16 | +-------+-----------------+ | 17 | int16 | +-------+-----------------+ | 32 | uint32 | +-------+-----------------+ | 33 | int32 | +-------+-----------------+ | 34 | float32 | +-------+-----------------+ | 64 | uint64 | +-------+-----------------+ | 65 | int64 | +-------+-----------------+ | 66 | float64 | +-------+-----------------+ | 67 | stretched float | +-------+-----------------+ ``binarized`` dtype represents ``uint8 np.dtype`` where all values are either 0 or 1. ``stretched float`` is saved as ``uint16 np.dtype`` but when loaded it is represented by ``float32 np.dtype``. ``float64`` dtype is now deprecated but there is backwards compatibility to load legacy SKIs. Next eight bytes are little endian floats. They represent ``value range`` which is used only for ``stretched float``, for all other dtypes it is set to ``(0, 0)``. For ``stretched float`` ``value range`` represents what the uint16 array should stretch into. For most current use cases it is set to ``(0, 1)``. Next four bytes represent a number of columns and rows. These bytes follow little endian format. The rest of the file is row-major ordered data of the band. Pixel values could be reconstructed with the following formula: :math:`p_{r, c} = (p_{r - 1, c} + b_{r, c}) \mod 2^k`, where :math:`p_{r, c}` is the pixel value at the rth row and cth column, :math:`b_{r, c}` is the value from a stored matrix at the rth row and cth column and k is the bit depth of the band (maximum value + 1). -1th row is defined as a row full of 0s, therefore, the encoded values of the first row are equal to real pixel values. When the binary is constructed and inverse formula is used. The above method of storing pixel values is only used for bands that use any integer as underlying data type including stretched float, the only exception is binarized dtype. For bands with floating point underlying data type and binarized dtype pixel values are stored directly. The following is a hexadecimal example of a band containing ``uint8`` pixels, with value range (0, 0), one column and two rows. Value in the first row is 250 and 200 in the second. ``08 00 00 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00 FA C8`` Info ^^^^ Every SKI contains ``info.json`` file which is UTF-8 encoded JSON serialized data with information about the SKI. It contains a list of band names (band can have multiple names), SKI version and SKI type (imagery, analysis). Info file has this format: .. code-block:: javascript { "bands": [ { "names": ["{band-name}"] } ], "version": "{SKI version}", "skiType" "{type of SKI}" } Example of an RGB image: .. code-block:: javascript { "bands": [ { "names": ["red"] }, { "names": ["green"] }, { "names": ["blue"] } ], "version": "200", "skiType" "imagery" } .. _ski.meta: Meta ^^^^ An optional file ``meta.json`` may be present in the SKI archive. This file contains arbitrary JSON data set by the creator of the SKI. The file is UTF-8 encoded JSON serialized data. Meta always contain :ref:`scene metadata ` in SKIs produced by Ragnar API. Geo-referencing sub-set of the metadata (i.e. EPSG code, pixel size and CRS origin) is usually present in algorithm output (i.e. mask) SKIs. Auxiliary Files ^^^^^^^^^^^^^^^ An arbitrary set of auxiliary files may be present in the ``aux/`` sub-folder of an SKI archive. This is a universal way to transfer additional data from SKI producers to consumers. No assumptions are made on the content of these files, it can be e.g. JSON, XML, binary file etc. .. _ski.masks: SKI Band Masks -------------- Each band in an SKI has a corresponding data mask stored in separate file named ``__MASK__{band_name}__`` (e.g. ``__MASK__red__``). The first two bytes of the file are an unsigned integer of value 3 which states that the file is a mask. Next eight bytes are little endian floats. They represent ``value range`` which is set to ``(0, 0)``. Next four bytes represent a number of columns and rows. These bytes follow little endian format. The rest of the file is row-major ordered 8 bit unsigned integers (uint8) where each bit has a different meaning. Bit 0b00000001 ^^^^^^^^^^^^^^ Value “1” corresponds to a valid pixel, other values indicate blackfill, lost, suspect or corrupt pixel. In the context of an algorithmic results, value “1” is set for pixels where the algorithm couldn't perform adequately. Bit 0b00000010 ^^^^^^^^^^^^^^ Value “1” corresponds to a pixel inside any area-based geometry from requested extent. Bit 0b00000100 ^^^^^^^^^^^^^^ Value “1” corresponds to a lost, suspect or otherwise corrupt pixel. Value “1” implies pixel invalidity (least significant bit must be “0”). This bit is available only in Planet imagery and is always set to “0” for other providers.