eumap.datasets.eo.utils.STACGenerator

class STACGenerator(gsheet, url_date_format='%Y.%m.%d', cog_level=7, thumb_overwrite=False, asset_id_delim='_', asset_id_fields=[1, 3, 5], catalogs=None, verbose=False)[source]

Bases: object

Generator able to access a remote Google Spreadsheet [1] containing several raster layer metadata (e.g. name, description, cloud-optimized GeoTIFF URL) and produce multiple SpatioTemporal Asset Catalogs (STAC) instances in a local folder and / or remote S3 bucket [2,3].

The COG files need to be publicly accessible to HTTP and compatible the Geo-harmonizer file naming convention [4]. The thumbnails are produced for every COG according to color scheme defined by columns thumb_cmap, thumb_vmin, thumb_vmax.

Parameters
  • gsheet (GoogleSheet) – Object representation of a Google Spreadsheet containing the metadata.

  • url_date_format – Date format expected in the COG URL (strftime).

  • cog_level – COG overview level used to generate the thumbnail.

  • thumb_overwrite – Overwrite the thumbnail files if exists.

  • asset_id_delim – Field delimiter used to split the COG filename [4].

  • asset_id_fields – Fields retrieved from COG filename used to compose the asset id.

  • catalogs – Used to pass a dictionary (catalog_id as key and pystac.catalog.Catalog as value) for update operation in pre-existing catalogs.

  • verbose – Use True to print the progress of all steps.

References

[1] ODSE Raster layer metadata example

[2] ODSE STAC Catalog

[3] ODSE STAC Browser

[4] Geo-harmonizer file naming convention

Methods

save_all

Save the STAC instance to local folder.

save_and_publish_all

Save the STAC instance to local folder and upload all the files to a s3 bucket.

save_all(output_dir='stac', catalog_type=CatalogType.SELF_CONTAINED, thumb_base_url=None)[source]

Save the STAC instance to local folder.

Parameters
  • output_dir (str) – Destination folder.

  • catalog_type – Normalization strategy defined by pystac.CatalogType.

  • thumb_base_url – Base urls for the thumbnail files. Useful in cases where the COG files are hosted in a different location (S3 bucket) of STAC files.

Examples

>>> from eumap.misc import GoogleSheet
>>> from eumap.datasets.eo import STACGenerator
>>> 
>>> # Generate your key follow the instructions in https://docs.gspread.org/en/latest/oauth2.html
>>> key_file = '<GDRIVE_KEY>'
>>> # Public accessible Google Spreadsheet (Anyone on the internet with this link can view)
>>> url = 'https://docs.google.com/spreadsheets/d/10tAhEpZ7TYPD0UWhrI0LHcuIzGZNt5AgSjx2Bu-FciU'
>>> 
>>> gsheet = GoogleSheet(key_file, url, verbose=True)
>>> stac_generator = STACGenerator(gsheet, asset_id_fields=[1,2,3,5], catalogs=catalogs, verbose=True)
>>> stac_generator.save_all(output_dir='stac_odse', thumb_base_url=f'https://s3.eu-central-1.wasabisys.com/stac')
save_and_publish_all(s3_host, s3_access_key, s3_access_secret, s3_bucket_name, s3_prefix='', output_dir='stac', catalog_type=CatalogType.SELF_CONTAINED)[source]

Save the STAC instance to local folder and upload all the files to a s3 bucket.

Parameters
  • s3_host (str) – Hostname of a S3 service.

  • s3_access_key (str) – Access key (aka user ID) of S3 service.

  • s3_access_secret (str) – Secret key (aka user ID) of S3 service.

  • s3_bucket_name (str) – Name of the bucket.

  • s3_prefix (str) – Object name prefix (URL part) in the bucket.

  • output_dir (str) – Destination folder.

  • catalog_type – Normalization strategy defined by pystac.CatalogType.

Examples

>>> s3_host = "<S3_HOST>"
>>> s3_access_key = "<s3_access_key>"
>>> s3_access_secret = "<s3_access_secret>"
>>> s3_bucket_name = 'stac'
>>>
>>> # Generate your key follow the instructions in https://docs.gspread.org/en/latest/oauth2.html
>>> key_file = '<GDRIVE_KEY>'
>>> # Public accessible Google Spreadsheet (Anyone on the internet with this link can view)
>>> url = 'https://docs.google.com/spreadsheets/d/10tAhEpZ7TYPD0UWhrI0LHcuIzGZNt5AgSjx2Bu-FciU'
>>> 
>>> gsheet = GoogleSheet(key_file, url)
>>> stac_generator = STACGenerator(gsheet, verbose=True)
>>> stac_generator.save_and_publish_all(s3_host, s3_access_key, s3_access_secret, s3_bucket_name)