eumap.datasets.eo.utils.STACGenerator¶
- class STACGenerator(gsheet, url_date_format='%Y.%m.%d', cog_level=7, thumb_overwrite=False, asset_id_delim='_', asset_id_fields=[1, 3, 5], catalogs=None, verbose=False)[source]¶
Bases:
object
Generator able to access a remote Google Spreadsheet [1] containing several raster layer metadata (e.g. name, description, cloud-optimized GeoTIFF URL) and produce multiple SpatioTemporal Asset Catalogs (STAC) instances in a local folder and / or remote S3 bucket [2,3].
The COG files need to be publicly accessible to HTTP and compatible the Geo-harmonizer file naming convention [4]. The thumbnails are produced for every COG according to color scheme defined by columns
thumb_cmap
,thumb_vmin
,thumb_vmax
.- Parameters
gsheet (
GoogleSheet
) – Object representation of a Google Spreadsheet containing the metadata.url_date_format – Date format expected in the COG URL (
strftime
).cog_level – COG overview level used to generate the thumbnail.
thumb_overwrite – Overwrite the thumbnail files if exists.
asset_id_delim – Field delimiter used to split the COG filename [4].
asset_id_fields – Fields retrieved from COG filename used to compose the asset id.
catalogs – Used to pass a dictionary (
catalog_id
as key andpystac.catalog.Catalog
as value) for update operation in pre-existing catalogs.verbose – Use
True
to print the progress of all steps.
References
[1] ODSE Raster layer metadata example
[4] Geo-harmonizer file naming convention
Methods
Save the STAC instance to local folder.
Save the STAC instance to local folder and upload all the files to a s3 bucket.
- save_all(output_dir='stac', catalog_type=CatalogType.SELF_CONTAINED, thumb_base_url=None)[source]¶
Save the STAC instance to local folder.
- Parameters
output_dir (
str
) – Destination folder.catalog_type – Normalization strategy defined by
pystac.CatalogType
.thumb_base_url – Base urls for the thumbnail files. Useful in cases where the COG files are hosted in a different location (S3 bucket) of STAC files.
Examples
>>> from eumap.misc import GoogleSheet >>> from eumap.datasets.eo import STACGenerator >>> >>> # Generate your key follow the instructions in https://docs.gspread.org/en/latest/oauth2.html >>> key_file = '<GDRIVE_KEY>' >>> # Public accessible Google Spreadsheet (Anyone on the internet with this link can view) >>> url = 'https://docs.google.com/spreadsheets/d/10tAhEpZ7TYPD0UWhrI0LHcuIzGZNt5AgSjx2Bu-FciU' >>> >>> gsheet = GoogleSheet(key_file, url, verbose=True) >>> stac_generator = STACGenerator(gsheet, asset_id_fields=[1,2,3,5], catalogs=catalogs, verbose=True) >>> stac_generator.save_all(output_dir='stac_odse', thumb_base_url=f'https://s3.eu-central-1.wasabisys.com/stac')
- save_and_publish_all(s3_host, s3_access_key, s3_access_secret, s3_bucket_name, s3_prefix='', output_dir='stac', catalog_type=CatalogType.SELF_CONTAINED)[source]¶
Save the STAC instance to local folder and upload all the files to a s3 bucket.
- Parameters
s3_host (
str
) – Hostname of a S3 service.s3_access_key (
str
) – Access key (aka user ID) of S3 service.s3_access_secret (
str
) – Secret key (aka user ID) of S3 service.s3_bucket_name (
str
) – Name of the bucket.s3_prefix (
str
) – Object name prefix (URL part) in the bucket.output_dir (
str
) – Destination folder.catalog_type – Normalization strategy defined by
pystac.CatalogType
.
Examples
>>> s3_host = "<S3_HOST>" >>> s3_access_key = "<s3_access_key>" >>> s3_access_secret = "<s3_access_secret>" >>> s3_bucket_name = 'stac' >>> >>> # Generate your key follow the instructions in https://docs.gspread.org/en/latest/oauth2.html >>> key_file = '<GDRIVE_KEY>' >>> # Public accessible Google Spreadsheet (Anyone on the internet with this link can view) >>> url = 'https://docs.google.com/spreadsheets/d/10tAhEpZ7TYPD0UWhrI0LHcuIzGZNt5AgSjx2Bu-FciU' >>> >>> gsheet = GoogleSheet(key_file, url) >>> stac_generator = STACGenerator(gsheet, verbose=True) >>> stac_generator.save_and_publish_all(s3_host, s3_access_key, s3_access_secret, s3_bucket_name)