eumap.parallel.blocks.RasterBlockReader¶
- class RasterBlockReader(reference_file=None)[source]¶
Bases:
objectThread-parallel reader for large rasters.
If
reference_fileis notNone, builds an R-tree index [1] of the block geometries read from thereference_fileon initialization. All rasters read with the initialized reader are assumed to have identical geotransforms and block structures to the reference.For full usage examples please refer to the block processing tutorial notebook [2].
References
[1] pygeos STRTree
[2] Raster block processing tutorial
Examples
>>> from eumap.parallel.blocks import RasterBlockReader >>> from eumap.misc import ttprint >>> >>> fp = 'https://s3.eu-central-1.wasabisys.com/eumap/lcv/lcv_landcover.hcl_lucas.corine.rf_p_30m_0..0cm_2019_eumap_epsg3035_v0.1.tif' >>> >>> ttprint('initializing reader') >>> reader = RasterBlockReader(fp) >>> ttprint('reader initialized')
Methods
Thread-parallel reading of large rasters within a bounding geometry.
- read_overlay(src_path, geometry, band=1, geometry_mask=True, max_workers=2, optimize_threadcount=True)[source]¶
Thread-parallel reading of large rasters within a bounding geometry.
Only blocks that intersect with
geometryare read. Returns a generator yielding(data, mask, window)tuples for each block, wheredataare the stacked pixel values of all rasters atmask==True,maskis the reduced (via bitwiseand) block data mask for all rasters, andwindowis therasterio.windows.Window[1] for the block within the transform of thereference_file. All rasters read with the initialized reader are assumed to have identical geotransforms and block structures to thereference_fileused for initialization. If the reader was initialized withreference_file==None, the first file insrc_pathis used as the reference and the block R-tree is built before yielding data from the first block.- Parameters
src_path (
Union[str,Iterable[str]]) – Path(s) (or URLs) of the raster file(s) to read.geometry (
dict) – The bounding geometry within which to read raster blocks, given as a dictionary (with the GeoJSON geometry schema).band (
int) – Index of band to read from all rasters.geometry_mask (
bool) – Indicates wheather or not to use the geometry as a data mask. IfFalse, the block data will be returned in its entirety, regardless if some of it falls outside of thegeometry.max_workers (
int) – Maximum number of worker threads to use, defaults tomultiprocessing.cpu_count().optimize_threadcount (
bool) – Wheather or not to optimize number of workers. IfTrue, the number of worker threads will be iteratively increased until the average read time per block stops decreasing ormax_workersis reached. IfFalse,max_workerswill be used as the number of threads.
- Returns
Generator yielding
(data, mask, window)tuples for each block.- Return type
Iterator[Tuple(np.ndarray, np.ndarray, rasterio.windows.Window)]
For full usage examples please refer to the block processing tutorial notebook [2].
References
[1] Rasterio Window
[2] Raster block processing tutorial
Examples
>>> geom = { >>> 'type': 'Polygon', >>> 'coordinates': [[ >>> [4765389, 2441103], >>> [4764441, 2439352], >>> [4767369, 2438696], >>> [4761659, 2441949], >>> [4765389, 2441103], >>> ]], >>> } >>> block_data_gen = reader.read_overlay(fp) >>> data, mask, window = next(block_data_gen)