eumap.gapfiller.SSA¶
- class SSA(fn_files=None, data=None, season_size=4, max_gap_pct=0.8, ltm_resolution=5, window_size=4, ngroups=4, reconstruct_ngroups=2, outlier_remover=None, std_win=3, std_env=2, perc_env=[2, 98], n_jobs_io=4, verbose=True)[source]¶
Bases:
eumap.gapfiller.ImageGapfill
Approach that uses a Singular Spectral Analysis (SSA [1]) to gapfill the missing values and smooth all the raster data. The missing values are first gapfilled using a long-term median strategy derived over values from other days/months/seasons. Later the SSA is uses to decompose each time series in multiple components (
ngroups
), considering only part of them to reconstruct the output time series. (reconstruct_ngroups
).- Parameters
fn_files (
Optional
[List
]) – Raster file paths to be read and gapfilled.data (
Optional
[array
]) – 3D array where the last dimension is the time.season_size (
int
) – Season size of a year used to calculate the long-term median (for monthly time series it is equal12
).max_gap_pct (
int
) – Max percentage allowed to run the approach. For pixels where this condition is satisfied the result isnp.nan
for all dates.ltm_resolution (
int
) – Number of years used to calculate the long-term median.window_size (
int
) – Size of the sliding window (i.e. the size of each word). If float, it represents the percentage of the size of each time series and must be between 0 and 1. The window size will be computed asmax(2, ceil(window_size * n_timestamps))
[1].ngroups (
int
) – Number of components used to decompose the time series [1].reconstruct_ngroups (
int
) – Number of components used to reconstruct the time series.outlier_remover (
Optional
[OutlierRemover
]) – Strategy to remove outliers.std_win (
int
) – Temporal window size used to calculate a local median and std.std_env (
int
) – Number of std used to define a local envelope around the median. Values outside of this envelope are removed.perc_env (
list
) – A list containing the lower and upper percentiles used to defined a global envelope for the time series. Values outside of this envelope are removed.n_jobs_io – Number of parallel jobs to read/write raster files.
verbose – Use
True
to print the progress of the gapfilled.
References
[1] Pyts SingularSpectrumAnalysis
Examples
>>> from eumap import gapfiller >>> >>> # For a 4-season time series >>> ssa = gapfiller.SSA(fn_files=fn_rasters, season_size=4) >>> data_ssa = ssa.run() >>> >>> fn_rasters_ssa = ssa.save_rasters('./gapfilled_ssa', dtype='uint8', save_flag=False)
Methods
gapfill_ltm
Execute the gapfilling approach.
Save the result in raster files maintaining the same filenames of the read rasters.
- run()¶
Execute the gapfilling approach.
- save_rasters(out_dir, dtype=None, out_mantain_subdirs=True, root_dir_name='eumap_data', fn_files=None, nodata=None, spatial_win=None, save_flag=True)¶
Save the result in raster files maintaining the same filenames of the read rasters.
- Parameters
out_dir – Folder path to save the files.
dtype (
Optional
[str
]) – Convert the rasters for the specified Numpydtype
before save. This argument overwrite the values retrieved offn_files
out_mantain_subdirs (
bool
) – Keep the full folder hierarchy of the read raster in theout_dir
.root_dir_name (
str
) – Keep the relative folder hierarchy of the read raster in theout_dir
considering of the sub folders ofroot_dir_name
.fn_files (
Optional
[List
]) – Raster file paths to retrieve the filenames and folders. Use this parameter in situations where thedata
parameter is informed in the class constructor. The pixel size, crs, extent, image size and nodata for the gapfilled rasters are retrieved from the first valid raster offn_files
nodata –
Nodata
value used for the the gapfilled rasters. This argument overwrite the values retrieved offn_files
. This argument doesn’t affect the flag rasters (gapfill summary), which havenodata=0
.spatial_win (
Optional
[Window
]) – Save the gapfilled rasters considering the specified spatial window.save_flag – Save the flag rasters (gapfill summary).