faninsar.datasets.HyP3S1#
- class faninsar.datasets.HyP3S1(root_dir: str = 'data', paths_unw: Sequence[str | Path] | None = None, paths_coh: Sequence[str | Path] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: np.dtype | None = None, nodata: float | None = None, roi: BoundingBox | None = None, bands_unw: Sequence[str] | None = None, bands_coh: Sequence[str] | None = None, cache: bool = True, resampling: Resampling = Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, keep_common: bool = True)[source]#
Bases:
InterferogramDataset,Sentinel1A dataset manages the data of HyP3 Sentinel-1 product.
Hyp3 is a service for processing Synthetic Aperture Radar (SAR) imagery. This class is used to manage the data of Hyp3 product.
- __init__(root_dir: str = 'data', paths_unw: Sequence[str | Path] | None = None, paths_coh: Sequence[str | Path] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: np.dtype | None = None, nodata: float | None = None, roi: BoundingBox | None = None, bands_unw: Sequence[str] | None = None, bands_coh: Sequence[str] | None = None, cache: bool = True, resampling: Resampling = Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, keep_common: bool = True) None#
Initialize a new InterferogramDataset instance.
- Parameters:
root_dir (str) – root_dir directory where dataset can be found.
paths_unw (list of str, optional) – list of unwrapped interferogram file paths to use instead of searching for files in
root_dir. If None, files will be searched for inroot_dir.paths_coh (list of str, optional) – list of coherence file paths to use instead of searching for files in
root_dir. If None, files will be searched for inroot_dir.crs (CRS, optional) – the output coordinate reference system term:(CRS) of the dataset. If None, the CRS of the first file found will be used.
res (float, optional) – resolution of the output dataset in units of CRS. If None, the resolution of the first file found will be used.
dtype (numpy.dtype, optional) – data type of the output dataset. If None, the data type of the first file found will be used.
nodata (float or int, optional) – no data value of the output dataset. If None, the no data value of the first file found will be used. This parameter is useful when the no data value is not stored in the file.
roi (BoundingBox, optional) – region of interest to load from the dataset. If None, the union of all files bounds in the dataset will be used.
bands_unw (list of str, optional) – names of bands to return (defaults to all bands) for unwrapped interferograms.
bands_coh (list of str, optional) – names of bands to return (defaults to all bands) for coherence.
cache (bool, optional) – if True, cache file handle to speed up repeated sampling
resampling (Resampling, optional) – Resampling algorithm used when reading input files. Default: Resampling.nearest.
fill_nodata (bool, optional) –
Whether to fill holes in the queried data by interpolating them using inverse distance weighting method provided by the
rasterio.fill.fillnodata(). Default: False.Note
This parameter is only used when sampling data using bounding boxes or polygons queries, and will not work for points queries.
verbose (bool, optional, default: True) – if True, print verbose output.
keep_common (bool, optional, default: True) – Only used when the number of interferograms and coherence files are not equal. If True, keep the common pairs of interferograms and coherence files and raise a warning. If False, raise an error.
Methods
__init__([root_dir, paths_unw, paths_coh, ...])Initialize a new InterferogramDataset instance.
array2kml(arr, out_file[, bounds, ...])Write a numpy array into a kml file.
array2kmz(arr, out_file[, bounds, ...])Write a numpy array into a kmz file.
array2tiff(arr, filename[, bounds, bbox, ...])Save a numpy array to a tiff file using the geoinformation of dataset.
get_profile([bbox])Get profile information of dataset for the given bounding box type.
load_los_ratio([roi, angle_type])Load and convert los angle map to ratio map for given region of interest.
load_mask(mask_path[, bbox])Load a mask from a tiff mask file (.msk).
parse_baselines([pairs])Parse the baseline of the interferogram for given pairs.
parse_datetime(paths)Parse the datetime of the interferogram to generate DatetimeIndex object.
parse_mask(percent[, bbox, seed])Parse the mask of the dataset.
parse_pairs(paths)Parse the Pairs from the paths of the interferogram.
query(query[, pairs])Retrieve images values for given query.
reproject(new_crs[, resampling, nodata])Reproject the dataset to a new CRS.
resample(new_res[, resampling, nodata])Resample the dataset to a new resolution.
row_col(xy[, crs, bbox])Convert x, y coordinates to row, col in the dataset.
set_aps_dataset([aps_dataset])Set the aps dataset.
set_dem_dataset([dem_dataset])Set the dem dataset.
set_los_dataset([los_dataset])Set the los dataset.
set_mask_dataset([mask_dataset])Set the mask dataset.
show(arr, **kwargs)Show the array using the dataset's geo information.
to_nan_count([pairs, roi])Calculate the number of nan values for given region of interest.
to_netcdf(filename[, roi, ref_points])Save the dataset to a netCDF file for given region of interest.
to_tiffs(out_dir[, roi, ref_points, pairs, ...])Save the dataset to files for given region of interest.
xy(row_col[, crs, bbox])Convert row, col in the dataset to x, y coordinates.
Attributes
Names of all available bands in the dataset
Return the aps (Atmospheric Phase Screen) dataset.
Bounds of the overall dataset.
Color map for the dataset, used for plotting
Return the coherence dataset.
value range of coherence.
Number of valid files in the dataset.
Coordinate reference system (CRS) of the dataset.
Date format string used to parse date from filename.
Return the datetime for each pair in the dataset.
Return the DEM dataset.
Data type of the dataset.
When
separate_filesis True, the following additional groups are searched for to find other files:Return a list of all files in the dataset.
Get the frequency of the SAR mission.
Return the theta dataset.
Return the mask dataset.
No data value of the dataset.
Return Pairs parsed from filenames.
Glob expression used to search for files.
pattern used to find coherence files.
pattern used to find interferogram files.
Return the resolution of the dataset.
Names of RGB bands in the dataset, used for plotting
Return the region of interest of the dataset.
Whether all files in the dataset have the same CRS with the desired CRS.
Shape of the dataset.
Return a boolean array indicating which files are valid.
Get the wavelength of the SAR mission.
- classmethod parse_datetime(paths: list[Path]) DatetimeIndex[source]#
Parse the datetime of the interferogram to generate DatetimeIndex object.
- classmethod parse_pairs(paths: list[Path]) Pairs[source]#
Parse the Pairs from the paths of the interferogram.
- array2kml(arr: ndarray, out_file: str | Path, bounds: BoundingBox | None = None, img_kwargs: dict | None = None, cbar_kwargs: dict | None = None, verbose: bool = True) None#
Write a numpy array into a kml file.
- Parameters:
arr (numpy.ndarray) – the numpy array to be written into kml file.
out_file (str or Path) – the path of the kml file.
bounds (BoundingBox, optional) – the bounds of the arr. Default is None, which means the roi of the dataset will be used.
img_kwargs (dict) – the keyword arguments for
matplotlib.pyplot.imshow()function.cbar_kwargs (dict) – the keyword arguments for
save_colorbar()function, except for the out_file and mappable argument.verbose (bool) – whether to print the information of the kml file. Default is verbose.
- array2kmz(arr: ndarray, out_file: str | Path, bounds: BoundingBox | None = None, img_kwargs: dict | None = None, cbar_kwargs: dict | None = None, keep_kml: bool = False, verbose: bool = True) None#
Write a numpy array into a kmz file.
- Parameters:
arr (numpy.ndarray) – the numpy array to be written into kmz file.
out_file (str or Path) – the path of the kmz file.
bounds (BoundingBox, optional) – the bounds of the arr. Default is None, which means the roi of the dataset will be used.
img_kwargs (dict) – the keyword arguments for
matplotlib.pyplot.imshow()function.cbar_kwargs (dict) – the keyword arguments for
save_colorbar()function, except for the out_file and mappable argument.keep_kml (bool) – whether to keep the kml file. Default is False.
verbose (bool) – whether to print the information of the kmz file. Default is verbose.
- array2tiff(arr: np.ndarray, filename: str | Path, bounds: BoundingBox | None = None, bbox: BoundingBox | None = None, band_names: Sequence[str] | None = None, arr_type: Literal['data', 'mask'] = 'data', nodata: float | None = None, overwrite: bool = False) None#
Save a numpy array to a tiff file using the geoinformation of dataset.
- Parameters:
arr (numpy.ndarray) – numpy array to save. arr can be a 2D array or a 3D array. If arr is a 3D array, the first dimension should be the band dimension.
filename (str or Path) – path to the tiff file to save
bounds (BoundingBox, optional) – the bounds of the arr. Default is None, which means the roi of the dataset will be used.
bbox (BoundingBox, optional) – if specified, the input array will be saved to the given part/bbox of dataset. Default is None, which means the array will be saved to the entire dataset.
band_names (Sequence of str, optional) – names of bands to save. Default is None, which will use the band indexes.
arr_type (str, one of ['data', 'mask'], optional) – type of the array to save. Default is ‘data’.
nodata (float or int, optional) – no data value of the dataset. If None, will automatically parse the a proper no data value for the array.
overwrite (bool, optional) – if True, overwrite the existing file. Default is False, which means the array will be saved in append mode (r+ mode).
- get_profile(bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi') Profile | None#
Get profile information of dataset for the given bounding box type.
- load_los_ratio(roi: BoundingBox | None = None, angle_type: Literal['incidence', 'look'] = 'look') ndarray#
Load and convert los angle map to ratio map for given region of interest.
The ratio map is used to convert differential atmospheric phase from vertical to line-of-sight (LOS) direction or convert LOS deformation phase to vertical.
- Parameters:
roi (BoundingBox, optional) – region of interest to load. If None, the roi of the dataset will be used.
angle_type (Literal['incidence', 'look'], optional) – angle type, one of [‘incidence’, ‘look’]. ‘incidence’ means incidence angle (relative to vertical) and ‘look’ means look angle (relative to horizontal). Default is ‘look’.
- load_mask(mask_path: str | Path, bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi') ndarray#
Load a mask from a tiff mask file (.msk).
- parse_baselines(pairs: Pairs | None = None) Baselines[source]#
Parse the baseline of the interferogram for given pairs.
- parse_mask(percent: float, bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi', seed: int = 0) ndarray#
Parse the mask of the dataset.
The mask is a boolean array where True indicates valid data and False indicates invalid data, which keeps in line with the GDAL/rasterio strategy.
- query(query: GeoQuery | Points | BoundingBox | Polygons, pairs: Pairs | None = None) QueryResult#
Retrieve images values for given query.
This method is an more flexible implementation compared to
__getitem__(), which can retrieve images only for the given pairs.- Parameters:
query (GeoQuery | Points | BoundingBox | Polygons) – query to index the dataset. It can be
Points,BoundingBox,Polygons, or a compositeGeoQuery(recommended) object.pairs (Pairs, optional) – pairs to use for the query. If None, all pairs will be used.
- Returns:
result – a QueryResult instance containing the results of the various queries.
- Return type:
- reproject(new_crs: CRS | str, resampling: Resampling = Resampling.nearest, nodata: float | None = None) Self#
Reproject the dataset to a new CRS.
- Parameters:
new_crs (CRS or str) – new coordinate reference system (CRS) of the dataset. It can be a CRS object or a string, which will be parsed to a CRS object. The string can be in any format supported by
pyproj.crs.CRS.from_user_input().resampling (Resampling, optional) – resampling method to use when reprojecting the dataset. Default is Resampling.nearest.
nodata (float or int, optional) – no data value of the dataset. If None, the no data value of the dataset will be used.
- resample(new_res: float | tuple[float, float], resampling: Resampling = Resampling.nearest, nodata: float | None = None) Self#
Resample the dataset to a new resolution.
- Parameters:
new_res (float or tuple of float) – new resolution of the dataset in units of CRS. If a single float is provided, it will be used for both x and y dimensions.
resampling (Resampling, optional) – resampling method to use when resampling the dataset. Default is Resampling.nearest.
nodata (float or int, optional) – no data value of the dataset. If None, the no data value of the dataset will be used.
- row_col(xy: Sequence, crs: CRS | str | None = None, bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi') np.ndarray#
Convert x, y coordinates to row, col in the dataset.
- Parameters:
xy (Sequence) – Pairs of x, y coordinates (floats)
crs (CRS or str, optional) – The CRS of the points. If None, the CRS of the dataset will be used. allowed CRS formats are the same as those supported by rasterio.
bbox (str, one of ['bounds', 'roi'], optional) – the bounding box used to calculate the
width,heightandtransformof the dataset for the profile. Default is ‘roi’.
- Returns:
row_col – row, col in the dataset for the given points(xy)
- Return type:
np.ndarray
- set_aps_dataset(aps_dataset: ApsPairs | None = None, **kwargs: dict) None#
Set the aps dataset.
If aps_dataset is None, a new ApsPairs object will be created using the kwargs.
- Parameters:
aps_dataset (ApsPairs, optional) – A ApsPairs object. ApsPairs is used to remove the atmospheric phase screen (APS) from the unwrapped interferograms. If None, no APS data is used.
**kwargs (dict, optional) – Keyword arguments used to create a new ApsPairs object if aps_dataset is None.
- set_dem_dataset(dem_dataset: RasterDataset | None = None, **kwargs: dict) None#
Set the dem dataset.
- Parameters:
dem_dataset (RasterDataset, optional) – A RasterDataset object containing the dem file.
**kwargs (dict, optional) – Keyword arguments used to create a new RasterDataset object if
dem_datasetis None.
- set_los_dataset(los_dataset: RasterDataset | None = None, **kwargs: dict) None#
Set the los dataset.
los file could be incidence angle (relative to vertical) or look angle (relative to horizontal). This file is used to convert differential atmospheric phase from vertical to line-of-sight (LOS) direction or convert LOS deformation phase to vertical.
- Parameters:
los_dataset (RasterDataset, optional) – A RasterDataset object containing the los files.
**kwargs (dict, optional) – Keyword arguments used to create a new RasterDataset object if
los_datasetis None.
- set_mask_dataset(mask_dataset: RasterDataset | None = None, **kwargs) None#
Set the mask dataset.
- show(arr: ndarray, **kwargs) Self#
Show the array using the dataset’s geo information.
- Parameters:
arr (np.ndarray) – The array with same shape as the dataset to show. The geo information of the dataset will be used to plot the array.
kwargs (key value pairs, optional) – Additional keyword arguments to pass to the
rasterio.plot.show()function.
- to_nan_count(pairs: Pairs | None = None, roi: BoundingBox | None = None) np.ndarray#
Calculate the number of nan values for given region of interest.
- Parameters:
pairs (Pairs, optional) – pairs to calculate the number of nan values. If None, will calculate the number of nan values for all pairs.
roi (BoundingBox, optional) – region of interest to calculate the mean coherence. If None, the roi of the dataset will be used.
- to_netcdf(filename: str | Path, roi: BoundingBox | None = None, ref_points: Points | None = None) None#
Save the dataset to a netCDF file for given region of interest.
- Parameters:
filename (str) – path to the netCDF file to save
roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.
ref_points (Points, optional, default: None) – reference points to save. If None, will keep the original values.
- to_tiffs(out_dir: str | Path, roi: BoundingBox | None = None, ref_points: Points | None = None, pairs: Pairs | None = None, pdc: PhaseDeformationConverter | None = None, los_ratio: np.ndarray | None = None, names_unw: list[str] | None = None, names_coh: list[str] | None = None, overwrite: bool = True) None#
Save the dataset to files for given region of interest.
- Parameters:
out_dir (str) – path to the directory to save the files
roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.
ref_points (Points, optional, default: None) – reference points to save. If None, will keep the original values.
pairs (Pairs, optional) – pairs to save. If None, will save all pairs.
pdc (PhaseDeformationConverter, optional) – PhaseDeformationConverter object used to convert the phase to deformation. If None, will save the phase.
los_ratio (np.ndarray, optional) – los angle ratio map used to convert deformation from line-of-sight (LOS) direction to vertical. You can use the method
load_los_ratio()to load the los angle ratio map. If None, will save the LOS deformation.names_unw (list of str, optional) – names of the unwrapped interferograms to save. If None, original names files to save. If None, original names will be used. If pairs is not None, names should be with the same length as pairs.
names_coh (list of str, optional) – names of the files to save. If None, original names will be used. If pairs is not None, names should be with the same length as pairs.
overwrite (bool, optional) – if True, overwrite the existing files. Default is True.
- xy(row_col: Sequence, crs: CRS | str | None = None, bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi') np.ndarray#
Convert row, col in the dataset to x, y coordinates.
- Parameters:
row_col (Sequence) – Pairs of row, col in the dataset (floats)
crs (CRS or str, optional) – The CRS of output points. If None, the CRS of the dataset will be used. Can be any of the formats supported by
pyproj.CRS.from_user_input().bbox (str, one of ['bounds', 'roi'], optional) – the bounding box used to calculate the
width,heightandtransformof the dataset for the profile. Default is ‘roi’.
- Returns:
xy – x, y coordinates in the given CRS (default is the CRS of the dataset)
- Return type:
np.ndarray
- property aps_dataset: RasterDataset | None#
Return the aps (Atmospheric Phase Screen) dataset.
If None, no aps data is used.
- property bounds: BoundingBox#
Bounds of the overall dataset.
It is the union of all the files in the dataset.
- Returns:
bounds – (minx, right, bottom, top) of the dataset
- Return type:
BoundingBox object
- cmap: ClassVar[dict[int, tuple[int, int, int, int]]] = {}#
Color map for the dataset, used for plotting
- property coh_dataset: CoherenceDataset#
Return the coherence dataset.
- property count: int#
Number of valid files in the dataset.
Note
This is different from the length of the dataset
len(GeoDataset), which is the total number of files in the dataset, including invalid files that cannot be read by rasterio.- Returns:
count – number of valid files in the dataset
- Return type:
- property crs: CRS | None#
Coordinate reference system (CRS) of the dataset.
- Return type:
The coordinate reference system (CRS).
- date_format = '%Y%m%d'#
Date format string used to parse date from filename.
Not used if
filename_regexdoes not contain adategroup.
- property datetime: DatetimeIndex#
Return the datetime for each pair in the dataset.
- property dem_dataset: RasterDataset | None#
Return the DEM dataset. If None, no DEM data is used.
- property dtype: dtype | None#
Data type of the dataset.
- Returns:
dtype – data type of the dataset
- Return type:
numpy.dtype object or None
- filename_regex = '.*'#
When
separate_filesis True, the following additional groups are searched for to find other files:band: replaced with requested band name
- property files: DataFrame#
Return a list of all files in the dataset.
- Return type:
list of all files in the dataset
- property los_dataset: RasterDataset | None#
Return the theta dataset. If None, no theta data is used.
- property mask_dataset: RasterDataset | None#
Return the mask dataset. If None, no Mask data is used.
- pattern = '*'#
Glob expression used to search for files.
This expression should be specific enough that it will not pick up files from other datasets. It should not include a file extension, as the dataset may be in a different file format than what it was originally downloaded as.
- pattern_coh = '*corr.tif'#
pattern used to find coherence files.
- pattern_unw = '*unw_phase.tif'#
pattern used to find interferogram files.
- property res: tuple[float, float]#
Return the resolution of the dataset.
- Returns:
res – resolution of the dataset in x and y directions.
- Return type:
tuple of floats
- property roi: BoundingBox | None#
Return the region of interest of the dataset.
- Returns:
roi – region of interest of the dataset. If None, the bounds of entire dataset will be used.
- Return type:
BoundingBox object
- property shape: tuple[int, int]#
Shape of the dataset.
- Returns:
shape – shape of the dataset in (height, width) format
- Return type:
tuple of ints
- property valid: ndarray#
Return a boolean array indicating which files are valid.
- Returns:
valid – boolean array indicating which files are valid. True means the file is valid and can be read by rasterio, False means the file is invalid.
- Return type:
- property wavelength: Wavelength#
Get the wavelength of the SAR mission.