geospatial Datasets#

Geospatial Datasets are used to process large amounts of geospatial data files that have common information. These files are combined to form a complete dataset.

The following datasets are available:

GeoDataset#

class faninsar.datasets.GeoDataset#

Bases: ABC

Abstract base class for all faninsar datasets. This class is used to represent a geospatial dataset and provides methods to index the dataset and retrieve information about the dataset, such as CRS, resolution, data type, no data value, and a bounds.

__init__()#
property crs: CRS | None#

coordinate reference system (CRS) of the dataset.

Returns:

The coordinate reference system (CRS).

property same_crs: bool#

True if all files in the dataset have the same CRS with the desired CRS, False otherwise.

property res: tuple[float, float]#

Return the resolution of the dataset.

Returns:

res – resolution of the dataset in x and y directions.

Return type:

tuple of floats

property roi: BoundingBox | None#

Return the region of interest of the dataset.

Returns:

roi – region of interest of the dataset. If None, the bounds of entire dataset will be used.

Return type:

BoundingBox object

property dtype: dtype | None#

Data type of the dataset.

Returns:

dtype – data type of the dataset

Return type:

numpy.dtype object or None

property nodata: float | int | Any | None#

No data value of the dataset.

Returns:

nodata – no data value of the dataset

Return type:

float or int

property valid: ndarray#

Return a boolean array indicating which files are valid.

Returns:

valid – boolean array indicating which files are valid. True means the file is valid and can be read by rasterio, False means the file is invalid.

Return type:

numpy.ndarray

property count: int#

Number of valid files in the dataset.

Note

This is different from the length of the dataset len(GeoDataset), which is the total number of files in the dataset, including invalid files that cannot be read by rasterio.

Returns:

count – number of valid files in the dataset

Return type:

int

property bounds: BoundingBox#

Bounds of the overall dataset. It is the union of all the files in the dataset.

Returns:

bounds – (minx, right, bottom, top) of the dataset

Return type:

BoundingBox object

get_profile(bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi') Profile | None#

Return the profile information of the dataset for the given bounding box type. The profile information includes the width, height, transform, count, data type, no data value, and CRS of the dataset.

Parameters:

bbox (BoundingBox | Literal["roi", "bounds"], optional) – the bounding box used to calculate the width, height and transform of the dataset for the profile. Default is ‘roi’.

Returns:

profile – profile of the dataset for the given bounding box type.

Return type:

Profile object or None

RasterDataset#

class faninsar.datasets.RasterDataset(root_dir: str = 'data', paths: Sequence[str] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: float | int | Any | None = None, roi: BoundingBox | None = None, bands: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, ds_name: str = '')#

Bases: GeoDataset

A base class for raster datasets.

Examples

>>> from pathlib import Path
>>> from faninsar.datasets import RasterDataset
>>> from faninsar.query import BoundingBox, GeoQuery, Points,
>>> home_dir = Path("./work/data")
>>> files = list(home_dir.rglob("*unw_phase.tif"))

initialize a RasterDataset and GeoQuery object

>>> ds = RasterDataset(paths=files)
>>> points = Points(
    [(490357, 4283413),
    (491048, 4283411),
    (490317, 4284829)]
    )
>>> query = GeoQuery(points=points, boxes=[ds.bounds, ds.bounds])

use the GeoQuery object to index the RasterDataset

>>> sample = ds[query]

output the samples shapes:

>>> print('boxes result shape:', sample.boxes.data.shape)
boxes result shape: (2, 7, 68, 80)
>>> print('points result shape:', sample.points.data.shape)
points result shape: (7, 3)

of course, you can also use the BoundingBox or Points directly to index the RasterDataset. Those two types will be automatically converted to GeoQuery object.

>>> sample = ds[points]
>>> sample
{'query': GeoQuery(
    boxes=None
    points=Points(count=3)
),
'boxes': None,
'points': array([...], dtype=float32)}
>>> sample = ds[ds.bounds]
query': GeoQuery(
    boxes=[1 BoundingBox]
    points=None
),
'boxes': array([...], dtype=float32),
'points': None}
pattern = '*'#

Glob expression used to search for files.

This expression should be specific enough that it will not pick up files from other datasets. It should not include a file extension, as the dataset may be in a different file format than what it was originally downloaded as.

filename_regex = '.*'#

When separate_files is True, the following additional groups are searched for to find other files:

  • band: replaced with requested band name

date_format = '%Y%m%d'#

Date format string used to parse date from filename.

Not used if filename_regex does not contain a date group.

all_bands: list[str] = []#

Names of all available bands in the dataset

rgb_bands: list[str] = []#

Names of RGB bands in the dataset, used for plotting

__init__(root_dir: str = 'data', paths: Sequence[str] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: float | int | Any | None = None, roi: BoundingBox | None = None, bands: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, ds_name: str = '') None#

Initialize a new raster dataset instance.

Parameters:
  • root_dir (str or Path) – root_dir directory where dataset can be found.

  • paths (list of str, optional) – list of file paths to use instead of searching for files in root_dir. If None, files will be searched for in root_dir.

  • crs (CRS, optional) – the output term:coordinate reference system (CRS) of the dataset. If None, the CRS of the first file found will be used.

  • res (float, optional) – resolution of the output dataset in units of CRS. If None, the resolution of the first file found will be used.

  • dtype (numpy.dtype, optional) – data type of the output dataset. If None, the data type of the first file found will be used.

  • nodata (float or int, optional) – no data value of the dataset. If None, the no data value of the first file found will be used.

  • roi (BoundingBox, optional) – region of interest to load from the dataset. If None, the union of all files bounds in the dataset will be used.

  • bands (list of str, optional) – names of bands to return (defaults to all bands)

  • cache (bool, optional) – if True, cache file handle to speed up repeated sampling

  • resampling (Resampling, optional) – Resampling algorithm used when reading input files. Default: Resampling.nearest.

  • fill_nodata (bool, optional) –

    Whether to fill holes in the queried data by interpolating them using inverse distance weighting method provided by the rasterio.fill.fillnodata(). Default: False.

    Note

    This parameter is only used when sampling data using bounding boxes or polygons queries, and will not work for points queries.

  • verbose (bool, optional) – if True, print verbose output, default: True

  • ds_name (str, optional) – name of the dataset. used for printing verbose output, default: “”

Raises:

FileNotFoundError – if no files are found in root_dir:

cmap: dict[int, tuple[int, int, int, int]] = {}#

Color map for the dataset, used for plotting

property files: DataFrame#

Return a list of all files in the dataset.

Returns:

list of all files in the dataset

row_col(xy: Sequence, crs: CRS | str | None = None, bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi') ndarray#

Convert x, y coordinates to row, col in the dataset.

Parameters:
  • xy (Sequence) – Pairs of x, y coordinates (floats)

  • crs (CRS or str, optional) – The CRS of the points. If None, the CRS of the dataset will be used. allowed CRS formats are the same as those supported by rasterio.

  • bbox (str, one of ['bounds', 'roi'], optional) – the bounding box used to calculate the width, height and transform of the dataset for the profile. Default is ‘roi’.

Returns:

row_col – row, col in the dataset for the given points(xy)

Return type:

np.ndarray

xy(row_col: Sequence, crs: CRS | str | None = None, bbox: BoundingBox | Literal['roi', 'bounds'] = 'roi') ndarray#

Convert row, col in the dataset to x, y coordinates.

Parameters:
  • row_col (Sequence) – Pairs of row, col in the dataset (floats)

  • crs (CRS or str, optional) – The CRS of the points. If None, the CRS of the dataset will be used. allowed CRS formats are the same as those supported by rasterio.

  • bbox (str, one of ['bounds', 'roi'], optional) – the bounding box used to calculate the width, height and transform of the dataset for the profile. Default is ‘roi’.

Returns:

xy – x, y coordinates in the given CRS (default is the CRS of the dataset)

Return type:

np.ndarray

to_tiffs(out_dir: str | Path, roi: BoundingBox | None = None)#

Save the dataset to a directory of tiff files for given region of interest.

Parameters:
  • out_dir (str or Path) – path to the directory to save the tiff files

  • roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.

to_netcdf(filename: str | Path, roi: BoundingBox | None = None) None#

Save the dataset to a netCDF file for given region of interest.

Parameters:
  • filename (str) – path to the netCDF file to save

  • roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.

save_arr_to_tiff(arr: ndarray, filename: str | Path, roi: BoundingBox | None = None, profile: Profile | None = None) None#

Save a numpy array to a tiff file using the geoinformation of the dataset.

Parameters:
  • arr (numpy.ndarray) – numpy array to save. arr can be a 2D array or a 3D array. If arr is a 3D array, the first dimension is the band dimension.

  • filename (str or Path) – path to the tiff file to save

  • roi (BoundingBox, optional) – Region of interest to save. only used if profile is None.

  • profile (Profile, optional) – profile of the tiff file. If None, the roi must be specified.

PairDataset#

class faninsar.datasets.PairDataset(root_dir: str = 'data', paths: Sequence[str] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: float | int | Any | None = None, roi: BoundingBox | None = None, bands: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, ds_name: str = '')#

Bases: RasterDataset

A base class for pair datasets.

abstractmethod classmethod parse_pairs(paths: list[Path]) Pairs#

Used to parse pairs from filenames. Must be implemented in subclass.

Parameters:

paths (list of pathlib.Path) – list of file paths to parse pairs

Returns:

pairs – pairs parsed from filenames

Return type:

Pairs object

Example

for the HyP3 dataset, pairs are parsed from the filenames as follows:

>>> names = [f.name for f in paths]]
>>> pair_names = ['_'.join(i.split("_")[1:3]) for i in names]

for the HyP3 dataset, the pair names are the second and third parts of the filename, separated by an underscore. After parsing the pair names, the Pairs object can be created by using the from_names method.

>>> pairs = Pairs.from_names(pair_names)
abstractmethod classmethod parse_datetime(paths: list[Path]) DatetimeIndex#

Used to parse datetime from filenames. Must be implemented in subclass.

Parameters:

paths (list of pathlib.Path) – list of file paths to parse datetime

Returns:

datetime – datetime parsed from filenames

Return type:

pd.DatetimeIndex

property pairs: Pairs#

Return Pairs parsed from filenames.

property datetime: DatetimeIndex#

Return the datetime for each pair in the dataset.

InterferogramDataset#

class faninsar.datasets.InterferogramDataset(root_dir: str = 'data', paths_unw: Sequence[str | Path] | None = None, paths_coh: Sequence[str | Path] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: Any | None = None, roi: BoundingBox | None = None, bands_unw: Sequence[str] | None = None, bands_coh: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose=True, keep_common: bool = True)#

Bases: PairDataset

A base class for interferogram datasets.

Note

1. Only the pairs that both unwrapped interferograms and coherence files are valid will be used.

2. The unwrapped interferograms are used to initialize this dataset. The coherence, dem, and mask files can be accessed as attributes coh_dataset, dem_dataset, and mask_dataset respectively.

pattern_unw = '*'#

pattern used to find interferogram files.

pattern_coh = '*'#

pattern used to find coherence files.

__init__(root_dir: str = 'data', paths_unw: Sequence[str | Path] | None = None, paths_coh: Sequence[str | Path] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: Any | None = None, roi: BoundingBox | None = None, bands_unw: Sequence[str] | None = None, bands_coh: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose=True, keep_common: bool = True) None#

Initialize a new InterferogramDataset instance.

Parameters:
  • root_dir (str) – root_dir directory where dataset can be found.

  • paths_unw (list of str, optional) – list of unwrapped interferogram file paths to use instead of searching for files in root_dir. If None, files will be searched for in root_dir.

  • paths_coh (list of str, optional) – list of coherence file paths to use instead of searching for files in root_dir. If None, files will be searched for in root_dir.

  • crs (CRS, optional) – the output coordinate reference system term:(CRS) of the dataset. If None, the CRS of the first file found will be used.

  • res (float, optional) – resolution of the output dataset in units of CRS. If None, the resolution of the first file found will be used.

  • dtype (numpy.dtype, optional) – data type of the output dataset. If None, the data type of the first file found will be used.

  • nodata (float or int, optional) – no data value of the output dataset. If None, the no data value of the first file found will be used.

  • roi (BoundingBox, optional) – region of interest to load from the dataset. If None, the union of all files bounds in the dataset will be used.

  • bands_unw (list of str, optional) – names of bands to return (defaults to all bands) for unwrapped interferograms.

  • bands_coh (list of str, optional) – names of bands to return (defaults to all bands) for coherence.

  • cache (bool, optional) – if True, cache file handle to speed up repeated sampling

  • resampling (Resampling, optional) – Resampling algorithm used when reading input files. Default: Resampling.nearest.

  • fill_nodata (bool, optional) –

    Whether to fill holes in the queried data by interpolating them using inverse distance weighting method provided by the rasterio.fill.fillnodata(). Default: False.

    Note

    This parameter is only used when sampling data using bounding boxes or polygons queries, and will not work for points queries.

  • verbose (bool, optional, default: True) – if True, print verbose output.

  • keep_common (bool, optional, default: True) – Only used when the number of interferograms and coherence files are not equal. If True, keep the common pairs of interferograms and coherence files and raise a warning. If False, raise an error.

classmethod parse_pairs(paths: list[Path]) Pairs#

Used to parse pairs from filenames. Must be implemented in subclass.

Parameters:

paths (list of pathlib.Path) – list of file paths to parse pairs

Returns:

pairs – pairs parsed from filenames

Return type:

Pairs object

Example

for the HyP3 dataset, pairs are parsed from the filenames as follows:

>>> names = [f.name for f in paths]]
>>> pair_names = ['_'.join(i.split("_")[1:3]) for i in names]

for the HyP3 dataset, the pair names are the second and third parts of the filename, separated by an underscore. After parsing the pair names, the Pairs object can be created by using the from_names method.

>>> pairs = Pairs.from_names(pair_names)

Note

  • The parse_pairs method must be implemented in subclass. If you are using InterferogramDataset directly, you must implement the parse_pairs method in your code.

  • The parse_pairs method must return a Pairs object.

Raises:

NotImplementedError – if not implemented in subclass or directly using: InterferogramDataset.

classmethod parse_datetime(paths: list[Path]) DatetimeIndex#

Used to parse datetime from filenames. Must be implemented in subclass.

Parameters:

paths (list of pathlib.Path) – list of file paths to parse datetime

Returns:

datetime – datetime parsed from filenames

Return type:

pd.DatetimeIndex

property coh_dataset: RasterDataset#

Return the coherence dataset.

property aps_dataset: RasterDataset | None#

Return the aps (Atmospheric Phase Screen) dataset. If None, no aps data is used.

property los_dataset: RasterDataset | None#

Return the theta dataset. If None, no theta data is used.

property dem_dataset: RasterDataset | None#

Return the DEM dataset. If None, no DEM data is used.

property mask_dataset: RasterDataset | None#

Return the mask dataset. If None, no Mask data is used.

set_aps_dataset(aps_dataset: ApsPairs | None = None, **kwargs: Any) None#

Set the aps dataset. If aps_dataset is None, a new ApsPairs object will be created using the kwargs.

Parameters:
  • aps_dataset (ApsPairs, optional) – A ApsPairs object. ApsPairs is used to remove the atmospheric phase screen (APS) from the unwrapped interferograms. If None, no APS data is used.

  • **kwargs (dict, optional) – Keyword arguments used to create a new ApsPairs object if aps_dataset is None.

set_los_dataset(los_dataset: RasterDataset | None = None, **kwargs: Any) None#

Set the los dataset. los file could be incidence angle (relative to vertical) or look angle (relative to horizontal). This file is used to convert differential atmospheric phase from vertical to line-of-sight (LOS) direction or convert LOS deformation phase to vertical.

Parameters:
  • los_dataset (RasterDataset, optional) – A RasterDataset object containing the los files.

  • **kwargs (dict, optional) – Keyword arguments used to create a new RasterDataset object if los_dataset is None.

set_dem_dataset(dem_dataset: RasterDataset | None = None, **kwargs: Any) None#

Set the dem dataset.

Parameters:
  • dem_dataset (RasterDataset, optional) – A RasterDataset object containing the dem file.

  • **kwargs (dict, optional) – Keyword arguments used to create a new RasterDataset object if dem_dataset is None.

set_mask_dataset(mask_dataset: RasterDataset | None = None, **kwargs) None#
load_los_ratio(roi: BoundingBox | None = None, angle_type: Literal['incidence', 'look'] = 'look') ndarray#

load and convert los angle map to ratio map for given region of interest. The ratio map is used to convert differential atmospheric phase from vertical to line-of-sight (LOS) direction or convert LOS deformation phase to vertical

Parameters:
  • roi (BoundingBox, optional) – region of interest to load. If None, the roi of the dataset will be used.

  • angle_type (Literal['incidence', 'look'], optional) – angle type, one of [‘incidence’, ‘look’]. ‘incidence’ means incidence angle (relative to vertical) and ‘look’ means look angle (relative to horizontal). Default is ‘look’.

to_netcdf(filename: str | Path, roi: BoundingBox | None = None, ref_points: Points | None = None) None#

Save the dataset to a netCDF file for given region of interest.

Parameters:
  • filename (str) – path to the netCDF file to save

  • roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.

  • ref_points (Points, optional, default: None) – reference points to save. If None, will keep the original values.

to_tiffs(out_dir: str | Path, roi: BoundingBox | None = None, ref_points: Points | None = None, pairs: Pairs | None = None, pdc: PhaseDeformationConverter | None = None, los_ratio: ndarray | None = None, names_unw: list[str] | None = None, names_coh: list[str] | None = None, overwrite: bool = True) None#

Save the dataset to files for given region of interest.

Parameters:
  • out_dir (str) – path to the directory to save the files

  • roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.

  • ref_points (Points, optional, default: None) – reference points to save. If None, will keep the original values.

  • pairs (Pairs, optional) – pairs to save. If None, will save all pairs.

  • pdc (PhaseDeformationConverter, optional) – PhaseDeformationConverter object used to convert the phase to deformation. If None, will save the phase.

  • los_ratio (np.ndarray, optional) – los angle ratio map used to convert deformation from line-of-sight (LOS) direction to vertical. You can use the method load_los_ratio() to load the los angle ratio map. If None, will save the LOS deformation.

  • names_unw (list of str, optional) – names of the unwrapped interferograms to save. If None, original names files to save. If None, original names will be used. If pairs is not None, names should be with the same length as pairs.

  • names_coh (list of str, optional) – names of the files to save. If None, original names will be used. If pairs is not None, names should be with the same length as pairs.

  • overwrite (bool, optional) – if True, overwrite the existing files. Default is True.

HyP3S1#

class faninsar.datasets.HyP3S1(root_dir: str = 'data', paths_unw: Sequence[str | Path] | None = None, paths_coh: Sequence[str | Path] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: Any | None = None, roi: BoundingBox | None = None, bands_unw: Sequence[str] | None = None, bands_coh: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose=True, keep_common: bool = True)#

Bases: InterferogramDataset

A dataset manages the data of HyP3 Sentinel-1 product.

Hyp3 is a service for processing Synthetic Aperture Radar (SAR) imagery. This class is used to manage the data of Hyp3 product.

pattern_unw = '*unw_phase.tif'#

pattern used to find interferogram files.

pattern_coh = '*corr.tif'#

pattern used to find coherence files.

classmethod parse_pairs(paths: list[Path]) Pairs#

Parse the primary and secondary date/acquisition of the interferogram to generate Pairs object.

classmethod parse_datetime(paths: list[Path]) DatetimeIndex#

Parse the datetime of the interferogram to generate DatetimeIndex object.

HyP3S1Burst#

class faninsar.datasets.HyP3S1Burst(root_dir: str = 'data', paths_unw: Sequence[str | Path] | None = None, paths_coh: Sequence[str | Path] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: Any | None = None, roi: BoundingBox | None = None, bands_unw: Sequence[str] | None = None, bands_coh: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose=True, keep_common: bool = True)#

Bases: InterferogramDataset

A dataset manages the data of HyP3 Sentinel-1 Burst product.

Hyp3 is a service for processing Synthetic Aperture Radar (SAR) imagery. This class is used to manage the data of Hyp3 product.

pattern_unw = '*unw_phase.tif'#

pattern used to find interferogram files.

pattern_coh = '*corr.tif'#

pattern used to find coherence files.

classmethod parse_pairs(paths: list[Path]) Pairs#

Parse the primary and secondary date/acquisition of the interferogram to generate Pairs object.

classmethod parse_datetime(paths: list[Path]) DatetimeIndex#

Parse the datetime of the interferogram to generate DatetimeIndex object.

LiCSAR#

class faninsar.datasets.LiCSAR(root_dir: str = 'data', paths_unw: Sequence[str | Path] | None = None, paths_coh: Sequence[str | Path] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: Any | None = None, roi: BoundingBox | None = None, bands_unw: Sequence[str] | None = None, bands_coh: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose=True, keep_common: bool = True)#

Bases: InterferogramDataset

A dataset manages the data of LiCSAR product.

LiCSAR is an open-source SAR interferometry (InSAR) time series analysis package that integrates with the automated Sentinel-1 InSAR processor, which products can be downloaded from COMET-LiCS-portal.

pattern_unw = '*geo.unw.tif'#

pattern used to find interferogram files.

pattern_coh = '*geo.cc.tif'#

pattern used to find coherence files.

pattern_dem = '*geo.hgt.tif'#

pattern used to find dem file

pattern_E = '*geo.E.tif'#

pattern used to find E files

pattern_N = '*geo.N.tif'#

pattern used to find N files

pattern_U = '*geo.U.tif'#

pattern used to find U files

pattern_baselines = 'baselines'#

pattern used to find baselines file

pattern_polygon = '*-poly.txt'#

pattern used to find polygon file

property meta_files: Series#

return the paths of LiCSAR metadata files in a pandas Series. metadata files include: DEM, U, E, N, baselines, polygon.

classmethod parse_pairs(paths: list[Path]) Pairs#

Parse the primary and secondary date/acquisition of the interferogram to generate Pairs object.

classmethod parse_datetime(paths: list[Path]) DatetimeIndex#

Parse the datetime of the interferogram to generate DatetimeIndex object.

ApsDataset#

class faninsar.datasets.ApsDataset(root_dir: str = 'data', paths: Sequence[str] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: float | int | Any | None = None, roi: BoundingBox | None = None, bands: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, ds_name: str = '')#

Bases: RasterDataset

A base class for aps (atmospheric phase screen) datasets.

pattern = '*'#

This expression is used to find the APS files.

to_pair_files(out_dir: str | Path, pairs: Pairs, ref_points: Points, roi: BoundingBox | None = None, overwrite: bool = False, prefix: str = 'APS')#

Generate aps-pair files for given pairs and reference points.

Parameters:
  • out_dir (str or Path) – path to the directory to save the aps-pair files

  • pairs (Pairs) – pairs to generate aps-pair files

  • ref_points (Points) – reference points which values are subtracted for all aps-pair files

  • roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.

  • overwrite (bool, optional) – if True, overwrite existing files, default: False

  • prefix (str, optional) – prefix of the aps-pair files, default: “APS”

abstractmethod classmethod parse_dates(paths: Sequence[str] | None = None) DatetimeIndex#

Used to parse acquisition dates from filenames. Must be implemented in subclass.

Parameters:

paths (list of pathlib.Path) – list of file paths to parse datetime

Returns:

datetime – datetime parsed from filenames

Return type:

pd.DatetimeIndex

GACOS#

class faninsar.datasets.GACOS(root_dir: str = 'data', paths: Sequence[str] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: float | int | Any | None = None, roi: BoundingBox | None = None, bands: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, ds_name: str = '')#

Bases: ApsDataset

A dataset manages the data of GACOS product.

GACOS (Generic Atmospheric Correction Online Service for InSAR) is a online service for processing zenith total delay maps to correct Atmospheric delays. This class is used to manage the data of GACOS product.

Examples

>>> from faninsar.datasets import GACOS
>>> from faninsar.datasets import HyP3
>>> from faninsar.query import BoundingBox, Points
>>> hyp3_dir = Path("/Volumes/Data/Hyp3/descending_roi")
>>> home_dir = Path("/Volumes/Data/Hyp3/descending_gacos")
>>> out_dir = Path("/Volumes/Data/Hyp3/descending_gacos_pairs")

prepare reference points and roi (region of interest)

>>> ref_points_file = Path("/Volumes/Data/ARPs.geojson")
>>> ref_points = Points.from_shapefile(ref_points_file)
>>> roi = BoundingBox(98.57726618, 38.52546262, 99.41100273, 39.13802703, crs=4326)

initialize HyP3

>>> ds_hyp3 = HyP3(hyp3_dir)

using HyP3 crs and res as the output crs and res of GACOS dataset

>>> gacos = GACOS(home_dir, crs=ds_hyp3.crs, res=ds_hyp3.res, nodata=np.nan)

using reference points, roi and HyP3 pairs to generate gacos pair files

>>> gacos.to_pair_files(out_dir, ds_hyp3.pairs, ref_points, roi)
pattern = '*.ztd.tif'#

This expression is used to find the GACOS files.

classmethod parse_dates(paths: list[Path])#

Used to parse acquisition dates from filenames. Must be implemented in subclass.

Parameters:

paths (list of pathlib.Path) – list of file paths to parse datetime

Returns:

datetime – datetime parsed from filenames

Return type:

pd.DatetimeIndex

to_pair_files(out_dir: str | Path, pairs: Pairs, ref_points: Points, roi: BoundingBox | None = None, overwrite: bool = False, prefix: str = 'GACOS')#

Generate aps-pair files for given pairs and reference points.

Parameters:
  • out_dir (str or Path) – path to the directory to save the aps-pair files

  • pairs (Pairs) – pairs to generate aps pair files

  • ref_points (Points) – reference points which values are subtracted for all aps pair files

  • roi (BoundingBox, optional) – region of interest to save. If None, the roi of the dataset will be used.

  • overwrite (bool, optional) – if True, overwrite existing files, default: False

  • prefix (str, optional) – prefix of the aps-pair files, default: “GACOS”

GACOSPairs#

class faninsar.datasets.GACOSPairs(root_dir: str = 'data', paths: Sequence[str] | None = None, crs: CRS | None = None, res: float | tuple[float, float] | None = None, dtype: dtype | None = None, nodata: float | int | Any | None = None, roi: BoundingBox | None = None, bands: Sequence[str] | None = None, cache: bool = True, resampling=Resampling.nearest, fill_nodata: bool = False, verbose: bool = True, ds_name: str = '')#

Bases: ApsPairs

A dataset manages the data of GACOS pairs.

pattern = '*.tif'#

This expression is used to find the GACOSPairs files.