PyHydroGeophysX.data_access package#
Submodules#
PyHydroGeophysX.data_access.accessors module#
Accessor abstraction for hydro-model data.
Three concrete implementations: - LocalHydroAccessor – reads directly from a local directory. - HttpHydroAccessor – downloads files on demand from an HTTP base URL. - (BaseHydroAccessor) – abstract base defining the interface.
Usage#
>>> acc = LocalHydroAccessor("/path/to/data")
>>> ok, summary, errors = acc.validate()
>>> local_dir = acc.materialize(["Watercontent.npy", "top.txt"], "/tmp/work")
- class PyHydroGeophysX.data_access.accessors.BaseHydroAccessor[source]#
Bases:
ABCAbstract base for hydro-data access.
- abstract list_available_items() Dict[str, Any][source]#
Return available timesteps / variables / files.
- Return type:
dict with keys like
files,timesteps,variables.
- abstract materialize(required_files: List[str], target_dir: str) str[source]#
Ensure required_files are available in target_dir.
For local accessors this may be a no-op (return source dir). For HTTP accessors this downloads missing files.
- Return type:
str – local directory path containing the files.
- class PyHydroGeophysX.data_access.accessors.HttpHydroAccessor(manifest_entry: Dict[str, Any], cache_dir: str | None = None)[source]#
Bases:
BaseHydroAccessorDownload hydro data on demand from an HTTP base URL.
- Parameters:
manifest_entry (dict) – A single dataset entry from manifest.json.
cache_dir (str, optional) – Directory for caching downloaded files. Defaults to a temp directory.
- list_available_items() Dict[str, Any][source]#
Return available timesteps / variables / files.
- Return type:
dict with keys like
files,timesteps,variables.
- class PyHydroGeophysX.data_access.accessors.LocalHydroAccessor(root_path: str)[source]#
Bases:
BaseHydroAccessorRead hydro data from a local filesystem directory.
- list_available_items() Dict[str, Any][source]#
Return available timesteps / variables / files.
- Return type:
dict with keys like
files,timesteps,variables.
- PyHydroGeophysX.data_access.accessors.get_manifest_entry(dataset_id: str, manifest_path: str | None = None) Dict[str, Any] | None[source]#
Return a single manifest entry by id, or None.
- PyHydroGeophysX.data_access.accessors.load_manifest(manifest_path: str | None = None) Dict[str, Any][source]#
Load the dataset manifest JSON.
- Parameters:
manifest_path (str, optional) – Path to manifest.json. Defaults to
datasets/manifest.jsonrelative to the repository root (two levels up from this file).- Returns:
Parsed manifest with a
datasetskey.- Return type:
dict
Module contents#
Data access abstraction for PyHydroGeophysX.
Provides accessor classes that unify local filesystem and HTTP-based data loading for the Hydro-to-Geophysics workflow.
- class PyHydroGeophysX.data_access.BaseHydroAccessor[source]#
Bases:
ABCAbstract base for hydro-data access.
- abstract list_available_items() Dict[str, Any][source]#
Return available timesteps / variables / files.
- Return type:
dict with keys like
files,timesteps,variables.
- abstract materialize(required_files: List[str], target_dir: str) str[source]#
Ensure required_files are available in target_dir.
For local accessors this may be a no-op (return source dir). For HTTP accessors this downloads missing files.
- Return type:
str – local directory path containing the files.
- class PyHydroGeophysX.data_access.HttpHydroAccessor(manifest_entry: Dict[str, Any], cache_dir: str | None = None)[source]#
Bases:
BaseHydroAccessorDownload hydro data on demand from an HTTP base URL.
- Parameters:
manifest_entry (dict) – A single dataset entry from manifest.json.
cache_dir (str, optional) – Directory for caching downloaded files. Defaults to a temp directory.
- list_available_items() Dict[str, Any][source]#
Return available timesteps / variables / files.
- Return type:
dict with keys like
files,timesteps,variables.
- class PyHydroGeophysX.data_access.LocalHydroAccessor(root_path: str)[source]#
Bases:
BaseHydroAccessorRead hydro data from a local filesystem directory.
- list_available_items() Dict[str, Any][source]#
Return available timesteps / variables / files.
- Return type:
dict with keys like
files,timesteps,variables.
- PyHydroGeophysX.data_access.get_manifest_entry(dataset_id: str, manifest_path: str | None = None) Dict[str, Any] | None[source]#
Return a single manifest entry by id, or None.
- PyHydroGeophysX.data_access.load_manifest(manifest_path: str | None = None) Dict[str, Any][source]#
Load the dataset manifest JSON.
- Parameters:
manifest_path (str, optional) – Path to manifest.json. Defaults to
datasets/manifest.jsonrelative to the repository root (two levels up from this file).- Returns:
Parsed manifest with a
datasetskey.- Return type:
dict