Data Processing Module

The data_processing module provides tools for loading, quality control, and exporting geophysical field data.

ERT Data Agent

Overview

The ert_data_agent module provides a standardized interface for working with electrical resistivity tomography (ERT) field data. It integrates with RESIPY to support data loading from multiple commercial instruments and provides quality control visualization and export functionality.

Key Features:

  • Load ERT data from 14+ commercial instruments (E4D, Syscal, ABEM, Sting, ARES, etc.)

  • Automatic coordinate reference system handling (local, projected, geographic)

  • Quality control visualizations (histograms, pseudosections)

  • Export to pyGIMLi/BERT format for inversion

  • Support for time-lapse ERT surveys

Data Structures

LocalRef

class LocalRef

Named tuple for local coordinate reference system parameters.

Parameters:
  • origin_x (float) – X-coordinate of profile origin in world coordinates (default: 0.0)

  • origin_y (float) – Y-coordinate of profile origin in world coordinates (default: 0.0)

  • azimuth_deg (float) – Profile azimuth in degrees clockwise from north (default: 0.0)

Electrode

class Electrode

Dataclass representing a single electrode.

Parameters:
  • id (int) – Electrode identifier

  • x (float) – X-coordinate

  • y (float) – Y-coordinate (default: 0.0)

  • z (float) – Z-coordinate/elevation (default: 0.0)

Quadruplet

class Quadruplet

Dataclass representing a 4-electrode measurement configuration.

Parameters:
  • A (int) – Current injection electrode A

  • B (int) – Current injection electrode B

  • M (int) – Potential measurement electrode M

  • N (int) – Potential measurement electrode N

Observation

class Observation

Dataclass representing a single ERT measurement.

Parameters:
  • quad (Quadruplet) – 4-electrode configuration

  • app_res (float | None) – Apparent resistivity in Ω·m (optional)

  • dV (float | None) – Measured potential difference in V (optional)

  • I (float | None) – Injected current in A (optional)

  • resist (float | None) – Measured resistance in Ω (optional)

  • K (float | None) – Geometric factor (optional)

  • err (float | None) – Measurement error/uncertainty (optional)

  • valid (bool | None) – Validity flag (optional)

ERTDataset

class ERTDataset

Dataclass representing a complete ERT survey dataset.

Parameters:
  • electrodes (List[Electrode]) – List of electrode positions

  • observations (List[Observation]) – List of measurements

  • crs (str) – Coordinate reference system (‘local’, ‘EPSG:XXXX’, or ‘WGS84’)

  • local_ref (LocalRef | None) – Local coordinate reference (optional)

  • epsg (int | None) – EPSG code for projected coordinates (optional)

  • metadata (Dict[str, Any]) – Additional survey metadata

Functions

load_ert_resipy

load_ert_resipy(project_dir: str, data_file: str, instrument: str, crs: str = 'local', local_ref: LocalRef | None = None, epsg: int | None = None) ERTDataset

Load ERT field data using RESIPY library with support for multiple instruments.

Parameters:
  • project_dir (str) – Directory for RESIPY project (working directory)

  • data_file (str) – Path to ERT data file (relative or absolute)

  • instrument (str) – Instrument type (see Supported Instruments below)

  • crs (str) – Coordinate reference system (‘local’, ‘EPSG:XXXX’, or ‘WGS84’)

  • local_ref (LocalRef | None) – Local coordinate reference parameters (required if crs=’local’)

  • epsg (int | None) – EPSG code for projected coordinates (required if crs starts with ‘EPSG:’)

Returns:

Complete ERT dataset with electrodes, measurements, and metadata

Return type:

ERTDataset

Raises:
  • ImportError – If RESIPY is not installed

  • FileNotFoundError – If data_file does not exist

  • ValueError – If instrument type is not supported or CRS parameters are invalid

Supported Instruments:

  • Protocol DC - Iris Instruments Protocol DC systems

  • Syscal - Iris Instruments Syscal systems

  • Protocol IP - Iris Instruments Protocol IP systems

  • ResInv - ResInv format files

  • PRIME/RESIMGR - Prime/Resimgr format

  • Sting - AGI Sting systems

  • ABEM-Lund - ABEM/Lund systems

  • Lippmann - Lippmann systems

  • ARES - GF Instruments ARES systems

  • BERT - pyGIMLi/BERT format files

  • E4D - E4D format (common in watershed monitoring)

  • DAS-1 - DAS-1 systems

  • Electra - Electra systems

  • Custom - Custom data formats

  • Merged - Merged datasets

Example:

from PyHydroGeophysX.data_processing.ert_data_agent import (
    load_ert_resipy, LocalRef
)

# Load E4D data in local coordinates
ert = load_ert_resipy(
    project_dir="data/ERT/E4D",
    data_file="data/ERT/E4D/2021-10-08_1400.ohm",
    instrument="E4D",
    crs="local",
    local_ref=LocalRef(origin_x=0.0, origin_y=0.0, azimuth_deg=90.0)
)

# Load Syscal data in UTM coordinates
ert = load_ert_resipy(
    project_dir="data/ERT/Syscal",
    data_file="data/ERT/Syscal/survey.txt",
    instrument="Syscal",
    crs="EPSG:32615",  # UTM Zone 15N
    epsg=32615
)

Notes:

  • Function handles Windows/OneDrive permission issues automatically

  • Supports Unix-style paths on Windows

  • Flexible column name detection (app/rhoa/Rho, resError/magErr)

  • Automatically converts between resistance and apparent resistivity

qc_and_visualize

qc_and_visualize(ert: ERTDataset, outdir: str = 'results') Dict[str, str]

Generate quality control plots and summary statistics for ERT dataset.

Parameters:
  • ert (ERTDataset) – ERT dataset from load_ert_resipy

  • outdir (str) – Output directory for plots and reports

Returns:

Dictionary mapping artifact types to file paths

Return type:

Dict[str, str]

Generated Artifacts:

  • rhoa_hist.png: Histogram of log10 apparent resistivity values

  • pseudosection.png: Pseudosection plot (if supported by instrument)

  • data_summary.json: Statistical summary (count, mean, std, min, max, percentiles)

Example:

from PyHydroGeophysX.data_processing.ert_data_agent import (
    load_ert_resipy, qc_and_visualize
)

ert = load_ert_resipy(
    project_dir="data/ERT/E4D",
    data_file="data/ERT/E4D/2021-10-08_1400.ohm",
    instrument="E4D"
)

artifacts = qc_and_visualize(ert, outdir="results/qc")
print(f"Histogram: {artifacts['histogram']}")
print(f"Summary: {artifacts['summary']}")

export_for_inversion

export_for_inversion(ert: ERTDataset, outdir: str = 'results', fmt: str = 'pgimli', filename: str = 'bert_data.dat') str

Export ERT dataset to format suitable for inversion codes.

Parameters:
  • ert (ERTDataset) – ERT dataset from load_ert_resipy

  • outdir (str) – Output directory

  • fmt (str) – Export format (‘pgimli’ or ‘bert’)

  • filename (str) – Output filename (default: ‘bert_data.dat’)

Returns:

Path to exported file

Return type:

str

Supported Formats:

  • pgimli/bert: Unified data format for pyGIMLi/BERT inversion codes

File Structure (pyGIMLi/BERT):

112                                    # Number of electrodes
# x y z                                # Electrode coordinate header
0.0    0.0    3213.46                  # Electrode 1 coordinates
3.0    0.0    3211.65                  # Electrode 2 coordinates
...
237.0  0.0    3134.49                  # Electrode 112 coordinates
3647                                   # Number of measurements
# a b m n err i ip iperr k r rhoa u valid
1  2  3  4  0.05  0.1  0  0  1.23  45.6  56.1  1  1
...

Columns in measurement data:

  • a: Current injection electrode A (1-indexed)

  • b: Current injection electrode B (1-indexed)

  • m: Potential measurement electrode M (1-indexed)

  • n: Potential measurement electrode N (1-indexed)

  • err: Relative error (default: 0.05 = 5%)

  • i: Injected current in A

  • ip: Induced polarization (0 for DC-only)

  • iperr: IP error (0 for DC-only)

  • k: Geometric factor

  • r: Measured resistance in Ω

  • rhoa: Apparent resistivity in Ω·m

  • u: Voltage/potential difference in V

  • valid: Validity flag (1=valid, 0=invalid)

Example:

from PyHydroGeophysX.data_processing.ert_data_agent import (
    load_ert_resipy, export_for_inversion
)

ert = load_ert_resipy(
    project_dir="data/ERT/E4D",
    data_file="data/ERT/E4D/2021-10-08_1400.ohm",
    instrument="E4D"
)

# Export to pyGIMLi format
bert_path = export_for_inversion(
    ert,
    outdir="results/inversion",
    fmt="pgimli",
    filename="survey_2021-10-08.dat"
)
print(f"Exported to: {bert_path}")

Workflow Example

Complete workflow from field data to inversion-ready format:

from PyHydroGeophysX.data_processing.ert_data_agent import (
    load_ert_resipy, qc_and_visualize, export_for_inversion, LocalRef
)

# 1. Load field data
ert = load_ert_resipy(
    project_dir="data/ERT/E4D",
    data_file="data/ERT/E4D/2021-10-08_1400.ohm",
    instrument="E4D",
    crs="local",
    local_ref=LocalRef(origin_x=0.0, origin_y=0.0, azimuth_deg=90.0)
)

# 2. Quality control
artifacts = qc_and_visualize(ert, outdir="results/qc")
print(f"Generated QC plots: {artifacts}")

# 3. Export for inversion
bert_path = export_for_inversion(
    ert,
    outdir="results/inversion",
    fmt="pgimli"
)
print(f"Ready for inversion: {bert_path}")

# 4. Inspect dataset
print(f"Survey has {len(ert.electrodes)} electrodes")
print(f"Survey has {len(ert.observations)} measurements")
print(f"CRS: {ert.crs}")

Time-Lapse Surveys

For time-lapse monitoring, process each timestep separately:

from pathlib import Path
from datetime import datetime

# Time-lapse data files
data_files = [
    "2021-10-08_1400.ohm",
    "2021-10-09_1400.ohm",
    "2021-10-10_1400.ohm",
]

# Process all timesteps
bert_files = []
for data_file in data_files:
    # Extract timestamp from filename
    timestamp = datetime.strptime(
        Path(data_file).stem, "%Y-%m-%d_%H%M"
    )

    # Load and process
    ert = load_ert_resipy(
        project_dir="data/ERT/E4D",
        data_file=f"data/ERT/E4D/{data_file}",
        instrument="E4D",
        crs="local",
        local_ref=LocalRef(origin_x=0.0, origin_y=0.0, azimuth_deg=90.0)
    )

    # Export with timestamp
    bert_path = export_for_inversion(
        ert,
        outdir="results/time_lapse",
        fmt="pgimli",
        filename=f"survey_{timestamp.strftime('%Y%m%d_%H%M')}.dat"
    )
    bert_files.append(bert_path)

print(f"Processed {len(bert_files)} time-lapse surveys")

See Also

Acknowledgments

The ERT data processing module is built on RESIPY, an intuitive open-source software for complex geoelectrical inversion/modeling developed by Guillaume Blanchy, Jimmy Boyd, and contributors.

This module integrates with pyGIMLi, an open-source library for geophysical modeling and inversion developed by Carsten Rücker, Thomas Günther, Florian Wagner, and contributors.

Citations:

RESIPY:

Blanchy, G., Saneiyan, S., Boyd, J., McLachlan, P., & Binley, A. (2020). ResIPy, an intuitive open source software for complex geoelectrical inversion/modeling. Computers & Geosciences, 137, 104423. https://doi.org/10.1016/j.cageo.2020.104423

pyGIMLi:

Rücker, C., Günther, T., & Wagner, F. M. (2017). pyGIMLi: An open-source library for modelling and inversion in geophysics. Computers & Geosciences, 109, 106-123. https://doi.org/10.1016/j.cageo.2017.07.011