Skip to content

confusius.validation

validation

Data validation utilities for confusius.

Modules:

  • coordinates

    Coordinate validation utilities.

  • fusi

    Validation helpers for ConfUSIus-style fUSI DataArrays.

  • iq

    IQ data validation utilities.

  • mask

    Mask validation utilities.

  • time_series

    Time series validation utilities.

Functions:

validate_fusi_dataarray

validate_fusi_dataarray(
    data: DataArray,
    *,
    require_time: bool = False,
    allow_pose: bool = True,
    allow_extra_dims: bool = True,
    minimum_spatial_dims: int = 2,
    require_regular_spacing: bool = False,
    regular_spacing_tolerance: float = 0.01,
    regular_spacing_dims: RegularSpacingDims = "space",
    require_canonical_dim_order: bool = False,
    require_spatial_voxdim: bool = False,
    require_spatial_units: bool = False,
    require_time_units: bool = False,
) -> None

Validate that a DataArray follows ConfUSIus fUSI conventions.

A valid fUSI DataArray must:

  • Have dimension names from the set (time, pose, z, y, x), with optional extra dimensions if allow_extra_dims is True (e.g., region, component, mask, etc.).
  • Have matching 1D coordinates for all core dimensions (time, pose, z, y, x). Extra-dimension coordinates are optional.
  • Have numeric, finite, and strictly increasing core dimension coordinates (time, pose, z, y, x).

Additional requirements can be enforced using the function parameters.

Parameters:

  • data

    (DataArray) –

    DataArray to validate.

  • require_time

    (bool, default: False ) –

    Whether to require a time dimension.

  • allow_pose

    (bool, default: True ) –

    Whether to allow a pose dimension.

  • allow_extra_dims

    (bool, default: True ) –

    Whether dimensions outside the ConfUSIus core set (time, pose, z, y, x) are allowed.

  • minimum_spatial_dims

    (int, default: 2 ) –

    Minimum number of spatial dimensions from ("z", "y", "x") required in the DataArray.

  • require_regular_spacing

    (bool, default: False ) –

    Whether numeric dimension coordinates must have regular spacing.

  • regular_spacing_tolerance

    (float, default: 1e-2 ) –

    Relative tolerance used to assess coordinate regularity.

  • regular_spacing_dims

    (('space', 'core', 'all'), default: "space" ) –

    Dimensions that must satisfy regular-spacing checks when require_regular_spacing=True. Use "space" for present z, y, x dimensions, "core" for present core dimensions (time, pose, z, y, x), "all" for all present dimensions, a string for one explicit dimension name, or a sequence for multiple explicit dimension names. Non-numeric coordinates are ignored.

  • require_canonical_dim_order

    (bool, default: False ) –

    Whether the ConfUSIus core dimensions present in the DataArray must appear in canonical relative order (time, pose, z, y, x).

  • require_spatial_voxdim

    (bool, default: False ) –

    Whether present spatial coordinates must define a voxdim attribute.

  • require_spatial_units

    (bool, default: False ) –

    Whether present spatial coordinates must define a units attribute.

  • require_time_units

    (bool, default: False ) –

    Whether the time coordinate must define a units attribute when present.

Raises:

  • TypeError

    If data is not an xarray.DataArray.

  • ValueError

    If dimension names are invalid, required dimensions or coordinates are missing, there are too few spatial dimensions, core numeric coordinate constraints fail, optional stricter checks fail, or required metadata is missing.

validate_iq_dataarray

validate_iq_dataarray(
    iq: DataArray, require_attrs: bool = False
) -> None

Validate that a DataArray contains valid IQ data.

This function performs validation of an IQ DataArray to ensure it meets all requirements for processing with ConfUSIus functions. Validation checks include:

  1. Dimensions: The IQ DataArray must have exactly 4 dimensions in the order: (time, z, y, x).
  2. Coordinates: All dimensions must have corresponding coordinates.
  3. Data type: The data must be complex-valued (complex64 or complex128).
  4. Attributes (optional): If require_attrs is True, the DataArray must have the following attributes needed for axial velocity computation:

  5. transmit_frequency: Ultrasound probe central frequency in Hz.

  6. beamforming_sound_velocity: Speed of sound assumed during beamforming in meters per second.

Parameters:

  • iq

    (DataArray) –

    Input DataArray to validate. Must have dimensions (time, z, y, x) and the required structure and attributes.

  • require_attrs

    (bool, default: False ) –

    Whether to validate that all required attributes (transmit_frequency, beamforming_sound_velocity) are present in the DataArray attributes.

Raises:

  • ValueError

    If the DataArray does not have dimensions (time, z, y, x), if required coordinates are missing, or if required attributes are missing when require_attrs=True.

  • TypeError

    If the IQ data is not complex-valued.

Examples:

Validate a properly formatted IQ DataArray:

>>> import numpy as np
>>> import xarray as xr
>>> iq = xr.DataArray(
...     np.ones((10, 4, 6, 8), dtype=np.complex64),
...     dims=("time", "z", "y", "x"),
...     coords={
...         "time": np.arange(10),
...         "z": np.arange(4),
...         "y": np.arange(6),
...         "x": np.arange(8),
...     },
...     attrs={
...         "transmit_frequency": 15e6,
...         "beamforming_sound_velocity": 1540.0,
...     },
... )
>>> validate_iq_dataarray(iq, require_attrs=True)

Skip attribute validation for intermediate processing:

>>> iq_no_attrs = xr.DataArray(
...     np.ones((10, 4, 6, 8), dtype=np.complex64),
...     dims=("time", "z", "y", "x"),
...     coords={
...         "time": np.arange(10),
...         "z": np.arange(4),
...         "y": np.arange(6),
...         "x": np.arange(8),
...     },
... )
>>> validate_iq_dataarray(iq_no_attrs, require_attrs=False)

validate_labels

validate_labels(
    labels: DataArray,
    data: DataArray,
    labels_name: str = "labels",
    rtol: float = 1e-05,
    atol: float = 1e-08,
) -> None

Validate that a label map matches data spatial dimensions and coordinates.

Parameters:

  • labels

    (DataArray) –

    Label map to validate. Must have integer dtype and coordinates must match data. Accepts two formats:

    • Flat label map: Spatial dims only, e.g. (z, y, x). Background voxels labeled 0; each unique non-zero integer identifies a distinct, non-overlapping region. The regions coordinate of the output holds the integer label values.
    • Stacked mask format: Has a leading mask dimension followed by spatial dims, e.g. (mask, z, y, x). Each layer has values in {0, region_id} and regions may overlap. The region coordinate of the output holds the mask coordinate values (e.g., region label).
  • data

    (DataArray) –

    Data array to validate labels against.

  • labels_name

    (str, default: "labels" ) –

    Name of the labels parameter (used in error messages).

  • rtol

    (float, default: 1e-5 ) –

    Relative tolerance for coordinate comparison.

  • atol

    (float, default: 1e-8 ) –

    Absolute tolerance for coordinate comparison.

Raises:

  • TypeError

    If labels is not an integer dtype DataArray.

  • ValueError

    If labels dimensions don't match data or if coordinates don't match.

validate_mask

validate_mask(
    mask: DataArray,
    data: DataArray,
    mask_name: str = "mask",
    rtol: float = 1e-05,
    atol: float = 1e-08,
    require_exact_dims: bool = False,
) -> None

Validate that a mask matches data spatial dimensions and coordinates.

Parameters:

  • mask

    (DataArray) –

    Mask to validate. Must have boolean dtype, or integer dtype with exactly one non-zero value (0 = background, one region id = foreground). The latter format is produced by Atlas.get_masks. Coordinates must match data.

  • data

    (DataArray) –

    Data array to validate mask against.

  • mask_name

    (str, default: "mask" ) –

    Name of the mask parameter (used in error messages).

  • rtol

    (float, default: 1e-5 ) –

    Relative tolerance for coordinate comparison.

  • atol

    (float, default: 1e-8 ) –

    Absolute tolerance for coordinate comparison.

  • require_exact_dims

    (bool, default: False ) –

    Whether mask.dims must match all non-time dimensions of data in the same order.

Raises:

  • TypeError

    If mask is not a boolean or single-label integer DataArray.

  • ValueError

    If mask dimensions don't match data or if coordinates don't match.

validate_matching_coordinates

validate_matching_coordinates(
    left: DataArray,
    right: DataArray,
    coord_names: Hashable
    | Iterable[Hashable]
    | None = None,
    *,
    left_name: str = "left array",
    right_name: str = "right array",
    rtol: float = 1e-05,
    atol: float = 1e-08,
) -> None

Validate that selected coordinates match between two DataArrays.

Comparison is performed on coordinate values rather than the full coordinate DataArray, so unrelated attached coordinates do not cause false mismatches. Numeric coordinates are compared with tolerance to accommodate harmless floating-point drift (for example after serialization and reload). Non-numeric coordinates are compared exactly.

Parameters:

  • left

    (DataArray) –

    First array to compare.

  • right

    (DataArray) –

    Second array to compare.

  • coord_names

    (Hashable or Iterable[Hashable], default: None ) –

    Coordinate names to compare. If not provided, all shared dimension coordinates are checked.

  • left_name

    (str, default: "left array" ) –

    Label used for left in error messages. Override with a context-specific name (e.g. "run 0", "map 0") for more actionable errors.

  • right_name

    (str, default: "right array" ) –

    Label used for right in error messages.

  • rtol

    (float, default: 1e-5 ) –

    Relative tolerance used for numeric coordinate comparison.

  • atol

    (float, default: 1e-8 ) –

    Absolute tolerance used for numeric coordinate comparison.

Raises:

  • ValueError

    If a requested coordinate is missing or if coordinates do not match.

validate_time_series

validate_time_series(
    time_series: DataArray,
    operation_name: str,
    check_time_chunks: bool = True,
) -> int

Validate time series for time series processing operations.

Performs common validation checks:

  1. Time series have a time dimension.
  2. Time dimension has more than 1 timepoint.
  3. Time dimension is not chunked for Dask arrays (optional).

Parameters:

  • time_series

    (DataArray) –

    Input time series to validate. Must have a time dimension.

  • operation_name

    (str) –

    Name of the operation (used in error/warning messages).

  • check_time_chunks

    (bool, default: True ) –

    Whether to raise an error when time dimension is chunked in a Dask array. Set to False for operations that can handle chunked time (e.g., confusius.signal.standardize).

Returns:

  • int

    Axis number for the time dimension.

Raises:

  • ValueError

    If time_series has no time dimension, if the time dimension has only 1 timepoint, or if the time dimension is chunked in a Dask array (when check_time_chunks=True).