confusius.validation¶
validation ¶
Data validation utilities for confusius.
Modules:
-
coordinates–Coordinate validation utilities.
-
iq–IQ data validation utilities.
-
mask–Mask validation utilities.
-
time_series–Time series validation utilities.
Functions:
-
validate_iq–Validate that a DataArray contains valid IQ data.
-
validate_labels–Validate that a label map matches data spatial dimensions and coordinates.
-
validate_mask–Validate that a mask matches data spatial dimensions and coordinates.
-
validate_matching_coordinates–Validate that selected coordinates match between two DataArrays.
-
validate_time_series–Validate time series for time series processing operations.
validate_iq ¶
validate_iq(
iq: DataArray, require_attrs: bool = False
) -> None
Validate that a DataArray contains valid IQ data.
This function performs validation of an IQ DataArray to ensure it meets all requirements for processing with confusius functions. Validation checks include:
- Dimensions: The IQ DataArray must have exactly 4 dimensions in the
order:
(time, z, y, x). - Coordinates: All dimensions must have corresponding coordinates.
- Data type: The data must be complex-valued (
complex64orcomplex128). -
Attributes (optional): If
require_attrsisTrue, the DataArray must have the following attributes needed for axial velocity computation: -
transmit_frequency: Ultrasound probe central frequency in Hz. beamforming_sound_velocity: Speed of sound assumed during beamforming in meters per second.
Parameters:
-
(iq¶DataArray) –Input DataArray to validate. Must have dimensions
(time, z, y, x)and the required structure and attributes. -
(require_attrs¶bool, default:False) –Whether to validate that all required attributes (
transmit_frequency,beamforming_sound_velocity) are present in the DataArray attributes.
Raises:
-
ValueError–If the DataArray does not have dimensions ("time", "z", "y", "x") or corresponding coordinates, or if required attributes are missing (when
require_attrs=True). -
TypeError–If the IQ data is not complex-valued.
Examples:
Validate a properly formatted IQ DataArray:
>>> import xarray as xr
>>> import numpy as np
>>> iq = xr.DataArray(
... np.ones((10, 4, 6, 8), dtype=np.complex64),
... dims=("time", "z", "y", "x"),
... coords={
... "time": np.arange(10),
... "z": np.arange(4),
... "y": np.arange(6),
... "x": np.arange(8),
... },
... attrs={
... "transmit_frequency": 15e6,
... "beamforming_sound_velocity": 1540.0,
... },
... )
>>> validate_iq(iq, require_attrs=True)
Skip attribute validation for intermediate processing:
>>> # DataArray missing attributes
>>> iq_no_attrs = xr.DataArray(
... np.ones((10, 4, 6, 8), dtype=np.complex64),
... dims=("time", "z", "y", "x"),
... coords={"time": np.arange(10), "z": np.arange(4),
... "y": np.arange(6), "x": np.arange(8)},
... )
>>> validate_iq(iq_no_attrs, require_attrs=False)
validate_labels ¶
validate_labels(
labels: DataArray,
data: DataArray,
labels_name: str = "labels",
rtol: float = 1e-05,
atol: float = 1e-08,
) -> None
Validate that a label map matches data spatial dimensions and coordinates.
Parameters:
-
(labels¶DataArray) –Label map to validate. Must have integer dtype and coordinates must match data. Accepts two formats:
- Flat label map: Spatial dims only, e.g.
(z, y, x). Background voxels labeled0; each unique non-zero integer identifies a distinct, non-overlapping region. Theregionscoordinate of the output holds the integer label values. - Stacked mask format: Has a leading
maskdimension followed by spatial dims, e.g.(mask, z, y, x). Each layer has values in{0, region_id}and regions may overlap. Theregioncoordinate of the output holds themaskcoordinate values (e.g., region label).
- Flat label map: Spatial dims only, e.g.
-
(data¶DataArray) –Data array to validate labels against.
-
(labels_name¶str, default:"labels") –Name of the labels parameter (used in error messages).
-
(rtol¶float, default:1e-5) –Relative tolerance for coordinate comparison.
-
(atol¶float, default:1e-8) –Absolute tolerance for coordinate comparison.
Raises:
-
TypeError–If
labelsis not an integer dtype DataArray. -
ValueError–If
labelsdimensions don't matchdataor if coordinates don't match.
validate_mask ¶
validate_mask(
mask: DataArray,
data: DataArray,
mask_name: str = "mask",
rtol: float = 1e-05,
atol: float = 1e-08,
) -> None
Validate that a mask matches data spatial dimensions and coordinates.
Parameters:
-
(mask¶DataArray) –Mask to validate. Must have boolean dtype, or integer dtype with exactly one non-zero value (0 = background, one region id = foreground). The latter format is produced by
Atlas.get_masks. Coordinates must match data. -
(data¶DataArray) –Data array to validate mask against.
-
(mask_name¶str, default:"mask") –Name of the mask parameter (used in error messages).
-
(rtol¶float, default:1e-5) –Relative tolerance for coordinate comparison.
-
(atol¶float, default:1e-8) –Absolute tolerance for coordinate comparison.
Raises:
-
TypeError–If
maskis not a boolean or single-label integer DataArray. -
ValueError–If
maskdimensions don't matchdataor if coordinates don't match.
validate_matching_coordinates ¶
validate_matching_coordinates(
left: DataArray,
right: DataArray,
coord_names: Hashable
| Iterable[Hashable]
| None = None,
*,
left_name: str = "left array",
right_name: str = "right array",
rtol: float = 1e-05,
atol: float = 1e-08,
) -> None
Validate that selected coordinates match between two DataArrays.
Comparison is performed on coordinate values rather than the full coordinate
DataArray, so unrelated attached coordinates do not cause false mismatches.
Numeric coordinates are compared with tolerance to accommodate harmless
floating-point drift (for example after serialization and reload). Non-numeric
coordinates are compared exactly.
Parameters:
-
(left¶DataArray) –First array to compare.
-
(right¶DataArray) –Second array to compare.
-
(coord_names¶Hashable | Iterable[Hashable] | None, default:None) –Coordinate names to compare. If not specified, all shared dimension coordinates are checked.
-
(left_name¶str, default:"left array") –Label used for
leftin error messages. Override with a context-specific name (e.g."run 0","map 0") for more actionable errors. -
(right_name¶str, default:"right array") –Label used for
rightin error messages. -
(rtol¶float, default:1e-5) –Relative tolerance used for numeric coordinate comparison.
-
(atol¶float, default:1e-8) –Absolute tolerance used for numeric coordinate comparison.
Raises:
-
ValueError–If a requested coordinate is missing or if coordinates do not match.
validate_time_series ¶
validate_time_series(
time_series: DataArray,
operation_name: str,
check_time_chunks: bool = True,
) -> int
Validate time series for time series processing operations.
Performs common validation checks:
- Time series have a
timedimension. - Time dimension has more than 1 timepoint.
- Time dimension is not chunked for Dask arrays (optional).
Parameters:
-
(time_series¶DataArray) –Input time series to validate. Must have a
timedimension. -
(operation_name¶str) –Name of the operation (used in error/warning messages).
-
(check_time_chunks¶bool, default:True) –Whether to raise an error when time dimension is chunked in a Dask array. Set to
Falsefor operations that can handle chunked time (e.g.,confusius.signal.standardize).
Returns:
-
int–Axis number for the
timedimension.
Raises:
-
ValueError–If
time_serieshas notimedimension, if thetimedimension has only 1 timepoint, or if thetimedimension is chunked in a Dask array (whencheck_time_chunks=True).