Store intensity config in IR metadata#6059
Conversation
🐳 Docker image sizes
|
There was a problem hiding this comment.
Pull request overview
This PR implements exporting of IntensityConfig (high-bit-depth / domain-specific intensity preprocessing) into exported model metadata (rt_info / ONNX metadata), by propagating intensity configuration from the data pipeline through CLI/Engine into DataInputParams, and finally embedding it during export. This addresses the need from issue #5952 to make exported models self-describing for inference-time preprocessing reconstruction.
Changes:
- Extend
DataInputParamsto optionally carryIntensityConfigand serialize it viaas_dict(). - Plumb intensity config from
OTXDataModule→ CLI/Engine model instantiation → exporter metadata embedding. - Add unit tests covering metadata embedding for multiple
IntensityConfigmodes andDataInputParams.as_dict()behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
library/src/otx/backend/native/exporter/base.py |
Embed intensity-related keys (dtype, mode, parameters) into export metadata. |
library/src/otx/backend/native/models/base.py |
Add intensity_config to DataInputParams; parse intensity config dicts in preprocessing params. |
library/src/otx/data/module.py |
Expose training subset intensity config on the datamodule for downstream plumbing. |
library/src/otx/cli/cli.py |
Forward datamodule intensity config into data_input_params during CLI instantiation. |
library/src/otx/backend/native/engine.py |
Forward datamodule intensity config into model construction args. |
library/tests/unit/backend/native/exporter/test_base.py |
Add tests for intensity metadata embedding and DataInputParams.as_dict(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Forward the training subset's intensity config so that Engine / CLI | ||
| # can pass it to OTXModel.data_input_params → exporter rt_info. | ||
| # None means no subset or no intensity attr (old configs / no data pipeline). | ||
| self.input_intensity_config = getattr(self.train_subset, "intensity", None) | ||
|
|
There was a problem hiding this comment.
OTXDataModule.from_otx_datasets() does not set input_intensity_config, but Engine/CLI now read datamodule.input_intensity_config. If a datamodule is created via from_otx_datasets (e.g. quantization/AutoConfigurator path), this can raise AttributeError. Consider initializing input_intensity_config in from_otx_datasets as well (e.g. from train_subset.intensity) or ensuring callers use getattr(..., None).
| if self._datamodule.input_intensity_config is not None: | ||
| params["intensity_config"] = self._datamodule.input_intensity_config |
There was a problem hiding this comment.
Directly accessing self._datamodule.input_intensity_config can raise AttributeError for datamodules created via OTXDataModule.from_otx_datasets(), which currently doesn’t define that attribute. Either ensure the datamodule factory sets input_intensity_config or use getattr(self._datamodule, "input_intensity_config", None) here.
| if self._datamodule.input_intensity_config is not None: | |
| params["intensity_config"] = self._datamodule.input_intensity_config | |
| input_intensity_config = getattr(self._datamodule, "input_intensity_config", None) | |
| if input_intensity_config is not None: | |
| params["intensity_config"] = input_intensity_config |
| if self.datamodule.input_intensity_config is not None: | ||
| _dip["intensity_config"] = self.datamodule.input_intensity_config |
There was a problem hiding this comment.
self.datamodule.input_intensity_config may not exist for datamodules created via OTXDataModule.from_otx_datasets() (it currently doesn’t define that attribute). This can raise AttributeError during CLI instantiation; consider using getattr(self.datamodule, "input_intensity_config", None) or making from_otx_datasets initialize it.
| if self.datamodule.input_intensity_config is not None: | |
| _dip["intensity_config"] = self.datamodule.input_intensity_config | |
| input_intensity_config = getattr(self.datamodule, "input_intensity_config", None) | |
| if input_intensity_config is not None: | |
| _dip["intensity_config"] = input_intensity_config |
| # Map storage_dtype to the ModelAPI input_dtype convention | ||
| _dtype_map = {"uint8": "u8", "uint16": "u16", "float32": "f32"} | ||
| input_dtype = _dtype_map.get(intensity_cfg.storage_dtype, intensity_cfg.storage_dtype) |
There was a problem hiding this comment.
IntensityConfig.storage_dtype supports values beyond {uint8,uint16,float32} (e.g. "int16" per otx/config/data.py). The dtype mapping here will fall back to the raw string, which may not match the intended ModelAPI input_dtype convention. Consider extending _dtype_map (and/or validating storage_dtype) to cover all supported dtypes consistently.
| # Map storage_dtype to the ModelAPI input_dtype convention | |
| _dtype_map = {"uint8": "u8", "uint16": "u16", "float32": "f32"} | |
| input_dtype = _dtype_map.get(intensity_cfg.storage_dtype, intensity_cfg.storage_dtype) | |
| # Map storage_dtype to the ModelAPI input_dtype convention. | |
| # Keep the original value for unknown dtypes to preserve current behavior, | |
| # but warn because ModelAPI expects the shorthand form (for example, "i16"). | |
| _dtype_map = { | |
| "int8": "i8", | |
| "int16": "i16", | |
| "int32": "i32", | |
| "uint8": "u8", | |
| "uint16": "u16", | |
| "uint32": "u32", | |
| "float16": "f16", | |
| "float32": "f32", | |
| "float64": "f64", | |
| } | |
| storage_dtype = intensity_cfg.storage_dtype | |
| normalized_storage_dtype = storage_dtype.lower() | |
| input_dtype = _dtype_map.get(normalized_storage_dtype, storage_dtype) | |
| if input_dtype == storage_dtype and normalized_storage_dtype not in _dtype_map: | |
| log.warning( | |
| "Unsupported intensity storage_dtype '%s' for ModelAPI input_dtype mapping; " | |
| "using raw value in exported metadata.", | |
| storage_dtype, | |
| ) |
| extra_data[("model_info", "intensity_mode")] = intensity_cfg.mode | ||
|
|
||
| if intensity_cfg.max_value is not None: | ||
| extra_data[("model_info", "intensity_max_value")] = str(intensity_cfg.max_value) | ||
| if intensity_cfg.window_center is not None: |
There was a problem hiding this comment.
IntensityConfig.max_value has an 'auto' meaning when None (see otx/config/data.py and build_intensity_transform auto-detection). For mode="scale_to_unit", the exporter currently omits intensity_max_value when max_value is None, so the exported rt_info can’t fully represent the effective preprocessing parameters unless the consumer duplicates the same auto-logic. Consider resolving and exporting the effective max_value (at least for non-uint8 dtypes) to make exports self-contained.
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Summary
Resolves #5952
How to test
Checklist