Skip to content

Extend DataInputParams with IntensityConfig fields for model export#5964

Closed
omkar-334 wants to merge 4 commits intoopen-edge-platform:developfrom
omkar-334:intensity
Closed

Extend DataInputParams with IntensityConfig fields for model export#5964
omkar-334 wants to merge 4 commits intoopen-edge-platform:developfrom
omkar-334:intensity

Conversation

@omkar-334
Copy link
Copy Markdown
Contributor

Summary

  • library/src/otx/backend/native/models/base.py - Added fields to DataInputParams. Updated as_dict() and the dict-merge in _configure_preprocessing_params.
  • library/src/otx/backend/native/exporter/base.py - _extend_model_metadata now writes all intensity fields into the exported IR.
  • library/src/otx/data/module.py - Exposed intensity_config from train_subset.intensity in both __init__ and from_otx_datasets.
  • library/src/otx/backend/native/engine.py - Read intensity_config from the datamodule and pass its fields into the data_input_params dict.
  • library/src/otx/cli/cli.py - Same as engine, for the CLI path.
  • library/tests/unit/backend/native/models/test_base.py - Updated test_as_dict to match the new fields.

Resolves #5952

How to test

Export a model and verify new metadata keys appear in the IR (storage_dtype, intensity_mode, etc.)

Checklist

  • The PR title and description are clear and descriptive
  • I have manually tested the changes
  • All changes are covered by automated tests
  • All related issues are linked to this PR (if applicable)
  • Documentation has been updated (if applicable)

@omkar-334 omkar-334 requested a review from a team as a code owner March 28, 2026 13:04
Copilot AI review requested due to automatic review settings March 28, 2026 13:04
@github-actions github-actions bot added the TEST Any changes in tests label Mar 28, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends the model preprocessing metadata pipeline so IntensityConfig settings (e.g., storage dtype and intensity mapping mode/parameters) can be carried through DataInputParams and embedded into exported model IR metadata (rt_info), enabling downstream consumers to reconstruct intensity handling at deployment time.

Changes:

  • Added intensity-related fields to DataInputParams and updated dict conversion / preprocessing-param merging.
  • Propagated IntensityConfig from the OTXDataModule into model instantiation paths (engine + CLI).
  • Embedded intensity-related values into exported model metadata and updated a unit test expectation for DataInputParams.as_dict().

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
library/src/otx/backend/native/models/base.py Extends DataInputParams and merges intensity defaults into preprocessing params.
library/src/otx/backend/native/exporter/base.py Writes intensity-related metadata into exported model rt_info.
library/src/otx/data/module.py Exposes intensity_config from the train subset on the datamodule.
library/src/otx/backend/native/engine.py Passes intensity fields from datamodule into data_input_params.
library/src/otx/cli/cli.py Mirrors engine behavior for CLI-driven model instantiation.
library/tests/unit/backend/native/models/test_base.py Updates DataInputParams.as_dict() unit test to include new fields.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +989 to +991
# Fall back to DataInputParams field defaults when model has no defaults.
fallback = DataInputParams(input_size=(0, 0), mean=(0.0, 0.0, 0.0), std=(1.0, 1.0, 1.0))
default = default or fallback
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _configure_preprocessing_params, the comment says to fall back to DataInputParams field defaults, but the fallback instance hard-codes input_size=(0,0), mean=(0,0,0), std=(1,1,1). If a model ever returns a falsy/empty default, this would silently accept missing required preprocessing values and potentially proceed with an invalid input_size. Prefer raising a clear error when model defaults are unavailable, or restructure so only the new intensity fields default while required fields must come from either caller or model defaults.

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +197
("model_info", "storage_dtype"): dip.storage_dtype,
("model_info", "intensity_mode"): dip.intensity_mode,
("model_info", "percentile_low"): str(dip.percentile_low),
("model_info", "percentile_high"): str(dip.percentile_high),
("model_info", "scale_factor"): str(dip.scale_factor),
("model_info", "min_value"): str(dip.min_value),
("model_info", "repeat_channels"): str(dip.repeat_channels),
}
if dip.intensity_max_value is not None:
extra_data[("model_info", "intensity_max_value")] = str(dip.intensity_max_value)
if dip.window_center is not None:
extra_data[("model_info", "window_center")] = str(dip.window_center)
if dip.window_width is not None:
extra_data[("model_info", "window_width")] = str(dip.window_width)
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_extend_model_metadata conditionally omits intensity_max_value/window_center/window_width when they are None. This conflicts with the PR description/issue goal of writing all intensity fields into exported rt_info, and means expected metadata keys may be missing for default configs (e.g. scale_to_unit with auto max_value). Consider always emitting these keys with a consistent sentinel value (e.g. empty string) so downstream consumers can rely on key presence.

Copilot uses AI. Check for mistakes.
Comment on lines +177 to +197
dip = self.data_input_params
extra_data = {
("model_info", "mean_values"): mean_str.strip(),
("model_info", "scale_values"): std_str.strip(),
("model_info", "resize_type"): self.resize_mode,
("model_info", "pad_value"): str(self.pad_value),
("model_info", "reverse_input_channels"): str(self.swap_rgb),
("model_info", "storage_dtype"): dip.storage_dtype,
("model_info", "intensity_mode"): dip.intensity_mode,
("model_info", "percentile_low"): str(dip.percentile_low),
("model_info", "percentile_high"): str(dip.percentile_high),
("model_info", "scale_factor"): str(dip.scale_factor),
("model_info", "min_value"): str(dip.min_value),
("model_info", "repeat_channels"): str(dip.repeat_channels),
}
if dip.intensity_max_value is not None:
extra_data[("model_info", "intensity_max_value")] = str(dip.intensity_max_value)
if dip.window_center is not None:
extra_data[("model_info", "window_center")] = str(dip.window_center)
if dip.window_width is not None:
extra_data[("model_info", "window_width")] = str(dip.window_width)
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new intensity-related metadata keys added in _extend_model_metadata are not covered by unit tests. Please add/extend exporter tests to assert the exported metadata contains the new keys (including the optional ones) and that their values match DataInputParams.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@leoll2 leoll2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @omkar-334, thanks for the contribution. I would like to ask a few questions to verify your understanding of the issue:

  1. What does intensity_mode represent and why is it important to embed it in the model_info?
  2. Where are the intensity parameters meant to be consumed?
  3. Which other parameters should be non-null when intensity_mode="scale_to_unit"?
  4. Which model did you use for testing? Could you share the exported model as a .zip file?

params["mean"] = self._datamodule.input_mean
if self._datamodule.input_std is not None:
params["std"] = self._datamodule.input_std
ic = getattr(self._datamodule, "intensity_config", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you access the attribute via getattr here?

@omkar-334
Copy link
Copy Markdown
Contributor Author

omkar-334 commented Mar 29, 2026

4. Which model did you use for testing? Could you share the exported model as a .zip file?

I used mobilenet_v3 model for testing. Here is the link for the exported model zip file - https://drive.google.com/file/d/15x1QL6blluC2Y5qNKJQAYHA4Vl3f8YXv/view?usp=sharing

I used a script to check if the exported and reloaded model contains the saved metadata -

from pathlib import Path

import openvino
import torch

from otx.backend.native.models.base import DataInputParams
from otx.backend.native.models.classification.multiclass_models.mobilenet_v3 import MobileNetV3MulticlassCls

#load model
model = MobileNetV3MulticlassCls(
    model_name="mobilenetv3_large",
    label_info=10,
    data_input_params=DataInputParams(
        input_size=(224, 224),
        mean=(0.485, 0.456, 0.406),
        std=(0.229, 0.224, 0.225),
        storage_dtype="uint16",
        intensity_mode="window",
        intensity_max_value=65535.0,
        window_center=40.0,
        window_width=400.0,
    ),
)
model.eval()


output_dir = Path("exported_model")
output_dir.mkdir(exist_ok=True)
dummy_input = torch.randn(1, 3, 224, 224)

#export model to onnx
onnx_path = output_dir / "test_model.onnx"
torch.onnx.export(
    model,
    dummy_input,
    str(onnx_path),
    input_names=["input"],
    output_names=["output"],
)

#convert to OpenVINO
ov_model = openvino.convert_model(str(onnx_path))

exporter = model._exporter
metadata = exporter.metadata
metadata = exporter._extend_model_metadata(metadata)
exporter._embed_openvino_ir_metadata(ov_model, metadata)

#save
xml_path = output_dir / "test_model.xml"
openvino.save_model(ov_model, str(xml_path))

#reload and verify metadata
ov_model = openvino.Core().read_model(str(xml_path))

print("\n=== Intensity metadata in exported IR ===")
for key in [
    "storage_dtype", "intensity_mode", "intensity_max_value",
    "window_center", "window_width", "percentile_low",
    "percentile_high", "scale_factor", "min_value", "repeat_channels",
    "mean_values", "scale_values",
]:
    if ov_model.has_rt_info(["model_info", key]):
        val = ov_model.get_rt_info(["model_info", key]).value
        print(f"  {key}: {val}")
    else:
        print(f"  {key}: NOT FOUND")

print(f"\nModel saved to {output_dir.resolve()}")
print("Zip with: zip exported_model.zip exported_model/*")

Output =

~/personal/otxnew/library intensity*
.venv ❯ uv run python ../test.py               
warning: `VIRTUAL_ENV=/Users/omkar/personal/otxnew/.venv` does not match the project environment path `.venv` and will be ignored; use `--active` to target the active environment instead
init weight - https://github.com/d-li14/mobilenetv3.pytorch/blob/master/pretrained/mobilenetv3-large-1cd25616.pth?raw=true
UserWarning: The model and loaded state dict do not match exactly

unexpected key in source state_dict: classifier.0.weight, classifier.0.bias, classifier.3.weight, classifier.3.bias

W0329 23:32:19.848000 74185 library/.venv/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_schemas.py:455] Missing annotation for parameter 'input' from (input, boxes, output_size: 'Sequence[int]', spatial_scale: 'float' = 1.0, sampling_ratio: 'int' = -1, aligned: 'bool' = False). Treating as an Input.
W0329 23:32:19.848000 74185 library/.venv/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_schemas.py:455] Missing annotation for parameter 'boxes' from (input, boxes, output_size: 'Sequence[int]', spatial_scale: 'float' = 1.0, sampling_ratio: 'int' = -1, aligned: 'bool' = False). Treating as an Input.
W0329 23:32:19.848000 74185 library/.venv/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_schemas.py:455] Missing annotation for parameter 'input' from (input, boxes, output_size: 'Sequence[int]', spatial_scale: 'float' = 1.0). Treating as an Input.
W0329 23:32:19.848000 74185 library/.venv/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_schemas.py:455] Missing annotation for parameter 'boxes' from (input, boxes, output_size: 'Sequence[int]', spatial_scale: 'float' = 1.0). Treating as an Input.
[torch.onnx] Obtain model graph for `MobileNetV3MulticlassCls([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `MobileNetV3MulticlassCls([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
FutureWarning: `isinstance(treespec, LeafSpec)` is deprecated, use `isinstance(treespec, TreeSpec) and treespec.is_leaf()` instead.
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
Applied 92 of general pattern rewrite rules.
W0329 23:32:21.010000 74185 library/.venv/lib/python3.12/site-packages/torch/export/pt2_archive/_package.py:1086] Unable to load package. f must be a buffer or a file ending in .pt2. Instead got {/Users/omkar/personal/otxnew/library/exported_model/test_model.onnx}
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454] Ran into the following error when deserializing
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454] Traceback (most recent call last):
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]   File "/Users/omkar/personal/otxnew/library/.venv/lib/python3.12/site-packages/torch/export/__init__.py", line 449, in load
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]     pt2_contents = load_pt2(
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]                    ^^^^^^^^^
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]   File "/Users/omkar/personal/otxnew/library/.venv/lib/python3.12/site-packages/torch/export/pt2_archive/_package.py", line 1098, in load_pt2
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]     with PT2ArchiveReader(f) as archive_reader:
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]          ^^^^^^^^^^^^^^^^^^^
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]   File "/Users/omkar/personal/otxnew/library/.venv/lib/python3.12/site-packages/torch/export/pt2_archive/_package.py", line 195, in __init__
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]     self.archive_file = torch._C.PyTorchFileReader(archive_path_or_buffer)  # type: ignore[arg-type]
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
W0329 23:32:21.011000 74185 library/.venv/lib/python3.12/site-packages/torch/export/__init__.py:454] RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

=== Intensity metadata in exported IR ===
  storage_dtype: uint16
  intensity_mode: window
  intensity_max_value: 65535.0
  window_center: 40.0
  window_width: 400.0
  percentile_low: 1.0
  percentile_high: 99.0
  scale_factor: 1.0
  min_value: 0.0
  repeat_channels: 0
  mean_values: 0.485 0.456 0.406
  scale_values: 0.229 0.224 0.225

Model saved to /Users/omkar/personal/otxnew/library/exported_model
Zip with: zip exported_model.zip exported_model/*

@omkar-334
Copy link
Copy Markdown
Contributor Author

omkar-334 commented Mar 29, 2026

Hi @omkar-334, thanks for the contribution. I would like to ask a few questions to verify your understanding of the issue:

  1. What does intensity_mode represent and why is it important to embed it in the model_info?

there are 4 supported intensity modes - "Supported: 'scale_to_unit', 'window', 'percentile', 'range_scale'." these are different modes for converting raw image pixels values to float32, before applying augmentations.
It should be present in model_info because we need to export it, so that the same preprocessing applies at inference

  1. Where are the intensity parameters meant to be consumed?

They are consumed in the CPUAugmentationPipeline from library/src/otx/data/augmentation/pipeline.py.
The transforms are first built with the function build_intensity_Transform from the file library/src/otx/data/augmentation/intensity.py and then prepended to augmentations.
I think these params are also read while loading a model and applying them before inference time

  1. Which other parameters should be non-null when intensity_mode="scale_to_unit"?
Screenshot 2026-03-30 at 12 37 57 AM
  1. storage_dtype must be non-null (default = uint8)
  2. intensity_max_value can be None - then it will be auto detected from storage_dtype

@omkar-334
Copy link
Copy Markdown
Contributor Author

hey @leoll2 the 2 failig tests are because of AttributeError: Mock object has no attribute 'intensity'
The test mocks use MagicMock(spec=SubsetConfig) for train_subset, which doesn't auto-populate dataclass fields like
intensity. Two ways to fix:

  1. Add mock.train_subset.intensity = IntensityConfig() to the test fixtures
  2. Use getattr(self.train_subset, "intensity", IntensityConfig()) in the OTXDataModule
    what should we do fo this?

@kprokofi
Copy link
Copy Markdown
Contributor

Thanks for the PR, this enhancement was included in different PR

@kprokofi kprokofi closed this Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

TEST Any changes in tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Embed intensity metadata in exported model rt_info

4 participants