KEMBAR78
Address edge GetMemInfo edge cases by yuslepukhin · Pull Request #26021 · microsoft/onnxruntime · GitHub
Skip to content

Conversation

@yuslepukhin
Copy link
Member

Description

This fixes somewhat contrived edgecases that are present in our tests

  • input propagates to output
  • output is produced by an initializer.

Motivation and Context

Python API upcoming PR does not pass tests without it.

  - input propagates to output
  - output is produced by an initialier.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses edge cases in the GetMemInfo functionality related to model inputs and outputs in contrived test scenarios. It fixes issues where input values propagate directly to outputs and where outputs are produced by constant initializers.

  • Enhanced memory info retrieval logic to handle edge cases in input/output relationships
  • Added comprehensive error handling for outputs produced by initializers or direct input propagation
  • Added null checks to prevent API calls when input/output counts are zero

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
onnxruntime/test/shared_lib/test_inference.cc Added tests for input pass-through and sparse output scenarios
onnxruntime/core/session/inference_session.cc Enhanced GetInputOutputMemoryInfo with fallback logic for edge cases
include/onnxruntime/core/session/onnxruntime_cxx_inline.h Added null checks before API calls in memory info methods

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

skottmckay
skottmckay previously approved these changes Sep 16, 2025
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

import sys

import numpy as np
import onnx

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note test

Module 'onnx' is imported with both 'import' and 'import from'.
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.

Copilot Autofix

AI about 1 month ago

To fix the problem, remove the line from onnx import TensorProto, helper, numpy_helper (line 12) and replace all occurrences of TensorProto, helper, and numpy_helper with onnx.TensorProto, onnx.helper, and onnx.numpy_helper respectively, throughout the code snippet. This ensures code clarity and avoids confusion as recommended, and no APIs should change. The only file to be changed is onnxruntime/test/testdata/test_dangling_input_segment_ids.py. No new imports or definitions are needed, just refactoring for proper qualification.

Suggested changeset 1
onnxruntime/test/testdata/test_dangling_input_segment_ids.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/onnxruntime/test/testdata/test_dangling_input_segment_ids.py b/onnxruntime/test/testdata/test_dangling_input_segment_ids.py
--- a/onnxruntime/test/testdata/test_dangling_input_segment_ids.py
+++ b/onnxruntime/test/testdata/test_dangling_input_segment_ids.py
@@ -9,7 +9,6 @@
 
 import numpy as np
 import onnx
-from onnx import TensorProto, helper, numpy_helper
 
 DATA_DIR = os.path.join(os.path.dirname(os.path.realpath(__file__)), "test_dangling_input_segment_ids")
 
@@ -20,7 +19,7 @@
 
 
 def make_node(op_type, inputs, outputs, name=None, doc_string=None, domain=None, **kwargs):
-    node = helper.make_node(op_type, inputs, outputs, name, doc_string, domain, **kwargs)
+    node = onnx.helper.make_node(op_type, inputs, outputs, name, doc_string, domain, **kwargs)
     if doc_string == "":
         node.doc_string = ""
     order_repeated_field(node.attribute, "name", kwargs.keys())
@@ -28,42 +27,42 @@
 
 
 def make_graph(*args, doc_string=None, **kwargs):
-    graph = helper.make_graph(*args, doc_string=doc_string, **kwargs)
+    graph = onnx.helper.make_graph(*args, doc_string=doc_string, **kwargs)
     if doc_string == "":
         graph.doc_string = ""
     return graph
 
 
-model = helper.make_model(
-    opset_imports=[helper.make_operatorsetid("", 14), helper.make_operatorsetid("com.microsoft", 1)],
+model = onnx.helper.make_model(
+    opset_imports=[onnx.helper.make_operatorsetid("", 14), onnx.helper.make_operatorsetid("com.microsoft", 1)],
     ir_version=7,
     graph=make_graph(
         name="embed_layernorm_graph",
         inputs=[
-            helper.make_tensor_value_info("input_ids", TensorProto.INT32, shape=[1, 4]),
-            helper.make_tensor_value_info("segment_ids", TensorProto.INT32, shape=[1, 4]),
+            onnx.helper.make_tensor_value_info("input_ids", onnx.TensorProto.INT32, shape=[1, 4]),
+            onnx.helper.make_tensor_value_info("segment_ids", onnx.TensorProto.INT32, shape=[1, 4]),
         ],
         outputs=[
-            helper.make_tensor_value_info("layernorm_out", TensorProto.FLOAT, shape=[1, 4, 4]),
-            helper.make_tensor_value_info("mask_index_out", TensorProto.INT32, shape=[1]),
+            onnx.helper.make_tensor_value_info("layernorm_out", onnx.TensorProto.FLOAT, shape=[1, 4, 4]),
+            onnx.helper.make_tensor_value_info("mask_index_out", onnx.TensorProto.INT32, shape=[1]),
         ],
         initializer=[
-            numpy_helper.from_array(
+            onnx.numpy_helper.from_array(
                 np.load(os.path.join(DATA_DIR, "const0_word_embed.npy")).astype("float32").reshape([32, 4]),
                 name="word_embed",
             ),
-            numpy_helper.from_array(
+            onnx.numpy_helper.from_array(
                 np.load(os.path.join(DATA_DIR, "const1_pos_embed.npy")).astype("float32").reshape([16, 4]),
                 name="pos_embed",
             ),
-            numpy_helper.from_array(
+            onnx.numpy_helper.from_array(
                 np.array(
                     [0.6185135841369629, 0.010364261455833912, 0.5386272668838501, 0.0030179566238075495],
                     dtype="float32",
                 ),
                 name="gamma",
             ),
-            numpy_helper.from_array(
+            onnx.numpy_helper.from_array(
                 np.array(
                     [0.9511938095092773, 0.9054020047187805, 0.7959669232368469, 0.9152743220329285], dtype="float32"
                 ),
EOF
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@yuslepukhin yuslepukhin merged commit d251f3a into main Sep 16, 2025
87 of 91 checks passed
@yuslepukhin yuslepukhin deleted the yuslepukhin/fix_getmeminfo_edge_cases branch September 16, 2025 17:32
yuslepukhin added a commit that referenced this pull request Sep 17, 2025
### Description
<!-- Describe your changes. -->
This pull request introduces several enhancements to ONNX Runtime's
Python and C++ APIs, focusing on improved device and memory information
handling, synchronization stream support, and tensor copy functionality.
It adds new Python bindings for device/memory types, exposes more
detailed session input/output metadata, and provides a Python-accessible
tensor copy API. The changes also refactor and extend the C++ API for
better stream and memory info management.

Key changes include:

### Device and Memory Information Enhancements

* Added Python bindings for `OrtMemoryInfoDeviceType`,
`OrtDeviceMemoryType`, and expanded `OrtDevice` to expose the memory
type via a new `mem_type` method. The `OrtMemoryInfo` Python class now
supports both legacy and new V2 constructors and exposes additional
properties such as device memory type and vendor ID.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1801-R1810)
[[2]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1839)
[[3]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eL1941-R2005)
* Extended the Python `InferenceSession` object to provide access to
input/output `OrtMemoryInfo` and `OrtEpDevice` objects through new
properties and methods.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR2702-R2729)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR202-R213)
[[3]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR591-R593)
[[4]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR607-R609)

### Synchronization Stream and Execution Provider Device Support

* Introduced Python bindings for `OrtSyncStream`, including creation via
`OrtEpDevice.create_sync_stream()` and retrieval of device-specific
`OrtMemoryInfo` via `OrtEpDevice.memory_info()`.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1890-R1938)
[[2]](diffhunk://#diff-44e70fbe60cba71c94f1a46ec2b1facaa8e9475232dad6df5ecbea301e76d475R34-R44)
* Refactored the C++ API to generalize `SyncStream` handling, allowing
for unowned streams and improved type safety.
[[1]](diffhunk://#diff-17f64e8b38fcdcd25e90abcabeec4b420956b15fe63868a5d0b270c376bde209L1066-R1084)
[[2]](diffhunk://#diff-cc93f5f9d8078d3d3af14c9bb4c0c59e25a99f3ec75d7772ea20111ed7eb6ddeL672-R677)

### Tensor Copy Functionality

* Added a new Python-level `copy_tensors` function and corresponding C++
binding, enabling efficient copying of tensor data between `OrtValue`
objects, optionally using a synchronization stream.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1588-R1599)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1155-R1163)
[[3]](diffhunk://#diff-44e70fbe60cba71c94f1a46ec2b1facaa8e9475232dad6df5ecbea301e76d475R84)

### Miscellaneous Improvements and Fixes

* Changed the return type of the `OrtValue.data_ptr` method in the
Python binding from `int64_t` to `uintptr_t` for better cross-platform
compatibility.
[[1]](diffhunk://#diff-666c9002698d1bbd4215237231e5be98d7b33e5054f018dce952407027bd0473L336-R336)
[[2]](diffhunk://#diff-666c9002698d1bbd4215237231e5be98d7b33e5054f018dce952407027bd0473L347-R347)
* Minor improvements to error messages and device type handling in the
Python API (e.g., for `OrtDevice`).
[[1]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1176)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1219-R1221)
* Included necessary C++ includes for plugin stream support.

These changes collectively improve the flexibility and introspection
capabilities of ONNX Runtime's device, memory, and execution provider
interfaces, and make advanced features available to Python users.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Depends on: #26021
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
### Description
<!-- Describe your changes. -->
This fixes somewhat contrived edgecases that are present in our tests
  - input propagates to output
  - output is produced by an initializer.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Python API upcoming PR does not pass tests without it.
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
### Description
<!-- Describe your changes. -->
This pull request introduces several enhancements to ONNX Runtime's
Python and C++ APIs, focusing on improved device and memory information
handling, synchronization stream support, and tensor copy functionality.
It adds new Python bindings for device/memory types, exposes more
detailed session input/output metadata, and provides a Python-accessible
tensor copy API. The changes also refactor and extend the C++ API for
better stream and memory info management.

Key changes include:

### Device and Memory Information Enhancements

* Added Python bindings for `OrtMemoryInfoDeviceType`,
`OrtDeviceMemoryType`, and expanded `OrtDevice` to expose the memory
type via a new `mem_type` method. The `OrtMemoryInfo` Python class now
supports both legacy and new V2 constructors and exposes additional
properties such as device memory type and vendor ID.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1801-R1810)
[[2]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1839)
[[3]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eL1941-R2005)
* Extended the Python `InferenceSession` object to provide access to
input/output `OrtMemoryInfo` and `OrtEpDevice` objects through new
properties and methods.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR2702-R2729)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR202-R213)
[[3]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR591-R593)
[[4]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR607-R609)

### Synchronization Stream and Execution Provider Device Support

* Introduced Python bindings for `OrtSyncStream`, including creation via
`OrtEpDevice.create_sync_stream()` and retrieval of device-specific
`OrtMemoryInfo` via `OrtEpDevice.memory_info()`.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1890-R1938)
[[2]](diffhunk://#diff-44e70fbe60cba71c94f1a46ec2b1facaa8e9475232dad6df5ecbea301e76d475R34-R44)
* Refactored the C++ API to generalize `SyncStream` handling, allowing
for unowned streams and improved type safety.
[[1]](diffhunk://#diff-17f64e8b38fcdcd25e90abcabeec4b420956b15fe63868a5d0b270c376bde209L1066-R1084)
[[2]](diffhunk://#diff-cc93f5f9d8078d3d3af14c9bb4c0c59e25a99f3ec75d7772ea20111ed7eb6ddeL672-R677)

### Tensor Copy Functionality

* Added a new Python-level `copy_tensors` function and corresponding C++
binding, enabling efficient copying of tensor data between `OrtValue`
objects, optionally using a synchronization stream.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1588-R1599)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1155-R1163)
[[3]](diffhunk://#diff-44e70fbe60cba71c94f1a46ec2b1facaa8e9475232dad6df5ecbea301e76d475R84)

### Miscellaneous Improvements and Fixes

* Changed the return type of the `OrtValue.data_ptr` method in the
Python binding from `int64_t` to `uintptr_t` for better cross-platform
compatibility.
[[1]](diffhunk://#diff-666c9002698d1bbd4215237231e5be98d7b33e5054f018dce952407027bd0473L336-R336)
[[2]](diffhunk://#diff-666c9002698d1bbd4215237231e5be98d7b33e5054f018dce952407027bd0473L347-R347)
* Minor improvements to error messages and device type handling in the
Python API (e.g., for `OrtDevice`).
[[1]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1176)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1219-R1221)
* Included necessary C++ includes for plugin stream support.

These changes collectively improve the flexibility and introspection
capabilities of ONNX Runtime's device, memory, and execution provider
interfaces, and make advanced features available to Python users.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Depends on: #26021
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
### Description
<!-- Describe your changes. -->
This fixes somewhat contrived edgecases that are present in our tests
  - input propagates to output
  - output is produced by an initializer.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Python API upcoming PR does not pass tests without it.
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
### Description
<!-- Describe your changes. -->
This pull request introduces several enhancements to ONNX Runtime's
Python and C++ APIs, focusing on improved device and memory information
handling, synchronization stream support, and tensor copy functionality.
It adds new Python bindings for device/memory types, exposes more
detailed session input/output metadata, and provides a Python-accessible
tensor copy API. The changes also refactor and extend the C++ API for
better stream and memory info management.

Key changes include:

### Device and Memory Information Enhancements

* Added Python bindings for `OrtMemoryInfoDeviceType`,
`OrtDeviceMemoryType`, and expanded `OrtDevice` to expose the memory
type via a new `mem_type` method. The `OrtMemoryInfo` Python class now
supports both legacy and new V2 constructors and exposes additional
properties such as device memory type and vendor ID.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1801-R1810)
[[2]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1839)
[[3]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eL1941-R2005)
* Extended the Python `InferenceSession` object to provide access to
input/output `OrtMemoryInfo` and `OrtEpDevice` objects through new
properties and methods.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR2702-R2729)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR202-R213)
[[3]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR591-R593)
[[4]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR607-R609)

### Synchronization Stream and Execution Provider Device Support

* Introduced Python bindings for `OrtSyncStream`, including creation via
`OrtEpDevice.create_sync_stream()` and retrieval of device-specific
`OrtMemoryInfo` via `OrtEpDevice.memory_info()`.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1890-R1938)
[[2]](diffhunk://#diff-44e70fbe60cba71c94f1a46ec2b1facaa8e9475232dad6df5ecbea301e76d475R34-R44)
* Refactored the C++ API to generalize `SyncStream` handling, allowing
for unowned streams and improved type safety.
[[1]](diffhunk://#diff-17f64e8b38fcdcd25e90abcabeec4b420956b15fe63868a5d0b270c376bde209L1066-R1084)
[[2]](diffhunk://#diff-cc93f5f9d8078d3d3af14c9bb4c0c59e25a99f3ec75d7772ea20111ed7eb6ddeL672-R677)

### Tensor Copy Functionality

* Added a new Python-level `copy_tensors` function and corresponding C++
binding, enabling efficient copying of tensor data between `OrtValue`
objects, optionally using a synchronization stream.
[[1]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eR1588-R1599)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1155-R1163)
[[3]](diffhunk://#diff-44e70fbe60cba71c94f1a46ec2b1facaa8e9475232dad6df5ecbea301e76d475R84)

### Miscellaneous Improvements and Fixes

* Changed the return type of the `OrtValue.data_ptr` method in the
Python binding from `int64_t` to `uintptr_t` for better cross-platform
compatibility.
[[1]](diffhunk://#diff-666c9002698d1bbd4215237231e5be98d7b33e5054f018dce952407027bd0473L336-R336)
[[2]](diffhunk://#diff-666c9002698d1bbd4215237231e5be98d7b33e5054f018dce952407027bd0473L347-R347)
* Minor improvements to error messages and device type handling in the
Python API (e.g., for `OrtDevice`).
[[1]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1176)
[[2]](diffhunk://#diff-f0e8ba8cb8cb07b51b3be675bf62cec07e2eae1461341ce5801d33a57c8f57fdR1219-R1221)
* Included necessary C++ includes for plugin stream support.

These changes collectively improve the flexibility and introspection
capabilities of ONNX Runtime's device, memory, and execution provider
interfaces, and make advanced features available to Python users.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Depends on: #26021
adrianlizarraga added a commit that referenced this pull request Sep 24, 2025
### Description
Cherry-pick the following PRs into the ORT 1.23.1 branch:

- Fix Attention GQA implementation on CPU
- **MANUAL MERGE**: see
#26057
  - main merge date: Sept 15, 11:33am
  - pr: #25966
  - commit: d530b29
- Address edge GetMemInfo edge cases
  - main merge date: Sept 16, 10:32am
  - pr: #26021
  - commit: d251f3a
- Implement new Python APIs
  - main merge date: Sept 17, 11:44am
  - pr: #25999
  - commit: abc63e8
- MemcpyFromHost and MemcpyToHost support for plugin EPs
- **MERGE CONFLICT** on file
onnxruntime/test/optimizer/transpose_optimizer_test.cc. Conflicts with
#25689
  - main merge date: Sept 23, 10:42am
  - pr: #26088
  - commit: 4545732
- [TRT RTX EP] Fix bug for generating the correct subgraph in
GetCapability #26132
  - main merge date: Sept 23, 8:54pm
  - pr: #26132
  - commit: 72e56e7


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>
adrianlizarraga added a commit that referenced this pull request Sep 24, 2025
### Description
- Regenerates the `input_propagate_to_output.onnx` model used in [this
unit
test](https://github.com/microsoft/onnxruntime/blob/35dcab5088118117acc6086c9b6dd6dd92c7060f/onnxruntime/test/shared_lib/test_inference.cc#L497-L506)
so that it uses an ONNX IR version compatible with ONNX 1.18.0 (i.e., IR
version < 12).
- Adds script `input_propagate_to_output.py` that can be used to
regenerate the `input_propagate_to_output.onnx` model.
- Embed missing weight values that are needed to run the existing
`test_dangling_input_segment_ids.py` script.



### Motivation and Context
The main branch is using ONNX 1.19. However, this unit test also needs
to pass in the `rel-1.23.1` branch, which is still using ONNX 1.18.0.
So, by downgrading the model's IR version, the unit test can run in both
branches.

See original PR that added the test models:
#26021
@snnn
Copy link
Member

snnn commented Sep 25, 2025

This PR has been cherry-picked into the rel-1.23.1 branch in PR #26140. Removing the release:1.23.1 label.

adrianlizarraga added a commit that referenced this pull request Sep 26, 2025
### Description
- Regenerates the `input_propagate_to_output.onnx` model used in [this
unit
test](https://github.com/microsoft/onnxruntime/blob/35dcab5088118117acc6086c9b6dd6dd92c7060f/onnxruntime/test/shared_lib/test_inference.cc#L497-L506)
so that it uses an ONNX IR version compatible with ONNX 1.18.0 (i.e., IR
version < 12).
- Adds script `input_propagate_to_output.py` that can be used to
regenerate the `input_propagate_to_output.onnx` model.
- Embed missing weight values that are needed to run the existing
`test_dangling_input_segment_ids.py` script.



### Motivation and Context
The main branch is using ONNX 1.19. However, this unit test also needs
to pass in the `rel-1.23.1` branch, which is still using ONNX 1.18.0.
So, by downgrading the model's IR version, the unit test can run in both
branches.

See original PR that added the test models:
#26021
TedThemistokleous added a commit to ROCm/onnxruntime that referenced this pull request Oct 17, 2025
* ORT 1.23.1 cherrypick 1 [REDO] (microsoft#26140)

### Description
Cherry-pick the following PRs into the ORT 1.23.1 branch:

- Fix Attention GQA implementation on CPU
- **MANUAL MERGE**: see
microsoft#26057
  - main merge date: Sept 15, 11:33am
  - pr: microsoft#25966
  - commit: d530b29
- Address edge GetMemInfo edge cases
  - main merge date: Sept 16, 10:32am
  - pr: microsoft#26021
  - commit: d251f3a
- Implement new Python APIs
  - main merge date: Sept 17, 11:44am
  - pr: microsoft#25999
  - commit: abc63e8
- MemcpyFromHost and MemcpyToHost support for plugin EPs
- **MERGE CONFLICT** on file
onnxruntime/test/optimizer/transpose_optimizer_test.cc. Conflicts with
microsoft#25689
  - main merge date: Sept 23, 10:42am
  - pr: microsoft#26088
  - commit: 4545732
- [TRT RTX EP] Fix bug for generating the correct subgraph in
GetCapability microsoft#26132
  - main merge date: Sept 23, 8:54pm
  - pr: microsoft#26132
  - commit: 72e56e7


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>

* ORT 1.23.1 cherrypick 2 (microsoft#26182)

### Description
Adds the following commits to the `rel-1.23.1` branch for ORT 1.23.1:


- add session_id_ to LogEvaluationStart/Stop, LogSessionCreationStart
  - main merge date: July 31, 1:05am
  - pr: microsoft#25590
  - commit: e753643
- [build] fix WebAssembly build on macOS/arm64
  - main merge date: Aug 5, 8:07am
  - pr: microsoft#25653
  - commit: 53f152b
- [CPU] MoE Kernel (microsoft#25958)
  - main merge date: Sept 10, 4:54pm
  - pr: microsoft#25958
  - commit: 930e640
- [CPU] Block-wise QMoE kernel for CPU
  - main merge date: Sept 15, 8:32am
  - pr: microsoft#26009
  - commit: 5d17734
- [C#] Implement missing APIs
  - main merge date: Sept 24, 10:50am
  - pr: microsoft#26101
  - commit: 35dcab5
- Regenerate test model with ONNX IR < 12
  - main merge date: Sept 24, 2:50pm
  - pr: microsoft#26149
  - commit: 88f2652
- [CPU] Fix compilation errors because of unused variables
  - main merge date: Sept 25, 1:21pm
  - pr: microsoft#26147
  - commit: 42fcd71
- [EP ABI] Check if nodes specified in GetCapability() have already been
assigned
  - main merge date: Sept 26, 1:24am
  - pr: microsoft#26156
  - commit: 67d3ba0
- [QNN EP] Add dynamic option to set HTP performance mode
  - main merge date: Sept 26, 11:55am
  - pr: microsoft#26135
  - commit: 6cc40fd

---------

Co-authored-by: xieofxie <xieofxie@126.com>
Co-authored-by: hualxie <hualxie@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Co-authored-by: Akshay Sonawane <111780983+apsonawane@users.noreply.github.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: quic-tirupath <quic_tirupath@quicinc.com>
Co-authored-by: quic-ashwshan <quic_ashwshan@quicinc.com>

---------

Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>
Co-authored-by: xieofxie <xieofxie@126.com>
Co-authored-by: hualxie <hualxie@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Co-authored-by: Akshay Sonawane <111780983+apsonawane@users.noreply.github.com>
Co-authored-by: quic-tirupath <quic_tirupath@quicinc.com>
Co-authored-by: quic-ashwshan <quic_ashwshan@quicinc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants