- Notifications
You must be signed in to change notification settings - Fork3.3k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
36 Pull requests merged by22 people
- Bump transformers from 4.48.0 to 4.52.1 in /tools/ci_build/requirements/transformers-test
#25429 merged
Jul 18, 2025 - Bump on-headers and compression in /js/react_native/e2e
#25439 merged
Jul 18, 2025 - increase timeout for onnxruntime-ios-packaging-pipeline
#25438 merged
Jul 18, 2025 - fix shape inference error for ep context nodes
#25398 merged
Jul 18, 2025 - Ovep Feature Rel 1.23
#25435 merged
Jul 18, 2025 - [OVEP] Update OV version to 2025.2.0
#25436 merged
Jul 18, 2025 - Revert "revert qnn sdk version (#25426)"
#25437 merged
Jul 18, 2025 - [QNN EP] Add EP-aware Reshape handler for Transpose optimization.
#25344 merged
Jul 18, 2025 - [CUDA] Support head_sink in flash attention for GQA
#25432 merged
Jul 17, 2025 - Enable free dimension override for graph optimization level 0
#25425 merged
Jul 17, 2025 - [NV RTX EP] Upstream changes from the win-ort
#25370 merged
Jul 17, 2025 - [TRT-EP] Add loadModelProto APIs
#25409 merged
Jul 17, 2025 - add sliding window support for webgpu gqa
#25372 merged
Jul 17, 2025 - Restore ability to handle non-hex string in device discovery vendor/device id.
#25427 merged
Jul 17, 2025 - revert qnn sdk version
#25426 merged
Jul 17, 2025 - Update docker images
#25418 merged
Jul 16, 2025 - [webgpu] Apply template to
MatMulNBitsWideTile
#25353 merged
Jul 16, 2025 - Fix Build Error when tensor dumping is enabled
#25414 merged
Jul 16, 2025 - [QNN EP] Gpu backend test framework & test cases.
#25393 merged
Jul 16, 2025 - [webgpu] fix Slice implementation
#25415 merged
Jul 16, 2025 - Fix 2 device discovery issues.
#25397 merged
Jul 16, 2025 - [QNN-EP] Update ScatterND op to reject only QNN-CPU
#25403 merged
Jul 16, 2025 - [QNN-EP] Support GridSample of linear mode for ONNX opset 20+
#25408 merged
Jul 16, 2025 - [CPU] GQA supports attention scores output
#25319 merged
Jul 16, 2025 - Fix SigLIP casual mask bug
#25360 merged
Jul 15, 2025 - Fix some test issues when WebGPU and DML are enabled in the same build
#25401 merged
Jul 15, 2025 - [WebNN] Add rank range validation for rest ops
#25383 merged
Jul 15, 2025 - Bump lintrunner-adapters from 0.12.4 to 0.12.5
#25380 merged
Jul 15, 2025 - Add vendor id to OrtEpFactory and default ORT logger to CreateEpFactories
#25365 merged
Jul 15, 2025 - [QNN EP] Upgrade QNN to 2.36.1
#25388 merged
Jul 14, 2025 - Bump ruff from 0.12.2 to 0.12.3
#25382 merged
Jul 14, 2025 - Bump transformers from 4.48.0 to 4.52.1 in /onnxruntime/python/tools/transformers/models/llama
#25328 merged
Jul 14, 2025 - Fix number of layers in Whisper export
#25375 merged
Jul 14, 2025 - Bump clang-format from 20.1.7 to 20.1.8
#25381 merged
Jul 14, 2025 - [EP ABI] Update to use Node_GetEpName
#25363 merged
Jul 12, 2025
28 Pull requests opened by24 people
- ORT perf test support for plugin EP
#25374 opened
Jul 12, 2025 - Optimize layout for SubgroupMatrixLoad on Intel
#25384 opened
Jul 14, 2025 - upgrade protobuf in response to security alert
#25386 opened
Jul 14, 2025 - [WebGPU EP] allow concat operator to handle large number of inputs
#25390 opened
Jul 14, 2025 - [webgpu] use u32 to represent f16 in uniform
#25391 opened
Jul 14, 2025 - [webgpu] Optimize FlashAttention for prefill
#25395 opened
Jul 15, 2025 - Update fusion_attention to properly convert bfloat16 values
#25404 opened
Jul 15, 2025 - 'QnnEpFactory' should provide a fully-qualified path to the backend
#25407 opened
Jul 15, 2025 - Fix the is_leaf check in TreeEnsemble
#25410 opened
Jul 15, 2025 - [EP ABI] Add documentation for OrtValue and ort_graph_to_proto util
#25411 opened
Jul 15, 2025 - add webgpu support for GatherBlockQuantized
#25413 opened
Jul 15, 2025 - Subgroup matrix
#25416 opened
Jul 16, 2025 - Remove one recursive function in TreeEnsemble
#25423 opened
Jul 16, 2025 - [EP ABI] Add Graph_GetModelPath API function
#25424 opened
Jul 16, 2025 - [CANN]Fix issue with negative dynamic tensor shape
#25431 opened
Jul 17, 2025 - [WebNN] Fix some spelling and naming issues
#25433 opened
Jul 17, 2025 - [webgpu] support And operator
#25440 opened
Jul 18, 2025 - [webgpu] support float16 type for Einsum operator
#25443 opened
Jul 18, 2025 - [QNN EP] Enable Conv Op with "auto_pad" param set as VALID
#25444 opened
Jul 18, 2025 - [NV RTX EP] Set Compute Capability only on Turing architecture
#25446 opened
Jul 18, 2025 - [VitisAI] Remove 4k alignment from preferred allocator
#25447 opened
Jul 18, 2025 - [VitisAI] Upstream changes from win-ort
#25448 opened
Jul 18, 2025 - [NV RTX EP] Iraut/vendor id impl
#25449 opened
Jul 18, 2025 - Update .config/1espt/PipelineAutobaseliningConfig.yml
#25450 opened
Jul 18, 2025 - Remove training packages from onnxruntime-ios-packaging-pipeline
#25451 opened
Jul 18, 2025 - Enable TSA for nuget packaging pipelines
#25452 opened
Jul 18, 2025 - [QNN EP] Minor fix to enable MatMulAddFusion
#25453 opened
Jul 18, 2025 - [EP ABI] Signatures for compatibility info methods
#25454 opened
Jul 18, 2025
11 Issues closed by3 people
- [Performance]
#24787 closed
Jul 18, 2025 - ONNX model loading crashes on MacOS in protobuf function
#25420 closed
Jul 16, 2025 - [Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps'
#22519 closed
Jul 15, 2025 - [Build] Unable to build ONNX Runtime 1.22 due to dependency update
#25098 closed
Jul 15, 2025 - [Build] Unable to build due to eigen dependency
#25406 closed
Jul 15, 2025 - why not support ceil() in version 1.12.1
#24674 closed
Jul 15, 2025 - [Build] Cannot cross platform build for v3.20 alpine arm64
#24788 closed
Jul 15, 2025 - [Build] Fail to pass AutoEpSelection and OrtEpLibrary tests in Windows x64 QNN build
#24676 closed
Jul 14, 2025 - [Documentation] outdated documents about cuda version and onnxruntime
#24759 closed
Jul 14, 2025 - pip install --no-cache-dir onnxruntime-gpu --no-deps fails to install
#25288 closed
Jul 13, 2025
16 Issues opened by16 people
- int4 quantized model can't run with DML provider
#25445 opened
Jul 18, 2025 - [Performance] NV TRT RTX provider performance slower than TensorRT on RTX 4000 Ada
#25442 opened
Jul 18, 2025 - [Mobile] QNN graph execute error. Error code: 6001
#25422 opened
Jul 16, 2025 - [Feature Request] Support BERT models In CoreML execution provider
#25421 opened
Jul 16, 2025 - The resize operator's behavior may use Round instead of Floor
#25417 opened
Jul 16, 2025 - Support output QK symbolic shape inference
#25412 opened
Jul 15, 2025 - LogSoftmax produces different results when opset=11 or 13
#25402 opened
Jul 15, 2025 - TreeEnsemble dies due to memory/segmentation errors with large membership values
#25400 opened
Jul 15, 2025 - [Build] libonnxruntime_providers_cuda.so built from soure (v1.22) is too large (1.4GB)
#25399 opened
Jul 15, 2025 - [Performance] Update ONNX runtime to increase performance on CoreML
#25396 opened
Jul 15, 2025 - Can model Optimizations support skip custom operator?
#25394 opened
Jul 15, 2025 - [Feature Request] [C#] Add support for System.Half for input/output tensors
#25392 opened
Jul 14, 2025 - [Feature Request] Injecting GPU memory in CUDA/TensorRT EPs
#25385 opened
Jul 14, 2025 - Invalid MIGraphX EP option: migraphx_load_compiled_path
#25379 opened
Jul 13, 2025 - [Build] Can't build CUDA on Windows
#25377 opened
Jul 13, 2025
48 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
- Attention Operator (CPU)
#25156 commented on
Jul 18, 2025 • 42 new comments - Plugin EP data transfer and Stream support.
#25254 commented on
Jul 18, 2025 • 20 new comments - [MIGraphx EP] Sync AMD changes upstream
#25338 commented on
Jul 18, 2025 • 13 new comments - [EP ABI] Get EP compiled model compatibility
#25331 commented on
Jul 18, 2025 • 9 new comments - Convert Initializers to OrtValues Phase 2
#25320 commented on
Jul 18, 2025 • 8 new comments - [Mlas] optimize MlasConv using thread partition opt
#25255 commented on
Jul 15, 2025 • 4 new comments - Enable CUDA Graph in nv_tensorrt_rtx EP
#25368 commented on
Jul 18, 2025 • 2 new comments - Iraut/update nv trt rtx ep doc
#25321 commented on
Jul 15, 2025 • 2 new comments - Bump protobuf from 3.20.2 to 4.25.8 in /onnxruntime/python/tools/transformers/models/llama
#25085 commented on
Jul 18, 2025 • 0 new comments - Bump protobuf from 3.20.2 to 4.25.8 in /onnxruntime/python/tools/transformers/models/whisper
#25086 commented on
Jul 18, 2025 • 0 new comments - Bump protobuf from 3.20.3 to 4.25.8 in /onnxruntime/python/tools/transformers/models/stable_diffusion/requirements
#25087 commented on
Jul 18, 2025 • 0 new comments - Bump protobuf from 4.21.12 to 4.25.8 in /tools/ci_build/github/linux/docker/inference/aarch64/python/cpu/scripts
#25088 commented on
Jul 18, 2025 • 0 new comments - Update index.md
#25119 commented on
Jul 17, 2025 • 0 new comments - KleidiAI SGEMM/IGEMM/Quantized MatMul - Modular MLAS API Changes for KleidiAI
#25187 commented on
Jul 16, 2025 • 0 new comments - [WIP] Add some device discovery support for non-Windows platforms
#25228 commented on
Jul 15, 2025 • 0 new comments - [ARM CPU] SVE support for Elementwise kernels
#25238 commented on
Jul 14, 2025 • 0 new comments - Upgrade xnnpack to latest
#25275 commented on
Jul 14, 2025 • 0 new comments - FIX: dxcore include when compiling with older Windows SDK
#25297 commented on
Jul 13, 2025 • 0 new comments - Bump transformers from 4.50.0 to 4.51.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion/requirements
#25322 commented on
Jul 18, 2025 • 0 new comments - [NvTensorRTRTX EP]Disable Fast GELU operator in base model used for NV EP Unit Tests
#25323 commented on
Jul 14, 2025 • 0 new comments - Update python bindings to be able to use a shared allocator and/or IDataTransfer registered by a plugin EP in the Environment
#25346 commented on
Jul 14, 2025 • 0 new comments - Support read-only allocator for use with initializers
#25348 commented on
Jul 14, 2025 • 0 new comments - Add patch for WebGPU on Android to handle fp16 in uniforms
#25349 commented on
Jul 17, 2025 • 0 new comments - [webgpu] Enable per-run control for graph capture
#25367 commented on
Jul 14, 2025 • 0 new comments - [Build] Failed to build onnxruntime java api 1.20.0 on windows (C2039: 'system_clock': is not a member of 'std::chrono')
#24622 commented on
Jul 12, 2025 • 0 new comments - Non-zero status code returned while running LSTM node
#10768 commented on
Jul 12, 2025 • 0 new comments - onnxruntime outputs different results for different opset versions
#25050 commented on
Jul 13, 2025 • 0 new comments - [Build] ORT can't build with cuda 12.9
#24731 commented on
Jul 13, 2025 • 0 new comments - SafeIntOnOverflow() Integer overflow error when running inference in an ASGI server
#12288 commented on
Jul 14, 2025 • 0 new comments - onnxruntime-directml import interference with sklearn
#21724 commented on
Jul 14, 2025 • 0 new comments - [Delivery] Win ARM64 wheels + QNN
#19162 commented on
Jul 14, 2025 • 0 new comments - Multi-threaded GPU inferencing failing with whisper-small: Non-zero status code returned while running DecoderMaskedMultiHeadAttention node
#21413 commented on
Jul 15, 2025 • 0 new comments - OnnxRuntimeGenAIException: CUDA execution provider is not enabled in this build.
#23715 commented on
Jul 15, 2025 • 0 new comments - dockerfile with different stage
#17812 commented on
Jul 16, 2025 • 0 new comments - [Mobile] Android Native crash in [split_config.armeabi_v7a.apk!libonnxruntime.so] OrtSessionOptionsAppendExecutionProvider_Nnapi
#25138 commented on
Jul 16, 2025 • 0 new comments - Persistent Crashes on Android/armeabi-v7a
#25097 commented on
Jul 16, 2025 • 0 new comments - How can I output every node's output shape when infer with onnx models with lots of if branches
#25052 commented on
Jul 16, 2025 • 0 new comments - [Mobile] MatMulNbits Q8 Errors out on Android
#24769 commented on
Jul 16, 2025 • 0 new comments - [Build] Build fails: 'error : no operator "+=" matches these operands' with nv_bfloat16
#25162 commented on
Jul 16, 2025 • 0 new comments - [Mobile] TypeError: A bool tensor's data must be type of function Uint8Array() { [native code] }
#25294 commented on
Jul 16, 2025 • 0 new comments - [Web] Fail to link static Wasm library with WebNN EP support
#24936 commented on
Jul 17, 2025 • 0 new comments - [Performance] How does onnxruntime run in parallel mode?
#21259 commented on
Jul 17, 2025 • 0 new comments - Static quantize self-attention module not work
#17278 commented on
Jul 17, 2025 • 0 new comments - ORT 1.22.0 fails assertion on python import on aarch64, but not x86_64
#25103 commented on
Jul 18, 2025 • 0 new comments - [Web] Cannot import from web worker
#25096 commented on
Jul 18, 2025 • 0 new comments - Failed to load library libonnxruntime_providers_cuda.so I am getting the following erro
#19616 commented on
Jul 18, 2025 • 0 new comments - [TRT RTX EP] Implement GetEPContextNodes()
#24901 commented on
Jul 18, 2025 • 0 new comments - [MIGRAPHX] Add ORT generic interface build support for MigraphX
#25004 commented on
Jul 14, 2025 • 0 new comments