Inference
ONNX
Quick Answer
Open Neural Network Exchange: a standard format for representing trained models across frameworks.
ONNX is an open format for machine learning models, enabling inference in any ONNX-supporting runtime. ONNX models are framework-agnostic: train in PyTorch, deploy in ONNX Runtime. ONNX supports optimizations (quantization, operator fusion). ONNX enables hardware-specific optimization. ONNX Runtime is highly optimized for CPU and GPU inference. ONNX is practical for deployment requiring framework flexibility.
Last verified: 2026-04-08