ONNX Exporter
Export PyTorch models to ONNX format with optional INT8 quantization
ONNX Export
Export PyTorch models to ONNX format for deployment. Supports: - Basic ONNX export with graph optimization - Dynamic INT8 quantization (no calibration needed) - Static INT8 quantization (with calibration data) - Output verification against original model
Export Function
Inference Wrapper
Verification
Usage Examples
from fasterai.export.all import export_onnx, ONNXModel, verify_onnx
# Basic export
path = export_onnx(model, sample, "model.onnx")
# With quantization
path = export_onnx(model, sample, "model.onnx", quantize=True)
# Inference
onnx_model = ONNXModel("model.onnx")
output = onnx_model(input_tensor)
# Verify
assert verify_onnx(model, "model.onnx", sample)/home/nathan/miniconda3/envs/dev/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
/home/nathan/miniconda3/envs/dev/lib/python3.12/site-packages/torch/cuda/__init__.py:182: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:119.)
return torch._C._cuda_getDeviceCount() > 0
W0202 11:00:43.171000 421313 site-packages/torch/utils/cpp_extension.py:117] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-12.8'
Found permutation search CUDA kernels [ASP][Info] permutation_search_kernels can be imported. —
export_onnx
def export_onnx(
model:nn.Module, # PyTorch model to export
sample:torch.Tensor, # Example input for tracing (with batch dim)
output_path:str | Path, # Output .onnx file path
opset_version:int=18, # ONNX opset version
quantize:bool=False, # Apply INT8 quantization after export
quantize_mode:str='dynamic', # "dynamic" (no calibration) or "static"
calibration_data:Iterable | None=None, # DataLoader for static quantization
optimize:bool=True, # Run ONNX graph optimizer
dynamic_batch:bool=True, # Allow variable batch size at runtime
input_names:list[str] | None=None, # Names for input tensors
output_names:list[str] | None=None, # Names for output tensors
)->Path:
Export a PyTorch model to ONNX format with optional quantization
ONNXModel
def ONNXModel(
path:str | Path, device:str='cpu'
):
Wrapper for ONNX Runtime inference with PyTorch-like interface
verify_onnx
def verify_onnx(
model:nn.Module, # Original PyTorch model
onnx_path:str | Path, # Path to exported ONNX model
sample:torch.Tensor, # Test input tensor
rtol:float=0.001, # Relative tolerance
atol:float=1e-05, # Absolute tolerance
)->bool:
Verify ONNX model outputs match PyTorch model within tolerance