ONNX Exporter

Export PyTorch models to ONNX format with optional INT8 quantization

ONNX Export

Export PyTorch models to ONNX format for deployment. Supports: - Basic ONNX export with graph optimization - Dynamic INT8 quantization (no calibration needed) - Static INT8 quantization (with calibration data) - Output verification against original model

Export Function

Inference Wrapper

Verification

Usage Examples

from fasterai.export.all import export_onnx, ONNXModel, verify_onnx

# Basic export
path = export_onnx(model, sample, "model.onnx")

# With quantization
path = export_onnx(model, sample, "model.onnx", quantize=True)

# Inference
onnx_model = ONNXModel("model.onnx")
output = onnx_model(input_tensor)

# Verify
assert verify_onnx(model, "model.onnx", sample)

Found permutation search CUDA kernels [ASP][Info] permutation_search_kernels can be imported. —

source

export_onnx


def export_onnx(
    model:nn.Module, # PyTorch model to export
    sample:torch.Tensor, # Example input for tracing (with batch dim)
    output_path:str | Path, # Output .onnx file path
    opset_version:int=17, # ONNX opset version (17 recommended for compatibility)
    quantize:bool=False, # Apply INT8 quantization after export
    quantize_mode:str='dynamic', # "dynamic" (no calibration) or "static"
    calibration_data:Iterable | None=None, # DataLoader for static quantization
    optimize:bool=True, # Run ONNX graph optimizer
    dynamic_batch:bool=True, # Allow variable batch size at runtime
    input_names:list[str] | None=None, # Names for input tensors
    output_names:list[str] | None=None, # Names for output tensors
)->Path:

Export a PyTorch model to ONNX format with optional quantization


source

ONNXModel


def ONNXModel(
    path:str | Path, device:str='cpu'
):

Wrapper for ONNX Runtime inference with PyTorch-like interface


source

verify_onnx


def verify_onnx(
    model:nn.Module, # Original PyTorch model
    onnx_path:str | Path, # Path to exported ONNX model
    sample:torch.Tensor, # Test input tensor
    rtol:float=0.001, # Relative tolerance
    atol:float=1e-05, # Absolute tolerance
)->bool:

Verify ONNX model outputs match PyTorch model within tolerance