Quantizer
Quantize your network
Quantizer
Quantizer (backend:str='x86', method:str='static', qconfig_mapping:Optional[Dict]=None, custom_configs:Optional[Dict]=None, use_per_tensor:bool=False, verbose:bool=False)
Initialize a quantizer with specified backend and options.
Type | Default | Details | |
---|---|---|---|
backend | str | x86 | Target backend for quantization |
method | str | static | Quantization method: ‘static’, ‘dynamic’, or ‘qat’ |
qconfig_mapping | Optional | None | Optional custom quantization config |
custom_configs | Optional | None | Custom module-specific configurations |
use_per_tensor | bool | False | Force per-tensor quantization |
verbose | bool | False | Enable verbose output |
Quantizer.quantize
Quantizer.quantize (model:torch.nn.modules.module.Module, calibration_dl:Any, max_calibration_samples:int=100, device:Union[str,torch.device]='cpu')
Quantize a model using the specified method and settings.
Type | Default | Details | |
---|---|---|---|
model | Module | Model to quantize | |
calibration_dl | Any | Dataloader for calibration | |
max_calibration_samples | int | 100 | Maximum number of samples to use for calibration |
device | Union | cpu | Device to use for calibration |
Returns | Module |