def Criteria( f:Callable[[torch.Tensor], torch.Tensor], # Function that transforms weights (e.g., torch.abs, torch.square) reducer:Callable=<function Reducer.mean>, # Method to reduce dimensions ('mean' or 'sum') normalizer:Callable |None=None, # Method to normalize scores (None, 'sum', 'standardization', 'mean', 'max', 'gaussian') needs_init:bool=False, # Whether this criteria needs the initial weights needs_update:bool=False, # Whether this criteria needs to track weight updates between iterations output_fn:Callable[[torch.Tensor, torch.Tensor], torch.Tensor] |None=None, # Function to combine current and reference weights return_init:bool=False, # Whether to return the transformed initial weights instead of final output):
Evaluates neural network parameters based on various criteria for pruning
The following criteria use an updating value of the weights, i.e. the value from the previous iteration of training, instead of the initialization value to better capture the training dynamics.
Updating Magnitude Increase
demo_model(updating_magnitude_increase)
Updating Movement
demo_model(updating_movement, 50)
Updating mov-magnitude
demo_model(updating_movmag)
Activation-Based Criteria
The following criteria use input activation statistics collected during a calibration pass, providing data-aware importance scoring.
Wanda
Wanda (Sun et al. ICLR 2024) scores weight importance as |W| × ‖X‖₂ — the product of weight magnitude and input activation L2 norm. Best for one-shot post-training sparsification. Requires calibration data passed via Sparsifier(data=...).