Criteria

Which parameter is important in a neural network ?

The criteria implemented come from this paper.

Found permutation search CUDA kernels [ASP][Info] permutation_search_kernels can be imported. —

source

Reducer


def Reducer(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

Initialize self. See help(type(self)) for accurate signature.


source

Normalizer


def Normalizer(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

Initialize self. See help(type(self)) for accurate signature.


source

Criteria


def Criteria(
    f:Callable[[torch.Tensor], torch.Tensor], # Function that transforms weights (e.g., torch.abs, torch.square)
    reducer:Callable=<function Reducer.mean>, # Method to reduce dimensions ('mean' or 'sum')
    normalizer:Callable | None=None, # Method to normalize scores (None, 'sum', 'standardization', 'mean', 'max', 'gaussian')
    needs_init:bool=False, # Whether this criteria needs the initial weights
    needs_update:bool=False, # Whether this criteria needs to track weight updates between iterations
    output_fn:Callable[[torch.Tensor, torch.Tensor], torch.Tensor] | None=None, # Function to combine current and reference weights
    return_init:bool=False, # Whether to return the transformed initial weights instead of final output
):

Evaluates neural network parameters based on various criteria for pruning

Magnitude Based Criteria

Random

demo_model(random)

Large Final Value

demo_model(large_final)

Squared Final Value

demo_model(squared_final)

Small Final Value

demo_model(small_final)

Init based criteria

Large Init Value

demo_model(large_init)

Small Init Value

demo_model(small_init)

Large Init Large Final Value

demo_model(large_init_large_final, 80)

Small Init Small Final Value

demo_model(small_init_small_final)

Increasing Magnitude

demo_model(magnitude_increase, 60)

Movement Pruning

demo_model(movement)

movmag = init_based_criteria(noop, output_fn=lambda x,y: torch.abs(torch.mul(x, torch.sub(x,y))))
demo_model(movmag)

Update based criteria

The following criteria use an updating value of the weights, i.e. the value from the previous iteration of training, instead of the initialization value to better capture the training dynamics.

Updating Magnitude Increase

demo_model(updating_magnitude_increase)

Updating Movement

demo_model(updating_movement, 50)

Updating mov-magnitude

demo_model(updating_movmag)

New Ideas

updating_magnitude_increase = Criteria(torch.abs, needs_update=True, output_fn= lambda x,y: torch.abs(torch.sub(x,y)))

demo_model(updating_magnitude_increase)

updating_magnitude_increase = Criteria(torch.abs, needs_update=True, output_fn= lambda x,y: torch.sub(x,y))

demo_model(updating_magnitude_increase)

updating_magnitude_increase = Criteria(torch.square, needs_update=True, output_fn= lambda x,y: torch.abs(torch.sub(x,y)))

demo_model(updating_magnitude_increase)

updating_movmag = Criteria(noop, needs_update=True, output_fn=lambda x,y: torch.abs(torch.mul(x, torch.sub(x,y))))
demo_model(updating_movmag)

updating_movmag = Criteria(noop, needs_update=True, output_fn=lambda x,y: torch.abs(torch.mul(torch.square(x), torch.sub(x,y))))
demo_model(updating_movmag)

updating_movmag = Criteria(torch.square, needs_update=True, output_fn=lambda x,y: torch.abs(torch.mul(x, torch.sub(x,y))))
#updating_movmag = Criteria(noop, needs_update=True, output_fn=lambda x,y: torch.mul(x, torch.sub(x,y)))
demo_model(updating_movmag)

updating_movmag = Criteria(torch.abs, needs_update=True, output_fn=lambda x,y: torch.abs(torch.mul(x, torch.sub(x,y))))
#updating_movmag = Criteria(noop, needs_update=True, output_fn=lambda x,y: torch.mul(x, torch.sub(x,y)))
demo_model(updating_movmag, 30)

updating_movmag = Criteria(torch.abs, needs_update=True, output_fn=lambda x,y: torch.mul(x, torch.sub(x,y)))

demo_model(updating_movmag, 80)

updating_movmag = Criteria(torch.square, needs_update=True, output_fn=lambda x,y: torch.mul(x, torch.sub(x,y)))

demo_model(updating_movmag)

updating_movmag = Criteria(noop, needs_update=True, output_fn=lambda x,y: torch.mul(x, torch.sub(x,y)))

demo_model(updating_movmag)

updating_movement = Criteria(noop, needs_update=True, output_fn= lambda x,y: torch.abs(torch.sub(-x,y)))
demo_model(updating_movement, 50)

updating_movement = Criteria(torch.abs, needs_update=True, output_fn= lambda x,y: torch.abs(torch.sub(-x,y)))
demo_model(updating_movement)

updating_movement = Criteria(torch.abs, needs_update=True, output_fn= lambda x,y: torch.abs(torch.cosh(torch.sub(x,y))))
demo_model(updating_movement)

updating_movement = Criteria(torch.square, needs_update=True, output_fn= lambda x,y: torch.abs(torch.sub(x,y)))
demo_model(updating_movement)

updating_movement = Criteria(noop, needs_update=True, output_fn= lambda x,y: torch.sub(x,y))
demo_model(updating_movement)

mine = partial(torch.pow, exponent=4)
large_final = Criteria(torch.frac)
demo_model(large_final)

First order Taylor expansion on the weight (as per Nvidia Taylor Pruning)

scores = torch.randn(100).abs()
normed = Normalizer.standardization(scores)

See Also

  • Sparsifier - Apply sparsification using these criteria
  • Pruner - Structured pruning with importance scoring
  • Granularity - Control what gets pruned (weights, filters, etc.)