= resnet18() model
Sparsifier
A sparse vector, as opposed to a dense one, is a vector which contains a lot of zeroes. When we speak about making a neural network sparse, we thus mean that the network’s weight are mostly zeroes.
With fasterai, you can do that thanks to the Sparsifier
class.
Let’s start by creating a model
As you probably know, weights in a convolutional neural network have 4 dimensions ($ c_{out} c_{in} k_h k_w$)
model.conv1.weight.ndim
4
In the case of ResNet18, the dimension of the first layer weights is \(64 \times 3 \times 7 \times 7\). We thus can plot each of the \(64\) filter as a \(7 \times 7\) color image (because they contains \(3\) channels).
plot_kernels(model.conv1)
The Sparsifier
class allows us to remove some (part of) the filters, that are considered to be less useful than others. This can be done by first creating an instance of the class, specifying:
- The
granularity
, i.e. the part of filters that you want to remove. Typically, we usually remove weights, vectors, kernels or even complete filters. - The
context
, i.e. if you want to consider each layer independently (local
), or compare the parameters to remove across the whole network (global
). - The
criteria
, i.e. the way to assess the usefulness of a parameter. Common methods compare parameters using their magnitude, the lowest magnitude ones considered to be less useful.
User can pass a single layer to prune by using the Sparsifier.sparsify_layer
method.
Sparsifier.sparsify_layer
Sparsifier.sparsify_layer (m:torch.nn.modules.module.Module, sparsity:float, round_to:Optional[int]=None)
Apply sparsification to a single layer
Type | Default | Details | |
---|---|---|---|
m | Module | The layer to sparsify | |
sparsity | float | Target sparsity level (percentage) | |
round_to | Optional | None | Round to a multiple of this value |
Returns | None |
= resnet18()
model = Sparsifier(model, 'filter', 'local', large_final)
sparsifier 70) sparsifier.sparsify_layer(model.conv1,
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 6,615 70.31%
Layer 7 Conv2d 36,864 0 0.00%
Layer 10 Conv2d 36,864 0 0.00%
Layer 13 Conv2d 36,864 0 0.00%
Layer 16 Conv2d 36,864 0 0.00%
Layer 20 Conv2d 73,728 0 0.00%
Layer 23 Conv2d 147,456 0 0.00%
Layer 26 Conv2d 8,192 0 0.00%
Layer 29 Conv2d 147,456 0 0.00%
Layer 32 Conv2d 147,456 0 0.00%
Layer 36 Conv2d 294,912 0 0.00%
Layer 39 Conv2d 589,824 0 0.00%
Layer 42 Conv2d 32,768 0 0.00%
Layer 45 Conv2d 589,824 0 0.00%
Layer 48 Conv2d 589,824 0 0.00%
Layer 52 Conv2d 1,179,648 0 0.00%
Layer 55 Conv2d 2,359,296 0 0.00%
Layer 58 Conv2d 131,072 0 0.00%
Layer 61 Conv2d 2,359,296 0 0.00%
Layer 64 Conv2d 2,359,296 0 0.00%
--------------------------------------------------------------------------------
Overall all 11,166,912 6,615 0.06%
Most of the time, we may want to prune the whole model at once, using the Sparsifier.sparsify_model
method, indicating the percentage of sparsity to you want to apply.
Sparsifier.sparsify_model
Sparsifier.sparsify_model (sparsity:Union[float,List[float]], round_to:Optional[int]=None)
Apply sparsification to all matching layers in the model
Type | Default | Details | |
---|---|---|---|
sparsity | Union | Target sparsity level(s) | |
round_to | Optional | None | Round to a multiple of this value |
Returns | None |
There are several ways in which we can make that first layer sparse. You will find the most important below:
= resnet18()
model = Sparsifier(model, 'weight', 'local', large_final)
sparsifier 70) sparsifier.sparsify_model(
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 6,585 69.99%
Layer 7 Conv2d 36,864 25,805 70.00%
Layer 10 Conv2d 36,864 25,805 70.00%
Layer 13 Conv2d 36,864 25,805 70.00%
Layer 16 Conv2d 36,864 25,805 70.00%
Layer 20 Conv2d 73,728 51,609 70.00%
Layer 23 Conv2d 147,456 103,219 70.00%
Layer 26 Conv2d 8,192 5,734 70.00%
Layer 29 Conv2d 147,456 103,219 70.00%
Layer 32 Conv2d 147,456 103,219 70.00%
Layer 36 Conv2d 294,912 206,438 70.00%
Layer 39 Conv2d 589,824 412,877 70.00%
Layer 42 Conv2d 32,768 22,937 70.00%
Layer 45 Conv2d 589,824 412,877 70.00%
Layer 48 Conv2d 589,824 412,877 70.00%
Layer 52 Conv2d 1,179,648 825,753 70.00%
Layer 55 Conv2d 2,359,296 1,651,507 70.00%
Layer 58 Conv2d 131,072 91,750 70.00%
Layer 61 Conv2d 2,359,296 1,651,506 70.00%
Layer 64 Conv2d 2,359,296 1,651,507 70.00%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,834 70.00%
You now have a model that is \(70\%\) sparse !
Granularity
As we said earlier, the granularity
defines the structure of parameter that you will remove.
In the example below, we removed weight
from each convolutional filter, meaning that we now have sparse filters, as can be seen in the image below:
plot_kernels(model.conv1)
Another granularity is, for example, removing column
vectors from the filters. To do so, just change the granularity parameter accordingly.
= resnet18()
model = Sparsifier(model, 'column', 'local', large_final)
sparsifier 70) sparsifier.sparsify_layer(model.conv1,
plot_kernels(model.conv1)
For more information and examples about the pruning granularities, I suggest you to take a look at the corresponding section.
Context
The context defines where to look in the model, i.e. from where do we compare weight. The two basic contexts are: * local, i.e. we compare weight from each layer individually. This will lead to layers with similar levels of sparsity. * global, i.e. we compare weight from the whole model. This will lead to layers with different levels of sparsity
= resnet18()
model = Sparsifier(model, 'weight', 'local', large_final)
sparsifier 70) sparsifier.sparsify_model(
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 6,585 69.99%
Layer 7 Conv2d 36,864 25,805 70.00%
Layer 10 Conv2d 36,864 25,805 70.00%
Layer 13 Conv2d 36,864 25,805 70.00%
Layer 16 Conv2d 36,864 25,805 70.00%
Layer 20 Conv2d 73,728 51,609 70.00%
Layer 23 Conv2d 147,456 103,219 70.00%
Layer 26 Conv2d 8,192 5,734 70.00%
Layer 29 Conv2d 147,456 103,219 70.00%
Layer 32 Conv2d 147,456 103,219 70.00%
Layer 36 Conv2d 294,912 206,438 70.00%
Layer 39 Conv2d 589,824 412,876 70.00%
Layer 42 Conv2d 32,768 22,937 70.00%
Layer 45 Conv2d 589,824 412,877 70.00%
Layer 48 Conv2d 589,824 412,877 70.00%
Layer 52 Conv2d 1,179,648 825,753 70.00%
Layer 55 Conv2d 2,359,296 1,651,507 70.00%
Layer 58 Conv2d 131,072 91,750 70.00%
Layer 61 Conv2d 2,359,296 1,651,507 70.00%
Layer 64 Conv2d 2,359,296 1,651,507 70.00%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,834 70.00%
= resnet18()
model = Sparsifier(model, 'weight', 'global', large_final)
sparsifier 70) sparsifier.sparsify_model(
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 6,340 67.39%
Layer 7 Conv2d 36,864 11,874 32.21%
Layer 10 Conv2d 36,864 11,788 31.98%
Layer 13 Conv2d 36,864 11,761 31.90%
Layer 16 Conv2d 36,864 11,882 32.23%
Layer 20 Conv2d 73,728 32,673 44.32%
Layer 23 Conv2d 147,456 65,162 44.19%
Layer 26 Conv2d 8,192 1,239 15.12%
Layer 29 Conv2d 147,456 64,940 44.04%
Layer 32 Conv2d 147,456 65,162 44.19%
Layer 36 Conv2d 294,912 174,317 59.11%
Layer 39 Conv2d 589,824 350,277 59.39%
Layer 42 Conv2d 32,768 7,123 21.74%
Layer 45 Conv2d 589,824 349,232 59.21%
Layer 48 Conv2d 589,824 349,235 59.21%
Layer 52 Conv2d 1,179,648 894,622 75.84%
Layer 55 Conv2d 2,359,296 1,789,781 75.86%
Layer 58 Conv2d 131,072 39,957 30.48%
Layer 61 Conv2d 2,359,296 1,789,913 75.87%
Layer 64 Conv2d 2,359,296 1,789,559 75.85%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,837 70.00%
Criteria
The criteria
defines how we select the parameters to remove. It is usually given by a scoring method. The most common one is the large_final
, i.e. select parameters with the highest absolute value as they are supposed to contribute the most to the final results of the model.
= resnet18()
model = Sparsifier(model, 'weight', 'global', large_final)
sparsifier 70) sparsifier.sparsify_model(
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 6,340 67.39%
Layer 7 Conv2d 36,864 11,870 32.20%
Layer 10 Conv2d 36,864 11,904 32.29%
Layer 13 Conv2d 36,864 11,784 31.97%
Layer 16 Conv2d 36,864 11,769 31.93%
Layer 20 Conv2d 73,728 32,861 44.57%
Layer 23 Conv2d 147,456 64,797 43.94%
Layer 26 Conv2d 8,192 1,265 15.44%
Layer 29 Conv2d 147,456 65,236 44.24%
Layer 32 Conv2d 147,456 64,946 44.04%
Layer 36 Conv2d 294,912 174,580 59.20%
Layer 39 Conv2d 589,824 349,429 59.24%
Layer 42 Conv2d 32,768 6,896 21.04%
Layer 45 Conv2d 589,824 349,440 59.24%
Layer 48 Conv2d 589,824 350,151 59.37%
Layer 52 Conv2d 1,179,648 894,264 75.81%
Layer 55 Conv2d 2,359,296 1,790,074 75.87%
Layer 58 Conv2d 131,072 40,049 30.55%
Layer 61 Conv2d 2,359,296 1,789,386 75.84%
Layer 64 Conv2d 2,359,296 1,789,797 75.86%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,838 70.00%
= resnet18()
model = Sparsifier(model, 'weight', 'global', small_final)
sparsifier 70) sparsifier.sparsify_model(
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 9,407 99.99%
Layer 7 Conv2d 36,864 702 1.90%
Layer 10 Conv2d 36,864 1,177 3.19%
Layer 13 Conv2d 36,864 281 0.76%
Layer 16 Conv2d 36,864 142 0.39%
Layer 20 Conv2d 73,728 4,066 5.51%
Layer 23 Conv2d 147,456 3,290 2.23%
Layer 26 Conv2d 8,192 9 0.11%
Layer 29 Conv2d 147,456 8,264 5.60%
Layer 32 Conv2d 147,456 1,489 1.01%
Layer 36 Conv2d 294,912 46,798 15.87%
Layer 39 Conv2d 589,824 85,884 14.56%
Layer 42 Conv2d 32,768 94 0.29%
Layer 45 Conv2d 589,824 101,028 17.13%
Layer 48 Conv2d 589,824 107,154 18.17%
Layer 52 Conv2d 1,179,648 1,059,786 89.84%
Layer 55 Conv2d 2,359,296 2,171,266 92.03%
Layer 58 Conv2d 131,072 99 0.08%
Layer 61 Conv2d 2,359,296 2,084,975 88.37%
Layer 64 Conv2d 2,359,296 2,130,925 90.32%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,836 70.00%
For more information and examples about the pruning criteria, I suggest you to take a look at the corresponding section.
Remark
In some case, you may want to impose the remaining amount of parameters to be a multiple of 8, this can be done by passing the round_to
parameter.
= resnet18()
model = Sparsifier(model, 'filter', 'local', large_final)
sparsifier 70, round_to=8) sparsifier.sparsify_model(
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 5,880 62.50%
Layer 7 Conv2d 36,864 23,040 62.50%
Layer 10 Conv2d 36,864 23,040 62.50%
Layer 13 Conv2d 36,864 23,040 62.50%
Layer 16 Conv2d 36,864 23,040 62.50%
Layer 20 Conv2d 73,728 50,688 68.75%
Layer 23 Conv2d 147,456 101,376 68.75%
Layer 26 Conv2d 8,192 5,632 68.75%
Layer 29 Conv2d 147,456 101,376 68.75%
Layer 32 Conv2d 147,456 101,376 68.75%
Layer 36 Conv2d 294,912 202,752 68.75%
Layer 39 Conv2d 589,824 405,504 68.75%
Layer 42 Conv2d 32,768 22,528 68.75%
Layer 45 Conv2d 589,824 405,504 68.75%
Layer 48 Conv2d 589,824 405,504 68.75%
Layer 52 Conv2d 1,179,648 811,008 68.75%
Layer 55 Conv2d 2,359,296 1,622,016 68.75%
Layer 58 Conv2d 131,072 90,112 68.75%
Layer 61 Conv2d 2,359,296 1,622,016 68.75%
Layer 64 Conv2d 2,359,296 1,622,017 68.75%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,667,449 68.66%
= resnet18()
model = Sparsifier(model, 'filter', 'global', large_final)
sparsifier 70, round_to=8) sparsifier.sparsify_model(
sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 1 Conv2d 9,408 8,232 87.50%
Layer 7 Conv2d 36,864 0 0.00%
Layer 10 Conv2d 36,864 0 0.00%
Layer 13 Conv2d 36,864 0 0.00%
Layer 16 Conv2d 36,864 0 0.00%
Layer 20 Conv2d 73,728 69,120 93.75%
Layer 23 Conv2d 147,456 138,240 93.75%
Layer 26 Conv2d 8,192 0 0.00%
Layer 29 Conv2d 147,456 138,240 93.75%
Layer 32 Conv2d 147,456 129,024 87.50%
Layer 36 Conv2d 294,912 285,696 96.88%
Layer 39 Conv2d 589,824 571,392 96.88%
Layer 42 Conv2d 32,768 0 0.00%
Layer 45 Conv2d 589,824 571,392 96.88%
Layer 48 Conv2d 589,824 571,392 96.88%
Layer 52 Conv2d 1,179,648 1,161,216 98.44%
Layer 55 Conv2d 2,359,296 2,322,432 98.44%
Layer 58 Conv2d 131,072 0 0.00%
Layer 61 Conv2d 2,359,296 2,322,432 98.44%
Layer 64 Conv2d 2,359,296 2,322,432 98.44%
--------------------------------------------------------------------------------
Overall all 11,166,912 10,611,240 95.02%
For more information about granularities at which you can operate, please check the related page.