model = resnet18()Sparsifier
A sparse vector, as opposed to a dense one, is a vector which contains a lot of zeroes. When we speak about making a neural network sparse, we thus mean that the network’s weight are mostly zeroes.
With fasterai, you can do that thanks to the Sparsifier class.
Let’s start by creating a model
As you probably know, weights in a convolutional neural network have 4 dimensions ($ c_{out} c_{in} k_h k_w$)
model.conv1.weight.ndim4
In the case of ResNet18, the dimension of the first layer weights is \(64 \times 3 \times 7 \times 7\). We thus can plot each of the \(64\) filter as a \(7 \times 7\) color image (because they contains \(3\) channels).
plot_kernels(model.conv1)
The Sparsifier class allows us to remove some (part of) the filters, that are considered to be less useful than others. This can be done by first creating an instance of the class, specifying:
- The
granularity, i.e. the part of filters that you want to remove. Typically, we usually remove weights, vectors, kernels or even complete filters. - The
context, i.e. if you want to consider each layer independently (local), or compare the parameters to remove across the whole network (global). - The
criteria, i.e. the way to assess the usefulness of a parameter. Common methods compare parameters using their magnitude, the lowest magnitude ones considered to be less useful.
User can pass a single layer to prune by using the Sparsifier.sparsify_layer method.
Found permutation search CUDA kernels [ASP][Info] permutation_search_kernels can be imported. —
Sparsifier.sparsify_layer
def sparsify_layer(
m:nn.Module, # The layer to sparsify
sparsity:float, # Target sparsity level (percentage)
round_to:Optional[int]=None, # Round to a multiple of this value
)->None:
Apply sparsification to a single layer
model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'local', large_final)
sparsifier.sparsify_layer(model.conv1, 70)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 6,615 70.31%
Layer 1 Conv2d 36,864 0 0.00%
Layer 2 Conv2d 36,864 0 0.00%
Layer 3 Conv2d 36,864 0 0.00%
Layer 4 Conv2d 36,864 0 0.00%
Layer 5 Conv2d 73,728 0 0.00%
Layer 6 Conv2d 147,456 0 0.00%
Layer 7 Conv2d 8,192 0 0.00%
Layer 8 Conv2d 147,456 0 0.00%
Layer 9 Conv2d 147,456 0 0.00%
Layer 10 Conv2d 294,912 0 0.00%
Layer 11 Conv2d 589,824 0 0.00%
Layer 12 Conv2d 32,768 0 0.00%
Layer 13 Conv2d 589,824 1 0.00%
Layer 14 Conv2d 589,824 0 0.00%
Layer 15 Conv2d 1,179,648 0 0.00%
Layer 16 Conv2d 2,359,296 0 0.00%
Layer 17 Conv2d 131,072 0 0.00%
Layer 18 Conv2d 2,359,296 0 0.00%
Layer 19 Conv2d 2,359,296 0 0.00%
--------------------------------------------------------------------------------
Overall all 11,166,912 6,616 0.06%
Most of the time, we may want to prune the whole model at once, using the Sparsifier.sparsify_model method, indicating the percentage of sparsity to you want to apply.
Sparsifier.sparsify_model
def sparsify_model(
sparsity:Union[float, list[float]], # Target sparsity level(s)
round_to:Optional[int]=None, # Round to a multiple of this value
)->None:
Apply sparsification to all matching layers in the model
There are several ways in which we can make that first layer sparse. You will find the most important below:
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'local', large_final)
sparsifier.sparsify_model(70)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 6,585 69.99%
Layer 1 Conv2d 36,864 25,805 70.00%
Layer 2 Conv2d 36,864 25,805 70.00%
Layer 3 Conv2d 36,864 25,805 70.00%
Layer 4 Conv2d 36,864 25,805 70.00%
Layer 5 Conv2d 73,728 51,609 70.00%
Layer 6 Conv2d 147,456 103,219 70.00%
Layer 7 Conv2d 8,192 5,734 70.00%
Layer 8 Conv2d 147,456 103,219 70.00%
Layer 9 Conv2d 147,456 103,219 70.00%
Layer 10 Conv2d 294,912 206,438 70.00%
Layer 11 Conv2d 589,824 412,877 70.00%
Layer 12 Conv2d 32,768 22,937 70.00%
Layer 13 Conv2d 589,824 412,877 70.00%
Layer 14 Conv2d 589,824 412,877 70.00%
Layer 15 Conv2d 1,179,648 825,753 70.00%
Layer 16 Conv2d 2,359,296 1,651,507 70.00%
Layer 17 Conv2d 131,072 91,750 70.00%
Layer 18 Conv2d 2,359,296 1,651,506 70.00%
Layer 19 Conv2d 2,359,296 1,651,507 70.00%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,834 70.00%
You now have a model that is \(70\%\) sparse !
Granularity
As we said earlier, the granularity defines the structure of parameter that you will remove.
In the example below, we removed weight from each convolutional filter, meaning that we now have sparse filters, as can be seen in the image below:
plot_kernels(model.conv1)
Another granularity is, for example, removing column vectors from the filters. To do so, just change the granularity parameter accordingly.
model = resnet18()
sparsifier = Sparsifier(model, 'column', 'local', large_final)
sparsifier.sparsify_layer(model.conv1, 70)plot_kernels(model.conv1)
For more information and examples about the pruning granularities, I suggest you to take a look at the corresponding section.
Context
The context defines where to look in the model, i.e. from where do we compare weight. The two basic contexts are: * local, i.e. we compare weight from each layer individually. This will lead to layers with similar levels of sparsity. * global, i.e. we compare weight from the whole model. This will lead to layers with different levels of sparsity
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'local', large_final)
sparsifier.sparsify_model(70)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 6,585 69.99%
Layer 1 Conv2d 36,864 25,805 70.00%
Layer 2 Conv2d 36,864 25,805 70.00%
Layer 3 Conv2d 36,864 25,805 70.00%
Layer 4 Conv2d 36,864 25,805 70.00%
Layer 5 Conv2d 73,728 51,609 70.00%
Layer 6 Conv2d 147,456 103,219 70.00%
Layer 7 Conv2d 8,192 5,734 70.00%
Layer 8 Conv2d 147,456 103,219 70.00%
Layer 9 Conv2d 147,456 103,219 70.00%
Layer 10 Conv2d 294,912 206,438 70.00%
Layer 11 Conv2d 589,824 412,877 70.00%
Layer 12 Conv2d 32,768 22,937 70.00%
Layer 13 Conv2d 589,824 412,877 70.00%
Layer 14 Conv2d 589,824 412,876 70.00%
Layer 15 Conv2d 1,179,648 825,752 70.00%
Layer 16 Conv2d 2,359,296 1,651,507 70.00%
Layer 17 Conv2d 131,072 91,750 70.00%
Layer 18 Conv2d 2,359,296 1,651,507 70.00%
Layer 19 Conv2d 2,359,296 1,651,506 70.00%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,832 70.00%
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', large_final)
sparsifier.sparsify_model(70)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 6,239 66.32%
Layer 1 Conv2d 36,864 11,890 32.25%
Layer 2 Conv2d 36,864 11,923 32.34%
Layer 3 Conv2d 36,864 11,748 31.87%
Layer 4 Conv2d 36,864 11,827 32.08%
Layer 5 Conv2d 73,728 32,599 44.22%
Layer 6 Conv2d 147,456 65,040 44.11%
Layer 7 Conv2d 8,192 1,217 14.86%
Layer 8 Conv2d 147,456 65,228 44.24%
Layer 9 Conv2d 147,456 64,803 43.95%
Layer 10 Conv2d 294,912 174,703 59.24%
Layer 11 Conv2d 589,824 349,858 59.32%
Layer 12 Conv2d 32,768 7,042 21.49%
Layer 13 Conv2d 589,824 349,389 59.24%
Layer 14 Conv2d 589,824 349,414 59.24%
Layer 15 Conv2d 1,179,648 894,852 75.86%
Layer 16 Conv2d 2,359,296 1,790,061 75.87%
Layer 17 Conv2d 131,072 39,351 30.02%
Layer 18 Conv2d 2,359,296 1,789,686 75.86%
Layer 19 Conv2d 2,359,296 1,789,966 75.87%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,836 70.00%
Criteria
The criteria defines how we select the parameters to remove. It is usually given by a scoring method. The most common one is the large_final, i.e. select parameters with the highest absolute value as they are supposed to contribute the most to the final results of the model.
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', large_final)
sparsifier.sparsify_model(70)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 6,281 66.76%
Layer 1 Conv2d 36,864 11,788 31.98%
Layer 2 Conv2d 36,864 11,748 31.87%
Layer 3 Conv2d 36,864 11,739 31.84%
Layer 4 Conv2d 36,864 11,915 32.32%
Layer 5 Conv2d 73,728 32,519 44.11%
Layer 6 Conv2d 147,456 65,121 44.16%
Layer 7 Conv2d 8,192 1,238 15.11%
Layer 8 Conv2d 147,456 65,236 44.24%
Layer 9 Conv2d 147,456 65,362 44.33%
Layer 10 Conv2d 294,912 174,830 59.28%
Layer 11 Conv2d 589,824 349,212 59.21%
Layer 12 Conv2d 32,768 7,123 21.74%
Layer 13 Conv2d 589,824 349,297 59.22%
Layer 14 Conv2d 589,824 349,364 59.23%
Layer 15 Conv2d 1,179,648 895,626 75.92%
Layer 16 Conv2d 2,359,296 1,789,071 75.83%
Layer 17 Conv2d 131,072 39,881 30.43%
Layer 18 Conv2d 2,359,296 1,790,398 75.89%
Layer 19 Conv2d 2,359,296 1,789,088 75.83%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,837 70.00%
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', small_final)
sparsifier.sparsify_model(70)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 8,832 93.88%
Layer 1 Conv2d 36,864 636 1.73%
Layer 2 Conv2d 36,864 354 0.96%
Layer 3 Conv2d 36,864 168 0.46%
Layer 4 Conv2d 36,864 435 1.18%
Layer 5 Conv2d 73,728 9,451 12.82%
Layer 6 Conv2d 147,456 4,919 3.34%
Layer 7 Conv2d 8,192 28 0.34%
Layer 8 Conv2d 147,456 3,264 2.21%
Layer 9 Conv2d 147,456 6,210 4.21%
Layer 10 Conv2d 294,912 58,395 19.80%
Layer 11 Conv2d 589,824 129,374 21.93%
Layer 12 Conv2d 32,768 13 0.04%
Layer 13 Conv2d 589,824 177,530 30.10%
Layer 14 Conv2d 589,824 97,588 16.55%
Layer 15 Conv2d 1,179,648 1,179,646 100.00%
Layer 16 Conv2d 2,359,296 2,359,296 100.00%
Layer 17 Conv2d 131,072 1,106 0.84%
Layer 18 Conv2d 2,359,296 1,420,295 60.20%
Layer 19 Conv2d 2,359,296 2,359,295 100.00%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,816,835 70.00%
For more information and examples about the pruning criteria, I suggest you to take a look at the corresponding section.
Remark
In some case, you may want to impose the remaining amount of parameters to be a multiple of 8, this can be done by passing the round_to parameter.
model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'local', large_final)
sparsifier.sparsify_model(70, round_to=8)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 5,880 62.50%
Layer 1 Conv2d 36,864 23,040 62.50%
Layer 2 Conv2d 36,864 23,040 62.50%
Layer 3 Conv2d 36,864 23,040 62.50%
Layer 4 Conv2d 36,864 23,040 62.50%
Layer 5 Conv2d 73,728 50,688 68.75%
Layer 6 Conv2d 147,456 101,376 68.75%
Layer 7 Conv2d 8,192 5,632 68.75%
Layer 8 Conv2d 147,456 101,376 68.75%
Layer 9 Conv2d 147,456 101,376 68.75%
Layer 10 Conv2d 294,912 202,752 68.75%
Layer 11 Conv2d 589,824 405,504 68.75%
Layer 12 Conv2d 32,768 22,528 68.75%
Layer 13 Conv2d 589,824 405,504 68.75%
Layer 14 Conv2d 589,824 405,504 68.75%
Layer 15 Conv2d 1,179,648 811,008 68.75%
Layer 16 Conv2d 2,359,296 1,622,016 68.75%
Layer 17 Conv2d 131,072 90,112 68.75%
Layer 18 Conv2d 2,359,296 1,622,016 68.75%
Layer 19 Conv2d 2,359,296 1,622,016 68.75%
--------------------------------------------------------------------------------
Overall all 11,166,912 7,667,448 68.66%
model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'global', large_final)
sparsifier.sparsify_model(70, round_to=8)sparsifier.print_sparsity()
Sparsity Report:
--------------------------------------------------------------------------------
Layer Type Params Zeros Sparsity
--------------------------------------------------------------------------------
Layer 0 Conv2d 9,408 8,232 87.50%
Layer 1 Conv2d 36,864 0 0.00%
Layer 2 Conv2d 36,864 0 0.00%
Layer 3 Conv2d 36,864 0 0.00%
Layer 4 Conv2d 36,864 0 0.00%
Layer 5 Conv2d 73,728 69,120 93.75%
Layer 6 Conv2d 147,456 138,240 93.75%
Layer 7 Conv2d 8,192 0 0.00%
Layer 8 Conv2d 147,456 138,240 93.75%
Layer 9 Conv2d 147,456 138,240 93.75%
Layer 10 Conv2d 294,912 285,696 96.88%
Layer 11 Conv2d 589,824 571,392 96.88%
Layer 12 Conv2d 32,768 0 0.00%
Layer 13 Conv2d 589,824 571,392 96.88%
Layer 14 Conv2d 589,824 571,392 96.88%
Layer 15 Conv2d 1,179,648 1,161,216 98.44%
Layer 16 Conv2d 2,359,296 2,322,432 98.44%
Layer 17 Conv2d 131,072 0 0.00%
Layer 18 Conv2d 2,359,296 2,322,432 98.44%
Layer 19 Conv2d 2,359,296 2,285,568 96.88%
--------------------------------------------------------------------------------
Overall all 11,166,912 10,583,592 94.78%
For more information about granularities at which you can operate, please check the related page.