Sparsifier

Make your neural network sparse with fastai

A sparse vector, as opposed to a dense one, is a vector which contains a lot of zeroes. When we speak about making a neural network sparse, we thus mean that the network’s weight are mostly zeroes.

With fasterai, you can do that thanks to the Sparsifier class.

Let’s start by creating a model

model = resnet18()

As you probably know, weights in a convolutional neural network have 4 dimensions ($ c_{out} c_{in} k_h k_w$)

model.conv1.weight.ndim
4

In the case of ResNet18, the dimension of the first layer weights is \(64 \times 3 \times 7 \times 7\). We thus can plot each of the \(64\) filter as a \(7 \times 7\) color image (because they contains \(3\) channels).

plot_kernels(model.conv1)

The Sparsifier class allows us to remove some (part of) the filters, that are considered to be less useful than others. This can be done by first creating an instance of the class, specifying:

User can pass a single layer to prune by using the Sparsifier.sparsify_layer method.

Found permutation search CUDA kernels [ASP][Info] permutation_search_kernels can be imported. —

source

Sparsifier.sparsify_layer


def sparsify_layer(
    m:nn.Module, # The layer to sparsify
    sparsity:float, # Target sparsity level (percentage)
    round_to:Optional[int]=None, # Round to a multiple of this value
)->None:

Apply sparsification to a single layer

model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'local', large_final)
sparsifier.sparsify_layer(model.conv1, 70)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      6,615         70.31%
Layer 1              Conv2d          36,864     0              0.00%
Layer 2              Conv2d          36,864     0              0.00%
Layer 3              Conv2d          36,864     0              0.00%
Layer 4              Conv2d          36,864     0              0.00%
Layer 5              Conv2d          73,728     0              0.00%
Layer 6              Conv2d          147,456    0              0.00%
Layer 7              Conv2d          8,192      0              0.00%
Layer 8              Conv2d          147,456    0              0.00%
Layer 9              Conv2d          147,456    0              0.00%
Layer 10             Conv2d          294,912    0              0.00%
Layer 11             Conv2d          589,824    0              0.00%
Layer 12             Conv2d          32,768     0              0.00%
Layer 13             Conv2d          589,824    1              0.00%
Layer 14             Conv2d          589,824    0              0.00%
Layer 15             Conv2d          1,179,648  0              0.00%
Layer 16             Conv2d          2,359,296  0              0.00%
Layer 17             Conv2d          131,072    0              0.00%
Layer 18             Conv2d          2,359,296  0              0.00%
Layer 19             Conv2d          2,359,296  0              0.00%
--------------------------------------------------------------------------------
Overall              all             11,166,912 6,616          0.06%

Most of the time, we may want to prune the whole model at once, using the Sparsifier.sparsify_model method, indicating the percentage of sparsity to you want to apply.


source

Sparsifier.sparsify_model


def sparsify_model(
    sparsity:Union[float, list[float]], # Target sparsity level(s)
    round_to:Optional[int]=None, # Round to a multiple of this value
)->None:

Apply sparsification to all matching layers in the model

There are several ways in which we can make that first layer sparse. You will find the most important below:

model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'local', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      6,585         69.99%
Layer 1              Conv2d          36,864     25,805        70.00%
Layer 2              Conv2d          36,864     25,805        70.00%
Layer 3              Conv2d          36,864     25,805        70.00%
Layer 4              Conv2d          36,864     25,805        70.00%
Layer 5              Conv2d          73,728     51,609        70.00%
Layer 6              Conv2d          147,456    103,219       70.00%
Layer 7              Conv2d          8,192      5,734         70.00%
Layer 8              Conv2d          147,456    103,219       70.00%
Layer 9              Conv2d          147,456    103,219       70.00%
Layer 10             Conv2d          294,912    206,438       70.00%
Layer 11             Conv2d          589,824    412,877       70.00%
Layer 12             Conv2d          32,768     22,937        70.00%
Layer 13             Conv2d          589,824    412,877       70.00%
Layer 14             Conv2d          589,824    412,877       70.00%
Layer 15             Conv2d          1,179,648  825,753       70.00%
Layer 16             Conv2d          2,359,296  1,651,507     70.00%
Layer 17             Conv2d          131,072    91,750        70.00%
Layer 18             Conv2d          2,359,296  1,651,506     70.00%
Layer 19             Conv2d          2,359,296  1,651,507     70.00%
--------------------------------------------------------------------------------
Overall              all             11,166,912 7,816,834     70.00%

You now have a model that is \(70\%\) sparse !

Granularity

As we said earlier, the granularity defines the structure of parameter that you will remove.

In the example below, we removed weight from each convolutional filter, meaning that we now have sparse filters, as can be seen in the image below:

plot_kernels(model.conv1)

Another granularity is, for example, removing column vectors from the filters. To do so, just change the granularity parameter accordingly.

model = resnet18()
sparsifier = Sparsifier(model, 'column', 'local', large_final)
sparsifier.sparsify_layer(model.conv1, 70)
plot_kernels(model.conv1)

For more information and examples about the pruning granularities, I suggest you to take a look at the corresponding section.

Context

The context defines where to look in the model, i.e. from where do we compare weight. The two basic contexts are: * local, i.e. we compare weight from each layer individually. This will lead to layers with similar levels of sparsity. * global, i.e. we compare weight from the whole model. This will lead to layers with different levels of sparsity

model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'local', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      6,585         69.99%
Layer 1              Conv2d          36,864     25,805        70.00%
Layer 2              Conv2d          36,864     25,805        70.00%
Layer 3              Conv2d          36,864     25,805        70.00%
Layer 4              Conv2d          36,864     25,805        70.00%
Layer 5              Conv2d          73,728     51,609        70.00%
Layer 6              Conv2d          147,456    103,219       70.00%
Layer 7              Conv2d          8,192      5,734         70.00%
Layer 8              Conv2d          147,456    103,219       70.00%
Layer 9              Conv2d          147,456    103,219       70.00%
Layer 10             Conv2d          294,912    206,438       70.00%
Layer 11             Conv2d          589,824    412,877       70.00%
Layer 12             Conv2d          32,768     22,937        70.00%
Layer 13             Conv2d          589,824    412,877       70.00%
Layer 14             Conv2d          589,824    412,876       70.00%
Layer 15             Conv2d          1,179,648  825,752       70.00%
Layer 16             Conv2d          2,359,296  1,651,507     70.00%
Layer 17             Conv2d          131,072    91,750        70.00%
Layer 18             Conv2d          2,359,296  1,651,507     70.00%
Layer 19             Conv2d          2,359,296  1,651,506     70.00%
--------------------------------------------------------------------------------
Overall              all             11,166,912 7,816,832     70.00%
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      6,239         66.32%
Layer 1              Conv2d          36,864     11,890        32.25%
Layer 2              Conv2d          36,864     11,923        32.34%
Layer 3              Conv2d          36,864     11,748        31.87%
Layer 4              Conv2d          36,864     11,827        32.08%
Layer 5              Conv2d          73,728     32,599        44.22%
Layer 6              Conv2d          147,456    65,040        44.11%
Layer 7              Conv2d          8,192      1,217         14.86%
Layer 8              Conv2d          147,456    65,228        44.24%
Layer 9              Conv2d          147,456    64,803        43.95%
Layer 10             Conv2d          294,912    174,703       59.24%
Layer 11             Conv2d          589,824    349,858       59.32%
Layer 12             Conv2d          32,768     7,042         21.49%
Layer 13             Conv2d          589,824    349,389       59.24%
Layer 14             Conv2d          589,824    349,414       59.24%
Layer 15             Conv2d          1,179,648  894,852       75.86%
Layer 16             Conv2d          2,359,296  1,790,061     75.87%
Layer 17             Conv2d          131,072    39,351        30.02%
Layer 18             Conv2d          2,359,296  1,789,686     75.86%
Layer 19             Conv2d          2,359,296  1,789,966     75.87%
--------------------------------------------------------------------------------
Overall              all             11,166,912 7,816,836     70.00%

Criteria

The criteria defines how we select the parameters to remove. It is usually given by a scoring method. The most common one is the large_final, i.e. select parameters with the highest absolute value as they are supposed to contribute the most to the final results of the model.

model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      6,281         66.76%
Layer 1              Conv2d          36,864     11,788        31.98%
Layer 2              Conv2d          36,864     11,748        31.87%
Layer 3              Conv2d          36,864     11,739        31.84%
Layer 4              Conv2d          36,864     11,915        32.32%
Layer 5              Conv2d          73,728     32,519        44.11%
Layer 6              Conv2d          147,456    65,121        44.16%
Layer 7              Conv2d          8,192      1,238         15.11%
Layer 8              Conv2d          147,456    65,236        44.24%
Layer 9              Conv2d          147,456    65,362        44.33%
Layer 10             Conv2d          294,912    174,830       59.28%
Layer 11             Conv2d          589,824    349,212       59.21%
Layer 12             Conv2d          32,768     7,123         21.74%
Layer 13             Conv2d          589,824    349,297       59.22%
Layer 14             Conv2d          589,824    349,364       59.23%
Layer 15             Conv2d          1,179,648  895,626       75.92%
Layer 16             Conv2d          2,359,296  1,789,071     75.83%
Layer 17             Conv2d          131,072    39,881        30.43%
Layer 18             Conv2d          2,359,296  1,790,398     75.89%
Layer 19             Conv2d          2,359,296  1,789,088     75.83%
--------------------------------------------------------------------------------
Overall              all             11,166,912 7,816,837     70.00%
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', small_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      8,832         93.88%
Layer 1              Conv2d          36,864     636            1.73%
Layer 2              Conv2d          36,864     354            0.96%
Layer 3              Conv2d          36,864     168            0.46%
Layer 4              Conv2d          36,864     435            1.18%
Layer 5              Conv2d          73,728     9,451         12.82%
Layer 6              Conv2d          147,456    4,919          3.34%
Layer 7              Conv2d          8,192      28             0.34%
Layer 8              Conv2d          147,456    3,264          2.21%
Layer 9              Conv2d          147,456    6,210          4.21%
Layer 10             Conv2d          294,912    58,395        19.80%
Layer 11             Conv2d          589,824    129,374       21.93%
Layer 12             Conv2d          32,768     13             0.04%
Layer 13             Conv2d          589,824    177,530       30.10%
Layer 14             Conv2d          589,824    97,588        16.55%
Layer 15             Conv2d          1,179,648  1,179,646    100.00%
Layer 16             Conv2d          2,359,296  2,359,296    100.00%
Layer 17             Conv2d          131,072    1,106          0.84%
Layer 18             Conv2d          2,359,296  1,420,295     60.20%
Layer 19             Conv2d          2,359,296  2,359,295    100.00%
--------------------------------------------------------------------------------
Overall              all             11,166,912 7,816,835     70.00%

For more information and examples about the pruning criteria, I suggest you to take a look at the corresponding section.

Remark

In some case, you may want to impose the remaining amount of parameters to be a multiple of 8, this can be done by passing the round_to parameter.

model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'local', large_final)
sparsifier.sparsify_model(70, round_to=8)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      5,880         62.50%
Layer 1              Conv2d          36,864     23,040        62.50%
Layer 2              Conv2d          36,864     23,040        62.50%
Layer 3              Conv2d          36,864     23,040        62.50%
Layer 4              Conv2d          36,864     23,040        62.50%
Layer 5              Conv2d          73,728     50,688        68.75%
Layer 6              Conv2d          147,456    101,376       68.75%
Layer 7              Conv2d          8,192      5,632         68.75%
Layer 8              Conv2d          147,456    101,376       68.75%
Layer 9              Conv2d          147,456    101,376       68.75%
Layer 10             Conv2d          294,912    202,752       68.75%
Layer 11             Conv2d          589,824    405,504       68.75%
Layer 12             Conv2d          32,768     22,528        68.75%
Layer 13             Conv2d          589,824    405,504       68.75%
Layer 14             Conv2d          589,824    405,504       68.75%
Layer 15             Conv2d          1,179,648  811,008       68.75%
Layer 16             Conv2d          2,359,296  1,622,016     68.75%
Layer 17             Conv2d          131,072    90,112        68.75%
Layer 18             Conv2d          2,359,296  1,622,016     68.75%
Layer 19             Conv2d          2,359,296  1,622,016     68.75%
--------------------------------------------------------------------------------
Overall              all             11,166,912 7,667,448     68.66%
model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'global', large_final)
sparsifier.sparsify_model(70, round_to=8)
sparsifier.print_sparsity()

Sparsity Report:
--------------------------------------------------------------------------------
Layer                Type            Params     Zeros      Sparsity  
--------------------------------------------------------------------------------
Layer 0              Conv2d          9,408      8,232         87.50%
Layer 1              Conv2d          36,864     0              0.00%
Layer 2              Conv2d          36,864     0              0.00%
Layer 3              Conv2d          36,864     0              0.00%
Layer 4              Conv2d          36,864     0              0.00%
Layer 5              Conv2d          73,728     69,120        93.75%
Layer 6              Conv2d          147,456    138,240       93.75%
Layer 7              Conv2d          8,192      0              0.00%
Layer 8              Conv2d          147,456    138,240       93.75%
Layer 9              Conv2d          147,456    138,240       93.75%
Layer 10             Conv2d          294,912    285,696       96.88%
Layer 11             Conv2d          589,824    571,392       96.88%
Layer 12             Conv2d          32,768     0              0.00%
Layer 13             Conv2d          589,824    571,392       96.88%
Layer 14             Conv2d          589,824    571,392       96.88%
Layer 15             Conv2d          1,179,648  1,161,216     98.44%
Layer 16             Conv2d          2,359,296  2,322,432     98.44%
Layer 17             Conv2d          131,072    0              0.00%
Layer 18             Conv2d          2,359,296  2,322,432     98.44%
Layer 19             Conv2d          2,359,296  2,285,568     96.88%
--------------------------------------------------------------------------------
Overall              all             11,166,912 10,583,592    94.78%

For more information about granularities at which you can operate, please check the related page.