Fully-Connected Layers Decomposer

Factorize heavy FC layers into smaller ones

Overview

The FC_Decomposer class reduces model size by factorizing large fully-connected (Linear) layers into two smaller layers using Singular Value Decomposition (SVD). This is particularly effective for models with large FC layers like VGG or older architectures with big classifier heads.

Key Benefits: - Reduces parameter count without changing model architecture externally - No retraining required (though fine-tuning may improve accuracy) - Works on any model with Linear layers

When to Use FC Decomposition

Scenario Recommendation
Large classifier heads (e.g., VGG’s 4096→4096→1000) Highly recommended - significant savings
Modern architectures (ResNet, EfficientNet) Limited benefit - already efficient
Transformer attention layers Use with caution - may hurt performance
Pre-deployment optimization Good complement to pruning/quantization

Compression Ratio

For a Linear layer with shape (out_features, in_features): - Original parameters: out_features × in_features + out_features (with bias) - After decomposition (keeping k singular values): k × in_features + out_features × k + out_features - Compression ratio: roughly 1 / (1 - percent_removed) for square layers

How It Works

SVD decomposes a weight matrix into three matrices: \(W = U \Sigma V^T\)

Where: - \(U\) contains left singular vectors (output features) - \(\Sigma\) is diagonal with singular values (importance scores) - \(V^T\) contains right singular vectors (input features)

By keeping only the top \(k\) singular values, we approximate \(W\) with two smaller matrices, trading accuracy for compression.


Usage Example

from fasterai.misc.fc_decomposer import FC_Decomposer
from torchvision.models import vgg16

# Load a model with large FC layers
model = vgg16(pretrained=True)

# Decompose, removing 50% of singular values
decomposer = FC_Decomposer()
compressed_model = decomposer.decompose(model, percent_removed=0.5)

# Check parameter reduction
original_params = sum(p.numel() for p in model.parameters())
compressed_params = sum(p.numel() for p in compressed_model.parameters())
print(f"Compression: {original_params/compressed_params:.2f}x")

See Also

  • FC Decomposer Tutorial - Step-by-step walkthrough with examples
  • BN Folding - Another optimization technique to reduce inference overhead
  • Pruner - Remove entire filters for structured compression