Skip to content

1Utkarsh1/model-compression-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

๐Ÿš€ Model Compression Pipeline

Python PyTorch License Status

A comprehensive framework for optimizing deep learning models through advanced compression techniques

Key Features โ€ข Architecture โ€ข Installation โ€ข Usage โ€ข Results โ€ข Roadmap โ€ข Contributing

๐Ÿ“‘ Overview

The Model Compression Pipeline is an end-to-end framework designed to compress state-of-the-art deep learning models while preserving accuracy. This project implements various compression techniques including pruning, quantization, and knowledge distillation, allowing researchers and practitioners to optimize models for deployment on resource-constrained devices.

โœจ Key Features

  • Modular Architecture: Easily extensible framework for experimenting with different compression techniques
  • Multiple Compression Techniques:
    • ๐Ÿ”ช Pruning: Remove redundant weights and connections
    • ๐Ÿ”ข Quantization: Convert weights to lower precision formats
    • ๐Ÿง  Knowledge Distillation: Train smaller "student" models from larger "teacher" models
    • ๐ŸŽซ Lottery Ticket Hypothesis: Find and train sparse subnetworks with initial weights
  • Model Support:
    • CNN architectures (ResNet, MobileNet, EfficientNet)
    • Vision Transformers (ViT variants)
  • Dataset Integration:
    • CIFAR-10/100, ImageNet, Oxford Flowers102
  • Comprehensive Evaluation:
    • Accuracy, model size, inference latency, memory usage benchmarks
    • Detailed visualization and reporting

๐Ÿ—๏ธ Architecture

The pipeline consists of the following key components:

model_compression_pipeline/
โ”œโ”€โ”€ src/               # Source code
โ”‚   โ”œโ”€โ”€ data/          # Data loading and preprocessing
โ”‚   โ”œโ”€โ”€ models/        # Model architectures and training
โ”‚   โ”œโ”€โ”€ compression/   # Compression techniques implementation
โ”‚   โ”œโ”€โ”€ evaluation/    # Benchmarking and comparison
โ”‚   โ””โ”€โ”€ utils/         # Helper utilities
โ”œโ”€โ”€ experiments/       # Jupyter notebooks for experiments
โ”œโ”€โ”€ docs/              # Documentation
โ””โ”€โ”€ results/           # Saved models and metrics

Compression Techniques

Technique Description Benefits
Pruning Removes weights based on magnitude, importance, or structure Reduces model size and computation with minimal accuracy impact
Quantization Converts 32-bit floats to lower-precision (8-bit, 4-bit, 2-bit) Significantly decreases model size and improves inference speed
Knowledge Distillation Trains compact models using the output of larger models Creates smaller, faster models that retain knowledge from larger ones
Lottery Ticket Hypothesis Finds sparse subnetworks with comparable performance to full networks Identifies highly efficient subnetworks that train effectively from initialization

๐Ÿ”ง Installation

Prerequisites

  • Python 3.8+
  • CUDA-compatible GPU (recommended)

Setup

# Clone the repository
git clone https://github.com/1Utkarsh1/model-compression-pipeline.git
cd model-compression-pipeline

# Install dependencies
pip install -r requirements.txt

๐Ÿ“Š Examples

Full Pipeline Workflow

# Train a baseline model and apply all compression techniques
python src/main.py --mode full --model resnet50 --dataset cifar10

Individual Techniques

Baseline Model Training
# Train baseline ResNet50 on CIFAR-10
python src/main.py --mode baseline --model resnet50 --dataset cifar10 --epochs 100
Pruning
# Apply magnitude-based pruning with 50% sparsity
python src/main.py --mode prune --model resnet50 --dataset cifar10 --prune_rate 0.5 --prune_method magnitude
Quantization
# Apply 8-bit post-training quantization
python src/main.py --mode quantize --model resnet50 --dataset cifar10 --bits 8 --quantize_method post_training
Knowledge Distillation
# Distill from ResNet50 to ResNet18
python src/main.py --mode distill --model resnet50 --dataset cifar10 --student resnet18
Lottery Ticket Hypothesis
# Apply Lottery Ticket pruning with 5 iterations
python src/main.py --mode lottery_ticket --model resnet50 --dataset cifar10 --lottery_iterations 5 --lottery_prune_percent 0.2
Generate Comparison Report
# Generate a comprehensive HTML report comparing all techniques
python src/main.py --mode report --model resnet50 --dataset cifar10

๐Ÿ“ˆ Results

Here's a comparison of different compression techniques applied to ResNet50 on CIFAR-10:

Model Accuracy Size (MB) Inference Time (ms) Memory Usage (MB)
Baseline (ResNet50) 92.5% 97.8 125 550
Pruned (50%) 91.8% 49.2 110 320
Quantized (8-bit) 92.1% 24.6 85 210
Distilled (ResNet18) 89.3% 44.7 60 290
Lottery Ticket (5 iter) 91.2% 31.5 105 280

๐Ÿ”ฎ Advanced Usage

Custom Models

You can easily extend the pipeline to support custom models:

# In src/models/custom_model.py
class MyCustomModel(nn.Module):
    def __init__(self, num_classes):
        super(MyCustomModel, self).__init__()
        # Define your model architecture
        self.features = ...
        self.classifier = nn.Linear(feature_dim, num_classes)
    
    def forward(self, x):
        x = self.features(x)
        return self.classifier(x)

# Then register it in load_baseline_model function

Custom Datasets

Support for new datasets can be added as follows:

# In src/data/data_loader.py
# Add to get_transforms and load_dataset functions
elif dataset_name.lower() == 'my_dataset':
    # Define transforms
    train_transforms = transforms.Compose([...])
    test_transforms = transforms.Compose([...])
    
    # Load dataset
    train_dataset = MyDataset(...)
    val_dataset = MyDataset(...)
    test_dataset = MyDataset(...)

Custom Compression Techniques

The modular architecture allows adding new compression techniques:

# In src/compression/my_technique.py
def apply_my_compression(model, ...):
    # Implement your compression logic
    return compressed_model

๐Ÿ—บ๏ธ Roadmap

  • Implement basic pruning techniques
  • Implement quantization (8-bit, 4-bit, 2-bit)
  • Implement knowledge distillation
  • Add Vision Transformer support
  • Add Lottery Ticket Hypothesis implementation
  • Support for hardware-aware compression
  • Add NLP model support (BERT, GPT variants)
  • Deploy compressed models to mobile/edge devices
  • Add AutoML for finding optimal compression strategies
  • Support for continuous compression during training

๐Ÿ“‘ Publications

If you use this work in your research, please cite:

@software{model_compression_pipeline,
  author = {Utkarsh Rajput},
  title = {Model Compression Pipeline},
  year = {2025},
  url = {https://github.com/1Utkarsh1/model-compression-pipeline}
}

๐Ÿ‘ฅ Contributing

Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines.

Getting Started with Development

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgements


Made with โค๏ธ by the Model Compression Pipeline Team
Github โ€ข Website โ€ข Contact

About

A comprehensive framework for optimizing deep learning models through pruning, quantization, and knowledge distillation. This pipeline enables researchers and engineers to compress state-of-the-art models while preserving accuracy, reducing model size up to 75%, accelerating inference by 2x, and making deployment on resource-constrained devices

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages