🚀 Model Compression Pipeline

A comprehensive framework for optimizing deep learning models through advanced compression techniques

Key Features • Architecture • Installation • Usage • Results • Roadmap • Contributing

📑 Overview

The Model Compression Pipeline is an end-to-end framework designed to compress state-of-the-art deep learning models while preserving accuracy. This project implements various compression techniques including pruning, quantization, and knowledge distillation, allowing researchers and practitioners to optimize models for deployment on resource-constrained devices.

✨ Key Features

Modular Architecture: Easily extensible framework for experimenting with different compression techniques
Multiple Compression Techniques:
- 🔪 Pruning: Remove redundant weights and connections
- 🔢 Quantization: Convert weights to lower precision formats
- 🧠 Knowledge Distillation: Train smaller "student" models from larger "teacher" models
- 🎫 Lottery Ticket Hypothesis: Find and train sparse subnetworks with initial weights
Model Support:
- CNN architectures (ResNet, MobileNet, EfficientNet)
- Vision Transformers (ViT variants)
Dataset Integration:
- CIFAR-10/100, ImageNet, Oxford Flowers102
Comprehensive Evaluation:
- Accuracy, model size, inference latency, memory usage benchmarks
- Detailed visualization and reporting

🏗️ Architecture

The pipeline consists of the following key components:

model_compression_pipeline/
├── src/               # Source code
│   ├── data/          # Data loading and preprocessing
│   ├── models/        # Model architectures and training
│   ├── compression/   # Compression techniques implementation
│   ├── evaluation/    # Benchmarking and comparison
│   └── utils/         # Helper utilities
├── experiments/       # Jupyter notebooks for experiments
├── docs/              # Documentation
└── results/           # Saved models and metrics

Compression Techniques

Technique	Description	Benefits
Pruning	Removes weights based on magnitude, importance, or structure	Reduces model size and computation with minimal accuracy impact
Quantization	Converts 32-bit floats to lower-precision (8-bit, 4-bit, 2-bit)	Significantly decreases model size and improves inference speed
Knowledge Distillation	Trains compact models using the output of larger models	Creates smaller, faster models that retain knowledge from larger ones
Lottery Ticket Hypothesis	Finds sparse subnetworks with comparable performance to full networks	Identifies highly efficient subnetworks that train effectively from initialization

🔧 Installation

Prerequisites

Python 3.8+
CUDA-compatible GPU (recommended)

Setup

# Clone the repository
git clone https://github.com/1Utkarsh1/model-compression-pipeline.git
cd model-compression-pipeline

# Install dependencies
pip install -r requirements.txt

📊 Examples

Full Pipeline Workflow

# Train a baseline model and apply all compression techniques
python src/main.py --mode full --model resnet50 --dataset cifar10

Individual Techniques

Baseline Model Training

# Train baseline ResNet50 on CIFAR-10
python src/main.py --mode baseline --model resnet50 --dataset cifar10 --epochs 100

Pruning

# Apply magnitude-based pruning with 50% sparsity
python src/main.py --mode prune --model resnet50 --dataset cifar10 --prune_rate 0.5 --prune_method magnitude

Quantization

# Apply 8-bit post-training quantization
python src/main.py --mode quantize --model resnet50 --dataset cifar10 --bits 8 --quantize_method post_training

Knowledge Distillation

# Distill from ResNet50 to ResNet18
python src/main.py --mode distill --model resnet50 --dataset cifar10 --student resnet18

Lottery Ticket Hypothesis

# Apply Lottery Ticket pruning with 5 iterations
python src/main.py --mode lottery_ticket --model resnet50 --dataset cifar10 --lottery_iterations 5 --lottery_prune_percent 0.2

Generate Comparison Report

# Generate a comprehensive HTML report comparing all techniques
python src/main.py --mode report --model resnet50 --dataset cifar10

📈 Results

Here's a comparison of different compression techniques applied to ResNet50 on CIFAR-10:

Model	Accuracy	Size (MB)	Inference Time (ms)	Memory Usage (MB)
Baseline (ResNet50)	92.5%	97.8	125	550
Pruned (50%)	91.8%	49.2	110	320
Quantized (8-bit)	92.1%	24.6	85	210
Distilled (ResNet18)	89.3%	44.7	60	290
Lottery Ticket (5 iter)	91.2%	31.5	105	280

🔮 Advanced Usage

Custom Models

You can easily extend the pipeline to support custom models:

# In src/models/custom_model.py
class MyCustomModel(nn.Module):
    def __init__(self, num_classes):
        super(MyCustomModel, self).__init__()
        # Define your model architecture
        self.features = ...
        self.classifier = nn.Linear(feature_dim, num_classes)
    
    def forward(self, x):
        x = self.features(x)
        return self.classifier(x)

# Then register it in load_baseline_model function

Custom Datasets

Support for new datasets can be added as follows:

# In src/data/data_loader.py
# Add to get_transforms and load_dataset functions
elif dataset_name.lower() == 'my_dataset':
    # Define transforms
    train_transforms = transforms.Compose([...])
    test_transforms = transforms.Compose([...])
    
    # Load dataset
    train_dataset = MyDataset(...)
    val_dataset = MyDataset(...)
    test_dataset = MyDataset(...)

Custom Compression Techniques

The modular architecture allows adding new compression techniques:

# In src/compression/my_technique.py
def apply_my_compression(model, ...):
    # Implement your compression logic
    return compressed_model

🗺️ Roadmap

📑 Publications

If you use this work in your research, please cite:

@software{model_compression_pipeline,
  author = {Utkarsh Rajput},
  title = {Model Compression Pipeline},
  year = {2025},
  url = {https://github.com/1Utkarsh1/model-compression-pipeline}
}

👥 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines.

Getting Started with Development

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgements

PyTorch for the deep learning framework
TensorFlow Model Optimization Toolkit for inspiration on compression techniques
Hugging Face Transformers for transformer model implementations
Jonathan Frankle and Michael Carbin for the Lottery Ticket Hypothesis

Made with ❤️ by the Model Compression Pipeline Team
Github • Website • Contact

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
experiments		experiments
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Model Compression Pipeline

📑 Overview

✨ Key Features

🏗️ Architecture

Compression Techniques

🔧 Installation

Prerequisites

Setup

📊 Examples

Full Pipeline Workflow

Individual Techniques

📈 Results

🔮 Advanced Usage

Custom Models

Custom Datasets

Custom Compression Techniques

🗺️ Roadmap

📑 Publications

👥 Contributing

Getting Started with Development

📄 License

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 Model Compression Pipeline

📑 Overview

✨ Key Features

🏗️ Architecture

Compression Techniques

🔧 Installation

Prerequisites

Setup

📊 Examples

Full Pipeline Workflow

Individual Techniques

📈 Results

🔮 Advanced Usage

Custom Models

Custom Datasets

Custom Compression Techniques

🗺️ Roadmap

📑 Publications

👥 Contributing

Getting Started with Development

📄 License

🙏 Acknowledgements

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages