• Framework for Easy Deployment of Compressed and Optimized Models (FEDCOM)

    A Collaborative Initiative Between Odia Generative AI and the Norwegian BioAI Lab for Deep Learning Model Compression and Deployment

    broken image
  • Problem Statement

    Compressing deep learning models to make them suitable for resource-constrained environments is a challenging task. The process typically involves applying advanced techniques like quantization, pruning, and distillation, which demand substantial technical expertise and extensive experimentation. These complexities often hinder users from deploying efficient models on devices with limited computational power or memory, thereby creating a gap between the potential of cutting-edge AI and its practical usability in real-world applications.

    Solution

    Our project offers a structured framework to simplify the compression of deep learning models, addressing the challenges of deploying them in resource-constrained environments. The workflow allows users to load pre-trained models, select and configure compression techniques such as quantization, pruning, and distillation, and perform optional fine-tuning or calibration to recover accuracy and evaluate performance metrics. Furthermore, the framework supports exporting deployment-ready models and provides visual insights into trade-offs between model size, inference speed, and accuracy. This approach streamlines the compression process, making it accessible and efficient for users with diverse levels of expertise.

    Scope

    The initial phase of the framework development will focus on establishing a robust foundation with the following features:

    Model Support

    • Support multimodal deep learning models

    Compression Techniques

    • Quantization: Reduce precision to optimize memory and computational efficiency.
    • Pruning: Remove redundant weights and structures to minimize model size.
    • Knowledge Distillation: Implement teacher-student training workflows to create lightweight models without significant loss in performance.

    Workflow Capabilities

    • Load pre-trained models and allow users to configure compression parameters (e.g., sparsity level, quantization bits).
    • Include optional fine-tuning or calibration steps to recover accuracy post-compression.
    • Evaluate compressed models with metrics like inference speed, memory usage, and task accuracy.

    Minimal Visualization and Reporting

    • Provide basic visualizations to highlight the trade-offs between model size, speed, and accuracy.
    • Offer concise reports summarizing the impact of applied compression techniques.

    Deployment Readiness

    Export compressed models in deployment-ready formats compatible with resource-constrained environments.

    Ease of Use 

    Ensure a simple and intuitive interface for configuring and running compression workflows, making it accessible to users with varying expertise levels.

  • High-Level Architecture

    Modules to be developed

    broken image
  • Team

    broken image

    Dr. Sonal Khosla

    Researcher, OdiaGenAI

    (Project Coordinator)

    broken image

    AR Kamaldeen

    AI Engineer, OdiaGenAI

    broken image

    Debasish Dhal

    AI Engineer, OdiaGenAI

    broken image

    Sambit Sekhar

    Founder, OdiaGenAI

    broken image

    Prof. Dilip Prasad

    Professor, UiT The Arctic University of Norway

    broken image

    SK Sahid

    AI Engineer, OdiaGenAI

    broken image

    Sahil Khan

    Intern, OdiaGenAI

    broken image

    Pritiprava Mishra

    Researcher, OdiaGenAI

  • Contact

    Feel free to reach out to us with any questions about the project or collaboration opportunities.