Skip to content

upunaprosk/Awesome-LLM-Compression-Safety

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome LLM Compression Safety

GitHub stars GitHub forks GitHub issues GitHub last commit


🤗 Introduction

Model compression techniques, such as quantization, pruning, distillation, and low-rank adaptation, are widely used to reduce the deployment cost of language models while maintaining performance. However, compression can result in:

  • Increased stereotype generation
  • Reduced robustness to adversarial attacks
  • Increased calibration errors
  • Higher model uncertainty
  • Overall safety risks in downstream applications

This repository curates papers on the evaluation and mitigation of compression-induced safety degradation in LLMs, VLMs, and multimodal models, covering robustness, calibration, and alignment.

Papers are currently listed in a single section. Subcategories will be introduced as the collection grows.

Contributions are welcome! Please open an issue or submit a pull request following the existing format.


Table of Contents


📑 Papers

Date Institute Publication Paper
20.10 Google Research arXiv Characterising Bias in Compressed Models
22.01 University of California arXiv Can Model Compression Improve NLP Fairness
22.01 University of Cambridge ICPR 2022 The Effect of Model Compression on Fairness in Facial Expression Recognition
22.12 NAVER LABS Europe; IDIAP Research Institute; EPFL EMNLP 2022 What Do Compressed Multilingual Machine Translation Models Forget?
23.05 NJIT; Microsoft Research; Rice University EACL 2023 Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding
23.07 Microsoft Research; IIT Dhanbad; BITS Pilani ACL 2023 A Comparative Study on the Impact of Model Compression Techniques on Fairness in Language
23.10 UT Austin; Apple ICLR 2024 Compressing LLMs: The Truth Is Rarely Pure and Never Simple
23.12 Carnegie Mellon University; Universidade NOVA de Lisboa; Allen Institute for AI EMNLP 2023 Understanding the Effect of Model Compression on Social Bias in Large Language Models
23.12 Cohere For AI; Dyania Health; University of Virginia ICML 2024 On the Fairness Impacts of Hardware Selection in Machine Learning
24.02 Princeton University ICML 2024 Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
24.03 UT Austin; Drexel; MIT; UIUC; Duke; LLNL; CAIS; UC Berkeley; UChicago ICML 2024 Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
24.05 Tennessee Tech University FLAIRS 2024 Beyond Size and Accuracy: The Impact of Model Compression on Fairness
24.05 ETH Zurich NeurIPS 2024 Exploiting LLM Quantization
24.06 Université Lumière Lyon 2; Université Claude Bernard Lyon 1 NAACL 2024 When Quantization Affects Confidence of Large Language Models?
24.06 Penn State; NEC Labs America NAACL 2024 Pruning as a Domain-Specific LLM Extractor
24.10 Oregon State University NeurIPS 2024 (RBFM) You Never Know: Quantization Induces Inconsistent Biases in Vision-Language Foundation Models
24.11 MIT BlackboxNLP @ EMNLP 2024 Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
24.11 University of Utah; Google DeepMind EMNLP 2024 Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression
24.11 Cohere; Cohere For AI EMNLP 2024 How Does Quantization Affect Multilingual LLMs?
24.12 ETH Zurich Conference Exploiting LLM Quantization
25.05 University of Calgary; Vector Institute ICML 2025 Does Compression Exacerbate Large Language Models' Social Bias?
25.05 Ruhr University Bochum; UAR Research Center NAACL 2025 The Impact of Inference Acceleration on Bias of LLMs
25.05 East China Normal University ACL 2025 Findings Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models
25.07 Red Hat AI; IST Austria ACL 2025 "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-offs in LLM Quantization
25.08 Sofia University; Tsinghua University; Hebrew University; TU Darmstadt; ATHENE arXiv How Quantization Shapes Bias in Large Language Models
25.08 Dalhousie University AACL 2025 Interpreting the Effects of Quantization on LLMs
25.09 Université Lumière Lyon 2; Université Claude Bernard Lyon 1; École Centrale de Lyon; LIRIS CNRS arXiv Fair-GPTQ: Bias-Aware Quantization for Large Language Models
25.10 Tennessee Tech University arXiv Downsized and Compromised? Assessing the Faithfulness of Model Compression
25.10 University of Hong Kong; Huawei NeurIPS 2025 Preserving LLM Capabilities through Calibration Data Curation
25.10 ETH Zurich arXiv Fewer Weights, More Problems: A Practical Attack on LLM Pruning
25.11 UMass Amherst; Microsoft; University of Maryland EMNLP 2025 Does Quantization Affect Models’ Performance on Long-Context Tasks?
25.11 Iowa State University arXiv Decomposed Trust: Exploring Privacy, Adversarial Robustness, Fairness, and Ethics of Low-Rank LLMs
25.11 Seoul National University arXiv Alignment-Aware Quantization for LLM Safety
25.11 GE HealthCare NeurIPS 2025 (Lock-LLM Workshop) Compressed but Compromised? A Study of Jailbreaking in Compressed LLMs
25.12 Huazhong University of Science and Technology ICML 2025 Understanding the Unfairness in Network Quantization
26.01 Universitas Indonesia arXiv Preserving Fairness and Safety in Quantized LLMs Through Critical Weight Protection
26.02 University College London; University of Tübingen arXiv UniComp: A Unified Evaluation of Large Language Model Compression
26.02 UC Berkeley; Meta Superintelligence Labs arXiv Uncertainty Drives Social Bias Changes in Quantized Large Language Models

Benchmarks

Date Institute Publication Paper
24.03 UT Austin; Drexel; MIT; UIUC; Duke; LLNL; CAIS; UC Berkeley; UChicago ICML 2024 Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
24.12 IBM Research Europe; Trinity College Dublin; Imperial College London NeurIPS 2024 SafeGenAI HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment
25.02 Skolkovo Institute of Science and Technology; AIRI; HSE University arXiv Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models
25.02 Harbin Institute of Technology (Shenzhen); Illinois Institute of Technology arXiv Benchmarking Post-Training Quantization in LLMs: A Comprehensive Taxonomy

Cite

If you use this repository in your research, you can cite it as:

@misc{proskurina2026awesome_llm_compression_safety,
  title        = {Awesome LLM Compression Safety},
  author       = {Proskurina, Irina},
  year         = {2026},
  howpublished = {\url{https://github.com/upunaprosk/Awesome-LLM-Compression-Safety}},
  note         = {GitHub repository, accessed 2026}
}

About

A curated collection of papers and resources on the safety impacts of model compression in Large Language Models (LLMs), highlighting key challenges, risks, and recent advances.

Topics

Resources

Stars

Watchers

Forks

Contributors