Awesome LLM Compression Safety

🤗 Introduction

Model compression techniques, such as quantization, pruning, distillation, and low-rank adaptation, are widely used to reduce the deployment cost of language models while maintaining performance. However, compression can result in:

Increased stereotype generation
Reduced robustness to adversarial attacks
Increased calibration errors
Higher model uncertainty
Overall safety risks in downstream applications

This repository curates papers on the evaluation and mitigation of compression-induced safety degradation in LLMs, VLMs, and multimodal models, covering robustness, calibration, and alignment.

Papers are currently listed in a single section. Subcategories will be introduced as the collection grows.

Contributions are welcome! Please open an issue or submit a pull request following the existing format.

📑 Papers

Date	Institute	Publication	Paper
20.10	Google Research	arXiv	Characterising Bias in Compressed Models
22.01	University of California	arXiv	Can Model Compression Improve NLP Fairness
22.01	University of Cambridge	ICPR 2022	The Effect of Model Compression on Fairness in Facial Expression Recognition
22.12	NAVER LABS Europe; IDIAP Research Institute; EPFL	EMNLP 2022	What Do Compressed Multilingual Machine Translation Models Forget?
23.05	NJIT; Microsoft Research; Rice University	EACL 2023	Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding
23.07	Microsoft Research; IIT Dhanbad; BITS Pilani	ACL 2023	A Comparative Study on the Impact of Model Compression Techniques on Fairness in Language
23.10	UT Austin; Apple	ICLR 2024	Compressing LLMs: The Truth Is Rarely Pure and Never Simple
23.12	Carnegie Mellon University; Universidade NOVA de Lisboa; Allen Institute for AI	EMNLP 2023	Understanding the Effect of Model Compression on Social Bias in Large Language Models
23.12	Cohere For AI; Dyania Health; University of Virginia	ICML 2024	On the Fairness Impacts of Hardware Selection in Machine Learning
24.02	Princeton University	ICML 2024	Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
24.03	UT Austin; Drexel; MIT; UIUC; Duke; LLNL; CAIS; UC Berkeley; UChicago	ICML 2024	Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
24.05	Tennessee Tech University	FLAIRS 2024	Beyond Size and Accuracy: The Impact of Model Compression on Fairness
24.05	ETH Zurich	NeurIPS 2024	Exploiting LLM Quantization
24.06	Université Lumière Lyon 2; Université Claude Bernard Lyon 1	NAACL 2024	When Quantization Affects Confidence of Large Language Models?
24.06	Penn State; NEC Labs America	NAACL 2024	Pruning as a Domain-Specific LLM Extractor
24.10	Oregon State University	NeurIPS 2024 (RBFM)	You Never Know: Quantization Induces Inconsistent Biases in Vision-Language Foundation Models
24.11	MIT	BlackboxNLP @ EMNLP 2024	Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
24.11	University of Utah; Google DeepMind	EMNLP 2024	Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression
24.11	Cohere; Cohere For AI	EMNLP 2024	How Does Quantization Affect Multilingual LLMs?
24.12	ETH Zurich	Conference	Exploiting LLM Quantization
25.05	University of Calgary; Vector Institute	ICML 2025	Does Compression Exacerbate Large Language Models' Social Bias?
25.05	Ruhr University Bochum; UAR Research Center	NAACL 2025	The Impact of Inference Acceleration on Bias of LLMs
25.05	East China Normal University	ACL 2025 Findings	Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models
25.07	Red Hat AI; IST Austria	ACL 2025	"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-offs in LLM Quantization
25.08	Sofia University; Tsinghua University; Hebrew University; TU Darmstadt; ATHENE	arXiv	How Quantization Shapes Bias in Large Language Models
25.08	Dalhousie University	AACL 2025	Interpreting the Effects of Quantization on LLMs
25.09	Université Lumière Lyon 2; Université Claude Bernard Lyon 1; École Centrale de Lyon; LIRIS CNRS	arXiv	Fair-GPTQ: Bias-Aware Quantization for Large Language Models
25.10	Tennessee Tech University	arXiv	Downsized and Compromised? Assessing the Faithfulness of Model Compression
25.10	University of Hong Kong; Huawei	NeurIPS 2025	Preserving LLM Capabilities through Calibration Data Curation
25.10	ETH Zurich	arXiv	Fewer Weights, More Problems: A Practical Attack on LLM Pruning
25.11	UMass Amherst; Microsoft; University of Maryland	EMNLP 2025	Does Quantization Affect Models’ Performance on Long-Context Tasks?
25.11	Iowa State University	arXiv	Decomposed Trust: Exploring Privacy, Adversarial Robustness, Fairness, and Ethics of Low-Rank LLMs
25.11	Seoul National University	arXiv	Alignment-Aware Quantization for LLM Safety
25.11	GE HealthCare	NeurIPS 2025 (Lock-LLM Workshop)	Compressed but Compromised? A Study of Jailbreaking in Compressed LLMs
25.12	Huazhong University of Science and Technology	ICML 2025	Understanding the Unfairness in Network Quantization
26.01	Universitas Indonesia	arXiv	Preserving Fairness and Safety in Quantized LLMs Through Critical Weight Protection
26.02	University College London; University of Tübingen	arXiv	UniComp: A Unified Evaluation of Large Language Model Compression
26.02	UC Berkeley; Meta Superintelligence Labs	arXiv	Uncertainty Drives Social Bias Changes in Quantized Large Language Models

Benchmarks

Date	Institute	Publication	Paper
24.03	UT Austin; Drexel; MIT; UIUC; Duke; LLNL; CAIS; UC Berkeley; UChicago	ICML 2024	Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
24.12	IBM Research Europe; Trinity College Dublin; Imperial College London	NeurIPS 2024 SafeGenAI	HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment
25.02	Skolkovo Institute of Science and Technology; AIRI; HSE University	arXiv	Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models
25.02	Harbin Institute of Technology (Shenzhen); Illinois Institute of Technology	arXiv	Benchmarking Post-Training Quantization in LLMs: A Comprehensive Taxonomy

Cite

If you use this repository in your research, you can cite it as:

@misc{proskurina2026awesome_llm_compression_safety,
  title        = {Awesome LLM Compression Safety},
  author       = {Proskurina, Irina},
  year         = {2026},
  howpublished = {\url{https://github.com/upunaprosk/Awesome-LLM-Compression-Safety}},
  note         = {GitHub repository, accessed 2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.gitignore		.gitignore
README.md		README.md
banner.png		banner.png
banner_image.png		banner_image.png
llm_compression_safety.gif		llm_compression_safety.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome LLM Compression Safety

🤗 Introduction

Table of Contents

📑 Papers

Benchmarks

Cite

About

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome LLM Compression Safety

🤗 Introduction

Table of Contents

📑 Papers

Benchmarks

Cite

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!