ML-EcoLyzer

A framework for measuring the environmental impact of ML inference. Tracks CO2 emissions, energy consumption, and water usage across different hardware setups.

Why?

Training gets all the attention, but inference runs 24/7 in production. We built this to answer: "How much does running this model actually cost the environment?"

Install

Available on PyPI:

pip install ml-ecolyzer

With framework-specific dependencies:

pip install ml-ecolyzer[huggingface]  # transformers, diffusers
pip install ml-ecolyzer[pytorch]       # torchvision, torchaudio
pip install ml-ecolyzer[all]           # everything

Quick Start

from mlecolyzer import EcoLyzer

config = {
    "project": "my_analysis",
    "models": [{"name": "gpt2", "task": "text"}],
    "datasets": [{"name": "wikitext", "task": "text", "limit": 100}]
}

eco = EcoLyzer(config)
results = eco.run()

print(f"CO2: {results['final_report']['analysis_summary']['total_co2_emissions_kg']:.6f} kg")
print(f"Energy: {results['final_report']['analysis_summary']['total_energy_kwh']:.6f} kWh")

What it measures

CO2 emissions - Based on power draw and regional carbon intensity
Energy usage - Via NVIDIA-SMI, psutil, or RAPL
Water footprint - Cooling overhead varies by hardware tier
ESS (Environmental Sustainability Score) - Parameters per gram of CO2, useful for comparing models

ESS = Effective Parameters (M) / CO2 (g)

Higher ESS = more efficient. INT8 models typically score ~74% higher than FP32.

Supported setups

GPUs: A100, T4, RTX series, GTX series
CPU-only works too
Frameworks: HuggingFace, PyTorch, scikit-learn

Config file

project: "benchmark_run"

models:
  - name: "facebook/opt-350m"
    task: "text"
    quantization:
      enabled: true
      target_dtype: "int8"

datasets:
  - name: "wikitext"
    task: "text"
    limit: 500

hardware:
  device_profile: "auto"

output:
  output_dir: "./results"
  export_formats: ["json", "csv"]

CLI

# Single run
mlecolyzer analyze --model gpt2 --dataset wikitext --task text

# System info
mlecolyzer info

Benchmarks

Ran 1,500+ inference configs across:

Hardware: GTX 1650, RTX 4090, Tesla T4, A100
Models: GPT-2, OPT, Qwen, LLaMA, Phi, Whisper, ViT
Precisions: FP32, FP16, INT8

Key findings:

A100 has poor ESS when underutilized (overkill for small batches)
Consumer GPUs (RTX/T4) often more efficient for single-batch inference
Quantization helps a lot, especially INT8

Contributing

See CONTRIBUTING.md. PRs welcome.

# Dev setup
pip install -e ".[dev]"
pytest

Citation

@misc{mlecolyzer2026,
  title={ML-EcoLyzer: A Framework for Quantifying the Environmental Impact of Machine Learning Inference},
  author={Minoza, Jose Marie Antonio and Laylo, Rex Gregor and Villarin, Christian and Ibanez, Sebastian},
  year={2026},
  note={AAAI Workshop on AI for Environmental Science},
  eprint={2511.06694},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  doi={10.48550/arXiv.2511.06694}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
docs		docs
mlecolyzer		mlecolyzer
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-EcoLyzer

Why?

Install

Quick Start

What it measures

Supported setups

Config file

CLI

Benchmarks

Contributing

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML-EcoLyzer

Why?

Install

Quick Start

What it measures

Supported setups

Config file

CLI

Benchmarks

Contributing

Citation

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages