GitHub - Abu-Sameer-66/Mistral7B-Tox21-Molecular-Optimization: Native fine-tuning of Mistral-7B on the Tox21 dataset using LoRA and 4-bit quantization, achieving a competitive 0.72 ROC-AUC.

🧬 Project Overview

This repository contains a complete multi-task toxicity screening pipeline using OLMo-7B and Mistral-7B fine-tuned on the Tox21 benchmark. Developed for DeepChem GSoC 2026, this project uncovers a critical flaw in published baselines — the random vs scaffold split gap — and demonstrates how LLMs can achieve competitive generalization when evaluated honestly.

📊 Results — Best Experiment

OLMo-7B QLoRA — Scaffold Split — Mean ROC-AUC: 0.7225

Task	ROC-AUC
NR-AR	0.7179
NR-AR-LBD	0.8454
NR-AhR	0.7312
NR-Aromatase	0.7062
NR-ER	0.6888
NR-ER-LBD	0.8326
NR-PPAR-gamma	0.7343
SR-ARE	0.6968
SR-ATAD5	0.6411
SR-HSE	0.6352
SR-MMP	0.7394
SR-p53	0.7010
MEAN	0.7225

🔍 Key Scientific Finding — Random vs Scaffold Gap

Model	Split	ROC-AUC
RF + ECFP	Random (published baseline)	0.8183
RF + ECFP	Scaffold (honest evaluation)	0.6135
Gap		0.2048

This gap of 0.20 proves that most published baselines use random split — which leaks scaffold information and overestimates real-world generalization. LLM scaffold-split numbers are NOT underperforming — they are the honest numbers.

This finding directly justifies why the DeepChem GSoC project must use ScaffoldSplitter as the evaluation standard.

🚀 Key Scientific Engineering

Challenge	Solution	Impact
NaN Loss Crashes	fp32 loss upcasting + gradient clipping	Eliminated fp16 gradient underflow on imbalanced Tox21
Class Imbalance	8x oversampling of toxic minority class	Prevented bias toward non-toxic majority labels
Memory Bottleneck	4-bit NF4 + gradient checkpointing	OLMo-7B fits in 16GB VRAM
Invalid SMILES	RDKit sanitization — dropped 8 metal-ion SMILES	Prevented tokenizer instability from [Hg+2], [Fe+2]
Baseline Inflation	Scaffold split enforced throughout	True out-of-distribution generalization

🛠️ Tech Stack

📂 File Index

File	Description
`tox21_mistral_benchmark.py`	Mistral-7B training pipeline, scaffold split, 8x oversampling
`dataset.py`	Tox21 data loading, RDKit sanitization, scaffold split logic
`model.py`	OLMo-7B 4-bit NF4 wrapper with LoRA config
`train.py`	Training loop with checkpoint saving
`OLMo_Tox21_MultiTask_Final.ipynb`	Final OLMo-7B run — Mean ROC-AUC 0.7225
`graphconv-tox21-deepchem.ipynb`	RF baseline + scaffold vs random gap discovery
`requirements.txt`	Dependencies

💻 How to Run

pip install -r requirements.txt

# Modular pipeline (OLMo-7B)
python dataset.py    # prepare data
python train.py      # train model

# Mistral-7B benchmark
python tox21_mistral_benchmark.py

# Full notebook — Kaggle
# kaggle.com/sameernadeem66/graphconv-tox21-deepchem

🔗 Part of DeepChem GSoC 2026 Research

Task	Model	Result	Repo
BACE Classification	Mistral-7B QLoRA	0.8371 ROC-AUC	BACE Repo
BBBP Classification	Mistral-7B QLoRA	0.7141 ROC-AUC	BBBP Repo
ClinTox Classification	Mistral-7B QLoRA	0.9913 ROC-AUC	ClinTox Repo
Tox21 Multi-Task	OLMo-7B QLoRA	0.7225 Mean ROC-AUC	This Repo
ESOL Regression	OLMo-7B + Reg Head	0.8582 RMSE	[ESOL Repo]
SMILES Generation	OLMo-7B + RDKit TSM	20/20 = 100% valid	Generation Repo


---

## Proposal mein Tox21 row update karo

**Page 4 table — Tox21 row:**

Old:

Tox21 | Multi-Task | Native GraphConv | CPU/T4 | 0.6859 & 0.72 BB


New:

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.gitignore		.gitignore
Benchmark_final_score(0.72)		Benchmark_final_score(0.72)
LICENSE		LICENSE
README.md		README.md
Tox_21_graph.jpeg		Tox_21_graph.jpeg
WhatsApp Image 2026-02-09 at 4.39.02 PM.jpeg		WhatsApp Image 2026-02-09 at 4.39.02 PM.jpeg
graphconv-tox21-deepchem.ipynb		graphconv-tox21-deepchem.ipynb
requirements.txt		requirements.txt
tox21_mistral_benchmark.py		tox21_mistral_benchmark.py
tox21_results.jpeg		tox21_results.jpeg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 Project Overview

📊 Results — Best Experiment

🔍 Key Scientific Finding — Random vs Scaffold Gap

🚀 Key Scientific Engineering

🛠️ Tech Stack

📂 File Index

💻 How to Run

🔗 Part of DeepChem GSoC 2026 Research

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧬 Project Overview

📊 Results — Best Experiment

🔍 Key Scientific Finding — Random vs Scaffold Gap

🚀 Key Scientific Engineering

🛠️ Tech Stack

📂 File Index

💻 How to Run

🔗 Part of DeepChem GSoC 2026 Research

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages