A production-style notebook that converts a donor registry into calibrated outreach scores and budgeted outreach decisions.
Notebook: global-blood-donation-registry.ipynb
Case study: CASE_STUDY.md
- A single notebook:
global-blood-donation-registry.ipynb - Exported artifacts under
./artifacts/(scored donors + model bundle + metrics)
A calibrated probability for eligible donors:
Outreach Score = P(donated_next_6m)
Then the notebook turns scores into actions:
- Top‑K outreach for a a set outreach capacity
- Net‑benefit threshold that maximizes expected value under simple costs/benefits
The notebook expects these CSV files:
Required
blood_donation_registry_ml_ready.csvblood_population_distribution.csvblood_compatibility_lookup.csvdata_dictionary.csv
- Download the dataset files.
- Place them under:
data/raw/
See data/raw/README.md.
If local files are not present, the notebook falls back to the Kaggle input paths.
python -m venv .venv
# Windows: .\.venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate
pip install -r requirements.txtOpen and run:
global-blood-donation-registry.ipynb
The notebook will export:
artifacts/donor_outreach_scored.csvartifacts/model_bundle.joblibartifacts/metrics_holdout.json
See artifacts/README.md.
- Scores are computed only for eligible donors.
- Calibration makes thresholds usable for decision policies.
- Threshold policies make trade-offs explicit (contact cost vs expected donation benefit).
MIT (code). Dataset licensing depends on the dataset source you download.