Dose-Finding Study: Optimal Design & MED Identification using R

A full end-to-end dose-finding analysis pipeline in R, covering optimal study design, exploratory analysis, MED identification, and PK/PD modelling — from sample size calculation to a safety-adjusted minimum effective dose.

Overview

This project simulates and analyses a Phase II dose-finding clinical trial for a hypothetical drug tested at four dose levels (0, 10, 100, and 200 mg). The goal is to identify the Minimum Effective Dose (MED) — the lowest dose that produces a clinically meaningful change in pharmacodynamic effect relative to placebo — while accounting for exposure variability and side-effect risk.

The analysis is structured into four tasks, each building on the last.

Tasks

Task 1 — Sample Size & Optimal Dose Design

Cohen's d-based sample size calculation using pwr
Built a sigmoidal Emax (Hill) model in PopED to characterise the dose–response relationship
Optimised active dose levels using Adaptive Random Search (ARS) + Local Search (LS)
Generated model predictions with and without prediction intervals
Quantified parameter uncertainty via Monte Carlo sampling from the Fisher Information Matrix (FIM)
Explored sensitivity of dose–response shape across variations in Emax, ED50, and Hill coefficient

Task 2 — Exploratory Data Analysis

Loaded and tidied real-format clinical trial data (data_7.csv) using dplyr and tidyr
Stratified sampling of 200 subjects across four dose groups
Computed summary statistics: mean, SD, median, trimmed mean, Hodges-Lehmann estimate
Visualisations: boxplots, histograms, spaghetti plots of effect over time, and a full pairwise correlation matrix with significance annotation (GGally)

Task 3 — Statistical Testing & MED Identification

Pairwise dose vs. placebo comparisons using:
- Welch's t-test (one-sided, µ = 75, α = 0.05)
- Hodges-Lehmann (Wilcoxon) robust estimator
Multiple comparison correction via Holm's method
MED defined as the lowest dose whose 95% lower confidence bound exceeds the target effect (Δ = 75)
Side-effect analysis using Fisher's Exact Test at each dose level

Task 4 — PK/PD Modelling & Exposure-Response

Fitted 12 models total across three exposure metrics (Dose, AUC, Cmax):
- Simple linear regression
- Multiple linear regression (with covariates: age, weight, sex, height, side effects)
- Hyperbolic Emax (no Hill coefficient)
- Sigmoidal Emax (Hill equation) using nlsLM
Model selection via AIC and BIC
Best model: Sigmoidal Emax (AUC-based) — supported by lowest AIC/BIC
Diagnostic plots: observed vs. predicted, residuals vs. fitted, histogram of residuals, QQ-plot
MED via Monte Carlo simulation: parametric bootstrap (B = 10,000) from the multivariate parameter distribution to identify the critical AUC where the 95% lower bound of predicted effect exceeds Δ = 75
Converted AUC threshold to dose using a linear PK bridge model
Safety analysis: logistic regression modelling side-effect probability at MED vs. placebo

Files

File	Description
`Dose finding study.R`	Full analysis script — all 4 tasks
`data_7.csv`	Simulated clinical trial dataset (200 subjects, 4 dose groups)

Key Results

Optimal dose levels identified via PopED differ from naive equal-spacing assumptions
Welch t-test and Hodges-Lehmann agree on MED identification
Sigmoidal Emax (AUC) outperforms all linear models on AIC/BIC
MED dose is computed with an explicit 95% confidence guarantee, not just a point estimate
Side-effect risk at MED is quantified and benchmarked against placebo

Dependencies

install.packages(c(
  "pwr", "readr", "dplyr", "tidyr", "ggplot2",
  "PopED", "minpack.lm", "mvtnorm", "patchwork",
  "DoseFinding", "DescTools", "boot", "GGally", "MASS"
))

How to Run

Clone the repo
Place data_7.csv in your working directory (or update the setwd() path in Task 2)
Source or run Dose finding study.R section by section — each task is clearly delimited with # Task N---- comments

Context

This analysis was completed as part of the Preclinical and Clinical Data Analysis course in the MSc Pharmaceutical Modelling programme at Uppsala University.

Part of my pharmacometrics portfolio — see my GitHub profile for more.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Dose finding study.R		Dose finding study.R
README.md		README.md
data_7.csv		data_7.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dose-Finding Study: Optimal Design & MED Identification using R

Overview

Tasks

Task 1 — Sample Size & Optimal Dose Design

Task 2 — Exploratory Data Analysis

Task 3 — Statistical Testing & MED Identification

Task 4 — PK/PD Modelling & Exposure-Response

Files

Key Results

Dependencies

How to Run

Context

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dose-Finding Study: Optimal Design & MED Identification using R

Overview

Tasks

Task 1 — Sample Size & Optimal Dose Design

Task 2 — Exploratory Data Analysis

Task 3 — Statistical Testing & MED Identification

Task 4 — PK/PD Modelling & Exposure-Response

Files

Key Results

Dependencies

How to Run

Context

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages