This repository is building an agent-style interface for common OHDSI study design tasks. The current implementation is strongest in two areas:
- Phenotype recommendation for target and outcome cohort selection
- Keeper-assisted concept generation, profile extraction, and row adjudication for phenotype validation
- R demos of using workflows that call the ACP/MCP flows to design strategus incidence rate and cohort method analyses
The project separates orchestration from deterministic tooling:
acp_agent/: ACP server that exposes the flow endpoints and handles LLM orchestrationmcp_server/: MCP server that exposes retrieval, prompt, vocabulary, and Keeper toolscore/: pure validation and business logic shared by ACP and MCPR/slashOhdsiStrategusAssistant/: R-side Strategus workflow package and canonical shell entrypoints
Researchers often have three immediate bottlenecks when designing an OHDSI study:
- finding a reasonable starting phenotype definition for a study intent
- refining or validating that phenotype before using it in downstream analyses
- moving from phenotype selection into a reproducible study workflow
This repo addresses those bottlenecks by combining:
- phenotype retrieval from an indexed phenotype library
- constrained LLM ranking or critique with deterministic validation
- Keeper-oriented tooling for concept generation, OMOP profile extraction, and row-level adjudication using sanitized summaries only
- R shells that turn selected cohorts into reproducible Strategus incidence and cohort-method workflows
At no point should raw row-level patient data be sent directly to an LLM.
Implemented flow:
- Retrieve phenotype candidates with MCP
phenotype_search - Build the prompt and schema with MCP
phenotype_prompt_bundle - Rank candidates with an OpenAI-compatible LLM
- Validate and filter results in
core - Return diagnostics and explicit fallback metadata if the LLM output is unusable
Related implemented flows:
phenotype_recommendationphenotype_recommendation_advicephenotype_improvementsphenotype_intent_splitcohort_methods_intent_splitconcept_sets_reviewcohort_critique_general_design
This same recommendation path is already wired into the R Strategus incidence shell and the cohort-method shell.
Primary references:
- docs/TESTING.md
- docs/WORKFLOW_PHENOTYPE_RECOMMENDATION.md
- docs/PHENOTYPE_VALIDATION_REVIEW.md
- docs/SPEC_KEEPER_INTERFACE.md
- docs/R_STRATEGUS_INCIDENCE_SHELL.md
- docs/R_STRATEGUS_COHORT_METHODS_SHELL.md
- docs/WORKFLOW_INCIDENCE.md
- docs/ROADMAP.md
- docs/R_PACKAGE_ARCHITECTURE_PLAN.md
This is the other strong implemented story. It covers concept generation through case-review input preparation and row adjudication.
Implemented workflow:
- Generate Keeper-oriented concept sets with
keeper_concept_sets_generate - Extract OMOP-backed Keeper profiles with
keeper_profiles_generate - Convert those profiles into review rows
- Sanitize each row before any LLM call
- Run
phenotype_validation_reviewto adjudicate a single review row asyes,no, orunknown
Current characteristics:
- concept generation can use Hecate-backed, generic-search, or DB-backed vocabulary tooling
- profile extraction is deterministic only and does not call an LLM
- downstream adjudication is constrained by fail-closed sanitization and a small label set
- the R Strategus shells now generate ACP-based
04_keeper_review.Rscripts that persist Keeper workflow state for reuse and resume
Primary references:
Use this when you need a defensible starting cohort definition for a target or outcome.
- Start MCP and ACP
- Call
phenotype_recommendationwith a study intent - Review returned candidates and diagnostics
- If needed, call
phenotype_recommendation_advicefor next-step guidance - Optionally call
phenotype_improvementson a selected cohort - If you are working in R, continue through
slashOhdsiStrategusAssistant::runStrategusIncidenceShell()
Use this when you need a practical validation loop around a phenotype.
- Call
keeper_concept_sets_generatefor the phenotype of interest - Approve the concept sets you want to use for extraction
- Call
keeper_profiles_generateagainst your OMOP data - Take one generated
rows[]entry at a time - Send the sanitized row to
phenotype_validation_review - Repeat row adjudication as needed to review more sampled cases
pip install -e ".[dev]"The project currently uses a simple split:
pyproject.tomldefines the Python package, runtime dependencies, console scripts, and optional dev tools.environment.ymlbootstraps a Conda or Micromamba environment with the Python tooling commonly used in this repo.uv.lockis not tracked as a repo source of truth. If you useuvlocally, generate your own lockfile after cloning.
Official local workflow:
conda env create -f environment.yml
conda activate study-agent
pip install -e ".[dev]"Optional uv workflow for users who prefer it:
uv lock
uv run pytestThe repo does not currently require uv. Docker builds the runtime in two layers: environment.yml provides the Micromamba/Conda base environment, and then pyproject.toml is used by pip install -e . to install the Python package and console entrypoints inside that environment.
export MCP_TRANSPORT=http
export MCP_HOST=127.0.0.1
export MCP_PORT=8790
export MCP_PATH=/mcp
study-agent-mcpexport STUDY_AGENT_MCP_URL="http://127.0.0.1:8790/mcp"
export STUDY_AGENT_HOST=127.0.0.1
export STUDY_AGENT_PORT=8765
study-agent-acpIf you want LLM-backed phenotype flows, also set an OpenAI-compatible endpoint:
export LLM_API_KEY=<YOUR_KEY>
export LLM_API_URL="<URL_BASE>/api/chat/completions"
export LLM_MODEL=<MODEL_NAME>This has been tested with Open webui, with locally hosted models, and LLM Shim with access to cloud services (tested with openai and bedrock models) and an embedding model serviced using the HugginFace Text Embedding Interface service.
If you want phenotype retrieval, you also need an indexed phenotype library. See docs/PHENOTYPE_INDEXING.md.
Current indexing workflow:
- Build
catalog.jsonlplussparse_index.pklfrom OHDSI and/or CIPHER source files. - Optionally enable LLM-derived retrieval keywords during that build.
- Build
dense.indexseparately when embedding infrastructure is available, either during the main build with--build-denseor later with--build-dense --dense-only.
The retrieval layer reads from PHENOTYPE_INDEX_DIR, which should point to the built output directory. The source phenotype files do not need to live under that directory. In the default Docker/Compose setup, the index is expected on the host at ./data/phenotype_index and is mounted into the container at /data/phenotype_index. If you set PHENOTYPE_INDEX_DIR in .env, make sure the mounted volume path is updated to match; otherwise the container will still only see the default mounted index location.
curl -s -X POST http://127.0.0.1:8765/flows/phenotype_recommendation \
-H 'Content-Type: application/json' \
-d '{"study_intent":"Identify clinical risk factors for older adult patients who experience an adverse event of acute gastrointestinal bleeding","top_k":20,"max_results":10,"candidate_limit":10}'curl -s -X POST http://127.0.0.1:8765/flows/keeper_concept_sets_generate \
-H 'Content-Type: application/json' \
-d '{"phenotype":"Gastrointestinal bleeding",
"domain_keys":["doi","alternativeDiagnosis","symptoms"],
"candidate_limit":5,
"include_diagnostics":true
}'curl -s -X POST http://127.0.0.1:8765/flows/phenotype_validation_review \
-H 'Content-Type: application/json' \
-d '{
"disease_name": "Gastrointestinal bleeding",
"keeper_row": {
"age": 44,
"gender": "Male",
"visitContext": "Inpatient Visit",
"presentation": "Gastrointestinal hemorrhage",
"priorDisease": "Peptic ulcer",
"priorDrugs": "celecoxib",
"afterDrugs": "naproxen"
}
}'- Installation, smoke tests, and provider-specific examples: docs/TESTING.md
- Implemented service inventory: docs/SERVICE_REGISTRY.yaml
- Docker setup: see
compose.yamland.env.example. The default containerized phenotype index path is./data/phenotype_indexon the host, mounted to/data/phenotype_indexin the container. - ACP and MCP component details: acp_agent/README.md, mcp_server/README.md
- Open an issue or discussion if a workflow is unclear or under-documented
- Submit PRs that tighten the implemented workflow docs before adding new service claims
- Join the discussion on the OHDSI Forums
Near-term priorities:
- strengthen phenotype recommendation and improvement workflows for study design and Strategus handoff
- expand Keeper-assisted concept generation and profile-review workflows for phenotype validation
- improve researcher-facing workflow documentation, smoke tests, and deployment guidance
Active expansion areas:
- data-quality interpretation tied to study intent
- more phenotype authoring support beyond recommendation and improvement
- broader study-design critique and cohort authoring services
For the broader future-service catalog, see docs/ROADMAP.md.
The repository still contains broader plans that are not the main implemented story yet. Treat these as exploratory or partial unless the docs for a specific flow say otherwise:
- generalized protocol-writing and critique services
- broader data-quality interpretation services
- wider cohort authoring and design-review service families beyond the currently implemented lint/recommendation paths
- expansion toward a larger study-agent service catalog
The planned-service inventory in older docs should not be read as "fully available now".

