FPLX — Stochastic Inference & Constrained Optimization for Fantasy Premier League

FPLX treats FPL squad selection as a dynamic decision problem under uncertainty. Instead of producing static point projections, it models each player's underlying form as a latent variable, fuses noisy evidence (match stats, injury news, fixture difficulty) via probabilistic filters, and feeds the resulting distributions into a constrained optimizer.

The system has three layers: Perceive (news signals, fixture data) → Infer (HMM + Kalman Filter → fused predictions with uncertainty) → Act (ILP/Greedy squad selection).

Inference Pipeline

The core contribution. Each player is modeled independently:

                         NewsSignal ──────┐
                                          ▼
Points history ──► HMM (discrete) ──► Transition perturbation
     │               │                    │
     │               ▼                    │
     │          P(S_t | data)             │
     │          {Injured, Slump,          │
     │           Average, Good, Star}     │
     │               │                    │
     ├──► Kalman Filter (continuous) ◄────┘
     │          x̂_t, P_t              Process noise shock
     │               │
     │          FixtureSignal ──► Observation noise modulation
     │               │
     ▼               ▼
  Fusion (inverse-variance weighting)
     │
     ▼
  E[P], Var[P]  ──►  ILP Optimizer

HMM tracks 5 discrete form states: {Injured, Slump, Average, Good, Star}. Each state has a Gaussian emission model defining expected points. The Forward-Backward algorithm computes smoothed posteriors; Viterbi decoding finds the most likely state sequence. Baum-Welch (EM) learns transition and emission parameters from data.

Kalman Filter tracks a continuous latent variable representing the player's true point potential. A random-walk state model captures gradual form drift. The filter produces optimal minimum-MSE estimates with uncertainty bounds at each gameweek.

News injection is where signals enter the inference. When NewsSignal classifies news as "ruled out," the HMM transition matrix for that timestep is perturbed (10x boost toward Injured state), and the Kalman Filter's process noise is inflated (true form may have jumped). This is fundamentally different from the post-hoc scalar multiplier used in static pipelines.

Fixture injection modulates the Kalman Filter's observation noise. Harder opponents produce noisier point observations (R multiplied by 1.5 for difficulty-5 fixtures, 0.8 for difficulty-1).

Fusion combines HMM and Kalman outputs via inverse-variance weighting. The fused variance is always lower than either component alone. The output is a (mean, variance) pair per player that feeds into the optimizer.

Quick Start

Installation

git clone https://github.com/fnhirwa/fantasypl.git
cd fantasypl
pip install -e "."

# With ML models
pip install -e ".[ml]"

# With ILP optimizer
pip install -e ".[optimization]"

# Everything
pip install -e ".[all]"

Inference Pipeline (recommended)

from fplx import FPLModel

model = FPLModel(
    budget=100,
    horizon=1,
    formation="auto",
    config={"model_type": "inference"}
)

model.load_data(source="api")   # fetches FPL API + collects news
model.fit()                     # runs HMM + KF + fusion per player

squad = model.select_best_11()
print(squad.summary())

# Uncertainty is available for downstream use
for pid, ep in sorted(model.expected_points.items(), key=lambda x: -x[1])[:10]:
    var = model.expected_variance[pid]
    print(f"  Player {pid}: E[P]={ep:.2f}, std={var**0.5:.2f}")

Legacy Pipeline (baselines)

from fplx import FPLModel

# Rolling average baseline
model = FPLModel(budget=100, config={"model_type": "baseline"})
model.load_data(source="api")
model.fit()
squad = model.select_best_11()

# XGBoost (requires fplx[ml])
model = FPLModel(budget=100, config={"model_type": "xgboost"})

Direct Inference (without FPLModel)

import numpy as np
from fplx.inference.pipeline import PlayerInferencePipeline
from fplx.signals.news import NewsSignal

# Player's gameweek-by-gameweek points
points = np.array([6, 8, 5, 7, 9, 3, 6, 8, 7, 5, 0, 0, 0, 1, 2])

pipeline = PlayerInferencePipeline()
pipeline.ingest_observations(points)

# Inject injury news at gameweek 11
news = NewsSignal().generate_signal("Ruled out for 3 weeks")
pipeline.inject_news(news, timestep=10)

result = pipeline.run()
ep_mean, ep_var = pipeline.predict_next()

print(f"Next GW: E[P]={ep_mean:.2f}, std={ep_var**0.5:.2f}")
print(f"Current state: {result.viterbi_path[-1]}")  # 0=Injured,...,4=Star
print(f"P(Injured): {result.smoothed_beliefs[-1, 0]:.3f}")

News Signal Integration

FPLX uses news data from the FPL API itself. Every player in the bootstrap-static response includes:

Field	Example	Usage
`news`	`"Knee injury - expected back 01 Feb"`	Parsed by `NewsSignal`
`status`	`"i"` (injured), `"d"` (doubtful), `"a"` (available)	Classified into perturbation category
`chance_of_playing_next_round`	`25` (percent)	Augments news text

NewsCollector snapshots this data per gameweek (persisted as JSON in ~/.fplx/news/). NewsSnapshot.to_news_signal_input() enriches the raw text with status codes and chance percentages, then routes through the existing NewsSignal.generate_signal() parser.

The parsed signal maps to inference perturbations:

News Category	HMM Perturbation	KF Process Noise
Unavailable (`"ruled out"`, status=`i`)	Injured state ×10	Q × 5.0
Doubtful (`"late fitness test"`, status=`d`)	Injured ×3, Slump ×2	Q × 2.0
Rotation (`"rotation risk"`, `"benched"`)	Slump ×2, Average ×1.5	Q × 1.5
Positive (`"back in training"`, status=`a`)	Good ×2, Star ×1.5	Q × 1.0
Neutral (no news)	No perturbation	No change

Development

git clone https://github.com/fnhirwa/fantasypl.git
cd fantasypl
python -m venv env
source env/bin/activate
pip install -e ".[dev]"

# Run tests
pytest tests/

# Format
ruff format fplx/
ruff check fplx/ --fix

Research Context

This project implements the pipeline titled:

FPLX: A Framework for Stochastic Inference and Constrained Optimization in High-Variance Sports Environments

The key insight is treating FPL squad selection as a Perceive → Infer → Act loop where uncertainty propagates end-to-end from observation noise through to squad selection. This contrasts with standard approaches that decouple forecasting from optimization and discard uncertainty at the interface.

References: Matthews et al. (AAAI 2012), Tamimi & Tran (IJCSS 2025), Ramezani (arXiv 2025), Brill et al. (arXiv 2024).

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
config		config
docs		docs
examples		examples
fplx		fplx
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FPLX — Stochastic Inference & Constrained Optimization for Fantasy Premier League

Inference Pipeline

Quick Start

Installation

Inference Pipeline (recommended)

Legacy Pipeline (baselines)

Direct Inference (without FPLModel)

News Signal Integration

Development

Research Context

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FPLX — Stochastic Inference & Constrained Optimization for Fantasy Premier League

Inference Pipeline

Quick Start

Installation

Inference Pipeline (recommended)

Legacy Pipeline (baselines)

Direct Inference (without FPLModel)

News Signal Integration

Development

Research Context

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages