This repository provides the core implementation for Adversarial Water-Filling (AWF) and the AWF wireless foundation model used in the manuscript:
Adversarial Water-Filling: Theory, Algorithms, and a Wireless Foundation Model
The code implements a domain-specific learned solver for mercury/water-filling AWF problems with discrete constellations, spatial linear constraints, and adversarial interference. The model combines permutation-invariant channel set encoding, constraint-aware graph message passing, and learned primal-dual update dynamics.
Adversarial water-filling models a minimax resource-allocation problem in which a transmitter allocates power across channels while an adversary allocates interference power. This repository focuses on the discrete-constellation mercury/water-filling setting, where the derivative of the mutual-information function is represented using precomputed MMSE interpolation tables.
The implementation includes:
- QAM constellation simulation for MMSE and mutual-information tables.
- Online generation of AWF problem instances.
- Sparse, group, prefix, and dense constraint structures.
- A Perceiver-style channel set encoder.
- A bipartite GNN module for constraint-aware message passing.
- Learned primal-dual extragradient-style rollout dynamics.
- Mirror-Prox / projected primal-dual baselines.
- Evaluation scripts for size, modulation, and constraint generalization.
AWF/
├── train.py # Main training and evaluation driver
├── eval.py # Checkpoint evaluation script
├── requirements.txt # Python dependencies
├── awf_model_final.pt # Model weight
├── LICENSE # Apache-2.0 license
└── mercury/
├── constants.py # Modulation definitions and train/test groups
├── data.py # AWF instance generation and constraint matrices
├── evaluation.py # Evaluation utilities
├── losses.py # Training loss and residual objectives
├── metrics.py # Objective, feasibility, and KKT metrics
├── model.py # AWF foundation model architecture
├── optimization.py # Best-response and Mirror-Prox style baselines
├── qam.py # QAM constellation and MMSE/I table construction
├── training.py # Training loop and curriculum
└── utils.py # Projection, interpolation, and helper utilities
Clone the repository:
git clone https://github.com/convexsoft/AWF.git
cd AWFCreate and activate a Python environment:
conda create -n awf python=3.10
conda activate awfInstall dependencies:
pip install -r requirements.txtThe current requirements.txt contains:
torch
numpy
matplotlib
A CUDA-capable GPU is recommended for training and evaluation. The scripts can also run on CPU, but runtime will be longer.
Run the main training script:
python train.pyThe script will:
- set the random seed;
- select CUDA if available;
- build logarithmically spaced SNR grids;
- simulate MMSE and mutual-information tables for the supported QAM constellations;
- train the AWF foundation model on online-generated instances;
- save a timestamped model checkpoint;
- run baseline checks and generalization evaluations.
The public script is configured as a lightweight reproducibility entry point. For larger paper-scale experiments, increase the number of epochs, training steps, evaluation batches, and repeated runs as needed.
After training, evaluate a saved checkpoint using:
python eval.py --checkpoint path/to/checkpoint.ptFor example:
python eval.py --checkpoint awf_model_XXXX_final.ptThe evaluation script loads the model, reconstructs the MMSE/I tables, and reports:
- modulation-format generalization;
- held-out 256QAM generalization;
- constraint-structure generalization over sparse, group, prefix, and dense constraints.
A device can be specified explicitly:
python eval.py --checkpoint path/to/checkpoint.pt --device cudaor
python eval.py --checkpoint path/to/checkpoint.pt --device cpuThe code generates AWF instances online. Each instance contains:
- channel gains
beta; - noise powers
sigma; - transmit budget
P; - adversarial interference budget
N; - linear constraint matrix
A; - constraint threshold vector
p_hat; - modulation-dependent distribution token;
- modulation ID.
The default budget mode is:
per_channel_fixed
In this mode, per-channel budgets are sampled and scaled with the number of channels:
P = m * P_bar,
N = m * N_bar.
This setting keeps the per-channel resource scale comparable across different problem sizes.
The implementation supports several linear-constraint families:
sparse
group
prefix
dense
The default training setup uses sparse random nonnegative constraints. Generalization evaluation additionally tests group, prefix, and dense structures.
In the instance generator, each row of A is normalized to unit sum. The threshold vector p_hat is generated by sampling a feasible power allocation and adding positive random slack to ensure a nonempty feasible constraint set.
The supported modulations are:
16QAM
64QAM
256QAM
The default training modulations are:
16QAM
64QAM
The default test groups are:
mod16 -> 16QAM
mod64 -> 64QAM
mixed -> 16QAM and 64QAM
heldout256 -> 256QAM
Thus, 256QAM is used as a held-out modulation format to evaluate distribution generalization.
The mercury/water-filling formulation uses the I-MMSE relationship. The code constructs interpolation tables by Monte Carlo simulation over a logarithmically spaced SNR grid.
By default:
- table SNR grid: approximately
[-10, 30]dB with 128 points; - distribution-token SNR grid: approximately
[-10, 30]dB with 32 points; - Monte Carlo samples per SNR point:
15000.
These tables are generated at runtime by mercury/qam.py.
The public train.py script uses:
seed = 123
budget_mode = per_channel_fixed
optimizer = Adam
learning_rate = 1e-4
batch_size = 16
epochs = 1
steps_per_epoch = 100
The script saves a final checkpoint named like:
awf_model_<timestamp>_final.pt
The code uses stochastic components, including:
- Monte Carlo construction of MMSE/mutual-information tables;
- online random generation of AWF instances;
- random channel gains, noise powers, budgets, and constraints;
- stochastic model training;
- hardware-dependent GPU execution behavior.
Exact bitwise reproduction is therefore not expected. Results should be compared using averaged metrics over repeated runs.
For runtime measurement, use warm-up runs and CUDA synchronization when timing GPU execution. Runtime may vary depending on GPU temperature, power limits, background processes, memory pressure, and CUDA scheduling.
A typical reproducibility workflow is:
python train.py
python eval.py --checkpoint path/to/checkpoint.ptFor more stable results, run multiple repetitions and report the mean and standard deviation.
This repository contains the core implementation and scripts needed to reproduce the main experimental workflow in the manuscript. Additional internal scripts for large-scale hyperparameter sweeps and unpublished extensions are not included in this public release.
This project is released under the Apache License 2.0. See LICENSE for details.