Repository for training toy GPT-2 models and experimenting with sparse autoencoders.
- checkpoints - Training output
- config - Model and training configurations
- data - Datasets to be used for training
- experiments - Scripts for experiments
- models - GPT and SAE model definitions
- training - Model trainers
Python requirements are listed in requirements.txt. To install requirements, run:
pip install -r requirements.txt
Each dataset contains a prepare.py file. Run this file as a module from the root directory to prepare a dataset for use during training. Example:
python -m data.shakespeare.prepare
Configurations are stored in config/gpt. The trainer is located at training/gpt.py. To run training, use:
python -m training.gpt --config=shakespeare_64x4
DDP is supported:
torchrun --standalone --nproc_per_node=8 -m training.gpt --config=shakespeare_64x4
Configurations are stored in config/sae. The Trainers are located at training/sae. To run training, use:
python -m training.sae.concurrent --config=standard.shakespeare_64x4 --load_from=shakespeare_64x4