Skip to main contentThis directory maintains the documentation for PRIME-RL. It is organized into the following sections:
- Entrypoints - Overview of the main components (orchestrator, trainer, inference) and how to run SFT, RL, and evals
- Configs - Configuration system using TOML files, CLI arguments, and environment variables
- Environments - Installing and using verifiers environments from the Environments Hub
- Async Training - Understanding asynchronous off-policy training and step semantics
- Logging - Logging with loguru, torchrun, and Weights & Biases
- Checkpointing - Saving and resuming training from checkpoints
- Benchmarking - Performance benchmarking and throughput measurement
- Deployment - Training deployment on single-GPU, multi-GPU, and multi-node clusters
- Troubleshooting - Common issues and their solutions