Skip to main content
This directory maintains the documentation for PRIME-RL. It is organized into the following sections:
  • Entrypoints - Overview of the main components (orchestrator, trainer, inference) and how to run SFT, RL, and evals
  • Configs - Configuration system using TOML files, CLI arguments, and environment variables
  • Environments - Installing and using verifiers environments from the Environments Hub
  • Async Training - Understanding asynchronous off-policy training and step semantics
  • Logging - Logging with loguru, torchrun, and Weights & Biases
  • Checkpointing - Saving and resuming training from checkpoints
  • Benchmarking - Performance benchmarking and throughput measurement
  • Deployment - Training deployment on single-GPU, multi-GPU, and multi-node clusters
  • Troubleshooting - Common issues and their solutions