{output_dir}/logs/. For RL training, we recommend streaming file logs into tmux panes (as set up by tmux.sh).
tmux helper (scripts/tmux.sh)
scripts/tmux.sh sets up a tmux session for RL runs with three panes (one per subprocess):
- Trainer: run
uv run rl ...here - Orchestrator: follows
{output_dir}/logs/orchestrator.stdout - Inference: follows
{output_dir}/logs/inference.stdout
Logger Architecture
setup_logger and get_logger
We use a singleton pattern with a module-level global logger instance (_LOGGER).
-
setup_logger(log_level, log_file)- Initializes the global logger exactly once:- Creates an isolated loguru
Loggerinstance (not the defaultloguru.logger) to prevent third-party code from hijacking our logs - Adds a stdout handler with colorized output
- Optionally adds a file handler (deletes existing file first)
- Raises
RuntimeErrorif called twice
- Creates an isolated loguru
-
get_logger()- Returns the global logger instance:- Raises
RuntimeErrorifsetup_loggerhasn’t been called yet - Safe to call from any module after initialization
- Raises
-
reset_logger()- Resets the global logger toNone:- Used in subprocesses that inherit parent state (e.g., env workers)
- Used in tests between test cases
RL Log File Structure
For RL training, logs are organized by component:| Component | Log Path | Description |
|---|---|---|
| RL (parent) | logs/rl.log | Main process that spawns subprocesses |
| Inference | logs/inference.stdout | vLLM inference server stdout/stderr |
| Orchestrator | logs/orchestrator.log | Rollout generation, buffer, scheduling |
| Trainer | logs/trainer/rank_{N}.log | Training process (one file per GPU rank) |
| Env Workers | logs/env_workers/{env_name}/worker_{N}.log | Per-environment worker logs (optional) |
Per-Environment Worker Logging
Environment workers run in separate subprocesses to isolate event loop lag. Worker logging is controlled at the orchestrator level viaorchestrator.log:
env_worker_logs = true, logs are written to:
env_worker_logs = false to disable worker file logging (workers inherit parent process logging).
Torchrun
For multi-node training withtorchrun, all ranks log simultaneously. To filter to master rank only:
outputs/torchrun/{rdzv_id}/attempt_0/{rank}/{stdout,stderr}.log.