- A dataset of task inputs
- A harness for the model (tools, sandboxes, context management, etc.)
- A reward function or rubric to score the model’s performance
Getting Started
Ensure you haveuv installed, as well as the prime CLI tool:
uv init), installs verifiers (with uv add verifiers), creates the recommended workspace structure, and downloads useful starter files:
verifiers to an existing project:
my_env with a basic environment template.
load_environment function which returns an instance of the Environment object, and which can accept custom arguments. For example:
./configs/endpoints.toml.
View local evaluation results in the terminal UI:
environment -> model -> run). Press Enter on a run to open rollout details, b to go back, tab to cycle panes, e and x to expand or collapse history, pageup and pagedown to scroll history, and c for Copy Mode.
To publish the environment to the Environments Hub, do: