- A dataset of task inputs
- A harness for the model (tools, sandboxes, context management, etc.)
- A reward function or rubric to score the model’s performance
Getting Started
Ensure you haveuv installed, as well as the prime CLI tool:
uv init), installs verifiers (with uv add verifiers), creates the recommended workspace structure, and downloads useful starter files:
verifiers to an existing project:
my_env with a basic environment template.
load_environment function which returns an instance of the Environment object, and which can accept custom arguments. For example:
./configs/endpoints.toml.
View local evaluation results in the terminal UI:
c to open Copy Mode for prompt/completion text; highlight and press c again to copy.
To publish the environment to the Environments Hub, do: