This guide covers setup, testing, and contributing to the verifiers package.Documentation Index
Fetch the complete documentation index at: https://docs.primeintellect.ai/llms.txt
Use this file to discover all available pages before exploring further.
Table of Contents
- Setup
- Project Structure
- Prime CLI Plugin Export
- Running Tests
- Writing Tests
- Contributing
- Contributor Practices
- Common Issues
- Environment Development
- Quick Reference
Setup
Prerequisites
- Python 3.13 recommended for CI parity with Ty checks
- uv package manager
Installation
Project Structure
Prime CLI Plugin Export
Verifiers exports a plugin consumed byprime so command behavior is sourced from verifiers modules.
Entry point:
api_version(current:1)- command modules:
eval_module(verifiers.cli.commands.eval)gepa_module(verifiers.cli.commands.gepa)install_module(verifiers.cli.commands.install)init_module(verifiers.cli.commands.init)setup_module(verifiers.cli.commands.setup)build_module(verifiers.cli.commands.build)
build_module_command(module_name, args)to construct subprocess invocation for a command module
- Add new prime-facing command logic under
verifiers/cli/commands/. - Export new command modules through
PrimeCLIPlugininverifiers/cli/plugins/prime.py. - Keep
verifiers/scripts/*as thin compatibility wrappers that call intoverifiers/cli.
Running Tests
Writing Tests
Test Structure
Using Mocks
The test suite provides aMockClient in conftest.py that implements the Client interface:
Guidelines
- Test both success and failure cases
- Use descriptive test names that explain what’s being tested
- Leverage existing fixtures from
conftest.py - Group related tests in test classes
- Keep tests fast - use mocks instead of real API calls
Contributing
Workflow
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make changes following existing patterns
- Add tests for new functionality
- Run tests:
uv run pytest tests/ - Install hooks once per clone:
uv run pre-commit install - Commit and push (hooks run automatically on each commit/push)
- Update docs if adding/changing public APIs
- Submit PR with clear description
Code Style
- Strict
ruffenforcement via pre-commit hooks tyruns in the pre-push hook viauv run --python 3.13 ty check verifiers- Use type hints for function parameters and returns
- Write docstrings for public functions/classes
- Keep functions focused and modular
- Fail fast, fail loud - no defensive programming or silent fallbacks
PR Checklist
- Tests pass locally (
uv run pytest tests/) - Pre-commit and pre-push hooks pass on latest commit/push
- Added tests for new functionality
- Updated documentation if needed
Contributor Practices
Public Surface
Treat public config, docs, starter examples, skills, and generated agent guidance as one surface. If a behavior changes for users, update all matching surfaces in the same patch. For TOML config, keep one shape across eval, GEPA, RL, and Hosted Training. Normalize old or alternate inputs at the loader boundary, then keep examples on the current golden path. For v1 Taskset/Harness environments, put task data, task-owned tools, user behavior, metrics, rewards, and task-specific configuration on theTaskset.
Use the base vf.Harness unless the harness owns a reusable execution adapter
such as a CLI, framework program, sandboxed program, or nested harness flow.
Validation By Change Type
- Core runtime or shared config parsing: run the focused unit tests plus
uv run pre-commit run --all-files. - Example environment behavior: run the focused tests and a real
prime eval runsmoke when credentials and endpoint access are available. - Environment packaging: exercise
tests/test_envs.pyfor the changed environment so a fresh venv installs the environment package and its dependencies. - Docs or generated agent guidance: run
uv run python scripts/sync.pyand include the regenerated files. - Release prep: verify the version source, release notes commit range,
uv build, and final worktree status. - PR/CI follow-up: inspect the live review thread, check run, or log before patching, then rerun the smallest check that proves the fix.
Downstream Checks
Before changing dependencies, optional extras, lockfiles, exported config fields, or upload/eval metadata, trace the consumers inprime-cli, prime-rl, Hosted
Training, and public docs when they are in scope. Update the consumer or document
the compatibility boundary rather than assuming transitive behavior remains
safe.
Common Issues
Import Errors
Integration Tests
Test Failures
Environment Development
Creating a New Environment Module
Environment Module Structure
Quick Reference
Essential Commands
CLI Tools
| Command | Description |
|---|---|
prime eval run | Run evaluations on environments |
prime env init | Initialize new environment from template |
prime env install | Install environment module |
prime lab setup | Set up training workspace |
prime eval view | Terminal UI for browsing evals and rollout details |
prime rl run | Launch Hosted Training |
Project Guidelines
- Environments: Installable modules with
load_environment()function - Parsers: Extract structured data from model outputs
- Rubrics: Define multi-criteria evaluation functions
- Tests: Comprehensive coverage with mocks for external dependencies