Documentation Index
Fetch the complete documentation index at: https://docs.primeintellect.ai/llms.txt
Use this file to discover all available pages before exploring further.
This guide covers setup, testing, and contributing to the verifiers package.
Table of Contents
Setup
Prerequisites
- Python 3.13 recommended for CI parity with Ty checks
- uv package manager
Installation
# Clone and install for development
git clone https://github.com/PrimeIntellect-ai/verifiers.git
cd verifiers
# CPU-only development:
uv sync
# GPU-based trainer development:
uv sync --all-extras
# Install pre-commit hooks (including pre-push Ty gate):
uv run pre-commit install
Project Structure
verifiers/
├── verifiers/ # Main package
│ ├── envs/ # Environment classes
│ │ ├── integrations/ # Third-party wrappers (TextArena, ReasoningGym)
│ │ └── experimental/ # Newer environments (MCP, Harbor, etc.)
│ ├── parsers/ # Parser classes
│ ├── rubrics/ # Rubric classes
│ ├── rl/ # Training infrastructure
│ │ ├── inference/ # vLLM server utilities
│ │ └── trainer/ # Trainer implementation
│ ├── cli/ # Prime-facing CLI modules and plugin exports
│ ├── scripts/ # Compatibility wrappers around verifiers/cli commands
│ └── utils/ # Utilities
├── environments/ # Installable environment modules
├── configs/ # Example training configurations
├── tests/ # Test suite
└── docs/ # Documentation
Prime CLI Plugin Export
Verifiers exports a plugin consumed by prime so command behavior is sourced from verifiers modules.
Entry point:
from verifiers.cli.plugins.prime import get_plugin
plugin = get_plugin()
The plugin exposes:
api_version (current: 1)
- command modules:
eval_module (verifiers.cli.commands.eval)
gepa_module (verifiers.cli.commands.gepa)
install_module (verifiers.cli.commands.install)
init_module (verifiers.cli.commands.init)
setup_module (verifiers.cli.commands.setup)
build_module (verifiers.cli.commands.build)
build_module_command(module_name, args) to construct subprocess invocation for a command module
Contributor guidance:
- Add new prime-facing command logic under
verifiers/cli/commands/.
- Export new command modules through
PrimeCLIPlugin in verifiers/cli/plugins/prime.py.
- Keep
verifiers/scripts/* as thin compatibility wrappers that call into verifiers/cli.
Running Tests
# Run all tests
uv run pytest tests/
# Run with coverage
uv run pytest tests/ --cov=verifiers --cov-report=html
# Run specific test file
uv run pytest tests/test_parser.py
# Stop on first failure with verbose output
uv run pytest tests/ -xvs
# Run tests matching a pattern
uv run pytest tests/ -k "xml_parser"
# Run environment tests
uv run pytest tests/test_envs.py -vv
# Run environment tests across all CPU cores
uv run pytest -n auto tests/test_envs.py -vv
# Run specific environment tests
uv run pytest tests/test_envs.py -k math_python
The test suite includes 380+ tests covering parsers, rubrics, environments, and utilities.
Writing Tests
Test Structure
class TestFeature:
"""Test the feature functionality."""
def test_basic_functionality(self):
"""Test normal operation."""
# Arrange
feature = Feature()
# Act
result = feature.process("input")
# Assert
assert result == "expected"
def test_error_handling(self):
"""Test error cases."""
with pytest.raises(ValueError):
Feature().process(invalid_input)
Using Mocks
The test suite provides a MockClient in conftest.py that implements the Client interface:
def test_with_mock(mock_client):
mock_client.set_default_responses(chat_response="test answer")
env = vf.SingleTurnEnv(client=mock_client, model="test", ...)
# Test without real API calls
Guidelines
- Test both success and failure cases
- Use descriptive test names that explain what’s being tested
- Leverage existing fixtures from
conftest.py
- Group related tests in test classes
- Keep tests fast - use mocks instead of real API calls
Contributing
Workflow
- Fork the repository
- Create a feature branch:
git checkout -b feature-name
- Make changes following existing patterns
- Add tests for new functionality
- Run tests:
uv run pytest tests/
- Install hooks once per clone:
uv run pre-commit install
- Commit and push (hooks run automatically on each commit/push)
- Update docs if adding/changing public APIs
- Submit PR with clear description
Code Style
- Strict
ruff enforcement via pre-commit hooks
ty runs in the pre-push hook via uv run --python 3.13 ty check verifiers
- Use type hints for function parameters and returns
- Write docstrings for public functions/classes
- Keep functions focused and modular
- Fail fast, fail loud - no defensive programming or silent fallbacks
PR Checklist
Common Issues
Import Errors
# Ensure package is installed in development mode
uv sync
Integration Tests
# Install optional dependencies for specific integrations
uv sync --extra ta # for TextArenaEnv
uv sync --extra rg # for ReasoningGymEnv
Test Failures
# Debug specific test
uv run pytest tests/test_file.py::test_name -vvs --pdb
Environment Development
Creating a New Environment Module
# Initialize template
prime env init my-environment
# Install locally for testing
prime env install my-environment
# Test your environment
prime eval run my-environment -m openai/gpt-4.1-mini -n 5
Environment Module Structure
# my_environment.py
import verifiers as vf
def load_environment(**kwargs):
"""Load the environment."""
dataset = vf.load_example_dataset("dataset_name")
parser = vf.XMLParser(fields=["reasoning", "answer"])
def reward_func(parser, completion, answer, **kwargs):
return 1.0 if parser.parse_answer(completion) == answer else 0.0
rubric = vf.Rubric(
funcs=[reward_func, parser.get_format_reward_func()],
weights=[1.0, 0.2],
parser=parser
)
return vf.SingleTurnEnv(
dataset=dataset,
parser=parser,
rubric=rubric,
**kwargs
)
Quick Reference
Essential Commands
# Development setup
uv sync # CPU-only
uv sync --all-extras # With RL/training extras
uv run pre-commit install # One-time per clone (installs pre-commit + pre-push)
# Run tests
uv run pytest tests/ # All tests
uv run pytest tests/ -xvs # Debug mode
uv run pytest tests/ --cov=verifiers # With coverage
# Run environment tests
uv run pytest tests/test_envs.py -vv # All environments
uv run pytest tests/test_envs.py -k math_python # Specific environment
# Linting
uv run ruff check --fix . # Fix lint errors
uv run ruff format --check verifiers tests # Verify Python formatting
uv run ty check verifiers # Type check (matches CI Ty target)
# Environment tools
prime env init new-env # Create environment
prime env install new-env # Install environment
prime eval run new-env -m openai/gpt-4.1-mini -n 5 # Test environment
prime eval tui # Browse evals in the tree browser
| Command | Description |
|---|
prime eval run | Run evaluations on environments |
prime env init | Initialize new environment from template |
prime env install | Install environment module |
prime lab setup | Set up training workspace |
prime eval tui | Terminal UI for browsing evals and rollout details |
prime rl run | Launch Hosted Training |
uv run prime-rl | Launch prime-rl training |
Project Guidelines
- Environments: Installable modules with
load_environment() function
- Parsers: Extract structured data from model outputs
- Rubrics: Define multi-criteria evaluation functions
- Tests: Comprehensive coverage with mocks for external dependencies