Getting Started

Welcome to Verifiers! This library provides a flexible framework for creating RL environments and evaluations with custom multi-turn interaction protocols.

What is Verifiers?

Verifiers enables you to:

Define custom interaction protocols between models and environments
Build agents, multi-turn conversations, tool-augmented reasoning, and interactive games
Create reusable evaluation environments with multi-criteria reward functions
Train models with the included RL trainer (via vf-rl) or integrate with other RL frameworks

Key features:

First-class OpenAI-compatibility for ChatCompletions and Completions
Extensible multi-turn interactions via MultiTurnEnv
Native tool calling support with ToolEnv
Modular reward functions through Rubric classes
End-to-end async compatibility with sync support where you want it
Full-spectrum scaling from CPU evaluations in Jupyter to multi-node GPU RL training
Environments as Python modules for easy installation, sharing, and reuse

Installation

Basic Installation

For evaluation and API model usage:

uv add verifiers

Training Support

For RL training with the included trainer:

uv add 'verifiers[rl]'

To use the latest main with RL extras:

uv add 'verifiers[rl] @ git+https://github.com/PrimeIntellect-ai/verifiers.git@main'

Latest Development Version

To use the latest main branch:

uv add verifiers@git+https://github.com/PrimeIntellect-ai/verifiers.git

Development Setup

For contributing to verifiers:

git clone https://github.com/PrimeIntellect-ai/verifiers.git
cd verifiers
uv sync --all-extras && uv pip install flash-attn --no-build-isolation
uv run pre-commit install

Integration with prime-rl

For large-scale FSDP training, see prime-rl.

Integration with Prime Intellect Environments Hub

Coming soon.

Documentation

Overview — Core concepts and architecture. Start here if you’re new to Verifiers to understand how environments orchestrate interactions. Environments — Creating custom interaction protocols with MultiTurnEnv, ToolEnv, and basic rubrics.

Advanced Usage

Components — Advanced rubrics, tools, parsers, with practical examples. Covers judge rubrics, tool design, and complex workflows. Training — GRPO training and hyperparameter tuning. Read this when you’re ready to train models with your environments.

Reference

Development — Contributing to verifiers Type Reference — Understanding data structures

Docs

​What is Verifiers?

​Installation

​Basic Installation

​Training Support

​Latest Development Version

​Development Setup

​Integration with prime-rl

​Integration with Prime Intellect Environments Hub

​Documentation

​Getting Started

​Advanced Usage

​Reference