About
Environments Hub is a community-powered platform for aggregating and showcasing environments, both for RL training and downstream evaluation. You can view all available environments on our platform hub.Motivation
There are a few inter-related issues we see with the current ecosystem for both evals and RL environments which we’re aiming to address with this hub:- Despite the rapidly growing interest in training LLMs with RL, there is currently no established community platform for exploring and sharing train-ready environments.
- Environment implementations are often tied to a specific training RL stack and can be difficult to adapt to a new trainer.
- Popular evaluation suites (lm_eval, lighteval, openbench, simple-evals, HELM) offer convenient entrypoints into many single-turn Q&A evals, but these suites generally lack support for tasks which are agentic in nature or require complex infrastructure setups (TAU-bench, TerminalBench, SWE-bench), resulting in a proliferation of independent eval repos without shared entrypoints or specs.
- RL environments and agent evals are basically the same thing (dataset + harness + scoring rules), but current open-source efforts generally treat them as fundamentally separate.
- Realistic agent environments can be complex pieces of software requiring dependencies and versioning, and are ill-served by monorepo structures for environment collections which can quickly become unmaintainable.
pyproject.toml
and are distributed as wheels. By adopting the verifiers
spec, development efforts can focus on task-specific components (datasets, tools or harnesses, reward functions) and automatically leverage existing infrastructure for running evaluations or training models with RL.
Resources
- Prime CLI: Command-line tool to install, upload and manage environments
- Github: https://github.com/PrimeIntellect-ai/prime-cli
- Documentation: CLI Overview
- Verifiers: A library of modular components for creating RL environments and training LLM agents.
- Github: https://github.com/willccbb/verifiers
- Documentation: https://verifiers.readthedocs.io/en/latest/
- Prime RL: A library for large-scale RL training with FSDP.