Skip to main content
This page covers common issues you may encounter when using Lab and Hosted Training, along with their solutions.

Environment Issues

The verifiers library is not installed in your project.
uv add verifiers
Or if you’re installing a specific environment:
prime env install my-env
This usually means a module name collision — your environment name conflicts with an existing Python package, or the environment wasn’t installed correctly.Solutions:
  • Rename your environment to avoid conflicts
  • Reinstall: prime env install my-env
  • Check that your environment module exposes a load_environment function at the top level
The environment ID doesn’t match any published environment.
# Check the exact ID
prime env info owner/environment-name

# Install a specific version
prime env install owner/environment-name@latest
Make sure you’re using the correct owner/name format and that you have access if it’s a private environment.
The environment requires API keys that aren’t set. The error message will list which keys are missing.For local evaluation:
export OPENAI_API_KEY=sk-...
For Hosted Training, add secrets to your config:
env_file = ["secrets.env"]
Or use the CLI to manage secrets:
prime secrets             # Global secrets
prime env secrets         # Per-environment secrets

Training Issues

The task is too hard for the model at its current capability level. The model can’t solve any examples, so there’s no reward signal to learn from.Solutions:
  • Try a larger or more capable model
  • Use easier examples (filter your dataset or adjust environment args)
  • Increase max_tokens in [sampling] — the model may need more space to reason
  • Check your rubric implementation for bugs that might always return 0
  • Run a baseline evaluation first: prime eval run my-env -m <model> -n 20 -r 1
The task is too easy — the model already solves everything.Solutions:
  • Use harder examples or a more challenging dataset split
  • Add more demanding rubric criteria
  • Use a smaller model that has more room to improve
There could be several causes:
  • Low reward diversity: If all rollouts for an example get the same reward, there’s no contrast for the model to learn from. Increase rollouts_per_example (16–32) to get more variation.
  • Learning rate too low: Try increasing learning_rate (e.g., from 1e-4 to 3e-4).
  • Batch size too small: Larger batches provide more stable gradient estimates. Try batch_size = 512.
  • Task mismatch: The task may not be suitable for RL training. Ensure the reward function produces a meaningful gradient of scores, not just binary 0/1.
A field in your TOML config has the wrong type or an invalid value.Common causes:
  • String values not quoted: model = Qwen/Qwen3-4Bmodel = "Qwen/Qwen3-4B"
  • Integer where float expected or vice versa
  • Missing required sections like [sampling] or [[env]]
Double-check your config against the config reference.
The model you specified isn’t currently supported for Hosted Training.
# Check available models
prime rl models
The model list is subject to change during the beta period. See Models & Pricing for the current list.

CLI Issues

The CLI isn’t installed or isn’t on your PATH.
# Install or reinstall
uv tool install prime

# If already installed, upgrade
uv tool install -U prime

# Verify
prime --version   # Should be >= 0.5.15
If you installed with uv tool install and it’s still not found, make sure ~/.local/bin is in your PATH.
Your CLI session may have expired.
prime login
This opens a browser window to re-authenticate.
Check the following:
  • Your config file is valid TOML (no syntax errors)
  • The model is available: prime rl models
  • The environment ID is correct and accessible
  • You’re authenticated: prime login
  • Your CLI is up to date: uv tool install -U prime

Evaluation Issues

Reduce concurrency when running evaluations:
prime eval run my-env -m openai/gpt-4.1-mini -n 100 -c 8
The -c flag controls maximum concurrent requests. Lower it if you’re hitting rate limits.
Qwen3 and DeepSeek-R1 models have chat templates that automatically remove <think> tags from message history. This conflicts with ThinkParser.Solution: Use MaybeThinkParser or Parser instead of ThinkParser in your environment:
# Instead of:
parser = vf.ThinkParser(extract_fn=my_fn)

# Use:
parser = vf.MaybeThinkParser(extract_fn=my_fn)
  • Check rollouts_per_example: Low values (1–2) produce noisy results. Use at least 3–5 for reliable metrics.
  • Check num_examples: Very small sample sizes can be misleading.
  • Check sampling temperature: High temperatures produce more variation between runs.
  • Check your rubric: Make sure reward functions handle edge cases (empty responses, malformed outputs, etc.).

Environment Development Issues

Common causes:
  • Missing dependencies: Make sure all required packages are listed in your environment’s pyproject.toml
  • Missing secrets: API keys available locally may not be set for hosted runs. Use env_file or prime secrets.
  • Hardcoded paths: Avoid absolute file paths in your environment code
  • Network access: Some external APIs may not be reachable from the hosted environment
Start with a local evaluation using verbose output:
prime eval run my-env -m openai/gpt-4.1-mini -n 5 -r 1 -v
The -v flag enables verbose logging. You can also test your environment directly in Python:
from verifiers import load_environment

env = load_environment("my-env")
# Inspect the dataset, rubric, etc.

Getting Help

If your issue isn’t covered here:
  • Discord: Join the Prime Intellect Discord for community support and Q&A
  • Research Support: Fill out the research support form for hands-on assistance
  • Feedback: Use the thumbs-up/down on any docs page to let us know what’s helpful or missing