This page covers common issues you may encounter when using Lab and Hosted Training, along with their solutions.Documentation Index
Fetch the complete documentation index at: https://docs.primeintellect.ai/llms.txt
Use this file to discover all available pages before exploring further.
Environment Issues
ModuleNotFoundError: No module named 'verifiers'
ModuleNotFoundError: No module named 'verifiers'
load_environment not found
load_environment not found
- Rename your environment to avoid conflicts
- Reinstall:
prime env install my-env - Check that your environment module exposes a
load_environmentfunction at the top level
Environment not found on the Hub
Environment not found on the Hub
owner/name format and that you have access if it’s a private environment.MissingKeyError when loading environment
MissingKeyError when loading environment
Training Issues
Reward is always 0.0
Reward is always 0.0
- Try a larger or more capable model
- Use easier examples (filter your dataset or adjust environment args)
- Increase
max_tokensin[sampling]— the model may need more space to reason - Check your rubric implementation for bugs that might always return 0
- Run a baseline evaluation first:
prime eval run my-env -m <model> -n 20 -r 1
Reward is always 1.0
Reward is always 1.0
- Use harder examples or a more challenging dataset split
- Add more demanding rubric criteria
- Use a smaller model that has more room to improve
Reward is not changing / training seems stuck
Reward is not changing / training seems stuck
- Low reward diversity: If all rollouts for an example get the same reward, there’s no contrast for the model to learn from. Increase
rollouts_per_example(16–32) to get more variation. - Learning rate too low: Try increasing
learning_rate(e.g., from1e-4to3e-4). - Batch size too small: Larger batches provide more stable gradient estimates. Try
batch_size = 512. - Task mismatch: The task may not be suitable for RL training. Ensure the reward function produces a meaningful gradient of scores, not just binary 0/1.
Pydantic validation error in config
Pydantic validation error in config
- String values not quoted:
model = Qwen/Qwen3-4B→model = "Qwen/Qwen3-4B" - Integer where float expected or vice versa
- Missing required sections like
[sampling]or[[env]]
Model not available
Model not available
CLI Issues
prime: command not found
prime: command not found
uv tool install and it’s still not found, make sure ~/.local/bin is in your PATH.Authentication failed / not logged in
Authentication failed / not logged in
prime train run fails immediately
prime train run fails immediately
- Your config file is valid TOML (no syntax errors)
- The model is available:
prime train models - The environment ID is correct and accessible
- You’re authenticated:
prime login - Your CLI is up to date:
uv tool install -U prime
Evaluation Issues
Rate limit exceeded (429 errors)
Rate limit exceeded (429 errors)
-c flag controls maximum concurrent requests. Lower it if you’re hitting rate limits.ThinkParser failures with Qwen3 models
ThinkParser failures with Qwen3 models
<think> tags from message history. This conflicts with ThinkParser.Solution: Use MaybeThinkParser or Parser instead of ThinkParser in your environment:Evaluation results look wrong or inconsistent
Evaluation results look wrong or inconsistent
- Check
rollouts_per_example: Low values (1–2) produce noisy results. Use at least 3–5 for reliable metrics. - Check
num_examples: Very small sample sizes can be misleading. - Check sampling temperature: High temperatures produce more variation between runs.
- Check your rubric: Make sure reward functions handle edge cases (empty responses, malformed outputs, etc.).
Environment Development Issues
Environment works locally but fails in Hosted Training
Environment works locally but fails in Hosted Training
- Missing dependencies: Make sure all required packages are listed in your environment’s
pyproject.toml - Missing secrets: API keys available locally may not be set for hosted runs. Link them to your environment via the Environments Hub or use
env_filein your config as a fallback. - Hardcoded paths: Avoid absolute file paths in your environment code
- Network access: Some external APIs may not be reachable from the hosted environment
How do I debug my environment?
How do I debug my environment?
-v flag enables verbose logging. You can also test your environment directly in Python:Getting Help
If your issue isn’t covered here:- Discord: Join the Prime Intellect Discord for community support and Q&A
- Research Support: Fill out the research support form for hands-on assistance
- Feedback: Use the thumbs-up/down on any docs page to let us know what’s helpful or missing