The model list is subject to change during the Private Beta as we adjust and expand our infrastructure. Run
prime rl models for the most up-to-date list.Available Models
| Model | Parameters | Architecture | Notes |
|---|---|---|---|
meta-llama/Llama-3.2-1B-Instruct | 1B | Dense | Smallest option, good for rapid prototyping |
HuggingFaceTB/SmolLM3-3B | 3B | Dense | Compact model for lightweight tasks |
PrimeIntellect/Qwen3-0.6B-Reverse-Text-SFT | 0.6B | Dense | Example SFT model for testing |
Qwen/Qwen3-4B-Instruct-2507 | 4B | Dense | Recommended for validation runs |
Qwen/Qwen3-4B-Thinking-2507 | 4B | Dense | Thinking variant with chain-of-thought |
Qwen/Qwen3-30B-A3B-Instruct-2507 | 30B (3B active) | MoE | Recommended for experimentation |
Qwen/Qwen3-30B-A3B-Thinking-2507 | 30B (3B active) | MoE | Thinking variant |
Qwen/Qwen3-235B-A22B-Instruct-2507 | 235B (22B active) | MoE | Production scale |
Qwen/Qwen3-235B-A22B-Thinking-2507 | 235B (22B active) | MoE | Thinking variant |
PrimeIntellect/INTELLECT-3 | 106B (12B active) | MoE | Prime Intellect’s own open-source model |
arcee-ai/Trinity-Mini | — | — | Arcee’s Trinity model |
Pricing
Models are currently free to use with rate limits during the Private Beta. Paid usage will be enabled in the near future. Pricing will be per million tokens, billed separately for input, output, and training tokens. Prefix cache hits receive automatically-applied discounts.Choosing a Model
For Validation and Debugging
Start with a small, fast model to verify your environment and config work correctly before committing compute to a larger run. Recommended:Qwen/Qwen3-4B-Instruct-2507 or meta-llama/Llama-3.2-1B-Instruct
For Experimentation
MoE models with small active parameter counts give strong performance per token. Recommended:Qwen/Qwen3-30B-A3B-Instruct-2507
For Production Training
For serious training runs where you want the strongest possible results, use the larger MoE models. Recommended:Qwen/Qwen3-235B-A22B-Instruct-2507 or PrimeIntellect/INTELLECT-3
Instruct vs. Thinking Variants
Models with-Thinking in the name are trained for extended chain-of-thought reasoning. They tend to produce longer outputs with explicit reasoning steps, which can be beneficial for tasks that require multi-step problem solving (math, code, logic). However, they also consume more tokens per response.
Use Instruct variants for tasks where concise responses are preferred or where the task is straightforward. Use Thinking variants for tasks that benefit from extended reasoning, such as math questions or complex coding problems.