Hosted Training supports a range of open-weights models. This page lists the currently available models, their pricing, and guidance on choosing the right model for your use case.Documentation Index
Fetch the complete documentation index at: https://docs.primeintellect.ai/llms.txt
Use this file to discover all available pages before exploring further.
Available Models
Prices are per million tokens, billed separately for input, output, and training.| Model | Input ($ / 1M) | Output ($ / 1M) | Train ($ / 1M) |
|---|---|---|---|
Qwen/Qwen3.5-0.8B | 0.02 | 0.06 | 0.06 |
Qwen/Qwen3.5-2B | 0.05 | 0.15 | 0.15 |
Qwen/Qwen3.5-4B | 0.10 | 0.30 | 0.30 |
Qwen/Qwen3.5-9B | 0.20 | 0.60 | 0.60 |
Qwen/Qwen3.5-35B-A3B | 0.25 | 0.75 | 1.00 |
Qwen/Qwen3.5-122B-A10B | 0.50 | 1.50 | 2.00 |
Qwen/Qwen3.5-397B-A17B | 1.00 | 3.00 | 4.00 |
meta-llama/Llama-3.2-1B-Instruct | 0.02 | 0.06 | 0.06 |
meta-llama/Llama-3.2-3B-Instruct | 0.05 | 0.15 | 0.15 |
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | 0.15 | 0.45 | 0.60 |
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 | 0.30 | 0.90 | 1.20 |
openai/gpt-oss-20b | 0.10 | 0.30 | 0.40 |
openai/gpt-oss-120b | 0.25 | 0.75 | 1.00 |
Choosing a Model
For Validation and Debugging
Start with a small, fast model to verify your environment and config work correctly before committing compute to a larger run. Recommended:Qwen/Qwen3.5-0.8B or meta-llama/Llama-3.2-1B-Instruct
For Experimentation
MoE models with small active parameter counts give strong performance per token. Recommended:Qwen/Qwen3.5-35B-A3B
For Production Training
For serious training runs where you want the strongest possible results, use the larger MoE models. Recommended:Qwen/Qwen3.5-397B-A17B or nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
Thinking Mode
Qwen3.5 and Nemotron models support a thinking mode that produces extended chain-of-thought reasoning before the final answer. Toggle it via[sampling].enable_thinking in your config. Thinking mode tends to help on tasks that benefit from multi-step reasoning (math, code, logic) at the cost of longer outputs.
Checking Available Models
Always use the CLI to check the current list of supported models:--output json to get the live pricing alongside the model list. This may differ from this page as models are being added regularly.