> ## Documentation Index
> Fetch the complete documentation index at: https://docs.primeintellect.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Faqs

## Getting Started

### How do I quickly test my environment?

Use `prime eval run` with a small sample:

```bash theme={null}
prime eval run my-environment -m openai/gpt-4.1-mini -n 5
```

The `-s` flag prints sample outputs so you can see what's happening.

### How do I see what the model is outputting?

**If using `prime eval run`**: Results are saved automatically. Browse them interactively with:

```bash theme={null}
prime eval view
```

The TUI opens a single run browser (`environment -> model -> run`). Press `Enter` on a run to open rollout details, `b` to go back, `tab` to cycle panes, `e` and `x` to expand or collapse history, `pageup` and `pagedown` to scroll history, and `c` for Copy Mode.

**If using the Python API** (`env.generate()` / `env.evaluate()`):

```python theme={null}
vf.print_prompt_completions_sample(outputs, n=3)
```

### How do I enable debug logging?

Set the `VF_LOG_LEVEL` environment variable:

```bash theme={null}
VF_LOG_LEVEL=DEBUG prime eval run my-environment -m openai/gpt-4.1-mini -n 5
```

## Environments

### Which environment class should I use?

* **SingleTurnEnv**: One prompt, one response (Q\&A, classification)
* **MultiTurnEnv**: Custom back-and-forth interaction (games, simulations)
* **ToolEnv**: Model calls Python functions (search, calculator)
* **StatefulToolEnv**: Tools that need per-rollout state (sandbox IDs, sessions)

### What does `max_turns=-1` mean?

Unlimited turns. The rollout continues until a stop condition is triggered (e.g., model stops calling tools, or a custom condition you define).

### How do I add a custom stop condition?

Use the `@vf.stop` decorator on a method that returns `True` to end the rollout:

```python theme={null}
@vf.stop
async def task_completed(self, state: State) -> bool:
    return "DONE" in state["completion"][-1]["content"]
```

### How do I handle tool call errors gracefully?

In `ToolEnv`, customize error handling:

```python theme={null}
env = ToolEnv(
    tools=[my_tool],
    error_formatter=lambda e: f"Error: {type(e).__name__}: {e}",
    stop_errors=[CriticalError],  # These errors end the rollout
)
```

Non-critical errors are returned to the model as tool responses so it can retry.

## Reward Functions

### What arguments can my reward function receive?

Reward functions receive any of these via `**kwargs`:

* `completion` - the model's response
* `answer` - ground truth from dataset
* `prompt` - the input prompt
* `state` - full rollout state
* `parser` - the rubric's parser (if set)
* `task` - `vf.Task` object for taskset-backed environments
* `info` - metadata dict from dataset

Just include the ones you need in your function signature.

### How do group reward functions work?

Group reward functions receive plural arguments (`completions`, `answers`, `states`) and return a list of floats. They're detected automatically by parameter names:

```python theme={null}
def relative_reward(completions: list, answers: list, **kwargs) -> list[float]:
    # Score all completions for an example together
    scores = [compute_score(c, a) for c, a in zip(completions, answers)]
    # Normalize relative to group
    max_score = max(scores) if scores else 1.0
    return [s / max_score for s in scores]
```

## Training

### How do I use a local vLLM server?

Point the client to your local server:

```python theme={null}
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="http://localhost:8000/v1",
    api_key="not-needed"
)

outputs = await env.evaluate(client, model="your-model-name", ...)
```

### Which `client_type` should I use for RL training?

Three options trade off control vs simplicity:

* **`openai_chat_completions`** (MITO) — server-side templating, text only. Standard OpenAI path. The trainer re-tokenizes for training, which can drift across multi-turn rollouts and fragment them into multiple samples.
* **`openai_chat_completions_token`** (TITO) — server-side templating, returns token IDs alongside text. The trainer doesn't re-tokenize. Use when the server's chat template is stable across turns.
* **`renderer`** *(experimental)* — client-side tokenization via a per-model renderer in the [`renderers` package](https://github.com/PrimeIntellect-ai/verifiers/tree/main/packages/renderers). Install it with `uv add "verifiers[renderers]"` before using `client_type="renderer"`. Stronger token-preservation in theory: `bridge_to_next_turn` keeps multi-turn rollouts merged into one sample and survives mid-completion truncation cleanly. Hand-coded renderers exist only for a subset of models and corner cases are still being shaken out.

For production training, use `openai_chat_completions_token` — it's the tried-and-tested path. Try `renderer` if you want the stronger guarantees and your model has a hand-coded renderer. See [Inference Client Types](/verifiers/training#inference-client-types) for the full breakdown.
