> ## Documentation Index
> Fetch the complete documentation index at: https://docs.primeintellect.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Inference Overview

> Access powerful language models through Prime Intellect Inference API

Prime Intellect Inference provides OpenAI-compatible API access to state-of-the-art language models. Our inference service routes requests to various model providers, offering flexible model selection made for running large scale evaluations.

## Getting Started

### 1. Get Your API Key

First, obtain your API key from the [Prime Intellect Platform](https://app.primeintellect.ai):

1. Navigate to your account settings
2. Go to the API Keys section
3. Generate a new API key with **Inference** permission enabled

<Warning>
  Make sure to select the **Inference** permission when creating your API key. Without this permission, your requests will fail with authentication errors.
</Warning>

<img src="https://mintcdn.com/primeintellect/KPJjrHR3y_Xe3o0d/images/inference-permission.png?fit=max&auto=format&n=KPJjrHR3y_Xe3o0d&q=85&s=6607c00991f0e3013fc15a5fbf6d4695" alt="Select Inference Permission" width="3597" height="2160" data-path="images/inference-permission.png" />

### 2. Set Up Authentication

Set your API key as an environment variable:

```bash theme={null}
export PRIME_API_KEY="your-api-key-here"
```

### 3. Access through the CLI or API

You can use Prime Inference in two ways:

#### Prime CLI (Recommended for Evaluations)

The Prime CLI provides easy access to inference models, especially useful for running evaluations:

```bash theme={null}
# List available models
prime inference models

# Use with environment evaluations (most common use case)
prime env eval gsm8k -m meta-llama/llama-3.1-70b-instruct -n 25
```

<Note>
  **For evaluations**: See [Environment Evaluations guide](/tutorials-environments/evaluating) for comprehensive examples and best practices regarding evaluations.
</Note>

#### Direct API Access (OpenAI-Compatible)

<Note>
  **Team accounts**: Include the `X-Prime-Team-ID` header to use team credits instead of personal account. Find your team ID via `prime teams list` or on your [Team Profile page](https://app.primeintellect.ai/dashboard/team-profile).
</Note>

<CodeGroup>
  ```python Python theme={null}
  import openai
  import os

  # Personal account
  client = openai.OpenAI(
      api_key=os.environ.get("PRIME_API_KEY"),
      base_url="https://api.pinference.ai/api/v1"
  )

  # Team account (add X-Prime-Team-ID header)
  client = openai.OpenAI(
      api_key=os.environ.get("PRIME_API_KEY"),
      base_url="https://api.pinference.ai/api/v1",
      default_headers={
          "X-Prime-Team-ID": "your-team-id-here"
      }
  )

  # Make a chat completion request
  response = client.chat.completions.create(
      model="meta-llama/llama-3.1-70b-instruct",
      messages=[
          {"role": "user", "content": "What is Prime Intellect?"}
      ]
  )

  print(response.choices[0].message.content)
  ```

  ```bash cURL theme={null}
  curl -X POST https://api.pinference.ai/api/v1/chat/completions \
    -H "Authorization: Bearer $PRIME_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "meta-llama/llama-3.1-70b-instruct",
      "messages": [
        {"role": "user", "content": "What is Prime Intellect?"}
      ]
    }'

  # With team account (add X-Prime-Team-ID header)
  curl -X POST https://api.pinference.ai/api/v1/chat/completions \
    -H "Authorization: Bearer $PRIME_API_KEY" \
    -H "X-Prime-Team-ID: your-team-id-here" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "meta-llama/llama-3.1-70b-instruct",
      "messages": [
        {"role": "user", "content": "What is Prime Intellect?"}
      ]
    }'
  ```

  ```javascript JavaScript theme={null}
  const response = await fetch('https://api.pinference.ai/api/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.PRIME_API_KEY}`,
      'Content-Type': 'application/json',
      // Add 'X-Prime-Team-ID': 'your-team-id-here' for team accounts
    },
    body: JSON.stringify({
      model: 'meta-llama/llama-3.1-70b-instruct',
      messages: [
        { role: 'user', content: 'What is Prime Intellect?' }
      ]
    })
  });

  const data = await response.json();
  console.log(data.choices[0].message.content);
  ```
</CodeGroup>

## Available Models

Prime Inference provides access to various state-of-the-art language models. You can list all available models using the models endpoint:

### Get All Available Models

<CodeGroup>
  ```python Prime CLI theme={null}
  # List all available models
  prime inference models
  ```

  ```python OpenAI Client theme={null}
  # List all available models
  models = client.models.list()
  for model in models.data:
      print(f"Model: {model.id}")
  ```

  ```bash cURL theme={null}
  curl -H "Authorization: Bearer your-api-key-here" \
       https://api.pinference.ai/api/v1/models
  ```
</CodeGroup>

## Pricing and Billing

Prime Inference uses token-based pricing with competitive rates:

* **Input tokens**: Charged for tokens in your prompt
* **Output tokens**: Charged for tokens in the model's response
* **Billing**: Automatic deduction from your Prime Intellect account balance

Pricing varies by model. We will provide more details on pricing soon and make it available through the models API.

### Viewing Your Inference Usage

Track your inference usage and billing on the [Billing Dashboard](https://app.primeintellect.ai/dashboard/billing) under the **Inference** tab:

<img src="https://mintcdn.com/primeintellect/KPJjrHR3y_Xe3o0d/images/inference-billing.png?fit=max&auto=format&n=KPJjrHR3y_Xe3o0d&q=85&s=28a90042c25de4ac46bb2dfbb0bbe8b4" alt="Inference Billing" width="5028" height="2160" data-path="images/inference-billing.png" />

## Next Steps

<CardGroup cols={2}>
  <Card title="Advanced Usage" icon="code" href="/inference/usage">
    Streaming responses, advanced parameters, and more examples
  </Card>

  <Card title="Team Accounts" icon="users" href="/inference/team-accounts">
    Using inference with team accounts and managing team billing
  </Card>

  <Card title="Troubleshooting" icon="circle-question" href="/inference/troubleshooting">
    Fix common inference errors, including insufficient funds and team billing context
  </Card>
</CardGroup>

<CardGroup cols={2}>
  <Card title="Environment Evaluations" icon="flask" href="/tutorials-environments/evaluating">
    **Primary use case**: Learn how to run model evaluations using `prime env eval` with inference models
  </Card>

  <Card title="Inference API Reference" icon="book" href="/api-reference/inference-models">
    Detailed documentation for models and chat completion endpoints
  </Card>
</CardGroup>
