Inference Overview

Prime Intellect Inference provides OpenAI-compatible API access to state-of-the-art language models. Our inference service routes requests to various model providers, offering flexible model selection made for running large scale evaluations.

Getting Started

1. Get Your API Key

First, obtain your API key from the Prime Intellect Platform:

Navigate to your account settings
Go to the API Keys section
Generate a new API key with Inference permission enabled

Make sure to select the Inference permission when creating your API key. Without this permission, your requests will fail with authentication errors.

2. Set Up Authentication

Set your API key as an environment variable:

export PRIME_API_KEY="your-api-key-here"

3. Access through the CLI or API

You can use Prime Inference in two ways:

Prime CLI (Recommended for Evaluations)

The Prime CLI provides easy access to inference models, especially useful for running evaluations:

# List available models
prime inference models

# Use with environment evaluations (most common use case)
prime env eval gsm8k -m meta-llama/llama-3.1-70b-instruct -n 25

For evaluations: See Environment Evaluations guide for comprehensive examples and best practices regarding evaluations.

Direct API Access (OpenAI-Compatible)

Team accounts: Include the X-Prime-Team-ID header to use team credits instead of personal account. Find your team ID via prime teams list or on your Team Profile page.

import openai
import os

# Personal account
client = openai.OpenAI(
    api_key=os.environ.get("PRIME_API_KEY"),
    base_url="https://api.pinference.ai/api/v1"
)

# Team account (add X-Prime-Team-ID header)
client = openai.OpenAI(
    api_key=os.environ.get("PRIME_API_KEY"),
    base_url="https://api.pinference.ai/api/v1",
    default_headers={
        "X-Prime-Team-ID": "your-team-id-here"
    }
)

# Make a chat completion request
response = client.chat.completions.create(
    model="meta-llama/llama-3.1-70b-instruct",
    messages=[
        {"role": "user", "content": "What is Prime Intellect?"}
    ]
)

print(response.choices[0].message.content)

Available Models

Prime Inference provides access to various state-of-the-art language models. You can list all available models using the models endpoint:

Get All Available Models

# List all available models
prime inference models

Pricing and Billing

Prime Inference uses token-based pricing with competitive rates:

Input tokens: Charged for tokens in your prompt
Output tokens: Charged for tokens in the model’s response
Billing: Automatic deduction from your Prime Intellect account balance

Pricing varies by model. We will provide more details on pricing soon and make it available through the models API.

Viewing Your Inference Usage

Track your inference usage and billing on the Billing Dashboard under the Inference tab:

Next Steps

Advanced Usage

Streaming responses, advanced parameters, and more examples

Team Accounts

Using inference with team accounts and managing team billing

Environment Evaluations

Primary use case: Learn how to run model evaluations using prime env eval with inference models

Inference API Reference

Detailed documentation for models and chat completion endpoints

Getting Started

On-Demand Cloud

Storage

Multi-Node Clusters

Environments Hub

Inference

Reinforcement Fine-Tuning

Sandboxes

Community Pools

Getting Started

1. Get Your API Key

2. Set Up Authentication

3. Access through the CLI or API

Prime CLI (Recommended for Evaluations)

Direct API Access (OpenAI-Compatible)

Available Models

Get All Available Models

Pricing and Billing

Viewing Your Inference Usage

Next Steps

Advanced Usage

Team Accounts

Environment Evaluations

Inference API Reference

Getting Started

On-Demand Cloud

Storage

Multi-Node Clusters

Environments Hub

Inference

Reinforcement Fine-Tuning

Sandboxes

Community Pools

​Getting Started

​1. Get Your API Key

​2. Set Up Authentication

​3. Access through the CLI or API

​Prime CLI (Recommended for Evaluations)

​Direct API Access (OpenAI-Compatible)

​Available Models

​Get All Available Models

​Pricing and Billing

​Viewing Your Inference Usage

​Next Steps

Advanced Usage

Team Accounts

Environment Evaluations

Inference API Reference

Getting Started

1. Get Your API Key

2. Set Up Authentication

3. Access through the CLI or API

Prime CLI (Recommended for Evaluations)

Direct API Access (OpenAI-Compatible)

Available Models

Get All Available Models

Pricing and Billing

Viewing Your Inference Usage

Next Steps