Inference API is currently in closed beta. Features and model availability may change as we continue to improve the service.
Getting Started
1. Get Your API Key
First, obtain your API key from the Prime Intellect Platform:- Navigate to your account settings
- Go to the API Keys section
- Generate a new API key for inference access
2. Set Up Authentication
Set your API key as an environment variable:3. Access through the CLI or API
You can use Prime Inference in two ways:Prime CLI (Recommended for Evaluations)
The Prime CLI provides easy access to inference models, especially useful for running evaluations:For evaluations: See Environment Evaluations guide for comprehensive examples and best practices regarding evaluations.
Direct API Access (OpenAI-Compatible)
Available Models
Prime Inference provides access to various state-of-the-art language models. You can list all available models using the models endpoint:Get All Available Models
Using the Inference Endpoint
Basic Chat Completion
Streaming Responses
For real-time applications, use streaming to receive responses as they’re generated:Advanced Parameters
Prime Inference supports all standard OpenAI API parameters:Pricing and Billing
Prime Inference uses token-based pricing with competitive rates:- Input tokens: Charged for tokens in your prompt
- Output tokens: Charged for tokens in the model’s response
- Billing: Automatic deduction from your Prime Intellect account balance
Support and Feedback
Since Inference API is in closed beta, we welcome your feedback:- Issues: Report bugs or request features on GitHub
- Discord: Join our community Discord for support
- Email: Contact us at contact@primeintellect.ai
API Reference
The Prime Intellect Inference API provides OpenAI-compatible endpoints:Available Endpoints
GET /models - List Available Models
GET /models - List Available Models
Returns a list of all available models that you can use for inference requests.Response:
GET /models/{model_id} - Get Model Details
GET /models/{model_id} - Get Model Details
Retrieves detailed information about a specific model.Parameters:
model_id
(path): The ID of the model to retrieve
POST /chat/completions - Create Chat Completion
POST /chat/completions - Create Chat Completion
Creates a model response for the given chat conversation.Request Body:Response:
Interactive API Documentation: Full interactive API documentation with request/response examples will be available soon. The current endpoints are fully compatible with OpenAI’s API format.