Skip to main content
POST
/
api
/
v1
/
evaluations
/
Create Evaluation
curl --request POST \
  --url https://api.primeintellect.ai/api/v1/evaluations/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "<string>",
  "team_id": "<string>",
  "environments": [
    {
      "id": "<string>",
      "version_id": "<string>"
    }
  ],
  "suite_id": "<string>",
  "run_id": "<string>",
  "model_name": "<string>",
  "dataset": "<string>",
  "framework": "<string>",
  "task_type": "<string>",
  "description": "<string>",
  "tags": [],
  "metadata": {},
  "metrics": {}
}'
{
  "evaluation_id": "<string>",
  "name": "<string>",
  "status": "RUNNING",
  "eval_type": "suite",
  "environment_ids": [
    "<string>"
  ],
  "suite_id": "<string>",
  "run_id": "<string>",
  "version_id": "<string>",
  "created_at": "2023-11-07T05:31:56Z"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Request to create a new evaluation

name
string
required

Name of the evaluation

team_id
string | null

Team ID if creating evaluation for a team

environments
EnvironmentReference · object[] | null

List of environment references with optional version IDs

suite_id
string | null

Suite ID if this evaluation is part of a suite

run_id
string | null

Run ID for Prime RL runs (optional)

model_name
string | null

Model name

dataset
string | null

Dataset name

framework
string | null

Framework used (e.g., 'prime-rl', 'openai/evals')

task_type
string | null

Type of task (e.g., 'classification', 'generation')

description
string | null

Description of the evaluation

tags
string[]

Tags for categorization

metadata
object | null

Additional metadata

metrics
object | null

High-level metrics summary

Response

Successful Response

Response after creating an evaluation

evaluation_id
string
required

ID of the created evaluation

name
string
required
status
enum<string>
required

Evaluation status enum

Available options:
RUNNING,
COMPLETED,
FAILED,
CANCELLED
eval_type
enum<string>
required

Type: 'prime_rl', 'environment', or 'suite'

Available options:
suite,
training,
environment
created_at
string<date-time>
required
environment_ids
string[] | null
suite_id
string | null
run_id
string | null
version_id
string | null
I