Before you start, ensure that you have API Key with Availability -> Read permission

Retrieving Availability Data

Suppose you want to check pricing options for a single H100 GPU, with the location restricted to the United States or Canada. To do this, send a request to our availability endpoint as shown below:

curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/?regions=united_states&regions=canada&gpu_count=1&gpu_type=H100_80GB' \
  --header 'Authorization: Bearer your_api_key'
You can generate request samples using our interactive Availability API documentation

This is a GET request, and it requires an Authorization: Bearer token with your API key. The request accepts query parameters, allowing you to filter GPUs by region, GPU count, GPU type, and other criteria. In this case, the filters include regions, gpu_count, and gpu_type.

Understanding the Response Object

Here is an example of the JSON response you might receive:

Example response
{
  "H100_80GB": [
    {
        "cloudId": "NVIDIA H100 PCIe",
        "gpuType": "H100_80GB",
        "socket": "PCIe",
        "provider": "runpod",
        "dataCenter": "US-KS-2",
        "country": "US",
        "gpuCount": 1,
        "gpuMemory": 80,
        "disk": {
            "minCount": 80,
            "defaultCount": 80,
            "maxCount": 1000,
            "pricePerUnit": 0.00014,
            "step": 1,
            "defaultIncludedInPrice": false
        },
        "vcpu": {
            "defaultCount": 16
        },
        "memory": {
            "defaultCount": 251
        },
        "stockStatus": "Low",
        "security": "secure_cloud",
        "prices": {
            "onDemand": 2.69,
            "communityPrice": null,
            "isVariable": null,
            "currency": "USD"
        },
        "images": [
            "ubuntu_22_cuda_12",
            "cuda_12_1_pytorch_2_2",
            "cuda_11_8_pytorch_2_1",
            "stable_diffusion",
            "flux",
            "axolotl",
            "bittensor",
            "vllm_llama_8b",
            "vllm_llama_70b",
            "vllm_llama_405b"
        ],
        "isSpot": null,
        "prepaidTime": null
    }
  ]
}
For a full breakdown of the response schema, see the Get Gpu Availability Endpoint documentation

Let’s walk through the most important fields in the response object:

provider

Indicates the company or platform providing the GPU. You may encounter multiple offers from the same provider.

cloudId

A unique identifier provided by the GPU provider. This value is required if you plan to provision the GPU through our provisioning API.

dataCenter

Optional but necessary if present in the response. If the provider operates multiple datacenters with the same cloudId, you must specify which data center to use when provisioning the GPU.

disk, vcpu, memory

Each field informs you about the resources available for the selected GPU. Some providers allow you to modify defaultCount, so you can customize things like number of vcpus available in your instance. If the value is customizable, then the property contains attributes like minCount, maxCount, pricePerUnit, step and defaultIncludedInPrice.

In the example above your server will include 16 vpcus and 251GB of RAM, with customizable disk space. If the disk space is adjustable, its cost is calculated separately from the prices property. Here’s an example breakdown of total costs:

GPU cost: $2.69
vcpu cost: $0.00
memory cost: $0.00
disk cost: $0.0112 (80 units * $0.00014)

Total cost: $2.7012

The total cost varies based on the disk space that you send to the provisioning API. The value can range between minCount and maxCount adjustable by step and you’ll pay pricePerUnit/hr for every unit you want to use.

Some providers offer predefined server configurations, with the defaultIncludedInPrice set to true. This means the defaultCount is included in the base price, but any changes (even reductions) will incur additional charges.

prices and security

There are two types of offers:

  • Secure Cloud
  • Community Cloud

Secure Cloud means that GPU is provided by a provider that keeps security standards and the GPUs are hosted in a secured datacenter. In contrast, Community Cloud offers GPUs either provided by the community or from sources where we have limited information about the data center.

If the security type is set to secure_cloud, then the price is defined in prices -> onDemand. Otherwise, community pricing is stored in prices -> communityPrice.

Some providers may also use dynamic pricing for their instances. If the field isVariable is set to true, the price may fluctuate based on demand.