Before you start, ensure that you have
API Key with
Instances -> Read and write
permission
Retrieving offers from the availability API
For an in-depth guide to using the availability endpoint, refer to
This Guide
Our goal is to provision H100 GPU using availability data. First, we need to call the availability endpoint to get the current offers:
curl --request GET \
--url 'https://api.primeintellect.ai/api/v1/availability/?gpu_type=H100_80GB®ions=united_states®ions=canada&gpu_count=1' \
--header 'Authorization: Bearer your_api_key'
We will use the Hyperstack provider and the following offer:
{
"cloudId": "n3-H100x1",
"gpuType": "H100_80GB",
"socket": "PCIe",
"provider": "hyperstack",
"dataCenter": "CANADA-1",
"country": "CA",
"gpuCount": 1,
"gpuMemory": 80,
"disk": {
"minCount": null,
"defaultCount": 100,
"maxCount": null,
"pricePerUnit": null,
"step": null,
"defaultIncludedInPrice": null,
"additionalInfo": null
},
"vcpu": {
"minCount": null,
"defaultCount": 180,
"maxCount": null,
"pricePerUnit": null,
"step": null,
"defaultIncludedInPrice": null,
"additionalInfo": null
},
"memory": {
"minCount": null,
"defaultCount": 180,
"maxCount": null,
"pricePerUnit": null,
"step": null,
"defaultIncludedInPrice": null,
"additionalInfo": null
},
"internetSpeed": null,
"interconnect": null,
"interconnectType": null,
"provisioningTime": null,
"stockStatus": "Available",
"security": "secure_cloud",
"prices": {
"onDemand": 1.9,
"communityPrice": null,
"isVariable": null,
"currency": "USD"
},
"images": [
"ubuntu_22_cuda_12",
"cuda_12_1_pytorch_2_2",
"cuda_11_8_pytorch_2_1",
"stable_diffusion",
"flux",
"axolotl",
"bittensor",
"vllm_llama_8b",
"vllm_llama_70b",
"vllm_llama_405b"
],
"isSpot": null,
"prepaidTime": null
}
This GPU configuration has fixed resources, so we don’t have to worry about those and just use the default ones.
Creating the Instance Request Body
First, lets go through the Create Pod endpoint and explain how it works. The request requires a body
with pod
, provider
and optional team
definitions.
pod
The pod
object defines the instance’s characteristics:
"pod": {
"name": "My first pod",
"cloudId": "n3-H100x1",
"gpuType": "H100_80GB",
"socket": "PCIe",
"gpuCount": 1,
"image": "ubuntu_22_cuda_12",
"dataCenterId": "CANADA-1",
"country": "CA",
"security": "secure_cloud"
}
We can choose any name
, but the rest of the parameters is copied from the availability offer. Since this offer includes a dataCenterId
and country
, we’re going to pass those values during provisioning, as it indicates that the provider has GPUs with the same cloudId
available in different locations. We also copy the rest of the GPU definition data:
gpuType -> gpuType
socket -> socket
gpuCount -> gpuCount
security -> security
Last thing to do is to select an image
. Available values are stored within the images
property of the availability offer. We’re going to select the default ubuntu_22_cuda_12
image.
provider
"provider": {
"type": "hyperstack"
},
The provider
object is straightforward, we only need to specify the type
, which in our case is hyperstack
.
team
"team": {
"teamId": "my_team_id"
}
If you want to assign the pod to a specific team
, include the team object.
Sending the Create Request
With all parts configured, the final request looks like this:
curl --request POST \
--url https://api.primeintellect.ai/api/v1/pods/ \
--header 'Authorization: Bearer your_api_key' \
--header 'Content-Type: application/json' \
--data '{
"pod": {
"name": "My first pod",
"cloudId": "n3-H100x1",
"gpuType": "H100_80GB",
"socket": "PCIe",
"gpuCount": 1,
"image": "ubuntu_22_cuda_12",
"dataCenterId": "CANADA-1",
"country": "CA",
"security": "secure_cloud"
},
"provider": {
"type": "hyperstack"
}
}'
On successful completion, you should receive a 200 OK
response with the pod details in the Response Body.
Modifying instance resources
If the availability offer allows resource customization, you can adjust the default resources during provisioning. Below is an example of a modified resource configuration:
"disk": {
"minCount": 50,
"defaultCount": 100,
"maxCount": 1000,
"pricePerUnit": 0.0003,
"step": 10,
"defaultIncludedInPrice": false,
"additionalInfo": null
},
"vcpu": {
"minCount": 4,
"defaultCount": 16,
"maxCount": 32,
"pricePerUnit": 0.004,
"step": 2,
"defaultIncludedInPrice": true,
"additionalInfo": null
},
This configuration enables changes to disk
and vcpu
specifications.
Be aware that default vcpu
is included in price (defaultIncludedInPrice
set to true
) the default disk is not, which results in an additional cost of $0.03 when using the default disk size.
Increase disk size
To increase the disk size, set a new value in the pod
property when sending the create request. For this example, since the provider supports increments of 10 with a minimum of 50, we’ll set the disk size to 200. This adjustment will affect the total hourly cost as follows:
GPU cost: $2.69
vcpu cost: $0.00 (we're not paying for vcpu because `defaultIncludedInPrice == true`)
memory cost: $0.00
disk cost: $0.06 (200 units * $0.0003)
Total cost: $2.75
So the final request will look like:
curl --request POST \
--url https://api.primeintellect.ai/api/v1/pods/ \
--header 'Authorization: Bearer your_api_key' \
--header 'Content-Type: application/json' \
--data '{
"pod": {
"name": "My first pod",
"cloudId": "n3-H100x1",
"gpuType": "H100_80GB",
"socket": "PCIe",
"gpuCount": 1,
"image": "ubuntu_22_cuda_12",
"dataCenterId": "CANADA-1",
"country": "CA",
"security": "secure_cloud",
"diskSize": 200
},
"provider": {
"type": "hyperstack"
}
}'
Modifying vcpu
This case is a little more complicated. Because defaultIncludedInPrice
allows us to use default of 16 vcpus for free, there are 2 options in which we’re going to pay additional amount for vcpus
Increasing vcpu
Raising vcpu
to 20 will increase the cost beyond the base instance price:
GPU cost: $2.69
vcpu cost: $0.08 (20 units * $0.004)
memory cost: $0.00
disk cost: $0.03 (100 units[default] * $0.0003)
Total cost: $2.80
Decreasing vcpu
Reducing vcpu
to 10 can also increase costs compared to the default configuration. This is because some servers use predefined containers, and altering configurations may incur additional fees, making it more economical to use the default setup:
GPU cost: $2.69
vcpu cost: $0.04 (10 units * $0.004)
memory cost: $0.00
disk cost: $0.03 (100 units[default] * $0.0003)
Total cost: $2.76
Example request with adjusted disk and vcpu
With both disk
and vcpu
increased our request will look like:
curl --request POST \
--url https://api.primeintellect.ai/api/v1/pods/ \
--header 'Authorization: Bearer your_api_key' \
--header 'Content-Type: application/json' \
--data '{
"pod": {
"name": "My first pod",
"cloudId": "n3-H100x1",
"gpuType": "H100_80GB",
"socket": "PCIe",
"gpuCount": 1,
"image": "ubuntu_22_cuda_12",
"dataCenterId": "CANADA-1",
"country": "CA",
"security": "secure_cloud",
"diskSize": 200,
"vcpus": 20
},
"provider": {
"type": "hyperstack"
}
}'
and the total cost breakdown is as follows:
GPU cost: $2.69
vcpu cost: $0.08 (20 units * $0.004)
memory cost: $0.00
disk cost: $0.06 (200 units[default] * $0.0003)
Total cost: $2.83
Dynamic pricing
Certain offers feature dynamic pricing, meaning that rates may fluctuate throughout the instance’s lifetime due to factors such as:
- Price being pegged to a foreign currency and fluctuating with exchange rates
- Payment in alternative currencies or tokens
- Market-based pricing
If an offer supports dynamic pricing, prices -> isVariable
will be set to true
. In this case, it’s recommended to specify a maxPrice
when provisioning the instance to set a cap:
curl --request POST \
--url https://api.primeintellect.ai/api/v1/pods/ \
--header 'Authorization: Bearer your_api_key' \
--header 'Content-Type: application/json' \
--data '{
"pod": {
"name": "My first pod",
"cloudId": "n3-H100x1",
"gpuType": "H100_80GB",
"socket": "PCIe",
"gpuCount": 1,
"image": "ubuntu_22_cuda_12",
"dataCenterId": "CANADA-1",
"country": "CA",
"security": "secure_cloud",
"maxPrice": 2.70,
},
"provider": {
"type": "hyperstack"
}
}'
This configuration limits provisioning to instances at or below the specified price. However, prices may still vary during the instance’s lifetime.
Using Custom Templates
When creating an instance, you may want to use your own custom environment instead of the options available through the availability endpoint. You can select from any public templates or your private ones, which you can view on the Templates Page.
To use a custom template, you need to copy its customTemplateId
and set the image
to custom_template
.
Here’s an example request to use a custom template:
curl --request POST \
--url https://api.primeintellect.ai/api/v1/pods/ \
--header 'Authorization: Bearer your_api_key' \
--header 'Content-Type: application/json' \
--data '{
"pod": {
"name": "My first pod",
"cloudId": "n3-H100x1",
"gpuType": "H100_80GB",
"socket": "PCIe",
"gpuCount": 1,
"image": "custom_template",
"customTemplateId": "cm2szl4a20001tl3pyq7ua6o7"
"dataCenterId": "CANADA-1",
"country": "CA",
"security": "secure_cloud",
},
"provider": {
"type": "hyperstack"
}
}'
Not all templates are compatible with every provider and GPU. Ensure the selected GPU matches the template’s GPU Requirements before using it. These details are available in the template view by clicking View button, on the template card.