Skip to main content
The HOST address for the API server is:
https://api.gpuhub.com/

Authentication

headers = {"Authorization": "your token"}

1 List Private Images

→ Request

POST /api/v1/dev/image/private/list
Request Body
{
  "page_index": 1,
  "page_size": 10
}

← Response

Response Parameters
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
data.listListList of image objects
response Object Fields
ParameterTypeDescription
idIntImage ID
image_nameStringImage name
image_uuidStringImage UUID (used when creating deployments)
Response (on success)
{
    "code": "Success",
    "msg": ""
    "data": {
        "list": [
            {
                "id": 111,
                "created_at": "2022-01-20T18:34:08+08:00",
                "updated_at": "2022-01-20T18:34:08+08:00",
                "image_uuid": "image-db8346e037",
                "image_name": "image name",
                "status": "finished",
            }
        ],
        "page_index": 1,
        "page_size": 10,
        "max_page": 1,
        "offset": 0,
    },
}
Python
import requests
headers = {
    "Authorization": "your token",
    "Content-Type": "application/json"
}
url = "https://api.autodl.com/api/v1/dev/image/private/list"
body = {
    "page_index": 1,
    "page_size": 10,
}
response = requests.post(url, json=body, headers=headers)
print(response.content.decode())

2 Create Deployment

→ Request

POST /api-reference/api-elastic-deployment#request
Place all parameters in the request body. Details are as follows:
ParameterTypeRequiredDescription
nameStringYesDeployment name
deployment_typeStringYesDeployment type: ReplicaSet, Job, or Container
replica_numIntRequired for ReplicaSet & JobDesired number of container replicas
parallelism_numIntRequired for Job onlyMax number of containers running in parallel (for Job type)
reuse_containerBoolNoWhether to reuse stopped containers (significantly speeds up creation). Default: false
reuse_container_scopeStringNoScope of reuse: all (account-wide) or deployment (current deployment only). Default: all
service_6006_port_protocolStringNoProtocol for port 6006 mapping: http or tcp. Default: http
service_6008_port_protocolStringNoProtocol for port 6008 mapping: http or tcp. Default: http
container_templateContainer ObjectYesContainer template (see below)
container_template Object
ParameterTypeRequiredDescription
dc_listlist<String>YesList of available regions (data centers). See Appendix for values
cuda_v_fromIntYesMinimum CUDA version supported by driver (e.g., 118 = CUDA 11.8). See Appendix
cuda_v_toIntYesMaximum CUDA version (higher versions are backward compatible)
gpu_name_setlist<String>YesAllowed GPU models (e.g., ["RTX 5090", "RTX Pro 6000"])
gpu_numIntYesNumber of GPUs per container
cpu_num_from / cpu_num_toIntYesCPU core range (1 vCPU units)
memory_size_from / memory_size_toIntYesMemory range in GB
price_from / price_toIntYesPrice range in USD * 1000 (e.g., USD0.10/hr → 100, USD9.00/hr → 9000)
image_uuidStringYesUUID of your private image or public base image
cmdStringYesStartup command (executed when container starts)
cmd_before_shutdownStringNoCommand executed before container stops (5-second timeout; forced stop on timeout)
Request Body
{
 "name": "api-auto-created",
 "deployment_type": "ReplicaSet",
 "replica_num": 2,
 "reuse_container": true,
 "container_template": {
   "dc_list": ["sgp1", "us-west1"],
   "gpu_name_set": ["RTX 5090", "RTX Pro 6000"],
   "cuda_v_from": 118,
   "cuda_v_to": 128,
   "gpu_num": 1,
   "cpu_num_from": 4,
   "cpu_num_to": 64,
   "memory_size_from": 16,
   "memory_size_to": 256,
   "cmd": "python main.py",
   "price_from": 100,    // base price: $0.10/hr
   "price_to": 9000,     // base price: $9.00/hr
   "image_uuid": "image-xxxxxxxx"
 }
}

← Response

Response (on success)
{
   "code": "Success",
   "msg": "",
   "data": {
       "deployment_uuid": "833f1cd5a764fa3"
   }
}
Python
import requests

headers = {
    "Authorization": "your token",
    "Content-Type": "application/json"
}

url = "https://api.gpuhub.com/api/v1/dev/deployment"

# Example 1: Create a ReplicaSet deployment (auto-scaling)
body = {
    "name": "my-inference-service",
    "deployment_type": "ReplicaSet",
    "replica_num": 2,
    "reuse_container": True,
    "container_template": {
        "dc_list": ["sgp1", "us-west1"],
        "gpu_name_set": ["RTX 5090", "RTX Pro 6000"],
        "gpu_num": 1,
        "cuda_v_from": 118,
        "cuda_v_to": 128,
        "cpu_num_from": 4,
        "cpu_num_to": 64,
        "memory_size_from": 16,
        "memory_size_to": 256,
        "cmd": "python main.py",
        "price_from": 100,   # $0.10/hr (price × 1000)
        "price_to": 9000,    # $9.00/hr (price × 1000)
        "image_uuid": "image-xxxxxxxx"
    }
}

response = requests.post(url, json=body, headers=headers)
print(response.content.decode())

# Example 2: Create a Job deployment (batch processing)
{
    "name": "batch-training-job",
    "deployment_type": "Job",
    "replica_num": 4,
    "parallelism_num": 1,
    "reuse_container": True,
    "container_template": {
        "dc_list": ["sgp1"],
        "gpu_name_set": ["RTX 5090"],
        "gpu_num": 1,
        "cuda_v_from": 118,
        "cuda_v_to": 128,
        "cpu_num_from": 8,
        "cpu_num_to": 64,
        "memory_size_from": 32,
        "memory_size_to": 256,
        "cmd": "python train.py",
        "price_from": 100,
        "price_to": 9000,
        "image_uuid": "image-xxxxxxxx"
    }
}

# Example 3: Create a single Container deployment
{
    "name": "debug-container",
    "deployment_type": "Container",
    "reuse_container": True,
    "container_template": {
        "dc_list": ["sgp1"],
        "gpu_name_set": ["RTX 5090"],
        "gpu_num": 1,
        "cuda_v": 118,
        "cpu_num_from": 4,
        "cpu_num_to": 32,
        "memory_size_from": 16,
        "memory_size_to": 128,
        "cmd": "sleep infinity",
        "price_from": 100,
        "price_to": 9000,
        "image_uuid": "image-xxxxxxxx"
    }
}

3 List Deployments

→ Request

POST /api/v1/dev/deployment/list
Place the following parameters in the request body:
ParameterTypeRequiredDescription
page_indexIntYesPage number (starting from 1)
page_sizeIntYesNumber of items per page
nameStringNoFilter by exact deployment name (exact match only, no fuzzy search)
statusStringNoFilter by deployment status: running → active deployments stopped → stopped deployments → leave empty to return all
deployment_uuidStringNoFilter by specific deployment UUID
Request Body
{
  "page_index": 1,
  "page_size": 10,
  "name": "my-api",           // optional: filter by exact deployment name
  "status": "running",        // optional: "running" | "stopped" | empty (all)
  "deployment_uuid": "xxx"    // optional: filter by deployment UUID
}

← Response

The fields returned in each deployment object have the same meaning as the parameters used when creating the deployment.
Response (on success)
{
  "code": "Success",
  "msg": "",
  "data": {
    "list": [
      {
        "id": 214,
        "uuid": "53a677bb3e281b8",
        "name": "xxxx",
        "deployment_type": "Container",
        "status": "stopped",
        "replica_num": 1,
        "parallelism_num": 1,
        "reuse_container": true,
        "starting_num": 0,
        "running_num": 0,
        "finished_num": 2,
        "image_uuid": "image-db8346e037",
        "template": {
          "dc_list": ["sgp1", "us-west1"],
          "gpu_name_set": ["RTX 5090"],
          "gpu_num": 1,
          "cpu_num_from": 1,
          "cpu_num_to": 100,
          "memory_size_from": 1073741824,
          "memory_size_to": 274877906944,
          "price_from": 100,
          "price_to": 9000,
          "cuda_v_from": 118,
          "cuda_v_to": 128,
          "cmd": "sleep 100"
        },
        "price_estimates": 0,
        "created_at": "2023-01-05T20:34:07+08:00",
        "updated_at": "2023-01-05T20:34:07+08:00",
        "stopped_at": null
      }
    ],
    "page_index": 1,
    "page_size": 10,
    "offset": 0,
    "max_page": 1,
    "result_total": 3
  }
}

4 Scale Replicas (ReplicaSet only)

→ Request

PUT /api/v1/dev/deployment/replica_num
Place the following parameters in the request body:
ParameterTypeRequiredDescription
deployment_uuidStringYesUUID of the deployment to scale replicas for
replica_numIntYesDesired number of replicas (must be greater than 0)
Request Body
{
  "deployment_uuid": "53a677bb3e281b8",
  "replica_num": 3
}

← Response

ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
datanullNo data returned
Response (on success)
{
  "code": "Success",
  "msg": "",
  "data": null
}
Python
import requests

headers = {
    "Authorization": "your token",
    "Content-Type": "application/json"
}

url = "https://api.gpuhub.com/api/v1/dev/deployment/replica_num"

body = {
    "deployment_uuid": "5be3045703152b9",
    "replica_num": 16
}

response = requests.put(url, json=body, headers=headers)
print(response.json())

5 Stop Entire Deployment

→ Request

PUT /api/v1/dev/deployment/operate
Place the following parameters in the request body:
ParameterTypeRequiredDescription
deployment_uuidStringYesUUID of the deployment to operate
operationStringYesOperation to perform: stop → stop the deployment delete → delete the deployment
Request Body
{
  "deployment_uuid": "53a677bb3e281b8",
  "operation": "stop"
}

← Response

ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
datanullNo data returned
Response (on success)
{
  "code": "Success",
  "msg": "",
  "data": null
}
Python

import requests
headers = {
    "Authorization": "您的token",
    "Content-Type": "application/json"
}
url = "https://api.gpuhub.com/api/v1/dev/deployment/operate"
body = {
    "deployment_uuid": "5be3045703152b9",
    "operate": "stop"
}
response = requests.put(url, json=body, headers=headers)
print(response.content.decode())

6 Delete Entire Deployment

→ Request

If the deployment is still running, the system will automatically stop it first, then delete it completely (including all containers and associated resources).
DELETE /api/v1/dev/deployment
Request Body
JSON{
  "deployment_uuid": "xxx"
}

← Response

ParameterTypeDescription
codeStringSuccess on success
msgStringEmpty when successful
datanullNo data returned
Response (on success)
JSON{
  "code": "Success",
  "msg": "",
  "data": null
}
Python
import requests
headers = {
    "Authorization": "your_token_here",
    "Content-Type": "application/json"
}

url = "https://api.gpuhub.com/api/v1/dev/deployment"
body = {"deployment_uuid": "5be3045703152b9"}

response = requests.delete(url, json=body, headers=headers)
print(response.json())

7 List Container Events

→ Request

POST /api/v1/dev/deployment/container/event/list
Place the following parameters in the request body:
ParameterTypeRequiredDescription
deployment_uuidStringYesDeployment UUID
deployment_container_uuidStringNoSpecific container UUID (leave empty for all)
page_indexIntYesPage number (starting from 1)
page_sizeIntYesNumber of items per page
offsetIntNoStarting offset (used for polling new events)
Request Body
{
  "deployment_uuid": "da497aea1eb8343", 
  "deployment_container_uuid": "", 
  "page_index": 1, 
  "page_size": 10,
  "offset": 0
}

← Response

Response Parameters
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
dataObjectContainer event data
Event Object Fields
ParameterTypeDescription
deployment_container_uuidStringUUID of the container
statusStringContainer state (e.g., creating, starting, running, shutting_down, shutdown)
created_atStringTimestamp when the state change occurred
Response (on success)
{
    "code": "Success",
    "data": {
        "list": [
            {
                "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
                "status": "shutdown",
                "created_at": "2022-12-13T16:42:45+08:00"
            },
            {
                "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
                "status": "shutting_down",
                "created_at": "2022-12-13T16:42:40+08:00"
            },
            {
                "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
                "status": "running",
                "created_at": "2022-12-13T16:34:57+08:00"
            },
            {
                "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
                "status": "oss_merged",
                "created_at": "2022-12-13T16:34:55+08:00"
            },
            {
                "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
                "status": "starting",
                "created_at": "2022-12-13T16:34:55+08:00"
            },
            {
                "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
                "status": "created",
                "created_at": "2022-12-13T16:34:54+08:00"
            },
            {
                "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
                "status": "creating",
                "created_at": "2022-12-13T16:34:47+08:00"
            }
        ],
        "page_index": 1,
        "page_size": 10,
        "offset": 0,
        "max_page": 1,
    },
    "msg": ""
}
Python
import requests
headers = {
    "Authorization": "your token",
    "Content-Type": "application/json"
}

url = "https://api.gpuhub.com/api/v1/dev/deployment/container/event/list"
body = {
    "deployment_uuid": "da497aea1eb8343", 
    "deployment_container_uuid": "", 
    "page_index": 1, 
    "page_size": 10,
    "offset": 0
}

response = requests.post(url, json=body, headers=headers)
print(response.json())

8 List Containers

→ Request

Inside the container, you can get the current container’s UUID via the environment variable: AutoDLContainerUUID
POST /api/v1/dev/deployment/container/list
Request Body Parameters
ParameterTypeRequiredDescription
deployment_uuidStringYesDeployment UUID
container_uuidStringNoFilter by specific container UUID
date_from / date_toStringNoFilter by container creation time range
gpu_nameStringNoFilter by GPU model
cpu_num_from/toIntNoCPU core range
memory_size_from/toIntNoMemory range (bytes)
price_from/toFloatNoBase price range (USD × 1000)
releasedBooleanNoInclude released containers?
statusList<String>NoFilter by status: e.g., ["running"]
page_indexIntYesPage number (default: 1)
page_sizeIntYesItems per page (default: 10)
offsetIntNoStarting offset
Request Body
{
    "deployment_uuid": "da497aea1eb8343", 
    "page_index": 1, 
    "page_size": 10
}

← Response

Response Fields
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
data.listArrayList of container objects
Container Object
FieldTypeDescription
uuidStringContainer UUID
versionStringContainer version (auto-generated or user-specified on deployment update)
data_centerStringRegion / data center code (e.g., sgp1)
deployment_uuidStringParent deployment UUID
machine_idStringPhysical host ID
statusStringCurrent status (running, stopped, starting, etc.)
gpu_nameStringGPU model (e.g., RTX 5090)
gpu_numIntNumber of GPUs allocated
cpu_numIntNumber of CPU cores
memory_sizeIntMemory size in bytes
image_uuidStringImage UUID
priceFloatBase price in USD × 1000 (e.g., 1600 = $1.60/hr)
infoObjectConnection details (see below)
started_atStringTime when container started running
stopped_atStringTime when container stopped (null if running)
created_atStringCreation timestamp
updated_atStringLast update timestamp
info Object (Connection Details)
FieldTypeDescription
ssh_commandStringFull SSH login command
root_passwordStringRoot password for SSH
service_6006_port_urlStringPublic HTTPS URL for port 6006
service_6008_port_urlStringPublic HTTP URL for port 6008
service_url, proxy_host, custom_portDeprecated — use the two service URLs above
Response (on success)
{
  "code": "Success",
  "msg": "",
  "data": {
    "list": [
      {
        "uuid": "53a677bb3e281b8-f94411a60c-63c24009",
        "data_center": "sgp1",
        "deployment_uuid": "da497aea1eb8343",
        "status": "running",
        "gpu_name": "RTX 5090",
        "gpu_num": 1,
        "cpu_num": 16,
        "memory_size": 68719476736,
        "image_uuid": "image-xxxxxxxx",
        "price": 1600,
        "info": {
          "ssh_command": "ssh -p 22345 [email protected]",
          "root_password": "xxxxxxxxxx",
          "service_6006_port_url": "https://sgp1.gpuhub.com:22346",
          "service_6008_port_url": "http://sgp1.gpuhub.com:22348"
        },
        "started_at": "2025-11-19T10:43:03+08:00",
        "created_at": "2025-11-19T10:42:50+08:00"
      }
    ],
    "page_index": 1,
    "page_size": 10,
    "max_page": 1
  }
}
Python
import requests

headers = {
    "Authorization": "your_token_here",
    "Content-Type": "application/json"
}

url = "https://api.gpuhub.com/api/v1/dev/deployment/container/list"

body = {
    "deployment_uuid": "da497aea1eb8343",
    "status": ["running"],
    "page_index": 1,
    "page_size": 10
}

response = requests.post(url, json=body, headers=headers)
print(response.json())

9 Stop a Container

→ Request

POST /api/v1/dev/deployment/container/event/stop
Place the following parameters in the request body:
ParameterTypeRequiredDescription
deployment_container_uuidStringYesUUID of the container to stop
decrease_one_replica_numBooleanNoReplicaSet only: if true, decreases the desired replica count by 1 after stopping (default: false)
no_cacheBooleanNoIf true, prevents the stopped container from being cached for reuse (default: false)
cmd_before_shutdownStringNoCommand to run before shutdown (5-second timeout; force-kills on timeout). Overrides any cmd_before_shutdown set during deployment creation (very rare chance both run)
Request Body
{
     "deployment_container_uuid": "da497aea1eb8343-f94411a60c-a394fb30",
     "decrease_one_replica_num": false,
     "no_cache": false,
     "cmd_before_shutdown": "sleep 5"
}

← Response

Response Parameters
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
datanullNo data returned
Response (on success)
{
    "code": "Success",
    "msg": "",
    "data": null
}
Python
import requests
headers = {
    "Authorization": "your token",
    "Content-Type": "application/json"
}

url = "https://api.gpuhub.com/api/v1/dev/deployment/container/event/stop"
body = {
    "deployment_container_uuid": "da497aea1eb8343-f94411a60c-a394fb30",
    "decrease_one_replica_num": false,
    "no_cache": false,
    "cmd_before_shutdown": "sleep 5"
}

response = requests.put(url, json=body, headers=headers)
print(response.content.decode())

10 Host Blacklist

→ Request

To avoid bad machines: if a container experiences unknown issues (e.g., slow startup, crashes), you can blacklist its host to prevent future scheduling on that machine.
POST /api/v1/dev/deployment/blacklist
ParameterTypeRequiredDescription
deployment_container_uuidStringYesUUID of the problematic container
expire_in_minutesIntNoBlacklist duration in minutes. Default: 1440 (24 hours). Max: 43200 (30 days)
commentStringNoOptional note (e.g., “Slow boot”)
Request Body
{
  "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
  "expire_in_minutes": 1440,
  "comment": "Slow startup — avoid this host"
}

← Response

Response Parameters
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
datanullNo data returned
Response (on success)
{
    "code": "Success",
    "msg": "",
    "data": null
}
Python
import requests

headers = {"Authorization": "your token", "Content-Type": "application/json"}
url = "https://api.gpuhub.com/api/v1/dev/deployment/blacklist"

body = {
    "deployment_container_uuid": "da497aea1eb8343-f94411a60c-1502e6e2",
    "expire_in_minutes": 1440,
    "comment": "Slow startup — avoid this host"
}

response = requests.post(url, json=body, headers=headers)
print(response.json())

11 Get Active Blacklist

→ Request

GET /api/v1/dev/deployment/blacklist
No body required.

← Response

Response Parameters
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
dataListList of active blacklist entries
Blacklist Entry Object
FieldTypeDescription
created_atStringTimestamp when the blacklist was first created
updated_atStringTimestamp of the last update (updates when expiry time is extended)
data_centerStringRegion / data center code (e.g., sgp1)
expired_timeStringTimestamp when the blacklist entry will automatically expire
machine_idStringPhysical host ID that is blacklisted
msgStringOptional comment provided when creating the blacklist
Success Response
{
  "code": "Success",
  "msg": "",
  "data": [
    {
      "created_at": "2025-03-25T17:42:55+08:00",
      "updated_at": "2025-03-25T17:48:11+08:00",
      "data_center": "sgp1",
      "machine_id": "24fb4ca36a",
      "expired_time": "2025-03-26T17:48:11+08:00",
      "msg": "Slow startup — avoid this host"
    }
  ]
}
Python
import requests

headers = {"Authorization": "your token"}
url = "https://api.gpuhub.com/api/v1/dev/deployment/blacklist"

response = requests.get(url, headers=headers)
print(response.json())

12 Real-time GPU Stock by Region

→ Request

POST /api/v1/dev/machine/region/gpu_stock

Use this endpoint before creating deployments to check real-time availability and avoid scheduling failures. Stock is calculated assuming 1 GPU per container. Even if a host has 2 idle GPUs, they may be on different machines — a container requiring 2 GPUs might still fail to schedule. Request Body Parameters
ParameterTypeRequiredDescription
region_signStringYesRegion code (see Appendix)
cuda_v_fromIntNoMinimum supported CUDA version (e.g., 118 = CUDA 11.8)
cuda_v_toIntNoMaximum supported CUDA version
gpu_name_setList<String>NoFilter by specific GPU models
memory_size_fromIntNoMinimum memory size in GB
memory_size_toIntNoMaximum memory size in GB
cpu_num_fromIntNoMinimum CPU cores
cpu_num_toIntNoMaximum CPU cores
price_fromIntNoMinimum price (USD * 1000, e.g., $0.10/hr → 100)
price_toIntNoMaximum price (USD * 1000)
Request Body
{
  "region_sign": "sgp1",
  "cuda_v_from": 118,
  "cuda_v_to": 128
}

← Response

Response Parameters
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
dataListList of GPU inventory objects
GPU Inventory Object
FieldTypeDescription
{GPU_MODEL}ObjectKey is the GPU model name
idle_gpu_numIntNumber of currently idle GPUs
total_gpu_numIntTotal number of GPUs
Response (on success)
{
  "code": "Success",
  "msg": "",
  "data": [
    {
      "RTX 5090": {
        "idle_gpu_num": 312,
        "total_gpu_num": 2850
      }
    },
    {
      "RTX Pro 6000": {
        "idle_gpu_num": 48,
        "total_gpu_num": 640
      }
    }
  ]
}
Python
import requests

headers = {
    "Authorization": "your_token_here",
    "Content-Type": "application/json"
}

url = "https://api.gpuhub.com/api/v1/dev/machine/region/gpu_stock"

body = {
    "region_sign": "sgp1",
    "cuda_v_from": 118,
    "cuda_v_to": 128
}

response = requests.post(url, json=body, headers=headers)
print(response.json())

13 Check Duration Package Balance

→ Request

GET /api/v1/dev/deployment/ddp/overview
Request Parameters (Query String)
ParameterTypeRequiredDescription
deployment_uuidStringYesUUID of the deployment
Response Parameters
ParameterTypeDescription
codeStringResponse status. Success on success
msgStringError message. Empty when successful
dataListList of prepaid package objects
Prepaid Package Object
FieldTypeDescription
gpu_typeStringGPU model (e.g., RTX 5090)
totalIntTotal remaining seconds across all unused prepaid packages (seconds)
balanceIntCurrently remaining seconds from unused packages (seconds)
dc_listStringRegions covered by the package (comma-separated if multiple, e.g., sgp1,us-west1)

← Response

Response (on success)
{
    "code": "Success",
    "data": [
        {
            "gpu_type": "RTX 5090",
            "total": 86400,
            "balance": 83829,
            "dc_list": "sgp1,us-west1"
        }
    ],
    "msg": ""
}
Python
import requests

headers = {"Authorization": "your_token_here"}
url = "https://api.gpuhub.com/api/v1/dev/deployment/ddp/overview"
params = {"deployment_uuid": "your_deployment_uuid_here"}

response = requests.get(url, params=params, headers=headers)
print(response.json())

14 Appendix

region_sign & dc_list

Values for dc_list (preferred) or the deprecated region_sign parameter when creating a deployment, After a container starts, the current region is also available inside the container via the environment variable.
Region Namedc_list ValueOld region_sign
Singapore (Primary)sgp1sgp1

Public Image UUID

FrameworkImage UUIDDescription
PyTorchbase-image-12be412037CUDA 11.1 · torch 1.9.0 · Ubuntu 18.04
PyTorchbase-image-u9r24vthlkCUDA 11.3 · torch 1.10.0 · Ubuntu 20.04
PyTorchbase-image-l374uiucuiCUDA 11.3 · torch 1.11.0 · Ubuntu 20.04
PyTorchbase-image-l2t43iu6ukCUDA 11.8 · torch 2.0.0 · Ubuntu 20.04 (recommended)
TensorFlowbase-image-0gxqmciythCUDA 11.2 · tf 2.5.0 · Ubuntu 18.04
TensorFlowbase-image-uxeklgirirCUDA 11.2 · tf 2.9.0 · Ubuntu 20.04
TensorFlowbase-image-4bpg0tt88lCUDA 11.4 · tf 1.15.5 · Ubuntu 20.04
Minicondabase-image-mbr2n4urrcCUDA 11.6 · Ubuntu 20.04
Minicondabase-image-qkkhitpik5CUDA 10.2 · Ubuntu 18.04
Minicondabase-image-h041hn36ytCUDA 11.1 · Ubuntu 18.04
Minicondabase-image-7bn8iqhkb5CUDA 11.3 · Ubuntu 20.04
Minicondabase-image-k0vep6kyq8CUDA 9.0 · Ubuntu 16.04
TensorRTbase-image-l2843iu23kCUDA 11.8 · TensorRT 8.5.1 · Ubuntu 20.04
TensorRTbase-image-l2t43iu6ukCUDA 11.8 · torch 2.0.0 + TensorRT · Ubuntu 20.04
More images are added regularly – contact support for the latest list or request a custom base image.

CUDA Version values

CUDA VersionValue to use in cuda_v_from / cuda_v_to
11.8118
12.0120
12.1121
12.2122
If your framework needs CUDA 11.5 (or any unlisted version), choose the lowest available version ≥ your requirement (e.g., 11.8 → 118). Higher drivers are backward-compatible, but picking a version that’s too high reduces available GPUs. Always select the smallest compatible value to maximize scheduling success.

Container Environment Variables

Variable NameODescription
AutoDLContainerUUIDUnique ID of the current container
AutoDLDeploymentUUIDUUID of the parent deployment
AutoDLDataCenterCurrent region / data center code