API Reference
Submit ONNX inference requests and retrieve results through a simple, synchronous REST API. Authenticate with your API key and start running inference in minutes.
https://infinite-gpu-backend-bvh8a7c3fdgxd7c5.canadacentral-01.azurewebsites.net
Content-Type: application/json
Every request must include your API key in the X-Api-Key header.
Your API key is generated when you register and can be found in the task creation dialog when selecting
"Fill by API" mode.
// Include in every request X-Api-Key: pk-xxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
⚠ Security
Keep your API key secret. Do not expose it in client-side code or public repositories. Rotate your key from the console if you suspect it has been compromised.
Submit an inference request to an existing task that has been configured with "Fill by API" mode. The request blocks until the inference completes (up to 180 seconds), then returns the result synchronously.
/api/inference/tasks/{taskId}
| Parameter | Type | Description |
|---|---|---|
| taskId | GUID | The ID of the task created with fillBindingsViaApi: true |
| Field | Type | Required | Description |
|---|---|---|---|
| bindings | array | Yes | Array of input tensor bindings to feed into the ONNX graph |
| bindings[].tensorName | string | Yes | Name of the input tensor as defined in the ONNX model |
| bindings[].payloadType | string | Yes |
One of: Json,
Text,
Binary
|
| bindings[].payload | string | null | Conditional | Inline payload data. Required for Json and Text payload
types |
| bindings[].fileUrl | string | null | Conditional | URL pointing to a binary file. Required for Binary payload type |
curl -X POST https://infinite-gpu-backend-bvh8a7c3fdgxd7c5.canadacentral-01.azurewebsites.net/api/inference/tasks/{taskId} \ -H "Content-Type: application/json" \ -H "X-Api-Key: pk-xxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \ -d '{ "bindings": [ { "tensorName": "input_ids", "payloadType": "Json", "payload": "[[101, 2054, 2003, 1996, 3007, 1997, 2605, 102]]", "fileUrl": null }, { "tensorName": "attention_mask", "payloadType": "Json", "payload": "[[1, 1, 1, 1, 1, 1, 1, 1]]", "fileUrl": null } ] }'
curl -X POST https://infinite-gpu-backend-bvh8a7c3fdgxd7c5.canadacentral-01.azurewebsites.net/api/inference/tasks/{taskId} \ -H "Content-Type: application/json" \ -H "X-Api-Key: pk-xxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \ -d '{ "bindings": [ { "tensorName": "image", "payloadType": "Binary", "payload": null, "fileUrl": "https://your-storage.blob.core.windows.net/inputs/image.npy" } ] }'
The endpoint blocks until inference completes or times out (180s). The response always includes:
| Field | Type | Description |
|---|---|---|
| id | GUID | Subtask identifier for this inference run |
| state | string |
One of: success,
failed,
pending
|
| data | object | null | Inference results. Present when state is success |
| error | string | null | Error message. Present when state is failed |
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"state": "success",
"data": {
"output_logits": [[0.12, -0.34, 0.98, 1.45, ...]]
},
"error": null
}
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"state": "failed",
"data": null,
"error": "Subtask failed to execute."
}
| Code | Meaning |
|---|---|
| 200 | Inference completed (check state for result) |
| 400 | Invalid request body or task not configured for API bindings |
| 401 | Missing or invalid API key |
| 404 | Task not found |
Retrieve the result of a previously submitted inference subtask. Useful if you need to re-check a result or if the original submit request timed out while the subtask was still pending.
/api/inference/subtasks/{subtaskId}
| Parameter | Type | Description |
|---|---|---|
| subtaskId | GUID | The subtask id returned from the Submit Inference endpoint |
curl https://infinite-gpu-backend-bvh8a7c3fdgxd7c5.canadacentral-01.azurewebsites.net/api/inference/subtasks/{subtaskId} \ -H "X-Api-Key: pk-xxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
Returns the same response structure as the Submit Inference endpoint.
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"state": "success",
"data": { ... },
"error": null
}
Complete integration examples in popular languages.
import requests API_KEY = "pk-xxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" BASE_URL = "https://infinite-gpu-backend-bvh8a7c3fdgxd7c5.canadacentral-01.azurewebsites.net" TASK_ID = "your-task-id-here" # Submit inference response = requests.post( f"{BASE_URL}/api/inference/tasks/{TASK_ID}", headers={ "Content-Type": "application/json", "X-Api-Key": API_KEY, }, json={ "bindings": [ { "tensorName": "input_ids", "payloadType": "Json", "payload": "[[101, 2054, 2003, 102]]", "fileUrl": None, } ] }, ) result = response.json() print(result["state"]) # "success", "failed", or "pending" print(result["data"]) # inference output # Re-check result later subtask_id = result["id"] poll = requests.get( f"{BASE_URL}/api/inference/subtasks/{subtask_id}", headers={"X-Api-Key": API_KEY}, ) print(poll.json())
const API_KEY = "pk-xxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"; const BASE_URL = "https://infinite-gpu-backend-bvh8a7c3fdgxd7c5.canadacentral-01.azurewebsites.net"; const TASK_ID = "your-task-id-here"; // Submit inference const response = await fetch( `${BASE_URL}/api/inference/tasks/${TASK_ID}`, { method: "POST", headers: { "Content-Type": "application/json", "X-Api-Key": API_KEY, }, body: JSON.stringify({ bindings: [ { tensorName: "input_ids", payloadType: "Json", payload: "[[101, 2054, 2003, 102]]", fileUrl: null, }, ], }), } ); const result = await response.json(); console.log(result.state); // "success", "failed", or "pending" console.log(result.data); // inference output // Re-check result later const poll = await fetch( `${BASE_URL}/api/inference/subtasks/${result.id}`, { headers: { "X-Api-Key": API_KEY } } ); console.log(await poll.json());