Infrastructure Bearer Token

RunPod REST API

GPU cloud computing for AI and machine learning workloads

RunPod provides on-demand GPU cloud infrastructure optimized for AI, machine learning, and deep learning workloads. Developers use RunPod's REST API to programmatically deploy serverless endpoints, manage GPU instances, train models, and run inference at scale with flexible pricing and instant provisioning.

Base URL https://api.runpod.io/v2

API Endpoints

Method	Endpoint	Description
GET	`/pods`	List all active GPU pods in your account
POST	`/pods`	Create a new GPU pod instance with specified configuration
GET	`/pods/{podId}`	Get detailed information about a specific pod
DELETE	`/pods/{podId}`	Terminate a running GPU pod instance
POST	`/pods/{podId}/start`	Start a stopped GPU pod instance
POST	`/pods/{podId}/stop`	Stop a running GPU pod instance
GET	`/endpoints`	List all serverless endpoints in your account
POST	`/endpoints`	Create a new serverless endpoint for inference
POST	`/run/{endpointId}`	Execute a synchronous inference request on a serverless endpoint
POST	`/runsync/{endpointId}`	Execute a synchronous inference request with immediate response
POST	`/run/{endpointId}/health`	Check the health status of a serverless endpoint
GET	`/status/{requestId}`	Get the status and results of an asynchronous inference request
GET	`/gpus`	List available GPU types and their specifications
GET	`/user`	Get current user account information and credits
GET	`/templates`	List available container templates for pod deployment

Code Examples

curl -X POST https://api.runpod.io/v2/run/YOUR_ENDPOINT_ID \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "input": {
      "prompt": "A beautiful sunset over mountains",
      "num_inference_steps": 50
    }
  }'

const response = await fetch('https://api.runpod.io/v2/run/YOUR_ENDPOINT_ID', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    input: {
      prompt: 'A beautiful sunset over mountains',
      num_inference_steps: 50
    }
  })
});

const data = await response.json();
const jobId = data.id;

// Poll for results
while (true) {
  const statusResponse = await fetch(`https://api.runpod.io/v2/status/${jobId}`, {
    headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
  });
  const status = await statusResponse.json();
  
  if (status.status === 'COMPLETED') {
    console.log('Result:', status.output);
    break;
  }
  await new Promise(resolve => setTimeout(resolve, 1000));
}

import requests
import time

API_KEY = 'YOUR_API_KEY'
ENDPOINT_ID = 'YOUR_ENDPOINT_ID'
BASE_URL = 'https://api.runpod.io/v2'

headers = {
    'Authorization': f'Bearer {API_KEY}',
    'Content-Type': 'application/json'
}

# Submit inference request
response = requests.post(
    f'{BASE_URL}/run/{ENDPOINT_ID}',
    headers=headers,
    json={
        'input': {
            'prompt': 'A beautiful sunset over mountains',
            'num_inference_steps': 50
        }
    }
)

job_id = response.json()['id']

# Poll for results
while True:
    status_response = requests.get(
        f'{BASE_URL}/status/{job_id}',
        headers=headers
    )
    status = status_response.json()
    
    if status['status'] == 'COMPLETED':
        print('Result:', status['output'])
        break
    elif status['status'] == 'FAILED':
        print('Error:', status.get('error'))
        break
    
    time.sleep(1)

Use RunPod from Claude / Cursor / ChatGPT

Get a hosted MCP endpoint for RunPod. Paste your RunPod API key, copy back one URL, drop it into Claude Desktop, Cursor, or any AI client that supports remote MCP. Your AI calls RunPod directly with your credentials — no local install, works on mobile.

deploy_gpu_pod Deploy a GPU pod instance with specified hardware configuration and container image for custom workloads

run_inference Execute inference requests on serverless endpoints for AI models like Stable Diffusion, LLMs, or custom models

manage_endpoints Create, update, and manage serverless endpoints with automatic scaling and load balancing

monitor_jobs Track status and retrieve results from asynchronous GPU jobs and inference requests

check_gpu_availability Query available GPU types, pricing, and real-time availability across different regions

Connect in 60 seconds

Paste your RunPod key → get an MCP URL → paste into Claude/Cursor. Hosted by IOX, encrypted at rest.

Connect RunPod to your AI →

RunPod REST API

API Endpoints

Sponsor this page

Code Examples

Use RunPod from Claude / Cursor / ChatGPT

Connect in 60 seconds

Related APIs