Deploy MCP Server
Infrastructure Bearer Token

RunPod REST API

GPU cloud computing for AI and machine learning workloads

RunPod provides on-demand GPU cloud infrastructure optimized for AI, machine learning, and deep learning workloads. Developers use RunPod's REST API to programmatically deploy serverless endpoints, manage GPU instances, train models, and run inference at scale with flexible pricing and instant provisioning.

Base URL https://api.runpod.io/v2

API Endpoints

MethodEndpointDescription
GET/podsList all active GPU pods in your account
POST/podsCreate a new GPU pod instance with specified configuration
GET/pods/{podId}Get detailed information about a specific pod
DELETE/pods/{podId}Terminate a running GPU pod instance
POST/pods/{podId}/startStart a stopped GPU pod instance
POST/pods/{podId}/stopStop a running GPU pod instance
GET/endpointsList all serverless endpoints in your account
POST/endpointsCreate a new serverless endpoint for inference
POST/run/{endpointId}Execute a synchronous inference request on a serverless endpoint
POST/runsync/{endpointId}Execute a synchronous inference request with immediate response
POST/run/{endpointId}/healthCheck the health status of a serverless endpoint
GET/status/{requestId}Get the status and results of an asynchronous inference request
GET/gpusList available GPU types and their specifications
GET/userGet current user account information and credits
GET/templatesList available container templates for pod deployment

Code Examples

curl -X POST https://api.runpod.io/v2/run/YOUR_ENDPOINT_ID \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "input": {
      "prompt": "A beautiful sunset over mountains",
      "num_inference_steps": 50
    }
  }'

Connect RunPod to AI

Deploy a RunPod MCP server on IOX Cloud and connect it to Claude, ChatGPT, Cursor, or any AI client. Your AI assistant gets direct access to RunPod through these tools:

deploy_gpu_pod Deploy a GPU pod instance with specified hardware configuration and container image for custom workloads
run_inference Execute inference requests on serverless endpoints for AI models like Stable Diffusion, LLMs, or custom models
manage_endpoints Create, update, and manage serverless endpoints with automatic scaling and load balancing
monitor_jobs Track status and retrieve results from asynchronous GPU jobs and inference requests
check_gpu_availability Query available GPU types, pricing, and real-time availability across different regions

Deploy in 60 seconds

Describe what you need, AI generates the code, and IOX deploys it globally.

Deploy RunPod MCP Server →

Related APIs