API Documentation
CostPilot is an OpenAI-compatible API proxy that intelligently routes your requests to the cheapest model capable of handling them.
Change one line of code, save 40-70% on AI costs
Drop-in replacement for the OpenAI SDK — just swap your base URL.
Quick Start
Create an API key
Go to the Dashboard and generate a new key.
Replace your base URL
Point your OpenAI client to CostPilot — that's it.
Python
class="text-zinc-500"># Python
from openai import OpenAI
client = OpenAI(
base_url=class="text-emerald-400">"https:class="text-zinc-500">//costpilot.nweva.com/api/v1",
api_key=class="text-emerald-400">"cp_your_key_here"
)
response = client.chat.completions.create(
model=class="text-emerald-400">"gpt-4o",
messages=[{class="text-emerald-400">"role": class="text-emerald-400">"user", class="text-emerald-400">"content": class="text-emerald-400">"Hello!"}]
)TypeScript
// TypeScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://costpilot.nweva.com/api/v1",
apiKey: "cp_your_key_here"
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }]
});cURL
curl -X POST https://costpilot.nweva.com/api/v1/chat/completions \
-H "Authorization: Bearer cp_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'API Reference
/api/v1/chat/completionsFully compatible with the OpenAI Chat Completions API. Streaming (SSE) supported.
Headers
Authorization: Bearer cp_xxx— your API keyX-CostPilot-Force: true(optional) — bypass smart routing
Body
Standard OpenAI format: model, messages, stream, temperature, etc.
Response
Standard OpenAI format. When stream: true, returns Server-Sent Events.
/api/v1/modelsReturns the list of supported models.
Admin Endpoints
Require X-Admin-Secret header.
/api/keysCreate a new API key.
{ "name": "My App", "plan": "free|pro|scale" }/api/keysList all API keys (masked).
/api/keys?key=cp_xxxRevoke an API key.
Routing Logic
CostPilot classifies each request by complexity and routes it to the most cost-effective model:
Short prompts, factual Q&A
→ gpt-4o-miniSummaries, analysis, simple code
→ gpt-4oReasoning, long creative, complex code
→ Requested modelForce bypass: Send the header X-CostPilot-Force: true to skip classification and use the exact model you specified.
Pricing
Free
$0
1,000 req/month
Pro
$29/mo
50,000 req/month
Scale
$99/mo
500,000 req/month
Model Pricing
| Model | Input $/1M tokens | Output $/1M tokens |
|---|---|---|
| gpt-4o | $2.50 | $10.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| gpt-4-turbo | $10.00 | $30.00 |
| gpt-3.5-turbo | $0.50 | $1.50 |