VedasLab AI Gateway
API Documentation v3.0
Use our unified API to integrate state-of-the-art AI models — GPT-5, Claude 4.5, Gemini 3 Pro and more — into your applications with a single endpoint. Just change the model parameter to switch between providers.
OpenAI-Compatible API
Our API follows the OpenAI Chat Completions format. If your app already works with OpenAI, just change the base URL and API key — no other code changes needed.
Authentication
All API requests require authentication. Send your API key via the X-My-API-Key header.
X-My-API-Key: your_api_key_here
Keep Your Key Secret
Never expose your API key in client-side code or public repositories. Use environment variables or backend proxies in production.
Get your API key from the User Dashboard.
Base URL
All API requests should be sent to:
https://api.vedaslab.in/public/api.php?path={endpoint}
Chat Completions
?path=chat/completions
GitHub API Proxy ( UNder Developing )
?path=user/repos
Quick Start
Make your first AI request in 30 seconds. Copy and paste this into your terminal:
curl -X POST "https://api.vedaslab.in/public/api.php?path=chat/completions" \
-H "Content-Type: application/json" \
-H "X-My-API-Key: YOUR_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Say hello!"}
]
}'
That's It!
Replace YOUR_API_KEY with your actual key and you'll get a JSON response from GPT-4o. Change the model parameter to use any other available model.
/chat/completions
Generate an AI response for a conversation. Supports all available models.
Request Headers
X-My-API-Key
required
Your API key for authentication.
Content-Type
required
Must be application/json
Body Parameters
model
required
string
Model ID to use. Example: gpt-4o, claude-sonnet-4, gpt-5
messages
required
array
Conversation messages. Each message has:
role
string
system, user, or assistant
content
string | array
Message text, or array for vision (text + image_url objects)
stream
optional
boolean
Set to true for Server-Sent Events streaming. Default: false
temperature
optional
number
Sampling temperature (0 to 2). Lower = focused, higher = creative. Default: 1
max_tokens
optional
integer
Maximum tokens in the response.
curl -X POST "https://api.vedaslab.in/public/api.php?path=chat/completions" \
-H "Content-Type: application/json" \
-H "X-My-API-Key: YOUR_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
}'
Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1738368000,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 12,
"total_tokens": 37
}
}
Streaming Responses
Set "stream": true to receive responses as Server-Sent Events (SSE), delivering tokens in real-time as they are generated.
How It Works
Send a normal chat request but add "stream": true
Response arrives as text/event-stream with chunks
Each chunk is data: {json}, final chunk is data: [DONE]
const response = await fetch(
"https://api.vedaslab.in/public/api.php?path=chat/completions",
{
method: "POST",
headers: {
"Content-Type": "application/json",
"X-My-API-Key": "YOUR_API_KEY"
},
body: JSON.stringify({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
stream: true
})
}
);
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
const lines = text.split("\n")
.filter(line => line.startsWith("data: "));
for (const line of lines) {
const data = line.slice(6); // Remove "data: "
if (data === "[DONE]") break;
const chunk = JSON.parse(data);
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}
}
Vision / Image Analysis
Send images for analysis using vision-capable models like gpt-4o. Pass images as base64 data URLs.
curl -X POST "https://api.vedaslab.in/public/api.php?path=chat/completions" \
-H "Content-Type: application/json" \
-H "X-My-API-Key: YOUR_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,/9j/4AAQ..."
}
}
]
}
]
}'
Image Formats
Supported formats: JPEG, PNG, GIF, WebP. Max size: 5MB. Pass images as base64 data URLs in the image_url.url field.
Available Models
Models available depend on your API key's access tier. Toggle below to see what's available for each tier. This list is auto-synced from the server — new models appear instantly when enabled.
Free Tier Models
Loading...| Model ID | Short Code | Provider | Context | Description |
|---|---|---|---|---|
| Loading models... | ||||
Free tier: 20 requests/hour, 100 requests/day.
Free Tier
Access to 5 models. 100 req/day limit. Perfect for testing & personal use.
Premium Tier
Access to 11+ premium models. 5,000 req/day. Priority latency & support.
Both (All Access)
Access to all models. 10,000 req/day. Highest limits & full model access.
Reasoning / Thinking Models
Some models (e.g. claude-sonnet-4.5, gemini-2.5-pro) return an extended response that includes an internal thinking step before the final answer. Your code must handle both formats.
Standard Response
{
"choices": [{
"message": {
"role": "assistant",
"content": "The answer is 42."
},
"finish_reason": "stop"
}]
}
Thinking Model Response
{
"choices": [{
"message": {
"role": "assistant",
"content": [
{
"type": "thinking",
"thinking": "Let me reason through this..."
},
{
"type": "text",
"text": "The answer is 42."
}
]
},
"finish_reason": "stop"
}]
}
Key Difference
When content is a string, it's a standard response. When it's an array, it's a thinking model. Always check typeof content before parsing.
Handling Both Formats
function extractContent(response) {
const message = response.choices[0].message;
const content = message.content;
// Standard response: content is a string
if (typeof content === 'string') {
return { text: content, thinking: null };
}
// Thinking model: content is an array of blocks
if (Array.isArray(content)) {
let text = '';
let thinking = '';
for (const block of content) {
if (block.type === 'thinking') {
thinking += block.thinking;
} else if (block.type === 'text') {
text += block.text;
}
}
return { text, thinking: thinking || null };
}
return { text: String(content), thinking: null };
}
// Usage
const result = extractContent(apiResponse);
console.log(result.text); // The actual answer
console.log(result.thinking); // Internal reasoning (or null)
def extract_content(response):
message = response["choices"][0]["message"]
content = message["content"]
# Standard response: content is a string
if isinstance(content, str):
return {"text": content, "thinking": None}
# Thinking model: content is a list of blocks
if isinstance(content, list):
text_parts, thinking_parts = [], []
for block in content:
if block.get("type") == "thinking":
thinking_parts.append(block["thinking"])
elif block.get("type") == "text":
text_parts.append(block["text"])
return {
"text": "".join(text_parts),
"thinking": "".join(thinking_parts) or None
}
return {"text": str(content), "thinking": None}
# Usage
result = extract_content(api_response)
print(result["text"]) # The actual answer
print(result["thinking"]) # Internal reasoning (or None)
Streaming with Thinking Models
When streaming, thinking models send delta.content chunks that may include type: "thinking" blocks before type: "text" blocks. Buffer and parse the type field to separate thinking from final output.
Rate Limits
Rate limits are applied per API key based on your subscription tier. Every response includes headers showing your current usage.
| Tier | Hourly Limit | Daily Limit | Max IPs |
|---|---|---|---|
| Free | 20 requests | 100 requests | 2 |
| Premium | 500 requests | 5,000 requests | 5 |
| Both | 1,000 requests | 10,000 requests | 10 |
Response Headers
X-RateLimit-Limit
Your hourly request limit
X-RateLimit-Remaining
Requests remaining in current hour
X-RateLimit-Reset
Unix timestamp when the limit resets
X-RateLimit-Daily-Limit
Your daily request limit
X-RateLimit-Daily-Remaining
Requests remaining today
X-RateLimit-Warning
Warning message when approaching limit (80%+)
Error Codes
The API uses standard HTTP status codes. Errors return a JSON body with details.
{
"error": "Access Denied: Your API key is restricted to Free models only."
}
{
"error": "Model Temporarily Unavailable",
"message": "The model 'gpt-5' has been temporarily disabled by the administrator. Please try a different model.",
"code": "MODEL_DISABLED",
"available_models": "GET /public/models.php for available models"
}
| Status | Error | What To Do |
|---|---|---|
| 401 | Missing API Key | Add X-My-API-Key header to your request |
| 401 | Invalid API Key | Check your key is correct and active |
| 403 | Access Denied (Model) | You're requesting a model outside your tier. Upgrade or use a different model |
| 403 | Model Disabled MODEL_DISABLED |
The admin has temporarily disabled this model. Check code field for "MODEL_DISABLED". Use /models.php to list available alternatives |
| 403 | IP Not Whitelisted | Your IP is not in the whitelist for this key. Contact admin |
| 429 | Rate Limit Exceeded | Wait for X-RateLimit-Reset timestamp or reduce request frequency |
| 500 | Server Error | Server configuration issue. Contact support |
Integration Tips
Our API is OpenAI-compatible, so you can use existing SDKs with a base URL override.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.vedaslab.in/public/api.php?path=chat/completions",
default_headers={
"X-My-API-Key": "YOUR_API_KEY"
}
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://api.vedaslab.in/public/api.php?path=chat/completions",
defaultHeaders: {
"X-My-API-Key": "YOUR_API_KEY"
}
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "user", content: "Hello!" }
]
});
console.log(response.choices[0].message.content);
Works With Any OpenAI-Compatible Client
LangChain, LlamaIndex, Vercel AI SDK, and other frameworks that support custom OpenAI base URLs will work with VedasLab AI Gateway. Just point the base URL and pass the API key.
Agent Integration Guide
Give this prompt to your AI agent (Copilot, Cursor, Claude, etc.) so it knows how to use the VedasLab API out of the box. Click copy and paste it into your agent's system instructions or chat.
You have access to the VedasLab AI Gateway API. Use it to make AI model requests.
## API Details
- **Base URL:** `https://api.vedaslab.in/public/api.php?path=chat/completions`
- **Method:** POST
- **Auth Header:** `X-My-API-Key: YOUR_API_KEY`
- **Content-Type:** `application/json`
## Request Format (OpenAI-compatible)
```json
{
"model": "MODEL_ID",
"messages": [
{"role": "system", "content": "System instruction"},
{"role": "user", "content": "User message"}
],
"stream": false,
"temperature": 0.7,
"max_tokens": 4096
}
```
## Available Models
Fetch the live list: GET `https://api.vedaslab.in/public/models.php?format=flat`
Common models: gpt-4o (free), gpt-4.1 (free), gpt-5 (premium), claude-sonnet-4 (premium), gemini-2.5-pro (premium).
Short codes: f1-f5 for free, p1-p11 for premium (e.g. "p4" = claude-sonnet-4.5).
## Parsing Responses
Some models return thinking/reasoning blocks. Always handle both formats:
**Standard response** → `choices[0].message.content` is a string.
**Thinking model** → `choices[0].message.content` is an array:
```json
[
{"type": "thinking", "thinking": "Internal reasoning..."},
{"type": "text", "text": "Final answer"}
]
```
**Parser logic:**
```
content = response.choices[0].message.content
if content is string → use directly
if content is array → find block where type="text", use its .text field
```
## Error Handling
- `401` → Invalid/missing API key
- `403 + code:"MODEL_DISABLED"` → Model disabled by admin, pick another model
- `403` → Model not in your tier (free/premium)
- `429` → Rate limited, wait for `Retry-After` header
- `500` → Server error
## Streaming (SSE)
Set `"stream": true`. Read `text/event-stream` chunks:
- Each line: `data: {json}` with `choices[0].delta.content`
- Final line: `data: [DONE]`
## Vision
Pass images as base64 in the messages array:
```json
{"role": "user", "content": [
{"type": "text", "text": "Describe this image"},
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
]}
```
## Rules
1. Always include both `X-My-API-Key` and `Content-Type: application/json` headers.
2. Check if `content` is a string or array before extracting text.
3. If a model returns 403 with `MODEL_DISABLED`, automatically fall back to gpt-4o.
4. Respect rate limit headers. If 429, wait `Retry-After` seconds.
5. For the live models list, call GET /public/models.php?format=flat — never hardcode model lists.
How to Use This
- Click Copy Prompt above
- Open your AI agent's settings or system prompt
- Paste the prompt
- Replace
YOUR_API_KEYwith your actual key - Your agent can now call VedasLab API directly
Compatible Agents
- GitHub Copilot (custom instructions)
- Cursor AI (rules / system prompt)
- Claude Projects (knowledge base)
- ChatGPT Custom GPTs (instructions)
- LangChain / AutoGen agents
- Any OpenAI-compatible agent framework
Security Reminder
Only paste your API key in private agent configurations. Never share the prompt with your real key in public repositories, screenshots, or shared documents.
VedasLab AI Gateway © 2026 — vedaslab.in