API Documentation OpenAI-Compatible
Use https://www.shaibar.com as your base URL — drop-in replacement for OpenAI, Claude, and other providers.
Quick Navigation
1 Authentication
All API requests require a Bearer token. Get your API key from the Token Management page.
# cURL
curl https://www.shaibar.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://www.shaibar.com/v1" # IMPORTANT: trailing slash
)
chat = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Hello!"}]
)
print(chat.choices[0].message.content)
https://www.shaibar.com/v1/ — without it you'll get a 404.
2 Base URL
Use this as your OpenAI-compatible base URL in any SDK. Works with:
- OpenAI Python/JS SDK
- LangChain, LlamaIndex
- Any OpenAI-compatible client
- cURL / HTTP clients
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions | POST | Chat completions (main endpoint) |
/v1/completions | POST | Text completions (legacy) |
/v1/embeddings | POST | Embeddings |
/v1/models | GET | List available models |
/v1/models/{model} | GET | Get model info |
3 Available Models
All prices are in USD. Chinese models are routed through direct provider channels — no markup on token costs.
| Model ID | Provider | Strengths | Est. Price |
|---|---|---|---|
deepseek-chat |
DeepSeek V3 | Best value, strong reasoning, fast | ~$0.27 / 1M tokens |
deepseek-reasoner |
DeepSeek R1 | Chain-of-thought reasoning, math, coding | ~$0.55 / 1M tokens |
qwen-plus |
Qwen 2.5 Plus | Balanced, good multilingual | ~$0.40 / 1M tokens |
qwen-max |
Qwen 2.5 Max | Highest quality, complex tasks | ~$1.20 / 1M tokens |
minimax-text-01 |
MiniMax Text-01 | Long context, code, multilingual | ~$0.35 / 1M tokens |
glm-4-flash |
GLM-4 | Fast, low latency | ~$0.10 / 1M tokens |
moonshot-v1-128k |
Moonshot V1 | 128K context window | ~$0.60 / 1M tokens |
yi-lightning |
Yi Lightning | Fast, multilingual, creative | ~$0.40 / 1M tokens |
bailian-v2 |
Bailian (Alibaba) | Open-source compatible, fast | ~$0.15 / 1M tokens |
Full model list: GET /v1/models
4 Chat Completions
OpenAI-compatible. Request and response formats follow the standard /v1/chat/completions interface.
cURLPythonJavaScript
# cURL example with streaming
curl https://www.shaibar.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"stream": false,
"max_tokens": 500,
"temperature": 0.7
}'
# Python — streaming response
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://www.shaibar.com/v1/"
)
stream = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Hello!"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
// JavaScript / Node.js — OpenAI SDK
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://www.shaibar.com/v1/',
});
const chat = await client.chat.completions.create({
model: 'qwen-plus',
messages: [{ role: 'user', content: 'Hi' }],
});
console.log(chat.choices[0].message.content);
Request body parameters (OpenAI format):
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID from the available models list |
messages | array | Yes | Array of {role, content} message objects |
stream | boolean | No | Enable SSE streaming (default: false) |
max_tokens | integer | No | Max response tokens (default: 4096) |
temperature | float | No | Randomness 0–2 (default: 0.7) |
top_p | float | No | Nucleus sampling (default: 1.0) |
stop | string/array | No | Stop sequences |
frequency_penalty | float | No | -2.0 to 2.0 (default: 0) |
presence_penalty | float | No | -2.0 to 2.0 (default: 0) |
5 Billing & Top-Up
Prices are denominated in USD. You pay with USDT (TRON TRC-20, Ethereum ERC-20, or BSC BEP-20). No credit card required.
To top up:
- Go to shaibar.com/deposit_web3.html
- Generate a TRC-20/ERC-20/BEP-20 deposit address
- Send USDT to that address — deposits auto-credit within ~1-3 block confirmations
- Minimum deposit: 1 USDT
Balance & usage: Check your balance and usage logs in the Dashboard. Usage is deducted per 1M tokens processed at the rate listed for each model.
6 Chinese API Quirks (Read This!)
These are real differences between Chinese AI APIs and OpenAI that affect how you build your integration:
| Topic | What to expect |
|---|---|
| System prompts | Chinese models (especially DeepSeek, Qwen) handle system prompts well, but keep them concise. Very long system prompts may reduce output quality. |
| Output length | Default max_tokens varies by provider. Set it explicitly. DeepSeek R1 (reasoning) may produce very long responses — increase limit to 8192+ for complex tasks. |
| Tool use / Function calling | DeepSeek V3 and Qwen support function calling. Test with stream: false first. Streaming function calls are complex — disable stream for tool-use heavy apps. |
| Context window | Most models: 32K–128K context. MiniMax Text-01 supports up to 1M tokens. Sending near-max context is slow and expensive — test with shorter inputs first. |
| Rate limits | Per-key RPM/TPM limits are enforced. Default limits are generous but not unlimited. If you hit 429, implement exponential backoff. Check X-RateLimit-* response headers. |
| Streaming | SSE streaming works with OpenAI SDK. Some clients (Postman, Insomnia) may not auto-parse SSE correctly — use a real SDK or curl -N for testing. |
| Latency | Expect 1-5s first-token latency for non-streaming. Streaming starts faster. DeepSeek V3 is generally the fastest. Qwen Max is slower but higher quality. |
| JSON mode | Set response_format: {"type": "json_object"} for JSON output. Works on DeepSeek and Qwen. Always include "JSON" in your prompt as well. |
| Multi-turn conversations | Send the full conversation history each request (standard OpenAI way). Chinese APIs do not maintain server-side sessions — you manage context client-side. |
| Batch requests | No native batch endpoint. For parallel processing, use async/threading in Python with multiple API calls. Chinese providers handle concurrent requests well. |
| Cost tracking | Prices are per 1M tokens (input + output counted separately on most models). Monitor usage at shaibar.com/console/log. |
Questions? Email support@shaibar.com · Dashboard · Top Up USDT