API Translation Guide
A comprehensive guide to Aspy’s bidirectional API translation system—enabling any client to talk to any backend, regardless of API format.
Overview
Aspy translates between OpenAI and Anthropic API formats in both directions:
┌─────────────────────────────────────────────────────────────────────────────┐
│ DIRECTION 1: Claude Code → Other Models (PRIMARY USE CASE) │
│ │
│ Claude Code ──(Anthropic)──► Aspy ──(OpenAI)──► GPT-4/Azure/Ollama/etc │
│ ◄──(Anthropic)── ◄──(OpenAI)── │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ DIRECTION 2: OpenAI Tools → Claude │
│ │
│ LangChain/SDK ──(OpenAI)──► Aspy ──(Anthropic)──► Claude API │
│ ◄──(OpenAI)── ◄──(Anthropic)── │
└─────────────────────────────────────────────────────────────────────────────┘
The killer feature: Claude Code becomes a universal AI coding interface. Point it at any model.
Use Cases
Primary: Claude Code as Universal Interface
Claude Code speaks Anthropic format exclusively. With translation enabled:
| You Want To Use | How It Works |
|---|---|
| GPT-5.x | Claude Code → Aspy → OpenAI API |
| Gemini Pro 3 | Claude Code → Aspy → OpenRouter endpoint |
| Azure OpenAI | Claude Code → Aspy → Azure endpoint |
| Ollama | Claude Code → Aspy → localhost:11434 |
| Any OpenAI-compatible | Claude Code → Aspy → that endpoint |
Claude Code’s Anthropic requests get translated to OpenAI format, sent to your chosen backend, and responses translated back to Anthropic format.
Secondary: OpenAI Tools → Claude
Existing OpenAI integrations can route through Aspy to Claude:
| Tool | Configuration |
|---|---|
| LangChain | Set base_url to Aspy |
| OpenAI Python SDK | client = OpenAI(base_url="http://localhost:8080/dev-1") |
Any /v1/chat/completions client |
Point at Aspy |
Configuration
Basic Setup
# ~/.config/aspy/config.toml
[translation]
enabled = true # Master switch
auto_detect = true # Auto-detect format from request
Direction 1: Claude Code → OpenAI Backend
To route Claude Code requests to an OpenAI-compatible backend:
[translation]
enabled = true
auto_detect = true
# Map Anthropic models to OpenAI models
[translation.model_mapping]
"claude-sonnet-4-20250514" = "gpt-4"
"claude-3-haiku-20240307" = "gpt-3.5-turbo"
# Configure upstream (where to send translated requests)
[proxy]
upstream_url = "https://api.openai.com" # Or Azure, Ollama, etc.
Then configure Claude Code:
export ANTHROPIC_BASE_URL=http://127.0.0.1:8080/dev-1
# Claude Code now routes through Aspy → OpenAI
Direction 2: OpenAI Clients → Claude
To let OpenAI-format clients talk to Claude:
[translation]
enabled = true
auto_detect = true
# Map OpenAI models to Anthropic models
[translation.model_mapping]
"gpt-4" = "claude-sonnet-4-20250514"
"gpt-4-turbo" = "claude-sonnet-4-20250514"
"gpt-4o" = "claude-sonnet-4-20250514"
"gpt-3.5-turbo" = "claude-3-haiku-20240307"
# Upstream is Anthropic (default)
[proxy]
upstream_url = "https://api.anthropic.com"
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Master switch for translation |
auto_detect |
bool | true |
Auto-detect API format from request |
model_mapping |
map | (see below) | Bidirectional model name mappings |
Default Model Mappings
| OpenAI Model | Anthropic Model |
|---|---|
gpt-4 |
claude-sonnet-4-20250514 |
gpt-4-turbo |
claude-sonnet-4-20250514 |
gpt-4o |
claude-sonnet-4-20250514 |
gpt-3.5-turbo |
claude-3-haiku-20240307 |
o1 |
claude-sonnet-4-20250514 |
o1-mini |
claude-3-haiku-20240307 |
Mappings work bidirectionally—the same config handles both directions.
Format Detection
When auto_detect = true, Aspy determines the API format using:
- Path (highest priority):
/v1/chat/completions→ OpenAI format/v1/messages→ Anthropic format
- Headers:
openai-organizationheader → OpenAIanthropic-versionheader → AnthropicBearer sk-...auth → OpenAIx-api-keyheader → Anthropic
- Body structure (fallback):
- Model prefix
gpt-,o1-→ OpenAI - Model prefix
claude→ Anthropic - OpenAI-specific fields (
frequency_penalty,logprobs,n) → OpenAI - Anthropic-specific content types (
tool_use,thinking) → Anthropic
- Model prefix
Request Translation
Anthropic → OpenAI (Direction 1)
| Anthropic Parameter | OpenAI Equivalent | Notes |
|---|---|---|
model |
model |
Mapped via model_mapping |
messages |
messages |
Content blocks flattened |
system |
messages[0] |
Prepended as system message |
max_tokens |
max_tokens |
Direct mapping |
temperature |
temperature |
Scaled: Anthropic 0-1 → OpenAI 0-2 |
top_p |
top_p |
Direct mapping |
stop_sequences |
stop |
Direct mapping |
stream |
stream |
Direct mapping |
tools |
tools |
Structure adapted |
OpenAI → Anthropic (Direction 2)
| OpenAI Parameter | Anthropic Equivalent | Notes |
|---|---|---|
model |
model |
Mapped via model_mapping |
messages |
messages + system |
System messages extracted |
max_tokens |
max_tokens |
Default: 4096 if not specified |
temperature |
temperature |
Scaled: OpenAI 0-2 → Anthropic 0-1 |
top_p |
top_p |
Direct mapping |
stop |
stop_sequences |
Converted to array |
stream |
stream |
Direct mapping |
tools |
tools |
Similar structure |
tool_choice |
tool_choice |
Direct mapping |
Ignored Parameters
These parameters are accepted but not translated (no equivalent in target format):
frequency_penalty/presence_penaltylogprobs/top_logprobsn(both APIs return 1 completion)logit_biasthinking(Anthropic-specific, filtered in translation)
Response Translation
Streaming Responses (stream: true)
This is the critical path—Claude Code uses stream: true for 95%+ of requests.
Anthropic → OpenAI (Direction 1)
| Anthropic Event | OpenAI Event |
|---|---|
message_start |
Initial chunk with role: "assistant" |
content_block_start (text) |
(no event, wait for delta) |
content_block_start (tool_use) |
Tool call header with id, name |
content_block_delta (text) |
choices[].delta.content |
content_block_delta (input_json) |
Tool arguments streaming |
content_block_delta (thinking) |
Filtered out |
content_block_stop |
(internal index tracking) |
message_delta |
choices[].finish_reason |
message_stop |
data: [DONE] |
OpenAI → Anthropic (Direction 2)
| OpenAI Event | Anthropic Event |
|---|---|
Initial chunk with role |
message_start |
delta.content |
content_block_delta (text_delta) |
delta.tool_calls |
content_block_start + deltas |
finish_reason |
message_delta with stop_reason |
data: [DONE] |
message_stop |
Buffered Responses (stream: false)
Complete response translated at once. Output matches target format’s structure.
Stop Reason Mapping
| Anthropic | OpenAI |
|---|---|
end_turn |
stop |
tool_use |
tool_calls |
max_tokens |
length |
stop_sequence |
stop |
Usage Examples
Example 1: Claude Code → GPT-5.x
# ~/.config/aspy/config.toml
[translation]
enabled = true
auto_detect = true
[translation.model_mapping]
"claude-sonnet-4-20250514" = "gpt-5.1"
"claude-3-haiku-20240307" = "gpt-4o-mini"
[proxy]
upstream_url = "https://api.openai.com"
# Terminal 1: Start Aspy
aspy
# Terminal 2: Configure Claude Code
export ANTHROPIC_BASE_URL=http://127.0.0.1:8080/dev-1
export OPENAI_API_KEY=sk-... # Your OpenAI key
# Now use Claude Code normally - requests go to GPT-5.1
Example 2: Claude Code → Ollama (Local)
# ~/.config/aspy/config.toml
[translation]
enabled = true
auto_detect = true
[translation.model_mapping]
"claude-sonnet-4-20250514" = "llama2"
"claude-3-haiku-20240307" = "codellama"
[proxy]
upstream_url = "http://localhost:11434"
# Start Ollama
ollama serve
# Start Aspy
aspy
# Configure Claude Code
export ANTHROPIC_BASE_URL=http://127.0.0.1:8080/dev-1
# Claude Code now talks to your local Ollama models
Example 3: OpenAI SDK → Claude
from openai import OpenAI
# Point at Aspy instead of OpenAI
client = OpenAI(
base_url="http://localhost:8080/dev-1/v1",
api_key="your-anthropic-key" # Anthropic key, not OpenAI
)
response = client.chat.completions.create(
model="gpt-4", # Gets mapped to Claude
messages=[{"role": "user", "content": "Hello!"}],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")
Example 4: Test with curl (OpenAI format → Claude)
# Start Aspy
aspy
# Send OpenAI-format request
curl http://localhost:8080/dev-1/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ANTHROPIC_API_KEY" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
Expected output (OpenAI SSE format):
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":...,"model":"gpt-4","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":...,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
...
data: [DONE]
Verification Checklist
| Check | Command | Expected |
|---|---|---|
| Config loaded | Check startup banner | Shows ✓ translation in Pipeline |
| Buffered works | curl with stream: false |
Returns target format JSON |
| Streaming works | curl with stream: true |
Returns target format SSE |
| Model preserved | Check response | Returns same model name sent |
| Tools work | curl with tools | Tool calls in target format |
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
○ translation (disabled) in startup |
enabled = false |
Set enabled = true in config |
| Request not translated | Format not detected | Check path matches expected format |
| Response in wrong format | Detection failed | Ensure auto_detect = true |
| Model name wrong in response | Missing mapping | Add to [translation.model_mapping] |
| Streaming incomplete | TCP fragmentation | (Should work - report bug if not) |
| Tool calls missing | Different structure | Check tool format for target API |
| Claude Code not connecting | Wrong env var | Ensure ANTHROPIC_BASE_URL set |
Architecture Notes
How It Fits Together
| System | Purpose | When Runs |
|---|---|---|
| Translation | Format conversion (OpenAI ↔ Anthropic) | Pre/post proxy |
| Augmentation | Inject content into SSE streams | During streaming |
| EventProcessor | Transform parsed ProxyEvents | Post-parsing |
Processing Order (Streaming)
1. Request arrives (any format)
2. Format detected
3. Request translated (if needed)
4. Forwarded to upstream
5. Response chunk arrives
6. Real-time extraction (RAW format for tool registration, thinking streaming)
7. Augmentation injection (in upstream format)
8. Translation to client format (if needed)
9. Forward to client
Translation happens at the OUTPUT stage, preserving internal observability. Aspy always sees the raw format internally, regardless of what clients send/receive.
Model Name Preservation
The TranslationContext carries original_model through the request-response cycle:
Client: "gpt-4" → Aspy captures → Backend: "claude-sonnet-4" → Response: "gpt-4"
The client always sees the model name it originally requested.
Quick Reference
# Enable translation
cat >> ~/.config/aspy/config.toml << 'EOF'
[translation]
enabled = true
auto_detect = true
EOF
# Custom model mapping (both directions)
cat >> ~/.config/aspy/config.toml << 'EOF'
[translation.model_mapping]
"gpt-4" = "claude-sonnet-4-20250514"
"claude-sonnet-4-20250514" = "gpt-4"
EOF
# Test OpenAI → Claude
curl http://localhost:8080/dev-1/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ANTHROPIC_API_KEY" \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hi"}]}'
# Configure Claude Code → other models
export ANTHROPIC_BASE_URL=http://127.0.0.1:8080/dev-1