Getting Started with Smart Routing

Learn how Eden AI’s smart routing system automatically selects the best AI model for your requests.

Overview

Smart routing is Eden AI’s intelligent model selection system that automatically chooses the optimal AI model for your requests. Instead of manually selecting models, you use the special identifier @edenai and let the system analyze your request to pick the best provider and model. What you’ll learn:

How smart routing works
Basic usage with default models
Customizing candidate pools
Understanding routing decisions
When to use smart routing vs. fixed models

How It Works

The routing system follows this flow:

Your Request with model: "@edenai"
        ↓
Eden AI Router Service
        ↓
Analyze request context:
- Message content
- Tools/functions
- Request parameters
        ↓
Query NotDiamond API
        ↓
Select optimal model
        ↓
Execute request with selected model
        ↓
Response (includes selected model info)

Key components:

NotDiamond Integration - Powered by NotDiamond, an AI routing engine that analyzes request context
Model Inventory - Database of available models with capabilities and pricing
Redis Cache - Caches available models (1-hour TTL) for performance
Validation Layer - Ensures models are available and properly formatted

Basic Usage

Quick Start: Default Routing

The simplest way to use smart routing is to set model: "@edenai" without specifying candidates. The system will choose from all available models.

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}

payload = {
"model": "@edenai", # Activates smart routing
"messages": [
{"role": "user", "content": "Explain machine learning"}
]
}

response = requests.post(url, headers=headers, json=payload)
data = response.json()

# Get the selected model
selected_model = data.get('model')
print(f"Router selected: {selected_model}")

# Process content
content = data.get('choices', [{}])[0].get('message', {}).get('content', '')
print(content)

Response includes selected model:

{"id":"...","model":"openai/gpt-4o","choices":[{"message":{"content":"Machine learning is..."},...}],...}

Custom Candidate Pool

Restrict routing to specific models using router_candidates:

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}

payload = {
"model": "@edenai", # Only choose from these models
"router_candidates": [
"openai/gpt-4o",
"anthropic/claude-sonnet-4-5",
"google/gemini-2.5-flash"
],
"messages": [
{"role": "user", "content": "Write a Python function to sort a list"}
]
}

response = requests.post(url, headers=headers, json=payload)
data = response.json()

# Get the selected model
selected_model = data.get('model')
print(f"Router selected: {selected_model}")

# Process content
content = data.get('choices', [{}])[0].get('message', {}).get('content', '')
print(content)

Benefits of custom candidates:

Control over which models can be selected
Cost optimization by limiting to budget-friendly models
Quality control by restricting to tested models
Use case optimization (e.g., code-focused models for coding tasks)

Model Format

Models are specified in the format:


provider/model

Examples:

openai/gpt-4o
anthropic/claude-sonnet-4-5
google/gemini-2.5-flash
cohere/command-r-plus

Finding available models: Use the /v3/llm/models endpoint to list all available models:

import requests

url = "https://api.edenai.run/v3/llm/models"
headers = {"Authorization": "Bearer YOUR_API_KEY"}

response = requests.get(url, headers=headers)
models = response.json()

# List all models
for model in models['data']:
    print(f"{model['id']} - {model['description']}")
    print(f"  Context: {model['context_length']} tokens")
    print(f"  Pricing: {model['pricing']}")
    print()

Routing with OpenAI SDK

Smart routing works seamlessly with the official OpenAI SDK:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_EDEN_AI_KEY",
    base_url="https://api.edenai.run/v3/llm"
)

# Default routing
response = client.chat.completions.create(
    model="@edenai",
    messages=[
        {"role": "user", "content": "Explain neural networks"}
    ]
)

print(f"Router selected: {response.model}")
print(response.choices[0].message.content)

When to Use Smart Routing

Use Smart Routing When:

✅ Optimizing cost/performance - Let the system balance quality and cost ✅ Exploring new use cases - Don’t know which model works best yet ✅ Handling diverse requests - Different queries need different models ✅ Minimizing maintenance - No need to update code when better models launch ✅ A/B testing models - Compare routing vs. fixed model performance

Use Fixed Models When:

❌ Strict latency requirements - Routing adds 100-500ms overhead ❌ High-frequency APIs - 100+ requests/second may hit router limits ❌ Compliance requirements - Must use specific certified models ❌ Consistent output format - Need identical behavior across requests ❌ Already optimized - You’ve tested and know the best model for your use case

Understanding Routing Latency

Smart routing introduces a small overhead:

Phase	Latency	Notes
Routing decision	100-500ms	Analyzing request and selecting model
First token	+routing time	First token includes routing overhead
Subsequent tokens	No overhead	Normal streaming after first token

Example timeline:

Request sent → [300ms routing] → [500ms first token] → [streaming...]
Total to first token: ~800ms

```

**Compare with fixed model:**
```

Request sent → [500ms first token] → [streaming...]
Total to first token: ~500ms

Optimization tips:

Use custom candidates (3-5 models) to reduce routing time
Cache routing decisions at application level for repeated queries
Consider fixed models for latency-critical applications

Error Handling

The router has built-in fallback mechanisms:

import requests

def chat_with_router(message: str):
    """Chat with router and handle errors."""
    url = "https://api.edenai.run/v3/llm/chat/completions"
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": "@edenai",
        "messages": [{"role": "user", "content": message}]
    }

    try:
        response = requests.post(
            url,
            headers=headers,
            json=payload,
            timeout=30  # Set timeout
        )
        response.raise_for_status()

        data = response.json()
        content = data.get('choices', [{}])[0].get('message', {}).get('content', '')

        return {"success": True, "response": content}

    except requests.exceptions.Timeout:
        return {"success": False, "error": "Routing timed out"}
    except requests.exceptions.HTTPError as e:
        return {"success": False, "error": f"HTTP error: {e}"}
    except Exception as e:
        return {"success": False, "error": f"Unexpected error: {e}"}

# Usage
result = chat_with_router("Hello!")
if result["success"]:
    print(result["response"])
else:
    print(f"Error: {result['error']}")

Common errors:

503 Service Unavailable - Router service temporarily down
422 Validation Error - Invalid model candidates
Timeout - Routing took too long (>30s)

Best Practices

1. Choose Appropriate Candidates

✅ Do:

Limit to 3-5 models for faster routing
Group models by similar capabilities
Test candidates with your specific workload
Include at least one budget-friendly option

❌ Don’t:

Specify 20+ candidates (slows routing)
Mix specialized models (code + creative)
Use untested models in production

2. Monitor Performance

✅ Do:

Track which models get selected
Monitor routing latency
A/B test routing vs. fixed models
Set up alerts for routing failures

❌ Don’t:

Deploy without monitoring
Assume routing is always optimal
Ignore cost patterns

3. Handle Errors Gracefully

✅ Do:

Set appropriate timeouts (30s recommended)
Implement fallback to fixed models
Log routing failures for analysis
Retry with exponential backoff

❌ Don’t:

Use infinite timeouts
Ignore routing errors
Rely solely on routing without fallback

Next Steps

Advanced Usage - Learn advanced routing patterns and optimization strategies
LLM Smart Routing - Practical LLM-specific examples and use cases
Chat Completions - Master the LLM endpoint

Quick Reference

Request Parameters

Parameter	Type	Required	Description
`model`	string	Yes	Set to `"@edenai"` to activate routing
`router_candidates`	string[]	No	List of models to choose from (default: all models)
`messages`	object[]	Yes	Conversation messages (used for routing context)
`tools`	object[]	No	Function definitions (considered in routing)
`stream`	boolean	No	Set to `true` for streaming responses

Response Fields

The selected model is returned in the response:

{
  "id": "chatcmpl-...",
  "model": "openai/gpt-4o",  // Selected model
  "choices": [...]
}

Supported Features

Smart routing works with all V3 LLM features:

✅ Streaming
✅ Function calling / Tools
✅ Vision / Multimodal
✅ Multi-turn conversations
✅ System messages
✅ Temperature and other parameters

V3 Documentation

Getting started

Getting Started with Smart Routing

Overview

How It Works

Basic Usage

Quick Start: Default Routing

Custom Candidate Pool

Model Format

Routing with OpenAI SDK

When to Use Smart Routing

Use Smart Routing When:

Use Fixed Models When:

Understanding Routing Latency

Error Handling

Best Practices

1. Choose Appropriate Candidates

2. Monitor Performance

3. Handle Errors Gracefully

Next Steps

Quick Reference

Request Parameters

Response Fields

Supported Features

V3 Documentation

​Getting Started with Smart Routing

​Overview

​How It Works

​Basic Usage

​Quick Start: Default Routing

​Custom Candidate Pool

​Model Format

​Routing with OpenAI SDK

​When to Use Smart Routing

​Use Smart Routing When:

​Use Fixed Models When:

​Understanding Routing Latency

​Error Handling

​Best Practices

​1. Choose Appropriate Candidates

​2. Monitor Performance

​3. Handle Errors Gracefully

​Next Steps

​Quick Reference

​Request Parameters

​Response Fields

​Supported Features

Getting Started with Smart Routing

Overview

How It Works

Basic Usage

Quick Start: Default Routing

Custom Candidate Pool

Model Format

Routing with OpenAI SDK

When to Use Smart Routing

Use Smart Routing When:

Use Fixed Models When:

Understanding Routing Latency

Error Handling

Best Practices

1. Choose Appropriate Candidates

2. Monitor Performance

3. Handle Errors Gracefully

Next Steps

Quick Reference

Request Parameters

Response Fields

Supported Features