Skip to main content

Getting Started with Smart Routing

Learn how Eden AI’s smart routing system automatically selects the best AI model for your requests.

Overview

Smart routing is Eden AI’s intelligent model selection system that automatically chooses the optimal AI model for your requests. Instead of manually selecting models, you use the special identifier @edenai and let the system analyze your request to pick the best provider and model. What you’ll learn:
  • How smart routing works
  • Basic usage with default models
  • Customizing candidate pools
  • Understanding routing decisions
  • When to use smart routing vs. fixed models

How It Works

The routing system follows this flow:
Your Request with model: "@edenai"

Eden AI Router Service

Analyze request context:
- Message content
- Tools/functions
- Request parameters

Query NotDiamond API

Select optimal model

Execute request with selected model

Response (includes selected model info)
Key components:
  1. NotDiamond Integration - Powered by NotDiamond, an AI routing engine that analyzes request context
  2. Model Inventory - Database of available models with capabilities and pricing
  3. Redis Cache - Caches available models (1-hour TTL) for performance
  4. Validation Layer - Ensures models are available and properly formatted

Basic Usage

Quick Start: Default Routing

The simplest way to use smart routing is to set model: "@edenai" without specifying candidates. The system will choose from all available models. Response includes selected model:
data: {"id":"...","model":"openai/gpt-4o","choices":[{"delta":{"content":"Machine"},...}],...}

Custom Candidate Pool

Restrict routing to specific models using router_candidates: Benefits of custom candidates:
  • Control over which models can be selected
  • Cost optimization by limiting to budget-friendly models
  • Quality control by restricting to tested models
  • Use case optimization (e.g., code-focused models for coding tasks)

Model Format

Models are specified in the format:
provider/model
Examples:
  • openai/gpt-4o
  • anthropic/claude-sonnet-4-5
  • google/gemini-2.0-flash
  • cohere/command-r-plus
Finding available models: Use the /v3/llm/models endpoint to list all available models:

Routing with OpenAI SDK

Smart routing works seamlessly with the official OpenAI SDK:

When to Use Smart Routing

Use Smart Routing When:

Optimizing cost/performance - Let the system balance quality and cost ✅ Exploring new use cases - Don’t know which model works best yet ✅ Handling diverse requests - Different queries need different models ✅ Minimizing maintenance - No need to update code when better models launch ✅ A/B testing models - Compare routing vs. fixed model performance

Use Fixed Models When:

Strict latency requirements - Routing adds 100-500ms overhead ❌ High-frequency APIs - 100+ requests/second may hit router limits ❌ Compliance requirements - Must use specific certified models ❌ Consistent output format - Need identical behavior across requests ❌ Already optimized - You’ve tested and know the best model for your use case

Understanding Routing Latency

Smart routing introduces a small overhead:
PhaseLatencyNotes
Routing decision100-500msAnalyzing request and selecting model
First token+routing timeFirst token includes routing overhead
Subsequent tokensNo overheadNormal streaming after first token
Example timeline:
Request sent → [300ms routing] → [500ms first token] → [streaming...]
Total to first token: ~800ms
Compare with fixed model:
Request sent → [500ms first token] → [streaming...]
Total to first token: ~500ms
Optimization tips:
  • Use custom candidates (3-5 models) to reduce routing time
  • Cache routing decisions at application level for repeated queries
  • Consider fixed models for latency-critical applications

Error Handling

The router has built-in fallback mechanisms: Common errors:
  • 503 Service Unavailable - Router service temporarily down
  • 422 Validation Error - Invalid model candidates
  • Timeout - Routing took too long (>30s)

Best Practices

1. Choose Appropriate Candidates

Do:
  • Limit to 3-5 models for faster routing
  • Group models by similar capabilities
  • Test candidates with your specific workload
  • Include at least one budget-friendly option
Don’t:
  • Specify 20+ candidates (slows routing)
  • Mix specialized models (code + creative)
  • Use untested models in production

2. Monitor Performance

Do:
  • Track which models get selected
  • Monitor routing latency
  • A/B test routing vs. fixed models
  • Set up alerts for routing failures
Don’t:
  • Deploy without monitoring
  • Assume routing is always optimal
  • Ignore cost patterns

3. Handle Errors Gracefully

Do:
  • Set appropriate timeouts (30s recommended)
  • Implement fallback to fixed models
  • Log routing failures for analysis
  • Retry with exponential backoff
Don’t:
  • Use infinite timeouts
  • Ignore routing errors
  • Rely solely on routing without fallback

Next Steps

Quick Reference

Request Parameters

ParameterTypeRequiredDescription
modelstringYesSet to "@edenai" to activate routing
router_candidatesstring[]NoList of models to choose from (default: all models)
messagesobject[]YesConversation messages (used for routing context)
toolsobject[]NoFunction definitions (considered in routing)
streambooleanYesMust be true for V3

Response Fields

The selected model is returned in the response:
{
  "id": "chatcmpl-...",
  "model": "openai/gpt-4o",  // Selected model
  "choices": [...]
}

Supported Features

Smart routing works with all V3 LLM features:
  • ✅ Streaming (mandatory)
  • ✅ Function calling / Tools
  • ✅ Vision / Multimodal
  • ✅ Multi-turn conversations
  • ✅ System messages
  • ✅ Temperature and other parameters