Skip to main content

Advanced Router Usage

Master advanced routing patterns, optimization strategies, and production best practices.

Overview

This guide covers advanced routing techniques for production applications, including cost optimization, context-aware routing, multi-turn conversations, and performance tuning. What you’ll learn:
  • Cost-optimized routing strategies
  • Context-aware model selection
  • Multi-turn conversation handling
  • Performance optimization techniques
  • Function calling with routing
  • Production deployment patterns

Cost Optimization Strategies

Strategy 1: Tiered Routing by Query Complexity

Route simple queries to cheaper models and complex queries to premium models:

Strategy 2: Dynamic Budget Management

Track spending and adjust routing based on budget:

Context-Aware Routing

Use Case-Specific Candidate Pools

Define different candidate pools for different use cases:

Multi-Turn Conversations

Stateful Conversation with Routing

Maintain conversation state while using smart routing:

Function Calling with Routing

Smart Routing with Tools

Combine smart routing with function calling:

Performance Optimization

Strategy 1: Client-Side Caching

Cache routing decisions for repeated queries:

Strategy 2: Parallel Requests with Routing

Make multiple routed requests in parallel:

Production Deployment Patterns

Pattern 1: Fallback to Fixed Model

Implement graceful fallback when routing fails:

Best Practices Summary

Cost Optimization

  • ✅ Use tiered routing based on query complexity
  • ✅ Track spending and adjust candidates dynamically
  • ✅ Limit candidates to 3-5 models for faster routing
  • ✅ Use budget models for simple queries
  • ❌ Don’t use premium-only candidates for all queries

Performance

  • ✅ Cache routing decisions at application level
  • ✅ Use async/parallel requests for batch processing
  • ✅ Set appropriate timeouts (30s recommended)
  • ✅ Monitor routing latency in production
  • ❌ Don’t make synchronous serial requests

Reliability

  • ✅ Implement fallback to fixed models
  • ✅ Handle routing failures gracefully
  • ✅ Log routing errors for analysis
  • ✅ Set up alerting for high failure rates
  • ❌ Don’t rely solely on routing without fallback

Context Awareness

  • ✅ Define use case-specific candidate pools
  • ✅ Adjust candidates based on request characteristics
  • ✅ Consider tools/functions in candidate selection
  • ✅ Maintain conversation context across turns
  • ❌ Don’t use same candidates for all use cases

Next Steps