Working with Media Files
Send images, documents, and other media files to LLMs for analysis and understanding.Overview
Eden AI V3 LLM endpoints support multimodal inputs, allowing you to send:- Images - For visual understanding and analysis
- Documents - PDFs and text files for processing
- Mixed content - Combine text prompts with media
- Analyzing screenshots and diagrams
- Extracting data from images and documents
- Visual question answering
- Chart and graph interpretation
- Receipt and invoice processing
Supported Input Types
V3 supports multiple ways to send media to LLMs:| Input Type | Format | Best For | Example |
|---|---|---|---|
| HTTP(S) URL | Direct link | Publicly accessible files | https://example.com/image.jpg |
| Base64 Data URL | Inline encoded data | Small files, secure data | data:image/jpeg;base64,... |
| File Upload | UUID from /v3/upload | Reusable files, large files | 550e8400-e29b-... |
| Base64 File Data | Raw base64 or data URL | PDFs, documents | data:application/pdf;base64,... |
Image Inputs
Using Image URLs
The simplest method for publicly accessible images:Using Base64 Image Data
For inline images or when URLs aren’t available:Using Uploaded Files
For reusable images or better performance:Document Inputs
PDF and Document Files
Send PDFs and documents for analysis:Base64 Document Data
For inline document processing:Mixed Content Messages
Multiple Images
Send multiple images in a single message:Text + Images + Documents
Combine different media types:Practical Examples
Analyze a Screenshot
Extract Receipt Data
Summarize PDF Document
Chart Analysis
Provider Support Matrix
Different providers have varying multimodal capabilities:| Provider | Models | Image URLs | Base64 Images | PDF/Docs | Max Image Size | Max File Size |
|---|---|---|---|---|---|---|
| OpenAI | gpt-4o, gpt-4-turbo | ✓ | ✓ | ✓ | 20 MB | 512 MB |
| Anthropic | claude-3-opus, claude-3-5-sonnet | ✓ | ✓ | ✓ | 5 MB | 10 MB |
| gemini-1.5-pro, gemini-1.5-flash | ✓ | ✓ | ✓ | 20 MB | 2 GB | |
| Mistral | pixtral-12b | ✓ | ✓ | - | 10 MB | - |
Best Practices
Choosing Input Method
Use HTTP(S) URLs when:- Images are publicly accessible
- You want to minimize request payload size
- Files are already hosted
- Processing the same file multiple times
- Files are large (reduces repeated upload overhead)
- Better performance is needed
- Files are small (5 MB)
- URLs aren’t available
- Security/privacy requires inline data
Optimizing Performance
Image optimization:- Resize large images before uploading
- Use appropriate compression
- Consider using URLs for public images
- Extract relevant pages from large PDFs
- Use text extraction for text-heavy documents
- Consider OCR preprocessing for scanned documents
Prompting Strategies
Be specific:Error Handling
Common Issues
File too large:Handling Errors
Next Steps
- Vision Capabilities - Deep dive into image analysis
- File Attachments - Working with documents and PDFs
- Upload Files - File upload and management
- Streaming Responses - Handle SSE streaming
- Chat Completions - Core LLM usage guide