Working with Media Files - Eden AI Documentation

Send images, documents, and other media files to LLMs for analysis and understanding.

Overview

Eden AI V3 LLM endpoints support multimodal inputs, allowing you to send:

Images - For visual understanding and analysis
Documents - PDFs and text files for processing
Mixed content - Combine text prompts with media

Multimodal capabilities enable use cases like:

Analyzing screenshots and diagrams
Extracting data from images and documents
Visual question answering
Chart and graph interpretation
Receipt and invoice processing

Supported Input Types

V3 supports multiple ways to send media to LLMs:

Input Type	Format	Best For	Example
HTTP(S) URL	Direct link	Publicly accessible files	`https://example.com/image.jpg`
Base64 Data URL	Inline encoded data	Small files, secure data	`data:image/jpeg;base64,...`
File Upload	UUID from `/v3/upload`	Reusable files, large files	`550e8400-e29b-...`
Base64 File Data	Raw base64 or data URL	PDFs, documents	`data:application/pdf;base64,...`

Image Inputs

Using Image URLs

The simplest method for publicly accessible images:

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/photo.jpg"
                    }
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result)

Using Base64 Image Data

For inline images or when URLs aren’t available:

import base64
import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Read and encode image
with open("image.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

# Create data URL
data_url = f"data:image/jpeg;base64,{image_data}"

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image in detail."},
                {
                    "type": "image_url",
                    "image_url": {"url": data_url}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Using Uploaded Files

For reusable images or better performance:

import requests

# Step 1: Upload the image
upload_url = "https://api.edenai.run/v3/upload"
upload_headers = {"Authorization": "Bearer YOUR_API_KEY"}

files = {"file": open("screenshot.png", "rb")}
upload_response = requests.post(upload_url, headers=upload_headers, files=files)
file_id = upload_response.json()["file_id"]

print(f"Uploaded file ID: {file_id}")

# Step 2: Use the file in LLM request
llm_url = "https://api.edenai.run/v3/llm/chat/completions"
llm_headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "google/gemini-2.5-pro",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Analyze this screenshot and list all UI elements."},
                {
                    "type": "file",
                    "file": {"file_id": file_id}
                }
            ]
        }
    ]
}

response = requests.post(llm_url, headers=llm_headers, json=payload)
result = response.json()
print(result)

Document Inputs

PDF and Document Files

Send PDFs and documents for analysis:

import requests

# Upload PDF document
upload_url = "https://api.edenai.run/v3/upload"
upload_headers = {"Authorization": "Bearer YOUR_API_KEY"}

files = {"file": open("report.pdf", "rb")}
data = {"purpose": "llm-analysis"}

upload_response = requests.post(
    upload_url,
    headers=upload_headers,
    files=files,
    data=data
)
file_id = upload_response.json()["file_id"]

# Analyze the PDF with LLM
llm_url = "https://api.edenai.run/v3/llm/chat/completions"
llm_headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Summarize this document and extract key findings."},
                {
                    "type": "file",
                    "file": {"file_id": file_id}
                }
            ]
        }
    ]
}

response = requests.post(llm_url, headers=llm_headers, json=payload)
result = response.json()
print(result)

Base64 Document Data

For inline document processing:

import base64
import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Read and encode PDF
with open("invoice.pdf", "rb") as f:
    pdf_data = base64.b64encode(f.read()).decode('utf-8')

# Create data URL for PDF
data_url = f"data:application/pdf;base64,{pdf_data}"

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Extract all line items and totals from this invoice."},
                {
                    "type": "file",
                    "file": {"file_data": data_url}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Mixed Content Messages

Multiple Images

Send multiple images in a single message:

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Compare these two images and describe the differences."},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/before.jpg"}
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/after.jpg"}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Text + Images + Documents

Combine different media types:

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Review the chart and supporting documentation. Provide analysis."
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/chart.png"}
                },
                {
                    "type": "file",
                    "file": {"file_id": "550e8400-e29b-41d4-a716-446655440000"}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Practical Examples

Analyze a Screenshot

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "This is a screenshot of an error message. What's wrong and how do I fix it?"
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/error-screenshot.png"}
                }
            ]
        }
    ],
    "max_tokens": 500
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result)

Extract Receipt Data

import requests
import base64

# Read receipt image
with open("receipt.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Extract the following from this receipt: merchant name, date, total amount, items purchased. Format as JSON."
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{image_data}"
                    }
                }
            ]
        }
    ],
    "temperature": 0.2  # Lower temperature for structured extraction
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print("Extracted data:", result)

Summarize PDF Document

import requests

# Upload PDF
upload_url = "https://api.edenai.run/v3/upload"
upload_headers = {"Authorization": "Bearer YOUR_API_KEY"}

files = {"file": open("research-paper.pdf", "rb")}
upload_response = requests.post(upload_url, headers=upload_headers, files=files)
file_id = upload_response.json()["file_id"]

# Request summary
llm_url = "https://api.edenai.run/v3/llm/chat/completions"
llm_headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "google/gemini-2.5-pro",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Provide a comprehensive summary of this research paper, including methodology, key findings, and conclusions."
                },
                {
                    "type": "file",
                    "file": {"file_id": file_id}
                }
            ]
        }
    ],
    "max_tokens": 1000
}

response = requests.post(llm_url, headers=llm_headers, json=payload)
result = response.json()
print(result)

Chart Analysis

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Analyze this chart and provide: 1) Main trends, 2) Notable outliers, 3) Key insights, 4) Recommendations"
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/sales-chart.png"}
                }
            ]
        }
    ],
    "temperature": 0.3
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result)

Provider Support Matrix

Different providers have varying multimodal capabilities:

Provider	Models	Image URLs	Base64 Images	PDF/Docs	Max Image Size	Max File Size
OpenAI	gpt-4o, gpt-4-turbo	✓	✓	✓	20 MB	512 MB
Anthropic	claude-opus-4-5, claude-sonnet-4-5	✓	✓	✓	5 MB	10 MB
Google	gemini-2.5-pro, gemini-2.5-flash	✓	✓	✓	20 MB	2 GB
Mistral	pixtral-12b	✓	✓	-	10 MB	-

See Vision Capabilities for detailed provider comparison.

Best Practices

Choosing Input Method

Use HTTP(S) URLs when:

Images are publicly accessible
You want to minimize request payload size
Files are already hosted

Use uploaded files (UUID) when:

Processing the same file multiple times
Files are large (reduces repeated upload overhead)
Better performance is needed

Use base64 when:

Files are small (5 MB)
URLs aren’t available
Security/privacy requires inline data

Optimizing Performance

Image optimization:

Resize large images before uploading
Use appropriate compression
Consider using URLs for public images

Document optimization:

Extract relevant pages from large PDFs
Use text extraction for text-heavy documents
Consider OCR preprocessing for scanned documents

Prompting Strategies

Be specific:

# Vague
"What's in this image?"

# Specific
"List all visible UI components in this screenshot, including buttons, text fields, and their labels."

Provide context:

{
    "type": "text",
    "text": "This is a medical chart showing patient vitals over 24 hours. Identify any concerning trends."
}

Use structured output:

{
    "type": "text",
    "text": "Extract data as JSON with fields: date, vendor, total, items[]."
}

Error Handling

Common Issues

File too large:

{
  "error": {
    "code": "file_too_large",
    "message": "File size exceeds provider limit of 20 MB"
  }
}

Unsupported format:

{
  "error": {
    "code": "unsupported_format",
    "message": "Image format .bmp is not supported"
  }
}

Invalid base64:

{
  "error": {
    "code": "invalid_base64",
    "message": "Invalid base64 data in data URL"
  }
}

Handling Errors

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
}

try:
    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()
    result = response.json()
    print(result)

except requests.exceptions.HTTPError as e:
    if e.response.status_code == 413:
        print("File too large. Try compressing or resizing.")
    elif e.response.status_code == 422:
        print("Invalid request:", e.response.json())
    else:
        print(f"HTTP error: {e}")

Next Steps

Vision Capabilities - Deep dive into image analysis
File Attachments - Working with documents and PDFs
Upload Files - File upload and management
Streaming Responses - Handle SSE streaming
Chat Completions - Core LLM usage guide

V3 Documentation

​Overview

​Supported Input Types

​Image Inputs

​Using Image URLs

​Using Base64 Image Data

​Using Uploaded Files

​Document Inputs

​PDF and Document Files

​Base64 Document Data

​Mixed Content Messages

​Multiple Images

​Text + Images + Documents

​Practical Examples

​Analyze a Screenshot

​Extract Receipt Data

​Summarize PDF Document

​Chart Analysis

​Provider Support Matrix

​Best Practices

​Choosing Input Method

​Optimizing Performance

​Prompting Strategies

​Error Handling

​Common Issues

​Handling Errors

​Next Steps