Skip to main content
Send images, documents, and other media files to LLMs for analysis and understanding.

Overview

Eden AI V3 LLM endpoints support multimodal inputs, allowing you to send:
  • Images - For visual understanding and analysis
  • Documents - PDFs and text files for processing
  • Mixed content - Combine text prompts with media
Multimodal capabilities enable use cases like:
  • Analyzing screenshots and diagrams
  • Extracting data from images and documents
  • Visual question answering
  • Chart and graph interpretation
  • Receipt and invoice processing

Supported Input Types

V3 supports multiple ways to send media to LLMs:
Input TypeFormatBest ForExample
HTTP(S) URLDirect linkPublicly accessible fileshttps://example.com/image.jpg
Base64 Data URLInline encoded dataSmall files, secure datadata:image/jpeg;base64,...
File UploadUUID from /v3/uploadReusable files, large files550e8400-e29b-...
Base64 File DataRaw base64 or data URLPDFs, documentsdata:application/pdf;base64,...

Image Inputs

Using Image URLs

The simplest method for publicly accessible images:
import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/photo.jpg"
                    }
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result)

Using Base64 Image Data

For inline images or when URLs aren’t available:
import base64
import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Read and encode image
with open("image.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

# Create data URL
data_url = f"data:image/jpeg;base64,{image_data}"

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image in detail."},
                {
                    "type": "image_url",
                    "image_url": {"url": data_url}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Using Uploaded Files

For reusable images or better performance:
import requests

# Step 1: Upload the image
upload_url = "https://api.edenai.run/v3/upload"
upload_headers = {"Authorization": "Bearer YOUR_API_KEY"}

files = {"file": open("screenshot.png", "rb")}
upload_response = requests.post(upload_url, headers=upload_headers, files=files)
file_id = upload_response.json()["file_id"]

print(f"Uploaded file ID: {file_id}")

# Step 2: Use the file in LLM request
llm_url = "https://api.edenai.run/v3/llm/chat/completions"
llm_headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "google/gemini-2.5-pro",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Analyze this screenshot and list all UI elements."},
                {
                    "type": "file",
                    "file": {"file_id": file_id}
                }
            ]
        }
    ]
}

response = requests.post(llm_url, headers=llm_headers, json=payload)
result = response.json()
print(result)

Document Inputs

PDF and Document Files

Send PDFs and documents for analysis:
import requests

# Upload PDF document
upload_url = "https://api.edenai.run/v3/upload"
upload_headers = {"Authorization": "Bearer YOUR_API_KEY"}

files = {"file": open("report.pdf", "rb")}
data = {"purpose": "llm-analysis"}

upload_response = requests.post(
    upload_url,
    headers=upload_headers,
    files=files,
    data=data
)
file_id = upload_response.json()["file_id"]

# Analyze the PDF with LLM
llm_url = "https://api.edenai.run/v3/llm/chat/completions"
llm_headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Summarize this document and extract key findings."},
                {
                    "type": "file",
                    "file": {"file_id": file_id}
                }
            ]
        }
    ]
}

response = requests.post(llm_url, headers=llm_headers, json=payload)
result = response.json()
print(result)

Base64 Document Data

For inline document processing:
import base64
import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Read and encode PDF
with open("invoice.pdf", "rb") as f:
    pdf_data = base64.b64encode(f.read()).decode('utf-8')

# Create data URL for PDF
data_url = f"data:application/pdf;base64,{pdf_data}"

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Extract all line items and totals from this invoice."},
                {
                    "type": "file",
                    "file": {"file_data": data_url}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Mixed Content Messages

Multiple Images

Send multiple images in a single message:
import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Compare these two images and describe the differences."},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/before.jpg"}
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/after.jpg"}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Text + Images + Documents

Combine different media types:
import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Review the chart and supporting documentation. Provide analysis."
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/chart.png"}
                },
                {
                    "type": "file",
                    "file": {"file_id": "550e8400-e29b-41d4-a716-446655440000"}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

Practical Examples

Analyze a Screenshot

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "This is a screenshot of an error message. What's wrong and how do I fix it?"
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/error-screenshot.png"}
                }
            ]
        }
    ],
    "max_tokens": 500
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result)

Extract Receipt Data

import requests
import base64

# Read receipt image
with open("receipt.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Extract the following from this receipt: merchant name, date, total amount, items purchased. Format as JSON."
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{image_data}"
                    }
                }
            ]
        }
    ],
    "temperature": 0.2  # Lower temperature for structured extraction
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print("Extracted data:", result)

Summarize PDF Document

import requests

# Upload PDF
upload_url = "https://api.edenai.run/v3/upload"
upload_headers = {"Authorization": "Bearer YOUR_API_KEY"}

files = {"file": open("research-paper.pdf", "rb")}
upload_response = requests.post(upload_url, headers=upload_headers, files=files)
file_id = upload_response.json()["file_id"]

# Request summary
llm_url = "https://api.edenai.run/v3/llm/chat/completions"
llm_headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "google/gemini-2.5-pro",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Provide a comprehensive summary of this research paper, including methodology, key findings, and conclusions."
                },
                {
                    "type": "file",
                    "file": {"file_id": file_id}
                }
            ]
        }
    ],
    "max_tokens": 1000
}

response = requests.post(llm_url, headers=llm_headers, json=payload)
result = response.json()
print(result)

Chart Analysis

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Analyze this chart and provide: 1) Main trends, 2) Notable outliers, 3) Key insights, 4) Recommendations"
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/sales-chart.png"}
                }
            ]
        }
    ],
    "temperature": 0.3
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result)

Provider Support Matrix

Different providers have varying multimodal capabilities:
ProviderModelsImage URLsBase64 ImagesPDF/DocsMax Image SizeMax File Size
OpenAIgpt-4o, gpt-4-turbo20 MB512 MB
Anthropicclaude-opus-4-5, claude-sonnet-4-55 MB10 MB
Googlegemini-2.5-pro, gemini-2.5-flash20 MB2 GB
Mistralpixtral-12b-10 MB-
See Vision Capabilities for detailed provider comparison.

Best Practices

Choosing Input Method

Use HTTP(S) URLs when:
  • Images are publicly accessible
  • You want to minimize request payload size
  • Files are already hosted
Use uploaded files (UUID) when:
  • Processing the same file multiple times
  • Files are large (reduces repeated upload overhead)
  • Better performance is needed
Use base64 when:
  • Files are small (5 MB)
  • URLs aren’t available
  • Security/privacy requires inline data

Optimizing Performance

Image optimization:
  • Resize large images before uploading
  • Use appropriate compression
  • Consider using URLs for public images
Document optimization:
  • Extract relevant pages from large PDFs
  • Use text extraction for text-heavy documents
  • Consider OCR preprocessing for scanned documents

Prompting Strategies

Be specific:
# Vague
"What's in this image?"

# Specific
"List all visible UI components in this screenshot, including buttons, text fields, and their labels."
Provide context:
{
    "type": "text",
    "text": "This is a medical chart showing patient vitals over 24 hours. Identify any concerning trends."
}
Use structured output:
{
    "type": "text",
    "text": "Extract data as JSON with fields: date, vendor, total, items[]."
}

Error Handling

Common Issues

File too large:
{
  "error": {
    "code": "file_too_large",
    "message": "File size exceeds provider limit of 20 MB"
  }
}
Unsupported format:
{
  "error": {
    "code": "unsupported_format",
    "message": "Image format .bmp is not supported"
  }
}
Invalid base64:
{
  "error": {
    "code": "invalid_base64",
    "message": "Invalid base64 data in data URL"
  }
}

Handling Errors

import requests

url = "https://api.edenai.run/v3/llm/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
}

try:
    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()
    result = response.json()
    print(result)

except requests.exceptions.HTTPError as e:
    if e.response.status_code == 413:
        print("File too large. Try compressing or resizing.")
    elif e.response.status_code == 422:
        print("Invalid request:", e.response.json())
    else:
        print(f"HTTP error: {e}")

Next Steps