Batch processing

What is Batch processing?

Batch processing is a method of handling a large amount of data or tasks all at once, rather than individually or in real-time. It involves grouping the data or tasks into batches and processing them together as a single unit. This approach is efficient because it reduces the overhead associated with processing each item separately. Batch processing is commonly used in various fields, such as data analysis, computer programming, and data entry, where large volumes of data need to be processed in a systematic and automated manner.

Features available for Batch processing

Eden AI Batch processing is available for all Eden AI synchronous features:

audio: text_to_speech
image: anonymization, explicit_content, face_detection, generation, landmark_detection, logo_detection, object_detection
OCR: identity_parser, invoice_parser, ocr, receipt_parser, resume_parser
text: anonymization, chat, code_generation, custom_classification, custom_named_entity_recognition, embeddings, generation, keyword_extraction, moderation, named_entity_recognition, question_answer, search, sentiment_analysis, spell_check, summarize, syntax_analysis, topic_extraction
translation: automatic_translation, document_translation, language_detection

How to use Batch processing?

You can access the API reference to perform batch processing requests. Batch processing is an asynchronous API, so you need first to do a POST request that contains all the data that you want to process (text or file urls). Here is a python sample code:

import json
import requests

headers = {"Authorization": "Bearer your_API_key"}

url="https://api.edenai.run/v2/text/sentiment_analysis/batch/test/"
payload = {
    "requests": [
    {
        "text": "It's -25 outside and I am so hot.",
        "language": "en",
        "providers": "google"
    },
    {
        "text": "Overall I am satisfied with my experience at Amazon, but two areas of major improvement needed.",
        "language": "en",
        "providers": "google"
    }]
}
     

response = requests.post(url, json=payload, headers=headers)
print(response.text)

In the example, we did a batch request for sentiment analysis. Depending on the subfeature you want to use, you need to modify the url: https://api.edenai.run/v2/{feature}/{subfeature}/batch/{name}/

{feature} can be values here: audio, image, OCR, text, translation.

{subfeature} values are listed here, associated to the different {feature}.

{name} is the name of your batch, this value is optional.

Then you can set up in "requests" the different calls you want to process with all the parameters required for the subfeature you use.

This POST API request will return you the ID of the request ({name} or a generated ID if you did not defined {name}.

You can then access to the results with a GET request:

url = "https://api.edenai.run/v2/text/sentiment_analysis/batch/test/"

headers = {"Authorization": "Bearer your_API_key"}

response = requests.get(url, headers=headers)

print(response.text)