Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Visual AI predictions

There are various methods for making predictions from image models:

Method Description See...
UI predictions Use the same dataset format as the one used to create the project. Make Predictions tab
Model Deployment: API
(real-time, small datasets)
Use base64 format (described below). Deploy tab (UI)
Prediction API
Model Deployment: Batch
(large datasets)
Use base64 format (described below). Batch prediction scripts
Portable Prediction Server Use base64 format (described below). Community example

Base64 encoding format

Most prediction methods use a single CSV file. If your training dataset consists of ZIP archive with image files, the prediction dataset needs to be converted to a different format so that it is fully contained in a single CSV file.

Sample scripts

See the links below, from the DataRobot Community, for help with visual data conversion:

Note

Log in to GitHub before accessing these GitHub resources.

The following shows sample usage:

python visualai_data_prep.py pred_dataset.zip pred_dataset.csv image

Where:

  • pred_dataset.zip is the input dataset (ZIP of images).
  • pred_dataset.csv is the output, which can be used via prediction API.
  • image is an image column name.

Deep dive

To convert a set of image files into a single CSV file, each image must be converted to base64 text. This format allows DataRobot to embed images as a regular text column in the CSV. Encoding binary image data into base64 is a simple operation, present in all programming languages.

Here is an example in Python:

import base64
import pandas as pd
from io import BytesIO
from PIL import Image


def image_to_base64(image: Image) -> str:
    img_bytes = BytesIO()
    image.save(img_bytes, 'jpeg', quality=90)
    image_base64 = base64.b64encode(img_bytes.getvalue()).decode('utf-8')
    return image_base64


# let's build a CSV with a single row that contains an image
# the same general approach works if you have multiple image rows or columns
image = Image.open('cat.jpg')
image_base64 = image_to_base64(image)

df = pd.DataFrame({'animal_image': [image_base64]})
df.to_csv('prediction_dataset.csv' index=False)
print(df)

Note

Encode a binary image file (not decoded pixel contents) to base64. This example uses PIL.Image to open the file, but you can base64-encode an image file directly.


Updated October 26, 2021
Back to top