Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Visual AI predictions

There are various methods for making predictions from image models:

Method Description See...
UI predictions Use the same dataset format as the one used to create the project (upload a ZIP archive with one or more images). Make Predictions tab
Model Deployment: API
(real-time, small datasets)
Use base64 format (described below).
Model Deployment: Batch
(large datasets)
Use base64 format (described below). Batch prediction scripts
Portable Prediction Server Use base64 format (described below). Portable Prediction Server

Base64 encoding format

If your training dataset consists of a ZIP archive with one or more image files, the prediction dataset needs to be converted to a different format so that it is fully contained in a single CSV file.

Sample scripts

See the links below for help with visual data conversion:

Note

Log in to GitHub before accessing these GitHub resources.

The following shows sample usage:

python visualai_data_prep.py pred_dataset.zip pred_dataset.csv image

Where:

  • pred_dataset.zip is the input dataset (ZIP of images).
  • pred_dataset.csv is the output, which can be used via prediction API.
  • image is an image column name.

Deep dive

To convert a set of image files into a single CSV file, each image must be converted to base64 text. This format allows DataRobot to embed images as a regular text column in the CSV. Encoding binary image data into base64 is a simple operation, present in all programming languages.

Here is an example in Python:

import base64
import pandas as pd
from io import BytesIO
from PIL import Image


def image_to_base64(image: Image) -> str:
    img_bytes = BytesIO()
    image.save(img_bytes, 'jpeg', quality=90)
    image_base64 = base64.b64encode(img_bytes.getvalue()).decode('utf-8')
    return image_base64


# let's build a CSV with a single row that contains an image
# the same general approach works if you have multiple image rows or columns
image = Image.open('cat.jpg')
image_base64 = image_to_base64(image)

df = pd.DataFrame({'animal_image': [image_base64]})
df.to_csv('prediction_dataset.csv' index=False)
print(df)

Note

Encode a binary image file (not decoded pixel contents) to base64. This example uses PIL.Image to open the file, but you can base64-encode an image file directly.


Updated May 31, 2022
Back to top