Visual AI predictions¶
There are various methods for making predictions from image models:
Method | Description | See... |
---|---|---|
UI predictions | Use the same dataset format as the one used to create the project (upload a ZIP archive with one or more images). | Make Predictions tab |
Model Deployment: API (real-time, small datasets) |
Use the base64 format (described below). | |
Model Deployment: Batch (large datasets) |
For the API Client and HTTP Interface options, use the same dataset format as the original dataset used to create the project (i.e., upload a ZIP archive with one or more images). For the CLI Interface option use the base64 format (described below). | Batch prediction scripts |
Portable Prediction Server | Use the base64 format (described below). | Portable Prediction Server |
Base64 encoding format¶
If your training dataset consists of a ZIP archive with one or more image files, the prediction dataset needs to be converted to a different format so that it is fully contained in a single CSV file.
Sample scripts¶
See the links below for help with visual data conversion:
- Tutorial: Getting Predictions for Visual AI Projects via API Calls
- DataRobot Python package: Preparing data for predictions using the DataRobot library
- Script: Comprehensive data prep script
Note
Log in to GitHub before accessing these GitHub resources.
The following shows sample usage:
python visualai_data_prep.py pred_dataset.zip pred_dataset.csv image
Where:
visualai_data_prep.py
is the comprehensive Data Prep script used for making conversions to base64 format.pred_dataset.zip
is the input dataset (ZIP of images).pred_dataset.csv
is the output, which can be used via prediction API.image
is an image column name.
Deep dive¶
To convert a set of image files into a single CSV file, each image must be converted to base64 text. This format allows DataRobot to embed images as a regular text column in the CSV. Encoding binary image data into base64 is a simple operation, present in all programming languages.
Here is an example in Python:
import base64
import pandas as pd
from io import BytesIO
from PIL import Image
def image_to_base64(image: Image) -> str:
img_bytes = BytesIO()
image.save(img_bytes, 'jpeg', quality=90)
image_base64 = base64.b64encode(img_bytes.getvalue()).decode('utf-8')
return image_base64
# let's build a CSV with a single row that contains an image
# the same general approach works if you have multiple image rows or columns
image = Image.open('cat.jpg')
image_base64 = image_to_base64(image)
df = pd.DataFrame({'animal_image': [image_base64]})
df.to_csv('prediction_dataset.csv' index=False)
print(df)
Note
Encode a binary image file (not decoded pixel contents) to base64. This example uses PIL.Image
to open the file, but you can base64-encode an image file directly.