# Make batch predictions with Google Cloud storage

> Make batch predictions with Google Cloud storage - Learn how to make and write predictions to Google
> Cloud Storage.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.294690+00:00` (UTC).

## Primary page

- [Make batch predictions with Google Cloud storage](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/prediction-examples/gcs-pred.html): Full documentation for this topic (HTML).

## Sections on this page

- [Requirements](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/prediction-examples/gcs-pred.html#requirements): In-page section heading.
- [Configure a GCP service account](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/prediction-examples/gcs-pred.html#configure-a-gcp-service-account): In-page section heading.
- [Create stored credentials](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/prediction-examples/gcs-pred.html#create-stored-credentials): In-page section heading.
- [Run the prediction job](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/prediction-examples/gcs-pred.html#run-the-prediction-job): In-page section heading.
- [Documentation](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/prediction-examples/gcs-pred.html#documentation): In-page section heading.

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [Developer learning](https://docs.datarobot.com/en/docs/api/dev-learning/index.html): Linked from this page.
- [Python API client user guide](https://docs.datarobot.com/en/docs/api/dev-learning/python/index.html): Linked from this page.
- [Python code examples](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/index.html): Linked from this page.
- [Prediction code examples](https://docs.datarobot.com/en/docs/api/dev-learning/python/py-code-examples/prediction-examples/index.html): Linked from this page.
- [A DataRobot deployment](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/deploy-methods/index.html): Linked from this page.
- [Deployments > Predictions > Prediction API](https://docs.datarobot.com/en/docs/classic-ui/predictions/realtime/code-py.html): Linked from this page.
- [Prediction API overview](https://docs.datarobot.com/en/docs/api/reference/predapi/index.html): Linked from this page.
- [DataRobot Batch Prediction API](https://docs.datarobot.com/en/docs/api/reference/batch-prediction-api/index.html): Linked from this page.

## Documentation content

# Make batch predictions with Google Cloud storage

The DataRobot Batch Prediction API allows you to take in large datasets and score them against deployed models running on a prediction server. The API also provides flexible options for the intake and output of these files.

In this tutorial, you will learn how to use the DataRobot Python Client package (which calls the Batch Prediction API) to set up a batch prediction job. The job reads an input file for scoring from Google Cloud Storage (GCS) and then writes the results back to GCS.

## Requirements

In order to use the code provided in this tutorial, make sure you have the following:

- Python 2.7 or 3.4+
- The DataRobot Python package (version 2.21.0+)
- A DataRobot deployment
- A GCS bucket
- A service account with access to the GCS bucket (detailed below)
- A scoring dataset that lives in the GCS bucket to use with your DataRobot deployment

## Configure a GCP service account

Running batch prediction jobs requires the appropriate credentials to read and write to GCS. You must create a service account within the Google Cloud Platform that has access to the GCS bucket, then download a key for the account to use in the batch prediction job.

1. To retrieve these credentials, log into the Google Cloud Platform console and selectIAM & Admin > Service Accountsfrom the sidebar.
2. ClickCreate Service Account. Provide a name and description for the account, then clickCreate > Done.
3. On theService Accountpage, find the account that you just created, navigate to theDetailspage, and clickKeys.
4. Go to theAdd Keymenu and clickCreate new key. Select JSON for the key type and clickCreateto generate a key and download a JSON file with the information required for the batch prediction job.
5. Return to your GCS bucket and navigate to thePermissionstab. ClickAdd, enter the email address for the service account user you created, and give the account the “Storage Admin” role. ClickSaveto confirm the changes. This grants your GCP service account access to the GCS bucket.

## Create stored credentials

After downloading the JSON key, use the following code to create a new credential object within DataRobot. The credentials will be used in the batch prediction job to connect to the GCS bucket. Open the JSON key file and copy its contents into the key variable. The DataRobot Python client reads the JSON data as a dictionary and parses it accordingly.

```
# Set name for GCP credential in DataRobot
DR_CREDENTIAL_NAME = "YOUR GCP DATAROBOT CREDENTIAL NAME"
# Create a GCP-specific Credential
# NOTE: This cannot be done from the UI

# This can be generated and downloaded ready to drop in from within GCP
# 1. Go to IAM & Admin -> Service Accounts
# 2. Search for the Service Account you want to use (or create a new one)
# 3. Go to Keys
# 4. Click Add Key -> Create Key
# 5. Selection JSON key type
# 6. copy the contents of the json file into the gcp_key section of the credential code below
key = {
       "type": "service_account",
       "project_id": "**********",
       "private_key_id": "***************",
       "private_key": "-----BEGIN PRIVATE KEY-----\n********\n-----END PRIVATE KEY-----\n",
       "client_email": "********",
       "client_id": "********",
       "auth_uri": "https://accounts.google.com/o/oauth2/auth",
       "token_uri": "https://oauth2.googleapis.com/token",
       "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
       "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/*********"
   }

credential = dr.Credential.create_gcp(
   name=DR_CREDENTIAL_NAME,
   gcp_key=key
)
# Use this code to look up the ID of the credential object created.
credential_id = None
for cred in dr.Credential.list():
   if cred.name == DR_CREDENTIAL_NAME:
       credential_id = cred.credential_id
       break
print(credential_id)
```

## Run the prediction job

With a credential object created, you can now configure the batch prediction job. Set the `intake_settings` and `output_settings` to the `gcp` type. Provide both attributes with the URL to the files in GCS that you want to read and write to (the output file does not need to exist already). Additionally, provide the ID of the credential object that was created above. The code below creates and runs the batch prediction job. Once finished, it provides the status of the job. This code also demonstrates how to configure the job to return both Prediction Explanations and passthrough columns for the scoring data.

> [!NOTE] Note
> You can find the deployment ID in the sample code output of the [Deployments > Predictions > Prediction API](https://docs.datarobot.com/en/docs/classic-ui/predictions/realtime/code-py.html) tab (with Interface set to "API Client").

```
DEPLOYMENT_ID = 'YOUR DEPLOYMENT ID'

# Set GCP Info
GCP_BUCKET_NAME = "YOUR GCS BUCKET NAME"
GCP_INPUT_SCORING_FILE = "YOUR INPUT SCORING FILE NAME"
GCP_OUTPUT_RESULTS_FILE = "YOUR OUTPUT RESULTS FILE NAME"

# Set up the batch prediction job
# Input: Google Cloud Storage
# Output: Google Cloud Storage

job = dr.BatchPredictionJob.score(
   deployment=DEPLOYMENT_ID,
   intake_settings={
       'type': 'gcp',
       'url': "gs://{}/{}".format(GCP_BUCKET_NAME,GCP_INPUT_SCORING_FILE),
       "credential_id": credential_id
   },
   output_settings={
       'type': 'gcp',
       'url': "gs://{}/{}".format(GCP_BUCKET_NAME,GCP_OUTPUT_RESULTS_FILE),
       "credential_id": credential_id
   },
   # If explanations are required, uncomment the line below
   max_explanations=5,

   # If passthrough columns are required, use this line
   passthrough_columns=['column1','column2']
)

job.wait_for_completion()
job.get_status()
```

When the job completes successfully, you will see the output file in the GCS bucket.

## Documentation

- Prediction API overview
- DataRobot Batch Prediction API
