Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Data Prep platform support 2021.1

All Installations

  • Interactive Pipeline: Apache Spark 2.4.5 Standalone
  • Static Batch Pipeline: Apache Spark 2.4.5 Standalone
  • Mongo: versions 3.6, 4.0, and 4.2
  • OS: RHEL 8.0+ (<9.x)
  • Java: OpenJDK 8 update 181+; Oracle JDK 8 update 162+
  • Browser: latest versions of Firefox ESR & Google Chrome

AWS

  • Data Library: S3
  • Dynamic Batch Pipeline: EMR 5.30.0 on YARN

Azure

  • Data Library: Azure Blob
  • Data Library: Azure Data Lake Storage Gen 2

Google Cloud Platform

  • Data Library: Google Cloud Storage

Private Cloud

Data Library Spark version and deployment - mode interactive Spark Version and deployment - mode batch Batch type
S3 Open Source Spark (Standalone) - Running on EC2 Open Source Spark (Standalone) Running on EC2 Static
S3 Open Source Spark (Standalone) - Running on EC2 EMR Spark 5.30 Dynamic
Azure Blob Open Source Spark (Standalone) - Running on Azure VM Open Source Spark (Standalone Static
Azure Data Lake Storage Gen 2 Open Source Spark (Standalone) - Running on Azure VM Open Source Spark (Standalone) Static
Google Cloud Storage Open Source Spark (Standalone) running on GCS VM Open Source Spark (Standalone) Static

Cloudera CDH

Supported storage for Data Library Spark version for Pipeline Standalone YARN
Cloudera CDH 6.1.0 – CDH 6.3.3 CDH 6.1.0 Spark – CDH 6.3.3 Spark -
Cloudera CDH 6.1.0 – CDH 6.3.3 Apache Spark 2.4.0 Yes -

Updated October 24, 2021
Back to top