100Gb Ingest¶
Since 11.0 DataRobot supports 100 GB ingest into Data Registry from different sources. This document highlights infrastructure requirements for large data enablement.
Disk size¶
Large files ingest which is happening through URL and data stages requires larger disk volume sizes. In order to provide smooth workflow for 100Gb ingest we recommend to bump up the volume size to be 600Gb. While 600 GB is recommended ephemeral node volume size the minimal required value is 250 GB. Although with other possible work on the cluster this value might not give a robust workflow as some ingests can fail due to running out of disk space.
Please note that we do not use internal storage when doing ingest through JDBC drivers and native database connectors.
Minimal Disk Size - 250 GB Recommended Disk Size - 600 GB
Applicable for AKS, EKS, GKE and OpenShift.
Other notes¶
Local file uploads do not support 100Gb.