Import data to DataRobot¶
The DataRobot platform provides many methods of ingesting data for machine learning—uploading local files, entering a URL, and connecting to external databases, among others.
This tutorial focuses on importing data using the DataRobot interface. You can also import to the AI Catalog or import using the API (see the API Quickstart to get started with the DataRobot API).
Takeaways¶
This tutorial:
- Provides guidelines for importing to DataRobot.
- Walks through the steps of importing to DataRobot directly.
Guidelines for imports¶
Review the following data guidelines for AutoML, Time Series, and Visual AI projects prior to importing.
For AutoML projects¶
- The data must be in a flat-file, tabular format.
- You must have a column that includes the target you are trying to predict.
For Time Series projects¶
- The data must be in a flat-file, tabular format.
- You must include a date/time feature for each row.
- When using time series modeling, DataRobot detects the time step—the delta between rows measured as a number and a time-delta unit in the data, for example (15, “minutes”). Your dataset must have a row for each time-delta unit. For example, if you are predicting seven days in the future (time step equals 7, days), then your dataset must have row for each day for the entire date range; similarly, if you are forecasting out seven years, then your data must have one row for each year for the entire date range.
- You must have a column that includes the target that you are trying to predict.
For Visual AI projects¶
- Set up folders that contain images for each class and name the folder for that class. Create a ZIP archive of that folder of folders and upload it to DataRobot.
- You can also add tabular data if you include the links to the images within the top folder. You can find more information on that here.
Import to DataRobot¶
To import to DataRobot, sign in and navigate to the Begin project page by clicking the DataRobot logo on the top left. There are other methods of accessing this page depending on your account type.
The following table describes the methods you can use to import to DataRobot:
Import method | Description | |
---|---|---|
![]() |
Drag and drop | Drag and drop a file from your computer onto the Begin a project page. |
![]() |
Import from | Choose an option:
|
![]() |
Browse | Browse the AI Catalog. You can import, store, blend, and share your data through the AI catalog. |
![]() |
File types | View the accepted formats for imports. See Dataset requirements for more details. |
Upload a local file¶
Click Local file and browse for a file or drag a file directly onto the Begin a project page.
DataRobot uploads the data and creates a project.
Import from a URL¶
Use a URL to import your data. It can be local, HTTP, HTTPS, Google Cloud Storage, Azure Blob Storage, or S3 (URL must use HTTP).
-
Click URL.
-
Enter the URL to your data and click Create New Project.
DataRobot imports the data and creates a project.
Note
The ability to import from Google Cloud, Azure Blob Storage, or S3 using a URL needs to be configured for your organization's installation. Contact your system administrator for information about configured import methods.
Import from a data source¶
Before importing from a data source, configure a JBDC connection to the external database.
-
Click Data Source.
-
Search and select a data source.
You can also choose to add a new data connection.
-
Choose an account.
-
Select the data you want to connect to.
-
Click to create a project.
DataRobot connects to the data and creates a project.
What's next?¶
After you import your data, DataRobot creates a project and performs Exploratory Data Analysis.
Learn more¶
Documentation:
- Dataset requirements
- Import to DataRobot directly
- Import and create projects in the AI Catalog
- Connect to data sources