As your source datasets grow larger, they can impact your ability to efficiently import and work with your datasets in data preparation tools. To address this growing problem, Data Prep offers an Interactive Mode feature, which lets you work faster on a portion of your data—a portion size that you decide is right for your project needs. You can then efficiently and interactively prep that portion in a Data Prep project, without ever having to bring all of that data into the project.
Your Data Prep Administrator must enable this feature in your application.
The major advantages of the Interactive Mode feature include the following:
You don't need to wait for the entire dataset to load into your library before you can begin working with it in a Data Prep project. Instead, you define a portion size for datasets, and when that portion size is reached, the data is available for prep in a project while the remainder of the dataset continues loading in the library.
When you've finished prepping your data in the project, you can easily apply the transformations to all of the data in the native datasets through the Automatic Project Flows feature.
You can always reset the dataset portion that you want to work with in Interactive Mode. For example, after working in a project with a portion limit of 50k rows per dataset, you may realize you actually need larger portions from each dataset. Changing the portion size is a one-step operation for your Data Prep Administrator. Your project then dynamically recognizes your portion limits have changed and provides you with the option to refresh your datasets in order to pick up the new data.
Your interactive experience in Data Prep projects is optimized because you only need to work with the defined portions of your datasets in your project.
You have more flexibility in how you work with large projects. Data Prep projects in Interactive Mode have a row limit that defines the maximum number of rows that can be prepared within a project. This limit is set by your Data Prep Administrator and is useful because it allows the Administrator to ensure you have the optimal interactive experience based on available system resources.
By default, Interactive Mode is not enabled for your Data Prep projects and you will need to contact your Data Prep Administrator to enable it. Before enabling Interactive Mode, you should consider the following points:
For existing projects, use the Profiling feature for the datasets in those projects. Profiling the datasets will give you fuller insights into the data and will inform your choice regarding the optimal portion size to select for your datasets.
Existing projects with datasets whose row sizes now exceed the defined portion size will not be dynamically updated to remove any rows. Instead, when you open those projects, you will have the option to use the Refresh Datasets feature to enforce the row portion for each dataset. The row portions will only be applied if you elect to refresh those datasets.
After Interactive Mode is enabled, the following icon displays to indicate that you're operating on a portion of the dataset:
When you hover over the icon, the row portion value displays so that you can quickly discern the value enforced for all datasets.
To determine the total number of rows in a dataset, go to the library page where that total is displayed for each dataset. Additionally, the library page provides information specific to the Interactive Mode feature so that you can determine:
- The loading status of a dataset and when its interactive portion is available for use in a project.
- The AnswerSets that have been published from projects in Interactive Mode.
- See Data Prep library for details.