AI Catalog fast registration¶
AI Catalog fast registration is off by default. Contact your DataRobot representative or administrator for information on enabling the feature.
Feature flag: Enable Fast Registration Workflow in AI Catalog
Fast registration, now available for public preview, allows you to quickly register large datasets in the AI Catalog by specifying the first N rows to be used for registration instead of the full dataset. This gives you faster access to data to use for testing and Feature Discovery.
To use fast registration in the AI Catalog:
Navigate to the AI Catalog.
Click Add to catalog and select your data source. Fast registration is only available when adding a dataset from a new data connection, an existing data connection, or a URL.
In the resulting window, enter the data source information (in this example, URL).
Select the appropriate policy for your use case—either Create snapshot or Create dynamic.
For both snapshot and dynamic policies, the AI Catalog dataset calculates EDA1 using only the specified number of rows, taken from the start of the dataset. For example, it calculates using the first 1,000 rows in the dataset above.
Where the two policies differ is that if you consume the snapshot dataset (for example, using it to create a project), the consumer of the dataset will only see the specified number of rows when consuming it, but the consumer of the dynamic dataset will see the full set of rows rather than the partial set of rows.
Select the fast registration data upload option. For snapshot, select Upload data partially, and for dynamic, select Use partial data for EDA.
Specify the number of rows to use for data ingest during registration and click Save.
- If you select Create dynamic as the policy, the partial set of rows is only used to calculate EDA1. All subsequent consumption will use the full dataset.