The AI Catalog is a centralized collaboration hub for working with data and related assets that enables seamlessly finding, sharing, tagging, and reusing data, helping to speed time to production. The catalog provides easy access to the data needed to answer a business problem while ensuring security, compliance, and consistency. With the AI Catalog, you can:
- Execute simple data preparation, leveraging SQL scripts for pinpointed results.
- Create datasets without the full commitment of creating projects.
- Find, access, delete, and reuse the assets you need.
- Share data without sharing projects, decreasing risks and costs around data duplication.
- Support data security and governance, which reduces friction and speeds up model adoption, through selective addition to the catalog, role-based sharing, and an audit trail.
|Import data and create projects||Import data into the AI Catalog and from there, create a DataRobot project.|
|Interact with catalog assets|
|Work with catalog assets||View and modify asset details, create snapshots, and create projects from a data entry.|
|Manage catalog assets||Share, delete, and download data assets.|
|Schedule data snapshots||Set up schedules for data snapshots in the AI Catalog to keep a dataset in sync with its source data.|
|Prepare data with SparkSQL||Enrich, transform, shape, and blend together datasets using Spark SQL queries within the AI Catalog.|
Self-Managed AI Platform: Elasticsearch
For Self-Managed AI Platform users, DataRobot recommends enabling Elasticsearch for significantly improved search matches, relevancy, and rankings. Contact your DataRobot representative for help configuring and deploying Elasticsearch.