Vector databases¶
Premium
DataRobot's Generative AI capabilities are a premium feature; contact your DataRobot representative for enablement information. Try this functionality for yourself in a limited capacity in the DataRobot trial experience.
A vector database is a collection of unstructured text that is broken into chunks, with embeddings generated for each chunk. Both the chunks and embeddings are stored in a database and are available for retrieval. Vector databases can optionally be used to ground the LLM responses to specific information and can be assigned to an LLM blueprint to leverage during a RAG operation. The role of the vector database is to enrich the prompt with relevant context before it is sent to the LLM.
The simplified workflow for working with vector databases is as follows:
- Import a data source, from which the vector database will be created, to the Data registry.
- Add the data source as a vector database to a Use Case.
- Set the configuration, embeddings, and chunking.
- Create the vector database and add it to an LLM blueprint in the playground.
See the considerations related to vector databases for guidance when working with DataRobot GenAI capabilities.
Working with vector databases includes the following:
Topic | Description |
---|---|
Add data sources | Add internal and external data sources; actions from the Vector databases tab in the Use Case directory. |
Create a vector database | Create and configure a vector database. |
Versioning vector databases | Use versioning to modify vector databases for tracking and fine-tuning. |