Use an embedding NVIDIA NIM to create a vector database¶
Premium
The use of NVIDIA Inference Microservices (NIM) in DataRobot requires access to premium features for GenAI experimentation and GPU inference. Contact your DataRobot representative or administrator for information on enabling the required features.
The NVIDIA Inference Microservices (NIM) available through the Registry include embedding models. You can add a deployed embedding model to a Use Case, creating a collection of unstructured text that is broken into chunks, with embeddings generated for each chunk. Both the chunks and embeddings are stored in the vector database and are available for retrieval. Vector databases can optionally be used to ground the LLM responses to specific information and can be assigned to an LLM blueprint to leverage during a RAG operation. The role of the vector database is to enrich the prompt with relevant context before it is sent to the LLM. Each embedding NVIDIA NIM available is listed below:
arctic-embed-l
llama-3.2-nv-embedqa-1b-v2
nv-embedqa-e5-v5
nv-embedqa-e5-v5-pb24h2
nv-embedqa-mistral-7b-v2
nvclip
Create a vector database with a registered embedding NIM¶
After you register an embedding NIM, you can add it to a vector database. DataRobot handles the deployment process automatically.
To create a vector database with a registered embedding NVIDIA NIM:
-
On the Registry > Models tab, next to + Register model, click and then Import from NVIDIA NGC.
-
In the Import from NVIDIA NGC panel, on the Select NIM tab, click an embedding NIM in the gallery.
Search the gallery
To direct your search for an embedding model, you can Search, filter by Publisher, or click Sort by to order the gallery by date added or alphabetically (ascending or descending).
-
Review the model information from the NVIDIA NGC source, then click Next.
-
On the Register model tab, configure the following fields and click Register:
Field Description Registered model name / Registered model Configure one of the following: - Registered model name: When registering a new model, enter a unique and descriptive name for the new registered model. If you choose a name that exists anywhere within your organization, a warning appears.
- Registered model: When saving as a version of an existing model, select the existing registered model you want to add a new version to.
Registered version name Automatically populated with the model name and the word version
. Change the version name or modify the default version name as necessary.Registered model version Assigned automatically. This displays the expected version number of the version (e.g., V1, V2, V3) you create. This is always V1 when you select Register as a new model. Resource bundle Recommended automatically. If possible, DataRobot translates the GPU requirements for the selected model into a resource bundle. In some cases, DataRobot can't detect a compatible resource bundle. To identify a resource bundle with sufficient VRAM, review the documentation for that NIM. NVIDIA NGC API key Select the credential associated with your NVIDIA NGC API key. Optional settings Registered version description Enter a description of the business problem this model package solves, or, more generally, describe the model represented by this version. Tags Click + Add tag and enter a Key and a Value for each key-value pair you want to tag the model version with. Tags added when registering a new model are applied to V1. -
After the registered model builds, navigate to Workbench and open a Use Case.
-
In a Use Case, on the Vector databases tab, either:
If you have already added one or more vector databases to the Use Case, Click the + Add vector database button in the upper right.
If you haven't added a vector database to the Use Case before, click Create vector database in the center of the page.
-
On the Create vector database panel, enter a descriptive Name. Then, in the Data source dropdown, select from the data sources associated with the Use Case or click Add data to add new data from the Data Registry.
-
In the Embedding model dropdown, click the embedding NIM you registered. Then, configure the vector database Text chunking settings and click Create vector database.
The selected embedding model is deployed to Console when you create the vector database. If necessary, this process creates a new prediction environment for NIM embeddings.
After creating a vector database, you can manage and version it, or add it to an LLM in the playground to inform responses.
Create a vector database with a deployed embedding NIM¶
If you've already registered and deployed an embedding NIM, you can add it to a vector database as a deployed embedding model.
To create a vector database with a registered and deployed embedding NVIDIA NIM:
-
In a Use Case, on the Vector databases tile, either:
If you have already added one or more vector databases to the Use Case, Click the + Add vector database button in the upper right.
If you haven't added a vector database to the Use Case before, click Create vector database in the center of the page.
-
On the Create vector database panel, enter a descriptive Name. Then, in the Data source dropdown, select from the data sources associated with the Use Case or click Add data to add new data from the Data Registry.
-
In the Embedding model dropdown, click Add deployed embedding model.
-
On the next page, configure the following settings to add the NVIDIA NIM embedding model, then click Validate and add:
Field Description Name Enter a descriptive name for the embedding model you're creating. Deployment name In the list, locate the name of the NVIDIA NIM embedding model registered and deployed in DataRobot and click the deployment name. Prompt column name Enter input
as the prompt column name.Response column name Enter result
as the response column name.Validation process
The validation process can take a few minutes. A notification appears when the process starts and if it succeeds or fails.
-
After the validation of the deployed embedding model succeeds, open the Embedding model menu, then, under Deployed embedding models, select the NVIDIA NIM embedding model.
-
Configure the vector database Text chunking settings, then click Create vector database.
After creating a vector database, you can manage and version it, or add it to an LLM in the playground to inform responses.