GenAI feature considerations¶
When working with generative AI capabilities in DataRobot, consider the following. Note that as the product continues to develop, some considerations may change.
Trial users: See the considerations specific to the DataRobot free trial, including supported LLM base models.
General considerations¶
-
If a multilingual dataset exceeds the limit associated with the multilingual model, DataRobot defaults to using the
jinaai/jina-embedding-t-en-v1
embedding model. -
Deployments created from custom models with training data attached that have extra columns cannot be used unless column filtering is disabled on the custom model.
-
When using LLMs that are either BYO or deployed from the playground and require a runtime parameter to point to the endpoint associated with their credentials: Be aware of the vendor's model versioning and end-of-life schedules. As a best practice, use only endpoints that are generally available when deploying to production. (Models provided in the playground manage this for you.)
-
Note that an API key named
[Internal] DR API Access for GenAI Experimentation
is created for you when you access the playground or vector database in the UI. -
BYO embeddings functionality is available for self-managed users only. Note that when many users run VDB creation jobs in parallel, if using BYO embeddings, LLM playground functionality may be degraded until VDB creation jobs complete.
-
Only one aggregated metric job can run at a time. If an aggregation job is currently running, the Configure aggregation button is disabled and the "Aggregation job in progress; try again when it completes" tooltip appears.
-
You cannot delete insights via the UI. Once they are configured, they can be disabled but not removed. They can be removed with the API.
LLM availability¶
The following table describes the availability of LLMs:
Type | Max context window | Max completion tokens |
---|---|---|
Amazon Titan* | 8,000 | 8,000 |
Anthropic Claude 2.1 | 200,000 | 4,96 |
Anthropic Claude 3 Haiku | 200,000 | 4,096 |
Anthropic Claude 3 Sonnet | 200,000 | 4,096 |
Azure OpenAI GPT-4 | 8,192 | 8,192 |
Azure OpenAI GPT-4 32k | 32,768 | 32,768 |
Azure OpenAI GPT-4 Turbo | 128,000 | 4,096 |
Azure OpenAI GPT-3.5 Turbo* | 4,096 | 4,096 |
Azure OpenAI GPT-3.5 Turbo 16k | 16,384 | 16,384 |
Google Bison* | 4,096 | 2,048 |
Google Gemini 1.5 Flash | 1,048,576 | 8,192 |
Google Gemini 1.5 Pro | 2,097,152 | 8,192 |
* Available for trial users.
Sharing and permissions¶
The following table describes GenAI component-related user permissions. All roles (Consumer, Editor, Owner) refer to the user's role in the Use Case; access to various function are based on the Use Case roles:
Permissions for GenAI functions
Function | Use Case Consumer | Use Case Editor | Use Case Owner |
---|---|---|---|
Vector database | |||
Vector database creators | |||
Create vector database | ✘ | ✔ | ✔ |
Edit vector database info | ✘ | ✔ | ✔ |
Delete vector database | ✘ | ✔ | ✔ |
Vector database non-creators | |||
Edit vector database info | ✘ | ✘ | ✔ |
Delete vector database | ✘ | ✘ | ✔ |
Playground | |||
Playground creators | |||
Create playground | ✘ | ✔ | ✔ |
Rename playground | ✘ | ✔ | ✔ |
Edit playground description | ✘ | ✔ | ✔ |
Delete playground | ✘ | ✔ | ✔ |
Playground non-creators | |||
Edit playground description | ✘ | ✘ | ✔ |
Delete playground | ✘ | ✘ | ✔ |
Playground → Assessment tab | |||
Configure assessment | ✘ | ✔ | ✔ |
Enable/disable assessment metrics | ✘ | ✔ | ✔ |
Playground → Tracing tab | |||
Download log | ✔ | ✔ | ✔ |
Upload to AI Catalog | ✔ | ✔ | ✔ |
LLM blueprint created by others (shared Use Case) | |||
Configure | ✘ | ✘ | ✘ |
Send prompts (from Configuration) | ✘ | ✘ | ✘ |
Generate aggregated metrics | ✘ | ✔ | ✔ |
Create conversation (from Comparison) | ✘ | ✘ | ✘ |
Upvote/downvote responses | ✔ | ✔ | ✔ |
Star/favorite | ✘ | ✘ | ✘ |
Copy to new LLM blueprint | ✘ | ✔ | ✔ |
Delete | ✘ | ✘ | ✘ |
Register | ✘ | ✘ | ✘ |
Playground considerations¶
-
Playgrounds can be shared for viewing, and users with editor or owner access can perform additional actions within the shared playground, such as creating blueprints. While non-creators cannot prompt an LLM blueprint in the playground, they can make a copy and submit prompts to that copy.
-
You can only prompt LLM blueprints that you created (i.e., in both configuration and comparison view). To see the results of prompting another user’s LLM blueprint in a shared Use Case, copy the blueprint and then you can chat with the same settings applied.
-
Each user can submit 5000 LLM prompts per day across all LLMs, where deleted prompts and responses are also counted. However, only successful prompt response pairs are counted and bring-your-own (BYO) LLM calls are not part of the count. Limits for trial users are different, as described here.
Vector database considerations¶
-
By default, DataRobot uses the Facebook AI Similarity Search (FAISS) Vector Database.
-
When determining the number of contexts to retrieve from the VDB, DataRobot allocates 3/4 of the excess token budget (the context size for the LLM) to retrieved documents and the rest to chat history (if applicable).
-
The token budget is comprised of system prompt, user prompt, and max completion length. The excess token budget is
context size - (max completion length + system prompt + user prompt)
. -
If there is no chat history, the whole excess budget is used for document retrieval. Similarly, if there is no VDB, excess budget is used for history.
The following sections describe considerations related to vector databases:
Supported dataset types¶
When uploading datasets for creating a vector database, the only supported format is .zip
. DataRobot then processes the .zip
to create a .csv
containing text columns with an associated reference ID (file path) column. The reference ID column is created automatically when the .zip
is uploaded. All files should be either in the root of the archive or in a single folder inside an archive. Using a folder tree hierarchy is not supported.
Regarding file types, DataRobot provides the following support:
-
.txt
documents -
PDF documents
- Text-based PDFs are supported.
- Image-based PDFs are not fully supported. That is, images are generally ignored but do not lead to errors.
- Documents with mixed image and text content are supported; only the text is parsed.
- Single documents consisting only of images result in empty documents and are ignored.
- Datasets consisting of image-only documents (no text) are not processable.
-
.docx
documents are supported but older.doc
format is not supported. -
.md
documents, and the.markdown
variant, are supported. -
A mix of all supported document types in a single dataset is allowed.
Dataset limits¶
The global 1GB dataset limit is applied during vector database creation, after the text is extracted from the document. Additional dynamic limits are listed below:
jinaai/jina-embedding-t-en-v1
: Supported to the 1GB global limitsentence-transformers/all-MiniLM-L6-v2
: Supported to the 650MB limitcl-nagoya/sup-simcse-ja-base
: Supported to the 250MB limitMultilingual-e5-base
: Supported to the 250MB limitE5-base-v2
: Supported to the 250 MB limitE5-large-v2
: Supported to the 100MB limit
Playground deployment considerations¶
Consider the following when registering and deploying LLMs from the playground:
-
Setting API keys through the DataRobot credential management system is supported. Those credentials are accessed as environment variables in a deployment.
-
Registration and deployment is supported for:
-
All base LLMs in the playground.
-
LLMs with vector databases.
-
-
The creation of a custom model version from an LLM blueprint associated with a large vector database (500+ MB) can be time consuming. You can leave the model workshop while the model is created and will not lose your progress.
Trial user considerations¶
The following considerations apply only to DataRobot free trial users:
-
You can create up to 15 vector databases, computed across multiple Use Cases. Deleted vector databases are included in this count.
-
You can make 1000 LLM API calls, where deleted prompts and responses are also counted. However, only successful prompt response pairs are counted.
See also the section on LLM availability.