The GenAI glossary provides brief definitions of terms relevant to GenAI capabilities in DataRobot.
Sending prompts (and as a result, LLM payloads) to LLM endpoints based on a single LLM blueprint and receiving a response from the LLM. In this case, context from previous prompts/responses is sent along with the payload.
The chunks of text from the vector database used during the generation of LLM responses.
Deploying (from a playground)¶
LLM blueprints and all their associated settings are registered in the Registry and can be deployed with DataRobot's production suite of products.
A numerical (vector) representation of text, or a collection of numerical representations of text. The action of generating embeddings means taking a chunk of unstructured text and using a text embedding model to convert the text to a numerical representation. The chunk is the input to the embedding model and the embedding is the “prediction” or output of the model.
A type of artificial intelligence model that is pre-trained on a broad range of Internet text. Examples of text foundation models are GPT-4 and PaLM. These models are referred to as “foundation” because they can be fine-tuned for a wide variety of specific tasks, much like how a foundation supports a range of structures above it.
A type of artificial intelligence that can create new content. It is designed to learn patterns in data and generate new content that fits the same patterns. The most common use cases of generative AI include creating synthetic images, music, voiceovers, and text. A popular form of generative AI is the generative adversarial network (GAN), which uses two neural networks, a generator, and a discriminator, to create and refine its output.
Large language model (LLM)¶
An algorithm that uses deep learning techniques and large datasets to understand, summarize, generate, and predict new content.
The saved blueprint, available to be used for deployment. LLM blueprints represent the full context for what is needed to generate a response from an LLM; the resulting output can be compared within the playground. This information is captured in the LLM blueprint settings.
LLM blueprint components¶
The entities that make up the LLM blueprint settings, this refers to the vector database, embedding model user to generate the vector database, LLM settings, system prompt, etc. These components can either be offered natively within DataRobot or can be brought in from external sources.
LLM blueprint draft¶
A draft of the LLM blueprint that can be used for experimentation and evaluation and ultimately saved as a blueprint that can be deployed.
LLM blueprint settings¶
The parameters sent to the LLM to generate a response (in conjunction with the user-entered prompt). They include a single LLM, LLM settings, optionally a system prompt, and optionally a vector database. If no vector database is assigned, then the LLM uses its learnings from training to generate a response. LLM blueprint settings are configurable so that you can experiment with different configurations.
The bundle of contents sent to the LLM endpoint to generate a response. This includes the user prompt, LLM settings, system prompt, and information retrieved from the vector database.
Generated text from the LLM based on the payload sent to an LLM endpoint.
Parameters that define how an LLM intakes a user prompt and generates a response. They can be adjusted within the LLM blueprint to alter the response. These parameters are currently represented by the "Temperature", "Token selection probability cutoff (Top P)", and "Max completion tokens" settings.
The place where you create and interact with LLM blueprints (LLMs and their associated settings), comparing the response of each to help determine which to use in production. Many LLM blueprints can live within a playground. A playground is an asset of a Use Case; multiple playgrounds can exist in a single Use Case.
The place to add LLM blueprints to the playground for comparison, submit prompts to these LLM blueprints, and evaluate the rendered responses. With RAG, a single prompt is sent to an LLM to generate a single response, without referencing previous prompts. This allows users to compare responses from multiple LLM blueprints.
The input entered during chatting used to generate the LLM response.
See system prompt.
Retrieval Augmented Generation (RAG)¶
The process of sending a payload to an LLM that contains the prompt, system prompt, LLM settings, vector database (or subset of vector database), and the LLM returning corresponding text based on this payload. It includes the process of retrieving relevant information from a vector database and sending that along with the prompt, system prompt, and LLM settings to the LLM endpoint to generate a response grounded in the data in the vector database. This operation may optionally also incorporate orchestration to execute a chain of multiple prompts.
The system prompt, an optional field, is a "universal" prompt prepended to all individual prompts. It instructs and formats the LLM response. The system prompt can impact the structure, tone, format, and content that is created during the generation of the response.```
A token is the smallest unit of text an LLM can work with. When processing a user prompt, the LLM splits the input text into tokens and generates the output token by token. Different LLMs can separate the same text into tokens differently as they learn their vocabulary from the data they were trained on. Depending on the language and the LLM, a token can be a character, a group of characters, a word, or any other unit of text. An approximation of token length for GPT models is
1 token ~= 4 chars in English.
Text that cannot fit cleanly into a table. The most typical example is large blocks of text typically in some kind of document or form.
A collection of chunks of unstructured text and corresponding text embeddings for each chunk, indexed for easy retrieval. Vector databases can optionally be used to ground the LLM responses to specific information and can be assigned to an LLM blueprint to leverage during a RAG operation. The creation of a vector database occurs when a collection of unstructured text is broken up into chunks, embeddings are generated for each chunk, and both the chunks and embeddings are stored in a database and are available for retrieval by some service. Updating the vector database is the action of adding (or removing) content to (or from) the originally created vector database. This means adding new chunks of text (or removing) and creating new embeddings (or removing) in the vector database.