Skip to content

アプリケーション内で をクリックすると、お使いのDataRobotバージョンに関する全プラットフォームドキュメントにアクセスできます。

チャット

チャットは、プロンプトを送信してLLMから回答を受信するアクティビティです。 A chat is a collection of chat prompts. Once you have set the configuration for your LLM, send it prompts (from the entry box in the lower part of the panel) to determine whether further refinements are needed before considering your LLM blueprint for deployment.

プレイグラウンド内でのチャットは「会話」で、それに続くプロンプトでフォローアップの質問をすることができます。 以下は、DataRobotオートパイロットを実行するためのPythonコードの出力をLLMに求める例です。

The results of the follow-up questions are dependent on whether context awareness is enabled (see continuation of the example). プレイグラウンドを使用して、システムプロンプトと設定に満足するまでプロンプトをテストおよびチューニングします。 Then, click Save configuration in the bottom of the right-hand panel.

Context-aware chatting

When configuring an LLM blueprint, you set the history awareness in the Prompting tab.

There are two states of context. They control whether chat history is sent with the prompt to include relevant context for responses.

状態 説明
コンテキスト認識 When sending input, previous chat history is included with the prompt. This state is the default.
コンテキストがありません チャットからの履歴なしで、各プロンプトを独立した入力として送信します。

You can switch between one-time (no context) and context-aware within a chat. They each become independent sets of history context—going from context-aware, to no context, and back to aware clears the earlier history from the prompt. (これが行われるのは、新しいプロンプトが送信されたときだけです。)

Context state is reported in two ways:

  1. バッジは、設定ビューと比較ビューの両方でLLMブループリント名の右側に表示され、現在のコンテキストの状態を報告します。

  2. In the configuration view, dividers show the state of the context setting:

Using the example above, you could then prompt to make a change to "that code." With context-aware enabled, the LLM responds knowing the code being referenced because it is "aware" of the previous conversation history:

See the prompting reference for information on crafting optimized prompts (including few-shot prompting).

Single vs comparison chats

Chatting with a single LLM blueprint is a good way to tune before starting prompt comparisons with other LLM blueprints. Comparison lets you compare responses between LLM blueprints to help decide which to move to production.

備考

You can only do comparison prompting with LLM blueprints that you created. To see the results of prompting another user’s LLM blueprint in a shared Use Case, copy the blueprint and then you can chat with the same settings applied. This is intentional behavior because prompting a an LLM blueprint impacts the chat history, which can impact the responses that are generated. However, you can provide response feedback to assist development.

Single LLM blueprint chat

When you first configure an LLM blueprint, part of the creation process includes chatting. Set the configuration, and save, to activate chatting:

チャットの結果を確認したら、必要に応じて設定を調整し、再度プロンプトを送信します。 Use the additional actions available within each chat result to retrieve more information and the prompt:

オプション 説明
設定を表示 Shows the configuration used by that prompt in the Configuration panel on the right. If you haven't changed configurations while chatting, no change is apparent. Using this tool allows you to recall previous settings and restore the LLM blueprint to those settings.
トレースを開く Opens the tracing log, which shows all components and prompting activity used in generating LLM responses.
プロンプトと応答を削除 Removes both the prompt and response from the chat history. If deleted, they are no longer considered as context for future responses.

LLMにプロンプトを送信すると、DataRobotはそのチャットの記録を保持します。 You can either add to the context of an existing chat or start a new chat, which does not carry over any of the context from other chats in the history:

Starting a new chat allows you to have multiple independent conversation threads with a single blueprint. In this way, you can evaluate the LLM blueprint based on different types of topics, without bringing in the history of the previous prompt response, which could "pollute" the answers. While you could also do this by switching context off, submitting a prompt, and then switching it back on, starting a new chat is a simpler solution.

Click Start new chat to begin with a clean history; DataRobot will rename the chat from New chat to the words from your prompt once the prompt is submitted.

Comparison LLM blueprint chat

Once you are satisfied, click Comparison in the breadcrumbs to compare responses with other LLM blueprints.

If you determine that further tuning is needed after having started a comparison, you can still modify the configuration of individual LLM blueprints:

To compare LLM blueprint chats side-by-side, see the LLM blueprint comparison documentation.

Response feedback

Use the response feedback "thumbs" to rate the prompt answer. Responses are recorded in the Tracing, tab User feedback column. The response, as part of the exported feedback sent to the AI Catalog, can be used, for example, to train a predictive model.

引用

A citation is a metric and is on by default (as are Latency, Prompt Tokens, and Response Tokens). Citations provide a list of the top reference document chunks, based on relevance to the prompt, retrieved from VDB. Be aware that the embedding model used to create the VDB in the first place can affect the quality of the citations retrieved.

備考

Citations only appear when the LLM blueprint being queried has an associated VDB. While citations are one of the available metrics, you do not need the assessment functionality enabled to have citations returned.

Use citations as a safety check to validate LLM responses. While they help to validate LLM responses, citations also allow you to validate proper and appropriate retrieval from the VDB—are you retrieving the chunks from your docs that you want to provide as context to the LLM? Additionally, if you enable the Faithfulness metric, which measures whether the LLM response matches the source, it relies on the citation output for its relevance.

信頼性スコア

信頼性スコアは、事実の一貫性指標アプローチを使用して計算されます。一方、類似性スコアは、 ベクターデータベースから取得した事実と、LLMブループリントから生成されたテキストを使用して計算されます。 使用される類似性指標は ROUGE-1です。 DataRobot GenAIは、 "The limits of automatic summarization according to ROUGE"からのインサイトに基づいて、ROUGE-1の改善されたバージョンを使用します。


更新しました June 19, 2024