Skip to content

データと会話するエージェントの基本ステップ

Talk to my Data Agent allows you to upload raw data from your preferred data source, and then, ask a question, and the agent recommends business analyses—generating charts, tables, and code to help you interpret the results. You can view a video walkthrough at the DataRobot Youtube channel.

この基本ステップでは、次のことを行います。

  • Select the Talk to my Data Agent template from the Application Template Gallery.
  • Configure the application template in a codespace.
  • Build the Pulumi stack and open the application.
  • Load data, automatically generating a data dictionary.
  • Interact with the agent, converting natural language queries into SQL/Python code to explain what the data shows, why patterns exist, and recommended next steps.

スケーラビリティ

When dealing with large datasets, connecting to Snowflake or BigQuery allows the analysis to run directly in the cloud using SQL. This is ideal for large datasets because the data remains in the cloud, utilizing cloud computing.

データ品質

The AI agent can talk to disparate datasets as if they were unified, performing joins or merges as it goes without explicit guidance. Upon ingestion, the agent also proactively cleans common data quality issues including data inconsistencies, special character problems, and formatting issues.

前提条件

To build a Talk to my Data Agent application from the Application Template Gallery, you need:

  • Access to GenAI and MLOps functionality in DataRobot
  • DataRobot APIトークン。
  • DataRobot のエンドポイント
  • Large language model (LLM) credentials for one of the following:

    • Azure OpenAI
    • VertexAI
    • Amazon Web Services(AWS)のAnthropic

1. Open the Talk to my Data Agent template

From Workbench, in the Use Case directory, click Browse application templates.

Select Talk to my Data Agent and click Open in a codespace in the upper-right corner.

This walkthrough focuses on working with application templates in a codespace, however, you can click Copy repository URL and paste the URL in your browser to open the template in GitHub.

DataRobot opens and begins initializing a codespace. Once the session starts, the template files appear on the left and the README opens in the center. To learn more about the codespace interface, see Codespace sessions.

ヒント

DataRobot automatically creates a Use Case, so you can access this codespace (and any resulting assets) from the Use Case directory in the future.

2. Configure the codespace

Follow the instructions included in the README file.

In the .env file, accessed from the file browser on the left, the the following fields are required:

  • DATAROBOT_API_TOKEN: Retrieved from User settings > API keys and tools in DataRobot.
  • DATAROBOT_ENDPOINT: Retrieved from User settings > API keys and tools in DataRobot.
  • PULUMI_CONFIG_PASSPHRASE: A self-selected alphanumeric passphrase.
  • LLM認証情報。 All application templates utilize generative AI, so DataRobot provides out-of-the-box support for Azure OpenAI, VertexAI (Google Cloud), and Anthropic on AWS.

備考

Make sure to remove the # to the left of the populated LLM credentials.

In the example above, the # was manually removed from line 23 and 24.

3. Execute the Pulumi stack and open the application

Click the Terminal tile in the left panel.

In the resulting terminal pane, run python quickstart.py YOUR_PROJECT_NAME, replacing YOUR_PROJECT_NAME with a unique name. その後、Enterを押します。

Executing the Pulumi stack can take several minutes. Once complete, DataRobot provides a URL at the bottom of the results in terminal. To view the deployed application, copy and paste the URL in your browser.

ヒント

DataRobot also creates an application in Registry. To access this application again, you can navigate to Registry > Applications.

4. Load and explore data

Upload one or more datasets from a .csv or multi-tabbed excel file, or connect directly to a data source, including Snowflake, BigQuery, or the AI Catalog in DataRobot. Once ingested, the AI agent combines and cleans the data, automatically generating and opening a data dictionary for immediate analysis.

The Dictionary is AI-generated, providing clear column definitions and metadata. You can edit columns to incorporate your own business perspective and then click Download Data Dictionary to export the updated definitions.

The dataset in this example contains 2021 real estate information for Ontario, Canada, including information about real estate transactions, details about the properties sold (e.g., location, type, size, and features), as well as demographic and economic data for the areas where the properties are located.

5. Talk to the agent

To begin talking to the AI agent, click AI Data Analyst. In the search bar at the bottom, enter a request in plain language. In this example, the request is: Show me the average price of properties on a map, by city. Let's use one of those open street maps.

Immediately, the agent communicates that it understands that you want to see the average price of properties on a map, by city, using an open street map. After processing the request, the agent provides actionable insights and visualizations in the form of a table with the requested information, an interactive open street map, and a bar chart highlighting the most expensive regions at a glance.

If you want more detail after interacting with the results, you can ask a follow up question. In this example, the follow up request is: Break that down further by property type. There's no need to start over, the agent uses the initial request as a starting point for the follow up request.

You can even save valuable chats by clicking Save Chat on the left, or New Chat to start over again.