{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Price elasticity of demand notebook\n", "\n", "This notebook helps you understand the impact that changes in price will have on consumer demand for a given product. Business analysts that measure price elasticity and business users that require elasticity as an input to make pricing decisions will benefit from this notebook.\n", "\n", "Following this workflow will allow you to identify relationships between price and demand, maximize revenue by properly pricing products, monitor price elasticities for changes in price and demand, and reduce manual processes used to obtain and update price elasticities. See the full description of the use case in the business overview for more background information." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import datetime\n", "from datetime import date, timedelta\n", "\n", "import datarobot as dr\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "\n", "dr_dark_blue = \"#08233F\"\n", "dr_orange = \"#FF7F0E\"\n", "dr_blue = \"#1F77B4\"\n", "\n", "pd.set_option(\"display.max_columns\", None)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Connect to DataRobot\n", "\n", "Read more about different options for [connecting to DataRobot from the client](https://docs.datarobot.com/en/docs/api/api-quickstart/api-qs.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If the config file is not in the default location described in the API Quickstart guide, '~/.config/datarobot/drconfig.yaml', then you will need to call\n", "# dr.Client(config_path='path-to-drconfig.yaml')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import data\n", "\n", "Access the training dataset [here](https://datarobot.app.box.com/s/lzver7v68s3y693zgy0zqm5vvxp3br17)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateSKUNameSalesPriceBaselinePriceActualPctOnSaleMarketinghotDaysunnyDayEconChangeGDPEconJobsChangeAnnualizedCPI
07/1/12Heck 97% Pork Sausages 400g1096733.252.939.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
17/2/12Heck 97% Pork Sausages 400g1317913.252.978.65July In Store Credit Card Signup Discount; In ...NoNoNaNNaNNaN
27/3/12Heck 97% Pork Sausages 400g1347113.252.968.96July In Store Credit Card Signup Discount; In ...NoNoNaNNaNNaN
37/4/12Heck 97% Pork Sausages 400g976403.252.9210.08July In Store Credit Card Signup Discount; In ...YesNoNaNNaNNaN
47/5/12Heck 97% Pork Sausages 400g1295383.252.939.80July In Store Credit Card Signup Discount; ID5...NoNoNaNNaNNaN
\n", "
" ], "text/plain": [ " Date SKUName Sales PriceBaseline \\\n", "0 7/1/12 Heck 97% Pork Sausages 400g 109673 3.25 \n", "1 7/2/12 Heck 97% Pork Sausages 400g 131791 3.25 \n", "2 7/3/12 Heck 97% Pork Sausages 400g 134711 3.25 \n", "3 7/4/12 Heck 97% Pork Sausages 400g 97640 3.25 \n", "4 7/5/12 Heck 97% Pork Sausages 400g 129538 3.25 \n", "\n", " PriceActual PctOnSale \\\n", "0 2.93 9.96 \n", "1 2.97 8.65 \n", "2 2.96 8.96 \n", "3 2.92 10.08 \n", "4 2.93 9.80 \n", "\n", " Marketing hotDay sunnyDay \\\n", "0 July In Store Credit Card Signup Discount; In ... No No \n", "1 July In Store Credit Card Signup Discount; In ... No No \n", "2 July In Store Credit Card Signup Discount; In ... No No \n", "3 July In Store Credit Card Signup Discount; In ... Yes No \n", "4 July In Store Credit Card Signup Discount; ID5... No No \n", "\n", " EconChangeGDP EconJobsChange AnnualizedCPI \n", "0 0.5 NaN 0.02 \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train = pd.read_csv(\"1. DR_DEMO_priceOpt_training.csv\")\n", "train.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a project\n", "\n", "The snippets below create and configure a DataRobot project." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Create a Datarobot project\n", "proj = dr.Project.create(train, project_name=\"Price elasticity of demand\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Create a dataframe from the training data\n", "informative = [\n", " feat_list for feat_list in proj.get_featurelists() if feat_list.name == \"Informative Features\"\n", "][0]\n", "# Update the feature list by subtracting parents and adding new variables provided by DataRobot\n", "new_fl = proj.create_featurelist(\"new_fl\", list((set(informative.features) - {\"Date (Year)\"})))\n", "# Create a new feature list to force a monotonic relationship between price and demand\n", "mono_dec = proj.create_featurelist(\"mono_dec\", list(({\"PriceActual\"})))\n", "advanced_options = dr.AdvancedOptions(\n", " monotonic_decreasing_featurelist_id=mono_dec.id, only_include_monotonic_blueprints=False\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initiate Autopilot" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Project(Price elasticity of demand)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proj.set_target(\n", " target=\"Sales\",\n", " mode=dr.enums.AUTOPILOT_MODE.FULL_AUTO,\n", " advanced_options=advanced_options,\n", " featurelist_id=new_fl.id,\n", " worker_count=-1,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import scoring data\n", "\n", "Import the [scoring dataset](https://datarobot.app.box.com/s/7qgnaw595unuot3w2le9609g5wzc58ek) and use it to simulate the effects of different price levels." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateSKUNamePriceBaselinePriceActualPctOnSaleMarketinghotDaysunnyDayEconChangeGDPEconJobsChangeAnnualizedCPI
02/27/20Heck 97% Pork Sausages 400g3.252.939.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
12/27/20Yeo Valley Organic Greek Style Natural Yogurt ...1.301.1710.37July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
22/27/20Linda McCartney Rosemary Vegetarian Sausages x...2.001.7811.21July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
32/27/20Goodfella's Stonebaked Thin Margherita Pizza 345g1.501.3311.09July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
42/27/20New Covent Garden Potato & Leek Soup 600g1.501.3311.16July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
\n", "
" ], "text/plain": [ " Date SKUName PriceBaseline \\\n", "0 2/27/20 Heck 97% Pork Sausages 400g 3.25 \n", "1 2/27/20 Yeo Valley Organic Greek Style Natural Yogurt ... 1.30 \n", "2 2/27/20 Linda McCartney Rosemary Vegetarian Sausages x... 2.00 \n", "3 2/27/20 Goodfella's Stonebaked Thin Margherita Pizza 345g 1.50 \n", "4 2/27/20 New Covent Garden Potato & Leek Soup 600g 1.50 \n", "\n", " PriceActual PctOnSale Marketing \\\n", "0 2.93 9.96 July In Store Credit Card Signup Discount; In ... \n", "1 1.17 10.37 July In Store Credit Card Signup Discount; In ... \n", "2 1.78 11.21 July In Store Credit Card Signup Discount; In ... \n", "3 1.33 11.09 July In Store Credit Card Signup Discount; In ... \n", "4 1.33 11.16 July In Store Credit Card Signup Discount; In ... \n", "\n", " hotDay sunnyDay EconChangeGDP EconJobsChange AnnualizedCPI \n", "0 No No 0.5 NaN 0.02 \n", "1 No No 0.5 NaN 0.02 \n", "2 No No 0.5 NaN 0.02 \n", "3 No No 0.5 NaN 0.02 \n", "4 No No 0.5 NaN 0.02 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "score = pd.read_csv(\"2. DR_DEMO_priceOpt_forScoring.csv\")\n", "score.head()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SKUNamediscountLevelDatePriceBaselinePriceActualPctOnSaleMarketinghotDaysunnyDayEconChangeGDPEconJobsChangeAnnualizedCPI
0Heck 97% Pork Sausages 400g0.742/27/203.252.40509.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
1Heck 97% Pork Sausages 400g0.752/27/203.252.43759.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
2Heck 97% Pork Sausages 400g0.762/27/203.252.47009.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
3Heck 97% Pork Sausages 400g0.772/27/203.252.50259.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
4Heck 97% Pork Sausages 400g0.782/27/203.252.53509.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.02
\n", "
" ], "text/plain": [ " SKUName discountLevel Date PriceBaseline \\\n", "0 Heck 97% Pork Sausages 400g 0.74 2/27/20 3.25 \n", "1 Heck 97% Pork Sausages 400g 0.75 2/27/20 3.25 \n", "2 Heck 97% Pork Sausages 400g 0.76 2/27/20 3.25 \n", "3 Heck 97% Pork Sausages 400g 0.77 2/27/20 3.25 \n", "4 Heck 97% Pork Sausages 400g 0.78 2/27/20 3.25 \n", "\n", " PriceActual PctOnSale Marketing \\\n", "0 2.4050 9.96 July In Store Credit Card Signup Discount; In ... \n", "1 2.4375 9.96 July In Store Credit Card Signup Discount; In ... \n", "2 2.4700 9.96 July In Store Credit Card Signup Discount; In ... \n", "3 2.5025 9.96 July In Store Credit Card Signup Discount; In ... \n", "4 2.5350 9.96 July In Store Credit Card Signup Discount; In ... \n", "\n", " hotDay sunnyDay EconChangeGDP EconJobsChange AnnualizedCPI \n", "0 No No 0.5 NaN 0.02 \n", "1 No No 0.5 NaN 0.02 \n", "2 No No 0.5 NaN 0.02 \n", "3 No No 0.5 NaN 0.02 \n", "4 No No 0.5 NaN 0.02 " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create different price points in the scoring dataset\n", "skuList = score[\"SKUName\"].unique()\n", "discountList = range(74, 126, 1)\n", "discountList = [discountList / 100 for discountList in discountList]\n", "\n", "index = pd.MultiIndex.from_product([skuList, discountList], names=[\"SKUName\", \"discountLevel\"])\n", "indexDF = pd.DataFrame(index=index).reset_index()\n", "scorePerm = pd.merge(indexDF, score, on=[\"SKUName\"], how=\"left\")\n", "scorePerm[\"PriceActual\"] = scorePerm[\"discountLevel\"] * scorePerm[\"PriceBaseline\"]\n", "scorePerm.head()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# scorePerm.to_csv(\"2. DR_DEMO_priceOpt_forScoring_allPerm.csv\",index=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Test predictions \n", "\n", "Use the scoring dataset with the top-performing model to generate demand predictions for each stock keeping unit (SKU) at different price points." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "proj = dr.Project.get(project_id=\"your-project-id\")\n", "model = dr.ModelRecommendation.get(\n", " proj.id, dr.enums.RECOMMENDED_MODEL_TYPE.RECOMMENDED_FOR_DEPLOYMENT\n", ").get_model()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "dataset = proj.upload_dataset(scorePerm)\n", "pred_job = model.request_predictions(dataset.id)\n", "preds = pred_job.get_result_when_complete()" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true, "jupyter": { "outputs_hidden": true } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SKUNamediscountLevelDatePriceBaselinePriceActualPctOnSaleMarketinghotDaysunnyDayEconChangeGDPEconJobsChangeAnnualizedCPIrow_idpredictionrevenue
0Heck 97% Pork Sausages 400g0.742/27/203.252.40509.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.020115643.445312278122.485977
1Heck 97% Pork Sausages 400g0.752/27/203.252.43759.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.021115643.445312281880.897949
2Heck 97% Pork Sausages 400g0.762/27/203.252.47009.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.022115643.445312285639.309922
3Heck 97% Pork Sausages 400g0.772/27/203.252.50259.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.023115643.445312289397.721895
4Heck 97% Pork Sausages 400g0.782/27/203.252.53509.96July In Store Credit Card Signup Discount; In ...NoNo0.5NaN0.024115643.445312293156.133867
\n", "
" ], "text/plain": [ " SKUName discountLevel Date PriceBaseline \\\n", "0 Heck 97% Pork Sausages 400g 0.74 2/27/20 3.25 \n", "1 Heck 97% Pork Sausages 400g 0.75 2/27/20 3.25 \n", "2 Heck 97% Pork Sausages 400g 0.76 2/27/20 3.25 \n", "3 Heck 97% Pork Sausages 400g 0.77 2/27/20 3.25 \n", "4 Heck 97% Pork Sausages 400g 0.78 2/27/20 3.25 \n", "\n", " PriceActual PctOnSale Marketing \\\n", "0 2.4050 9.96 July In Store Credit Card Signup Discount; In ... \n", "1 2.4375 9.96 July In Store Credit Card Signup Discount; In ... \n", "2 2.4700 9.96 July In Store Credit Card Signup Discount; In ... \n", "3 2.5025 9.96 July In Store Credit Card Signup Discount; In ... \n", "4 2.5350 9.96 July In Store Credit Card Signup Discount; In ... \n", "\n", " hotDay sunnyDay EconChangeGDP EconJobsChange AnnualizedCPI row_id \\\n", "0 No No 0.5 NaN 0.02 0 \n", "1 No No 0.5 NaN 0.02 1 \n", "2 No No 0.5 NaN 0.02 2 \n", "3 No No 0.5 NaN 0.02 3 \n", "4 No No 0.5 NaN 0.02 4 \n", "\n", " prediction revenue \n", "0 115643.445312 278122.485977 \n", "1 115643.445312 281880.897949 \n", "2 115643.445312 285639.309922 \n", "3 115643.445312 289397.721895 \n", "4 115643.445312 293156.133867 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "predsAll = pd.concat([scorePerm, preds], axis=1)\n", "predsAll[\"revenue\"] = (\n", " predsAll[\"prediction\"] * predsAll[\"PriceActual\"]\n", ") # - predsAll['prediction']*predsAll['PriceBaseline']*0.2\n", "predsAll.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### View distribution of demand\n", "\n", "Use the snippets below to view the distribution of demand for different price points for one SKU." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "oneSKU = predsAll[predsAll[\"SKUName\"] == \"New Covent Garden Potato & Leek Soup 600g\"]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "oneSKU = predsAll[predsAll[\"SKUName\"] == \"New Covent Garden Potato & Leek Soup 600g\"]\n", "\n", "plt.rcParams.update({\"text.color\": \"white\", \"axes.labelcolor\": \"white\"})\n", "fig = plt.figure(figsize=(8, 8))\n", "ax1 = fig.add_subplot(1, 1, 1, facecolor=dr_dark_blue)\n", "ax2 = ax1.twinx() # instantiate a second axes that shares the same x-axis\n", "plt.title(\"Optimum Price Point\")\n", "\n", "ax1.set_xlabel(\"Change vs. Baseline Price\")\n", "ax1.tick_params(axis=\"x\", labelcolor=\"white\", colors=\"white\")\n", "\n", "ax1.set_ylabel(\"Total Revenue\", color=dr_blue)\n", "ax1.tick_params(axis=\"y\", labelcolor=dr_blue, colors=dr_blue)\n", "ax1.plot(oneSKU.discountLevel, oneSKU.revenue, color=dr_blue)\n", "\n", "ax2.set_ylabel(\"Total Sales\", color=dr_orange)\n", "ax2.tick_params(axis=\"y\", labelcolor=dr_orange, colors=dr_orange)\n", "ax2.plot(oneSKU.discountLevel, oneSKU.prediction, color=dr_orange)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Select the price point that maximizes revenue for each SKU." ] }, { "cell_type": "markdown", "metadata": { "jupyter": { "outputs_hidden": true } }, "source": [ "idx = predsAll.groupby(['SKUName'])['revenue'].transform(max) == predsAll['revenue']\n", "maxSales = predsAll[idx][['SKUName','discountLevel','PriceActual']]\n", "maxSales['discountLevel'] = ((maxSales['discountLevel'] - 1)*100).astype(int).astype(str)+\"%\"\n", "finalData = pd.merge(score, maxSales, on=[\"SKUName\"], how=\"left\")#.drop(\"Row Count - 2. DR_DEMO_priceOpt_forScoring.csv\",axis=1)\n", "finalData.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deploy a model to production\n", "\n", "\n", "If you are happy with the model's performance, you can deploy it to a production environment with [MLOps](https://docs.datarobot.com/en/mlops/index.html). Deploying the model will free up workers, as data scored through the deployment doesn't use any modeling workers. Furthermore, you are no longer restricted on the amount of data to score; score over 100GB with the deployment. Deployments also offer many model management benefits: monitoring service, data drift, model comparison, retraining, and more." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "autoscroll": "auto" }, "outputs": [ { "data": { "text/plain": [ "Deployment(Late Shipment Predictions)" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Retrieve a prediction server\n", "prediction_server = dr.PredictionServer.list()[0]\n", "\n", "# Get the top performing model. Uncomment if this did not execute in the previous section\n", "# model_top = sorted_by_metric(models, 'crossValidation', metric)[0]\n", "\n", "deployment = dr.Deployment.create_from_learning_model(\n", " model_top.id,\n", " label=\"Price elasticity\",\n", " description=\"Predict the optimal price for a product based on demand\",\n", " default_prediction_server_id=prediction_server.id,\n", ")\n", "deployment.id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before proceeding, provide the deployed model's deployment ID (retrieved from the deployment's [Overview tab](https://docs.datarobot.com/en/docs/mlops/monitor/dep-overview.html) or from the Deployment object in the Python client with \"deployment.id.\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configure batch predictions\n", "\n", "After the model has been deployed, DataRobot creates an endpoint for real-time scoring. The deployment allows you to use DataRobot's batch prediction API to score large datasets with a deployed DataRobot model. \n", "\n", "The batch prediction API provides flexible intake and output options when scoring large datasets using prediction servers. The API is exposed through the DataRobot Public API and can be consumed using a REST-enabled client or Public API bindings for DataRobot's Python client." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set the deployment ID" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "deployment_id = \"YOUR_DEPLOYMENT_ID\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Determine input and output options\n", "\n", "DataRobot's batch prediction API allows you to score data from and to multiple sources. You can take advantage of the credentials and data sources you have already established previously through the UI for easy scoring. Credentials are usernames and passwords, while data sources are any databases with which you have previously established a connection (e.g., Snowflake). View the example code below outlining how to query credentials and data sources.\n", "\n", "See the documentation for the full list of supported [input](https://docs.datarobot.com/en/docs/predictions/batch/batch-prediction-api/intake-options.html) and [output options](https://docs.datarobot.com/en/docs/predictions/batch/batch-prediction-api/output-options.html) and more information about [data connections](https://docs.datarobot.com/en/docs/data/connect-data/data-conn.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The snippet below shows how you can query all credentials tied to a DataRobot account." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dr.Credential.list()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output above returns multiple sets of credentials. The alphanumeric string included in each item of the list is the credentials ID. You can use that ID to access credentials through the API." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The snippet below shows how you can query all data sources tied to a DataRobot account. The second line lists each datastore with an alphanumeric string; that is the datastore ID." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5e6696ff820e737a5bd78430\n" ] } ], "source": [ "dr.DataStore.list()\n", "print(dr.DataStore.list()[0].id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Scoring examples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The snippets below demonstrate how to score data with the Batch Prediction API. Edit the `intake_settings` and `output_settings` to suit your needs. You can mix and match until you get the outcome you prefer." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Score from CSV to CSV" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Scoring without Prediction Explanations\n", "if False:\n", " dr.BatchPredictionJob.score(\n", " deployment_id,\n", " intake_settings={\n", " \"type\": \"localFile\",\n", " \"file\": \"inputfile.csv\", # Provide the filepath, Pandas dataframe, or file-like object here\n", " },\n", " output_settings={\"type\": \"localFile\", \"path\": \"outputfile.csv\"},\n", " )\n", "\n", "# Scoring with Prediction Explanations\n", "if False:\n", " dr.BatchPredictionJob.score(\n", " deployment_id,\n", " intake_settings={\n", " \"type\": \"localFile\",\n", " \"file\": \"inputfile.csv\", # Provide the filepath, Pandas dataframe, or file-like object here\n", " },\n", " output_settings={\"type\": \"localFile\", \"path\": \"outputfile.csv\"},\n", " max_explanations=3, # Compute Prediction Explanations for the amount of features indicated here\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Score from S3 to S3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "if False:\n", " dr.BatchPredictionJob.score(\n", " deployment_id,\n", " intake_settings={\n", " \"type\": \"s3\",\n", " \"url\": \"s3://theos-test-bucket/lending_club_scoring.csv\", # Provide the URL of your datastore here\n", " \"credential_id\": \"YOUR_CREDENTIAL_ID_FROM_ABOVE\", # Provide your credentials here\n", " },\n", " output_settings={\n", " \"type\": \"s3\",\n", " \"url\": \"s3://theos-test-bucket/lending_club_scored2.csv\",\n", " \"credential_id\": \"YOUR_CREDENTIAL_ID_FROM_ABOVE\",\n", " },\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Score from JDBC to JDBC" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "if False:\n", " dr.BatchPredictionJob.score(\n", " deployment_id,\n", " intake_settings={\n", " \"type\": \"jdbc\",\n", " \"table\": \"table_name\",\n", " \"schema\": \"public\",\n", " \"dataStoreId\": data_store.id, # Provide the ID of your datastore here\n", " \"credentialId\": cred.credential_id, # Provide your credentials here\n", " },\n", " output_settings={\n", " \"type\": \"jdbc\",\n", " \"table\": \"table_name\",\n", " \"schema\": \"public\",\n", " \"statementType\": \"insert\",\n", " \"dataStoreId\": data_store.id,\n", " \"credentialId\": cred.credential_id,\n", " },\n", " )" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 4 }