ブループリントワークショップの基本ステップノートブック¶
In [1]:
Copied!
import datarobot as dr
import datarobot as dr
In [2]:
Copied!
from datarobot_bp_workshop import Workshop, Visualize
from datarobot_bp_workshop import Workshop, Visualize
In [3]:
Copied!
with open('../api.token', 'r') as f:
token = f.read()
dr.Client(token=token, endpoint='https://app.datarobot.com/api/v2')
with open('../api.token', 'r') as f:
token = f.read()
dr.Client(token=token, endpoint='https://app.datarobot.com/api/v2')
ワークショップの初期化¶
In [4]:
Copied!
w = Workshop()
w = Workshop()
ブループリントの構築¶
In [5]:
Copied!
w.Task('PNI2')
w.Task('PNI2')
Out[5]:
Missing Values Imputed (quick median) (PNI2) Input Summary: (None) Output Method: TaskOutputMethod.TRANSFORM
In [6]:
Copied!
w.Tasks.PNI2()
w.Tasks.PNI2()
Out[6]:
Missing Values Imputed (quick median) (PNI2) Input Summary: (None) Output Method: TaskOutputMethod.TRANSFORM
In [7]:
Copied!
pni = w.Tasks.PNI2(w.TaskInputs.NUM)
rdt = w.Tasks.RDT5(pni)
binning = w.Tasks.BINNING(pni)
keras = w.Tasks.KERASC(rdt, binning)
keras.set_task_parameters_by_name(learning_rate=0.123)
keras_blueprint = w.BlueprintGraph(keras, name='A blueprint I made with the Python API').save()
pni = w.Tasks.PNI2(w.TaskInputs.NUM)
rdt = w.Tasks.RDT5(pni)
binning = w.Tasks.BINNING(pni)
keras = w.Tasks.KERASC(rdt, binning)
keras.set_task_parameters_by_name(learning_rate=0.123)
keras_blueprint = w.BlueprintGraph(keras, name='A blueprint I made with the Python API').save()
In [8]:
Copied!
user_blueprint_id = keras_blueprint.user_blueprint_id
user_blueprint_id = keras_blueprint.user_blueprint_id
ブループリントの視覚化¶
In [9]:
Copied!
keras_blueprint.show()
keras_blueprint.show()
タスクの点検¶
In [10]:
Copied!
pni
pni
Out[10]:
Missing Values Imputed (quick median) (PNI2) Input Summary: Numeric Data Output Method: TaskOutputMethod.TRANSFORM
In [11]:
Copied!
rdt
rdt
Out[11]:
Smooth Ridit Transform (RDT5) Input Summary: Missing Values Imputed (quick median) (PNI2) Output Method: TaskOutputMethod.TRANSFORM
In [12]:
Copied!
binning
binning
Out[12]:
Binning of numerical variables (BINNING) Input Summary: Missing Values Imputed (quick median) (PNI2) Output Method: TaskOutputMethod.TRANSFORM
In [13]:
Copied!
keras
keras
Out[13]:
Keras Neural Network Classifier (KERASC) Input Summary: Smooth Ridit Transform (RDT5) | Binning of numerical variables (BINNING) Output Method: TaskOutputMethod.PREDICT Task Parameters: learning_rate (learning_rate) = 0.123
In [14]:
Copied!
keras.task_parameters.learning_rate
keras.task_parameters.learning_rate
Out[14]:
0.123
In [15]:
Copied!
keras.task_parameters.batch_size = 32
keras.task_parameters.batch_size = 32
In [16]:
Copied!
keras
keras
Out[16]:
Keras Neural Network Classifier (KERASC) Input Summary: Smooth Ridit Transform (RDT5) | Binning of numerical variables (BINNING) Output Method: TaskOutputMethod.PREDICT Task Parameters: batch_size (batch_size) = 32 learning_rate (learning_rate) = 0.123
In [17]:
Copied!
keras_blueprint
keras_blueprint
Out[17]:
Name: 'A blueprint I made with the Python API' Input Data: Numeric Tasks: Missing Values Imputed (quick median) | Smooth Ridit Transform | Binning of numerical variables | Keras Neural Network Classifier
検証¶
意図的に誤った入力データ型を提供して、検証をテストします。
In [18]:
Copied!
pni = w.Tasks.PNI2(w.TaskInputs.CAT)
rdt = w.Tasks.RDT5(pni)
binning = w.Tasks.BINNING(pni)
keras = w.Tasks.KERASC(rdt, binning)
keras.set_task_parameters_by_name(learning_rate=0.123)
invalid_keras_blueprint = w.BlueprintGraph(keras)
pni = w.Tasks.PNI2(w.TaskInputs.CAT)
rdt = w.Tasks.RDT5(pni)
binning = w.Tasks.BINNING(pni)
keras = w.Tasks.KERASC(rdt, binning)
keras.set_task_parameters_by_name(learning_rate=0.123)
invalid_keras_blueprint = w.BlueprintGraph(keras)
In [19]:
Copied!
invalid_keras_blueprint.save('A blueprint with warnings (PythonAPI)', user_blueprint_id=user_blueprint_id).show()
invalid_keras_blueprint.save('A blueprint with warnings (PythonAPI)', user_blueprint_id=user_blueprint_id).show()
In [20]:
Copied!
binning.set_task_parameters_by_name(max_bins=-22)
binning.set_task_parameters_by_name(max_bins=-22)
Out[20]:
Binning of numerical variables (BINNING) Input Summary: Missing Values Imputed (quick median) (PNI2) Output Method: TaskOutputMethod.TRANSFORM Task Parameters: max_bins (b) = -22
In [21]:
Copied!
invalid_keras_blueprint.save('A blueprint with warnings (PythonAPI)', user_blueprint_id=user_blueprint_id).show()
invalid_keras_blueprint.save('A blueprint with warnings (PythonAPI)', user_blueprint_id=user_blueprint_id).show()
Binning of numerical variables (BINNING) Invalid value(s) supplied max_bins (b) = -22 - Must be a 'intgrid' parameter defined by: [2, 500] Failed to save: parameter validation failed.
In [22]:
Copied!
keras.validate_task_parameters()
keras.validate_task_parameters()
Keras Neural Network Classifier (KERASC)
All parameters valid!
Out[22]:
元の有効なブループリントに更新¶
In [23]:
Copied!
pni = w.Tasks.PNI2(w.TaskInputs.NUM)
rdt = w.Tasks.RDT5(pni)
binning = w.Tasks.BINNING(pni)
keras = w.Tasks.KERASC(rdt, binning)
keras.set_task_parameters_by_name(learning_rate=0.123)
keras_blueprint = w.BlueprintGraph(keras)
blueprint_graph = keras_blueprint.save('A blueprint I made with the Python API', user_blueprint_id=user_blueprint_id)
pni = w.Tasks.PNI2(w.TaskInputs.NUM)
rdt = w.Tasks.RDT5(pni)
binning = w.Tasks.BINNING(pni)
keras = w.Tasks.KERASC(rdt, binning)
keras.set_task_parameters_by_name(learning_rate=0.123)
keras_blueprint = w.BlueprintGraph(keras)
blueprint_graph = keras_blueprint.save('A blueprint I made with the Python API', user_blueprint_id=user_blueprint_id)
タスクに関するヘルプを取得¶
In [24]:
Copied!
help(w.Tasks.PNI2)
help(w.Tasks.PNI2)
Help on PNI2 in module datarobot_bp_workshop.factories object: class PNI2(datarobot_bp_workshop.friendly_repr.FriendlyRepr) | Missing Values Imputed (quick median) | | Impute missing values on numeric variables with their median and create indicator variables to mark imputed records | | Parameters | ---------- | output_method: string, one of (TaskOutputMethod.TRANSFORM). | task_parameters: dict, which may contain: | | scale_small (s): select, (Default=0) | Possible Values: [False, True] | | threshold (t): int, (Default=10) | Possible Values: [1, 99999] | | Method resolution order: | PNI2 | datarobot_bp_workshop.friendly_repr.FriendlyRepr | builtins.object | | Methods defined here: | | __call__(zelf, *inputs, output_method=None, task_parameters=None, output_method_parameters=None, x_transformations=None, y_transformations=None, freeze=False, version=None) | | __friendly_repr__(zelf) | | documentation(zelf, auto_open=False) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | description = 'Impute missing values on numeric variables with ...eate... | | label = 'Missing Values Imputed (quick median)' | | task_code = 'PNI2' | | task_parameters = scale_small (s): select, (Default=0) | | threshold (t):... | | ---------------------------------------------------------------------- | Methods inherited from datarobot_bp_workshop.friendly_repr.FriendlyRepr: | | __repr__(self) | Return repr(self). | | ---------------------------------------------------------------------- | Data descriptors inherited from datarobot_bp_workshop.friendly_repr.FriendlyRepr: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined)
タスクカテゴリーの一覧表示¶
In [25]:
Copied!
w.list_categories(show_tasks=True)
w.list_categories(show_tasks=True)
Custom - Awesome Model (CUSTOMR_6019ae978cc598a46199cee1) - "My Custom Task" (CUSTOMR_608e42ac186a7242380a6a98) - "My Custom Task" (CUSTOMR_608e42ecd5eb0dc5f28d0dda) - "My Custom Task" (CUSTOMR_608e43fc01f9f466aa8d0d81) - My Custom Ridge Regressor w/ Imputation (CUSTOMR_608e5a4ed5eb0dc5f28d0ea0) - My Custom Ridge Regressor w/ Imputation (CUSTOMR_608e5bc8b66a4934d58d0d4e) - My Custom Ridge Regressor w/ Imputation (CUSTOMR_608ef72b6f13f54305667783) - My Custom Ridge Regressor w/ Imputation (CUSTOMR_608ef74c5dda651931052422) - Second model (CUSTOMC_6019d18adfa83afbad99cdb8) - My Imputation Task (CUSTOMT_6188b0e6fb465717f029fd05) - Image Featurizer (CUSTOMT_61b452e57fd5b0629a2f4fd3) - Maybe Broken? (CUSTOMT_61b7d3f26f8e01a1a8f7bc0c) Preprocessing Numeric Preprocessing Data Quality - Numeric Data Cleansing (NDC) Dimensionality Reducer - Truncated Singular Value Decomposition (SVD2) - Partial Principal Components Analysis (PPCA) - Truncated Singular Value Decomposition (SVD) Scaling - Impose Uniform Transform (UNIF3) - Log Transformer (LOGT) - Smooth Ridit Transform (RDT5) - Standardize (RST) - Search for best transformation including Smooth Ridit (BTRANSF6) - Transparent Search for best transformation (BTRANSF6T) - Transform on the link function scale (LINK) - Ridit Transform (SRDT3) - Standardize (ST) - Sparse Interaction Machine (SPOLY) - Constant Splines (GS) - One-Hot Encoding (PDM3) - Numeric Data Cleansing (NDC) - Missing Values Imputed (quick median) (PNI2) - Missing Values Imputed (arbitrary or quick median) (PNIA4) - Normalizer (NORM) - Search for ratios (RATIO3) - Binning of numerical variables (BINNING) - Search for differences (DIFF3) Categorical Preprocessing - Categorical Embedding (CATEMB) - Category Count (PCCAT) - One-Hot Encoding (PDM3) - Ordinal encoding of categorical variables (ORDCAT2) - Univariate credibility estimates with L2 (CRED1b1) - Buhlmann credibility estimates for high cardinality features (CRED1) Text Preprocessing - TextBlob Sentiment Featurizer (TEXTBLOB_SENTIMENT) - NLTK Sentiment Featurizer (NLTK_SENTIMENT) - One-Hot Encoding (PDM3) - Pretrained TinyBERT Featurizer (TINYBERTFEA) - SpaCy Named Entity Recognition Detector (SPACY_NAMED_ENTITY_RECOGNITION) - Fasttext Word Vectorization and Mean text embedding (TXTEM1) - Keras encoding of text variables (KERAS_TOKENIZER) - Matrix of word-grams occurrences (PTM3) Image Preprocessing - OpenCV Detect Largest Rectangle (OPENCV_DETECT_LARGEST_RECTANGLE) - OpenCV Image Featurizer (OPENCV_FEATURIZER) - Grayscale Downscaled Image Featurizer (IMG_GRAYSCALE_DOWNSCALED_IMAGE_FEATURIZER) - No Post Processing (IMAGE_POST_PROCESSOR) - Pretrained Multi-Level Global Average Pooling Image Featurizer (IMGFEA) Summarized Categorical Preprocessing - Summarized Categorical to Sparse Matrix (CDICT2SP) - Single Column Converter for Summarized Categorical (SCBAGOFCAT2) Geospatial Preprocessing - Spatial Neighborhood Featurizer (GEO_NEIGHBOR_V1) - Geospatial Location Converter (GEO_IN) Models Regression - eXtreme Gradient Boosted Trees Quantile Regressor with Early Stopping (ESQUANTXGBR) - ExtraTrees Regressor (RFR) - Elastic-Net Regressor (L1 / Least-Squares Loss) (ENETCDWC) - Light Gradient Boosted Trees Regressor with Early Stopping (ESLGBMTR) - eXtreme Gradient Boosted Trees Regressor (PXGBR2) - Ridge Regression (RIDGE) - Nystroem Kernel SVM Regressor (ASVMER) - eXtreme Gradient Boosted Trees Regressor with Early Stopping and Unsupervised Learning Features (UESXGBR2) - eXtreme Gradient Boosted Trees Regressor (XGBR2) - eXtreme Gradient Boosted Trees Regressor (XL_PXGBR2) - Nystroem Kernel SVM Regressor (ASVMSKR) - Partial Least-Squares Regression (PLS) - Gaussian Process Regressor with Rational Quadratic Kernel (GPRRQ) - Eureqa Regressor (EQR) - Auto-Tuned Char N-Gram Text Modeler using token counts (CNGER2) - Frequency-Severity Generalized Additive Model (FSGG2) - Hot Spots (XPRIMR) - Linear Regression (GLMCD) - Frequency-Severity ElasticNet (FSEE) - Gradient Boosted Trees Regressor with Early Stopping (Least-Squares Loss) (ESGBR2) - Light Gradient Boosting on ElasticNet Predictions (RES_ESLGBMTR) - Support Vector Regressor (Radial Kernel) (SVMR2) - Regularized Quantile Regressor with Keras (KERAS_REGULARIZED_QUANTILE_REG) - Auto-tuned K-Nearest Neighbors Regressor (Euclidean Distance) (KNNR) - Lasso Regression (LASSO2) - Gaussian Process Regressor with Radial Basis Function Kernel (GPRRBF) - XRuleFit Regressor (XRULEFITR) - Frequency-Severity Light Gradient Boosted Trees (FSLL) - Ridge Regression (RIDGEWC) - Stochastic Gradient Descent Regression (SGDR) - Eureqa Generalized Additive Model (EQ_ESXGBR) - Elastic-Net Regressor (L1 / Least-Squares Loss) with K-Means Distance Features (KMDENETCD) - eXtreme Gradient Boosting on ElasticNet Predictions (RES_XGBR2) - Auto-Tuned Word N-Gram Text Modeler using token counts (WNGER2) - Auto-Tuned Summarized Categorical Modeler (SCENETR) - Keras Neural Network Regressor (KERASR) - Elastic-Net Regressor (L1 / Least-Squares Loss) (ENETCD) - eXtreme Gradient Boosted Trees Regressor (XL_XGBR2) - Gaussian Process Regressor with Dot Product Kernel (GPRDP) - Dropout Additive Regression Trees Regressor (PLGBMDR) - Elastic-Net Regressor (L1 / Least-Squares Loss) with Binned numeric features (BENETCD2) - eXtreme Gradient Boosted Trees Regressor with Early Stopping (XL_ESXGBR2) - Auto-tuned Stochastic Gradient Descent Regression (SGDRA) - RuleFit Regressor (RULEFITR) - Gaussian Process Regressor with Exponential Sine Squared Kernel (GPRESS) - Adaboost Regressor (ABR) - Elastic-Net Regressor (L1 / Least-Squares Loss) with Unsupervised Learning Features (UENETCD) - Gaussian Process Regressor with Matern Kernel (GPRM) - Light Gradient Boosting on ElasticNet Predictions (RES_PLGBMTR) - Gradient Boosted Trees Quantile Regressor with Early Stopping (QESGBR2) - ExtraTrees Regressor (Shallow) (SHAPRFR) - Statsmodels Quantile Regressor (QUANTILER) - eXtreme Gradient Boosted Trees Regressor with Early Stopping (ESXGBR2) - LightGBM Random Forest Regressor (PLGBMRFR) - Frequency-Cost ElasticNet (FCEE) - Frequency-Severity eXtreme Gradient Boosted Trees (FSXX2) - Gradient Boosted Trees Quantile Regressor (QGBR2) - eXtreme Gradient Boosting on ElasticNet Predictions (RES_ESXGBR2) Binary Classification - Stochastic Gradient Descent Classifier (SGDC) - LightGBM Random Forest Classifier (PLGBMRFC) - Bernoulli Naive Bayes classifier (scikit-learn) (BNBC) - Dropout Additive Regression Trees Classifier (PLGBMDC) - Auto-Tuned Char N-Gram Text Modeler using token counts (CNGEC2) - Gaussian Process Classifier with Matern Kernel (GPCM) - Gradient Boosted Trees Classifier with Early Stopping (ESGBC) - XRuleFit Classifier (XRULEFITC) - Support Vector Classifier (Radial Kernel) (SVMC2) - Multinomial Naive Bayes classifier (scikit-learn) (MNBC) - Adaboost Classifier (ABC) - eXtreme Gradient Boosting on ElasticNet Predictions (RES_XGBC2) - Elastic-Net Classifier (L1 / Binomial Deviance) (LENETCDWC) - Gaussian Process Classifier with Radial Basis Function Kernel (GPCRBF) - Light Gradient Boosted Trees Classifier with Early Stopping (ESLGBMTC) - Eureqa Classifier (EQC) - ExtraTrees Classifier (Gini) (SHAPRFC) - Logistic Regression (LR) - Keras Neural Network Classifier (KERASC) - Nystroem Kernel SVM Classifier (ASVMEC) - eXtreme Gradient Boosted Trees Classifier with Early Stopping and Unsupervised Learning Features (UESXGBC2) - RuleFit Classifier (RULEFITC) - Regularized Logistic Regression (L2) (LR1) - Nystroem Kernel SVM Classifier (ASVMSKC) - Naive Bayes combiner classifier (CNBC) - Light Gradient Boosted Trees Classifier with Early Stopping and Unsupervised Learning Features (UESLGBMTC) - Light Gradient Boosting on ElasticNet Predictions (RES_PLGBMTC) - Elastic-Net Classifier (L1 / Binomial Deviance) with K-Means Distance Features (KMDLENETCD) - ExtraTrees Classifier (Gini) (RFC) - Hot Spots (XPRIMC) - Partial Least-Squares Classification (PLSC) - Auto-tuned K-Nearest Neighbors Classifier (Euclidean Distance) (KNNC) - Eureqa Generalized Additive Model Classifier (EQ_ESXGBC) - eXtreme Gradient Boosted Trees Classifier (XL_XGBC2) - Auto-Tuned Summarized Categorical Modeler (SCLENETC) - Light Gradient Boosting on ElasticNet Predictions (RES_ESLGBMTC) - Elastic-Net Classifier with Naive Bayes Feature Weighting (NB_LENETCD) - eXtreme Gradient Boosted Trees Classifier (XGBC2) - Elastic-Net Classifier (L1 / Binomial Deviance) (LENETCD) - Gaussian Naive Bayes classifier (scikit-learn) (GNBC) - Logistic Regression (LRCD) - eXtreme Gradient Boosted Trees Classifier with Early Stopping (ESXGBC2) - eXtreme Gradient Boosted Trees Classifier (PXGBC2) - Auto-Tuned Word N-Gram Text Modeler using token counts (WNGEC2) Multi-class Classification - Stochastic Gradient Descent Classifier (SGDC) - LightGBM Random Forest Classifier (PLGBMRFC) - Dropout Additive Regression Trees Classifier (PLGBMDC) - Gradient Boosted Trees Classifier with Early Stopping (ESGBC) - Light Gradient Boosted Trees Classifier with Early Stopping (ESLGBMTC) - ExtraTrees Classifier (Gini) (SHAPRFC) - Logistic Regression (LR) - Regularized Logistic Regression (L2) (LR1) - Light Gradient Boosted Trees Classifier with Early Stopping and Unsupervised Learning Features (UESLGBMTC) - Light Gradient Boosting on ElasticNet Predictions (RES_PLGBMTC) - Keras Neural Network Classifier (KERASMULTIC) - ExtraTrees Classifier (Gini) (RFC) - Light Gradient Boosting on ElasticNet Predictions (RES_ESLGBMTC) - eXtreme Gradient Boosted Trees Classifier (XGBC2) - Elastic-Net Classifier (L1 / Binomial Deviance) (LENETCD) - Logistic Regression (LRCD) - eXtreme Gradient Boosted Trees Classifier with Early Stopping (ESXGBC2) - eXtreme Gradient Boosted Trees Classifier (PXGBC2) Boosting - eXtreme Gradient Boosted Trees Regressor (XL_PXGBR2) - eXtreme Gradient Boosting on ElasticNet Predictions (RES_XGBC2) - Light Gradient Boosting on ElasticNet Predictions (RES_ESLGBMTR) - eXtreme Gradient Boosting on ElasticNet Predictions (RES_XGBR2) - eXtreme Gradient Boosted Trees Regressor (XL_XGBR2) - Light Gradient Boosting on ElasticNet Predictions (RES_PLGBMTC) - eXtreme Gradient Boosted Trees Regressor with Early Stopping (XL_ESXGBR2) - eXtreme Gradient Boosted Trees Classifier (XL_XGBC2) - Light Gradient Boosting on ElasticNet Predictions (RES_ESLGBMTC) - Light Gradient Boosting on ElasticNet Predictions (RES_PLGBMTR) - eXtreme Gradient Boosting on ElasticNet Predictions (RES_ESXGBR2) Unsupervised Anomaly Detection - Local Outlier Factor Anomaly Detection (ADLOF) - Mahalanobis Distance Ranked Anomaly Detection with PCA and Calibration (ADMAHAL_PCA_CAL) - Keras Autoencoder (KERAS_AUTOENCODER) - Isolation Forest Anomaly Detection (ADISOFOR) - Mahalanobis Distance Ranked Anomaly Detection with PCA (ADMahalPCA) - Keras Autoencoder with Calibration (KERAS_AUTOENCODER_CAL) - Isolation Forest Anomaly Detection with Calibration (ADISOFOR_CAL) - Double Median Absolute Deviation Anomaly Detection (ADDMAD) - Keras Variational Autoencoder (KERAS_VARIATIONAL_AUTOENCODER) - Keras Variational Autoencoder with Calibration (KERAS_VARIATIONAL_AUTOENCODER_CAL) - One-Class SVM Anomaly Detection with Calibration (ADOSVM_CAL) - Local Outlier Factor Anomaly Detection with Calibration (ADLOF_CAL) - Anomaly Detection with Supervised Learning (XGB) (ADXGB) - One-Class SVM Anomaly Detection (ADOSVM) - Anomaly Detection with Supervised Learning (XGB) and Calibration (ADXGB2_CAL) - Double Median Absolute Deviation Anomaly Detection with Calibration (ADDMAD_CAL) Clustering - K-Means Clustering (KMEANS) Calibration - Calibrate predictions with RF (CALIB_V2_RFC) - Text fit on Residuals (L1 / Least-Squares Loss) (XL_ENETCD) - Calibrate predictions: Weighted Calibration (SWCAL) - Calibrate predictions (CALIB) - Text fit on Residuals (L1 / Binomial Deviance) (XL_LENETCD) - Fit High Cardinality and Text (XLF_LENETCD) - Text fit on Residuals (L1 / Least-Squares Loss) (RES_FDENETCD) - Calibrate predictions (CALIB2) - Calibrate predictions: Platt (PLACAL2) - Fit High Cardinality and Text (XLF_ENETCD) Other Column Selection - Converter for Text Mining (SCTXT2) - Single Column Converter for Summarized Categorical (SCBAGOFCAT) - Single Column Converter (SCPICK2) - Single Column Converter (SCPICK) - Converter for Text Mining (SCTXT4) - Multiple Column Selector (MCPICK) Automatic Feature Selection - Feature Selection for Ratios/Differences (FS_RFR2) - Feature Selection for dimensionality reduction (FS_RFCDR2) - Feature Selection for dimensionality reduction (FS_RFCDR_LASSO) - Feature Selection for dimensionality reduction (FS_RFRDR_LASSO) - Feature Selection using L1 Regularization (FS_XL_LASSO2) - Rare Feature Masking (RFMASK) - Feature Selection for Ratios/Differences (FS_RFC2) - Feature Selection for dimensionality reduction (FS_RFRDR2) - Bind branches (BIND)
Out[25]:
名前でタスクを検索¶
In [26]:
Copied!
w.search_tasks('keras')
w.search_tasks('keras')
Out[26]:
Keras Autoencoder with Calibration: [KERAS_AUTOENCODER_CAL] - Keras Autoencoder for Anomaly Detection with Calibration Keras Autoencoder: [KERAS_AUTOENCODER] - Keras Autoencoder for Anomaly Detection Keras Neural Network Classifier: [KERASC] - Keras Neural Network Classifier Keras Neural Network Classifier: [KERASMULTIC] - Keras Neural Network Multi-Class Classifier Keras Neural Network Regressor: [KERASR] - Keras Neural Network Regressor Keras Variational Autoencoder with Calibration: [KERAS_VARIATIONAL_AUTOENCODER_CAL] - Keras Variational Autoencoder for Anomaly Detection with Calibration Keras Variational Autoencoder: [KERAS_VARIATIONAL_AUTOENCODER] - Keras Variational Autoencoder for Anomaly Detection Keras encoding of text variables: [KERAS_TOKENIZER] - Text encoding based on Keras Tokenizer class Regularized Quantile Regressor with Keras: [KERAS_REGULARIZED_QUANTILE_REG] - Regularized Quantile Regression implemented in Keras
カスタムタスクの検索¶
In [27]:
Copied!
w.search_tasks('Awesome')
w.search_tasks('Awesome')
Out[27]:
Awesome Model: [CUSTOMR_6019ae978cc598a46199cee1] - This is the best model ever.
柔軟な検索¶
In [28]:
Copied!
w.search_tasks('bins')
w.search_tasks('bins')
Out[28]:
Binning of numerical variables: [BINNING] - Bin numerical values into non-uniform bins using decision trees Elastic-Net Regressor (L1 / Least-Squares Loss) with Binned numeric features: [BENETCD2] - Bin numerical values into non-uniform bins using decision trees, followed by Elasticnet model using block coordinate descent-- a common form of derivated-free optimization. Based on lightning CDRegressor.
In [29]:
Copied!
w.search_tasks('Pre-proc')
w.search_tasks('Pre-proc')
Out[29]:
In [30]:
Copied!
[a.task_code for a in w.search_tasks('decision')]
[a.task_code for a in w.search_tasks('decision')]
Out[30]:
['BINNING', 'BENETCD2', 'RFC', 'RFR']
In [31]:
Copied!
w.Tasks.RFC
w.Tasks.RFC
Out[31]:
ExtraTrees Classifier (Gini): [RFC] - Random Forests based on scikit-learn. Random forests are an ensemble method where hundreds (or thousands) of individual decision trees are fit to bootstrap re-samples of the original dataset. ExtraTrees are a variant of RandomForests with even more randomness.
簡単な説明¶
In [32]:
Copied!
w.Tasks.PDM3.description
w.Tasks.PDM3.description
Out[32]:
'One-Hot (or dummy-variable) transformation of categorical features'
タスクのドキュメントを表示¶
In [33]:
Copied!
binning.documentation()
binning.documentation()
Out[33]:
'https://app.datarobot.com/model-docs/tasks/BINNING-Binning-of-numerical-variables.html'
タスクパラメーター値の表示¶
例として、ビニングタスクを見てみましょう。
In [34]:
Copied!
binning.get_task_parameter_by_name('max_bins')
binning.get_task_parameter_by_name('max_bins')
Out[34]:
20
タスクパラメーターの変更¶
In [35]:
Copied!
binning.set_task_parameters_by_name(max_bins=22)
binning.set_task_parameters_by_name(max_bins=22)
Out[35]:
Binning of numerical variables (BINNING) Input Summary: Missing Values Imputed (quick median) (PNI2) Output Method: TaskOutputMethod.TRANSFORM Task Parameters: max_bins (b) = 22
キーを使用したタスクパラメーターの設定¶
あるいは、ショートネームを直接使用することもできます。
In [36]:
Copied!
binning.task_parameters.b = 22
binning.task_parameters.b = 22
パラメーターの検証¶
In [37]:
Copied!
binning.task_parameters.b = -22
binning.task_parameters.b = -22
In [38]:
Copied!
binning.validate_task_parameters()
binning.validate_task_parameters()
Binning of numerical variables (BINNING) Invalid value(s) supplied max_bins (b) = -22 - Must be a 'intgrid' parameter defined by: [2, 500]
Out[38]:
In [39]:
Copied!
binning.set_task_parameters(b=22)
binning.set_task_parameters(b=22)
Out[39]:
Binning of numerical variables (BINNING) Input Summary: Missing Values Imputed (quick median) (PNI2) Output Method: TaskOutputMethod.TRANSFORM Task Parameters: max_bins (b) = 22
タスクパラメーターの検証¶
In [40]:
Copied!
binning.validate_task_parameters()
binning.validate_task_parameters()
Binning of numerical variables (BINNING)
All parameters valid!
Out[40]:
user_blueprint_id
を渡して、個人リポジトリ内の既存のブループリントを更新します。
In [41]:
Copied!
blueprint_graph = keras_blueprint.save('A blueprint I made with the Python API (updated)', user_blueprint_id=user_blueprint_id)
blueprint_graph = keras_blueprint.save('A blueprint I made with the Python API (updated)', user_blueprint_id=user_blueprint_id)
In [42]:
Copied!
assert user_blueprint_id == blueprint_graph.user_blueprint_id
assert user_blueprint_id == blueprint_graph.user_blueprint_id
ブループリントの取得¶
保存したブループリントからブループリントを取得できます。
In [43]:
Copied!
w.get(user_blueprint_id).show()
w.get(user_blueprint_id).show()