Distributed Feature Discovery¶
Distributed feature discovery is an optional feature and can be enabled on-demand. Datarobot currently supports the feature only on AWS installations. Datarobot uses AWS EMR Serverless 7.3.0 base docker image that currently exposed to multiple CVEs: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-730-common-vulnerabilities-exposures.html . Please, make sure that it complies with company policies.
依存関係¶
- All requirements are satisfied to enable Spark Batch Feature in compute spart service
- The image is presented in ECR as described here
- Configure env as shown in the table below
Configuration Values¶
To configure these options, refer to the Tuning Datarobot Environment Variables section of this guide.
| 設定 | 説明 | デフォルト |
|---|---|---|
CSP_SPARK_SAFER_CUSTOM_IMAGE_HOST |
ECR registry used to store spark-batch-image. Example: 11122233334444.dkr.ecr.us-east-1.amazonaws.com | なし |
CSP_SPARK_SAFER_CUSTOM_IMAGE_REPO |
Repository name in ECR | spark-batch-image |
ENABLE_SAFER_DISTRIBUTED_MODE |
Flag to enable distributed feature discovery | false |
Enable feature¶
Feature can be enabled per user:
- Login to application
- Click on
Settingsicon - Select
Feature Access - Search for
Enable Feature Discovery in Distributed Modeand make sure it's enabled.