Recommended monitoring endpoints¶
The following URLs are recommendations only. DataRobot supports a variety of cluster sizes and layouts. Work with DataRobot Customer Success to determine the best monitoring solution so your users aren't affected by operational issues.
Services¶
Monitor the following service routes:
| サービス | URL |
|---|---|
| GUI Application Health | /v1/health/?service=app&text=true |
| Upload Service Health | /v1/health/?service=appupload&text=true |
| Data Ingestion Manager Health | /v1/health/?service=datasetsserviceapi&text=true |
| Internal API Endpoint | /v1/health/?service=internalapi&text=true |
| Public API Server | /v1/health/?service=publicapi&text=true |
| Mongo Services Health | /v1/health/?service=mongo&text=true |
| Redis Service Health | /v1/health/?service=redis&text=true |
| Synchronous Prediction API | /v1/health/?service=predictionapi&text=true |
| Dedicated Prediction API Status | /v1/health/?service=dedicatedpredictionnginx&text=true |
| ModMon rsyslog Master | /v1/health/?service=modmonrsyslogmaster&text=true |
| ModMon database | /v1/health/?service=pgsql&text=true |
| Rabbit Status | /v1/health/?service=rabbit&text=true |
| Tableau Extension Status | /v1/health/?service=tableauextension&text=true |
| Worker Task Manager Status | /v1/health/?service=taskmanager&text=true |
| Elasticsearch | /v1/health/?service=elasticsearch&text=true |
備考
If you aren't sure what type of prediction service your cluster supports, contact DataRobot Support.
Test jobs¶
Regularly check the following routes for health, as they indicate whether the full end-to-end worker systems are functioning correctly:
| End-to-End Test | URL |
|---|---|
| Modeling Jobs | /v1/health/?name=Secure%20Worker%20Ping%20Job&text=true |
| Data Analysis Jobs | /v1/health/?name=EDA%20Worker%20Ping%20Job&text=true |
| Dedicated Prediction route | /v1/health/?name=Dedicated%20Prediction%20Server%20Health%20route&text=true |
Host-based checks¶
Include checks against specific hosts (one for each in your cluster) to quickly determine where a failure has occurred:
/v1/health/?address=myhost.com
Fine grained checks¶
Use separate monitors for important, business-sensitive applications such as dedicated prediction nodes.
Further support¶
DataRobot maintains a cloud deployment with a high number of simultaneous users and exceptional rates of simultaneous modeling and predictions. If you have questions about how to set up monitoring on your cluster or would like advice on how your particular situation could be better monitored, contact DataRobot Support.