DataRobot Ingress Reference Settings¶
Ingress in Kubernetes is an API object that manages external access to the services in a cluster, typically HTTP. For the Ingress resource to work, the cluster must have an ingress controller running. Cluster administrators are expected to select ingress controllers according to their requirements. There are dozens of ingress controllers available. Some controllers are maintained by the Kubernetes team (e.g., AWS, GCE, and Nginx controllers), while others are provided by third-party vendors.
DataRobot is compatible with a wide range of ingress controllers. Nginx Ingress, HAProxy Ingress, and OpenShift Ingress have been fully validated by the DataRobot team, but other controllers will also work as long as some basic requirements are met. These requirements are documented below.
If you don't know which ingress controller to pick, we recommend using Nginx Ingress.
Nginx Ingress Reference Settings¶
This section details the way DataRobot sets up the ingress-nginx controller using a helm chart. Client clusters may not use these exact settings, as each client will have its own administrator. This is provided as a reference and starting point for cluster admins who want to see how DataRobot does it.
Kubernetes provides extensive documentation about the controller here: https://github.com/kubernetes/ingress-nginx/tree/main/docs. This location includes user guides, examples, hardening recommendations, and many other details.
DataRobot uses the ingress-nginx chart. This chart comes from the kubernetes.github.io helm repo, as seen below. The internal DataRobot automation uses the latest chart available in the repo.
$ helm repo list
NAME URL
ingress-nginx https://kubernetes.github.io/ingress-nginx
This chart can be added to the client's local helm repo list with the following command:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
AWS¶
Nginx Ingress Controller release can be configured with the below values.yaml.
controller:
admissionWebhooks:
enabled: false
config:
use-proxy-protocol: "true"
worker-processes: "2"
extraArgs:
annotations-prefix: nginx.ingress.kubernetes.io
ingressClassResource:
default: true
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: <a list of DR internal tags>
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "3600"
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:us-east-1:DR_AWS_ACCOUNT_NUMBER:certificate/DR_ACM_CERTIFICATE_ID
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
service.beta.kubernetes.io/aws-load-balancer-subnets: <comma-separated list of AWS subnets IDs>
ports:
http: 80
targetPorts:
https: http
type: LoadBalancer
defaultBackend:
digest: null
enabled: true
imagePullSecrets: []
The helm command to install it will be something like this, assuming the above values are inside a file named ingress-nginx_values.yaml:
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx --values ingress-nginx_values.yaml --namespace ingress-nginx --debug
WARNING: nginx by default use AWS Classic LoadBalancer NOTE: For further details, we recommend consulting the official documentation for deploying on AWS
AWS Network Load Balancer¶
To use AWS Network Load Balancer as Loadbalancer AWS Load Balancer Controller must be installed.
Nginx Ingress Controller release can be then configured with the below values.yaml.
controller:
admissionWebhooks:
enabled: false
config:
use-forwarded-headers: "true"
use-gzip: "true"
use-proxy-protocol: "true"
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
service.beta.kubernetes.io/aws-load-balancer-subnets: <comma-separated list of AWS subnets IDs>
service.beta.kubernetes.io/aws-load-balancer-type: external
externalTrafficPolicy: Local
NOTE: For further details, we recommend consulting the official documentation for Installing AWS Load Balancer Controller and nginx-ingress nlb
Azure¶
Nginx Ingress Controller release can be configured with the below values.yaml.
controller:
admissionWebhooks:
enabled: false
service:
annotations:
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-load-balancer-tcp-idle-timeout: "30"
externalTrafficPolicy: Local
type: LoadBalancer
watchIngressWithoutClass: true
The helm command to install it will be something like this, assuming the above values are inside a file named ingress-nginx_values.yaml:
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx --values ingress-nginx_values.yaml --namespace ingress-nginx --debug
NOTE: For further details, we recommend consulting the official documentation for deploying on Azure
Azure Ingress-Terminated TLS with cert-manager¶
See the cert-manager and LetsEncrypt for Ingress Terminated TLS section for details on the cert-manager and LetsEncrypt settings DataRobot uses for ingress terminated TLS.
Google Cloud¶
Nginx Ingress Controller release can be configured with the below values.yaml.
controller:
admissionWebhooks:
enabled: false
service:
externalTrafficPolicy: Local
type: LoadBalancer
watchIngressWithoutClass: true
The helm command to install it will be something like this, assuming the above values are inside a file named ingress-nginx_values.yaml:
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx --values ingress-nginx_values.yaml --namespace ingress-nginx --debug
NOTE: For further details, we recommend consulting the official documentation for deploying on GCP
Google Cloud Ingress-Terminated TLS with cert-manager¶
See the cert-manager and LetsEncrypt for Ingress Terminated TLS section for details on the cert-manager and LetsEncrypt settings DataRobot uses for ingress terminated TLS. Also, if Google provides a certificate service, it is not possible to integrate it with the Nginx ingress controller.
Compatibility with Ingress Controllers¶
DataRobot leverages the standard Kubernetes Ingress and does not impose strict dependencies on particular Ingress Controllers. Nevertheless, certain DataRobot HTTP endpoints come with specific non-functional prerequisites, such as maximum body size or request timeout. Administrators may need to adjust either DataRobot's ingress annotations or the ingress controller configuration to meet these requirements effectively.
DataRobot is shipped with corresponding annotations for the following ingress controllers out of the box:
For other Ingress Controllers, administrators must ensure the following requirements are met:
- The Ingress Controller must support the websockets protocol.
- The
apigatewayIngress should support requests up to 1124M in size that take up to 600 seconds. - The
coreIngress should support requests up to 20G in size that take up to 600 seconds. - The
nbx-ingressIngress should support requests up to 1124M in size. - The
nbx-websocketsIngress should support requests up to 15M in size that take up to 3600 seconds. - If the
nbx-websockets-notebooks-deploymentdeployment is configured with two or more replicas (default: 1), "sticky sessions" should be implemented at the ingress side.
OpenShift Ingress configuration example¶
DataRobot supports OpenShift Ingress; please use the following values.yaml as a starting point:
global:
domain: "datarobot.apps.example.com"
ingressClassName: openshift-default
ingress:
annotations:
route.openshift.io/termination: edge
DataRobot's domain should be a subdomain of the domain name configured at the ingress controller side (please see the OpenShift Ingress documentation for more details).
We recommend terminating TLS at the ingress controller side by adding the annotation route.openshift.io/termination: edge. Additional annotations supported by OpenShift Ingress are documented on the Route configuration page in OCP docs.
HAProxy Ingress configuration example¶
The following values.yaml snippet implements the requirements above for HAProxy Ingress. This configuration is already shipped with the app and should be treated as informational; it can be used as a reference for configuring other ingress controllers.
global:
ingressClassName: haproxy
ingress:
annotations: {} # additional global ingress annotations go here
apigateway:
ingress:
annotations:
haproxy-ingress.github.io/proxy-body-size: "1124000000"
haproxy-ingress.github.io/timeout-server: "600s"
haproxy-ingress.github.io/timeout-server-fin: "600s"
haproxy-ingress.github.io/timeout-client: "600s"
haproxy-ingress.github.io/timeout-client-fin: "600s"
haproxy-ingress.github.io/timeout-http-request: "600s"
core:
ingress:
annotations:
haproxy-ingress.github.io/proxy-body-size: "20000000000"
haproxy-ingress.github.io/timeout-server: "600s"
haproxy-ingress.github.io/timeout-server-fin: "600s"
haproxy-ingress.github.io/timeout-client: "600s"
haproxy-ingress.github.io/timeout-client-fin: "600s"
haproxy-ingress.github.io/timeout-http-request: "600s"
nbx-ingress:
ingress:
annotations:
haproxy-ingress.github.io/proxy-body-size: "1124000000"
nbx-websockets:
ingress:
annotations:
haproxy-ingress.github.io/proxy-body-size: "15000000"
haproxy-ingress.github.io/timeout-server: "3600s"
haproxy-ingress.github.io/timeout-server-fin: "3600s"
haproxy-ingress.github.io/timeout-client: "3600s"
haproxy-ingress.github.io/timeout-client-fin: "3600s"
haproxy-ingress.github.io/timeout-http-request: "3600s"
cookie-persistence: "nbx-websockets-session-persistence-cookie"