# Platform

> Platform - Questions having to do with the DataRobot underlying platform and general model building
> architectures.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.679806+00:00` (UTC).

## Primary page

- [Platform](https://docs.datarobot.com/en/docs/reference/robot-to-robot/rr-platform.html): Full documentation for this topic (HTML).

## Sections on this page

- [What is an end-to-end ML platform?](https://docs.datarobot.com/en/docs/reference/robot-to-robot/rr-platform.html#what-is-an-end-to-end-ml-platform): In-page section heading.
- [Single-tenant vs multi-tenant SaaS?](https://docs.datarobot.com/en/docs/reference/robot-to-robot/rr-platform.html#single-tenant-vs-multi-tenant): In-page section heading.
- [What is Kubernetes and why is running it natively important?](https://docs.datarobot.com/en/docs/reference/robot-to-robot/rr-platform.html#what-is-kubernetes-and-why-is-running-it-natively-important): In-page section heading.
- [CPUs vs GPUs](https://docs.datarobot.com/en/docs/reference/robot-to-robot/rr-platform.html#cpus-vs-gpus): In-page section heading.

## Related documentation

- [Reference documentation](https://docs.datarobot.com/en/docs/reference/index.html): Linked from this page.
- [ELI5](https://docs.datarobot.com/en/docs/reference/robot-to-robot/index.html): Linked from this page.

## Documentation content

# Platform

## What is an end-to-end ML platform?

Think of it as baking a loaf of bread. If you take ready-made bread mix and follow the recipe, but someone else eats it, that's not end-to-end. If you harvest your own wheat, mill it into flour, make your loaf from scratch (flour, yeast, water, etc.), try out several different recipes, take the best loaf, eat some of it yourself, and then watch to see if it doesn't become moldy—that's end to end.

## Single-tenant vs multi-tenant SaaS?

DataRobot supports both single-tenant and multi-tenant SaaS and here's what it means.

**ELI5:**
Single-tenant: You rent an apartment. When you're not using it, neither is anybody else. You can leave your stuff there without being concerned that others will mess with it.

Multi-tenant: You stay in a hotel room.

Multi-tenant: Imagine a library with many individual, locked rooms, where every reader has a designated room for their personal collection, but the core library collection at the center of the space is shared, allowing everyone to access those resources. For the most part, you have plenty of privacy and control over your personal collection, but there's only one of copy of each book at the center of the building, so it's possible for someone to rent out the entire collection on a particular topic, leaving others to wait their turn.

Single-tenant: Imagine a library network of many individual branches, where each individual library branch carries a complete collection while still providing private rooms. Readers don't need to share the central collection of their branch with others, but the branches are maintained by the central library committee, ensuring that the contents of each library branch is regularly updated for all readers.

Self-managed: Some readers don't want to use our library space and instead want to make a copy to use in their own home. These folks make a copy of the library and resources and take them home, and then maintain them on their own schedule with their own personal resources. This gives them even more privacy and control over their content, but they lose the convenience of automated updates, new books, and library management.

**Robot-to-robot:**
`Robot 1`

What do we mean by Single-tenant and multi-tenant SaaS? Especially with respect to the DataRobot cloud?

`Robot 2`

Single-tenant and multi-tenant generally refer to the architecture of a software-as-a-service (SaaS) application. In a single-tenant architecture, each customer has their own dedicated instance of the DataRobot application. This means that their DataRobot is completely isolated from other customers, and the customer has full control over their own instance of the software (it is self-managed). In our case, these deployment options fall in this category:

Virtual Private Cloud (VPC), customer-managed
AI Platform, DataRobot-managed

In a multi-tenant SaaS architecture, multiple customers share a single instance of the DataRobot application, running on a shared infrastructure. This means that the customers do not have their own dedicated instance of the software, and their data and operations are potentially stored and running alongside other customers, while still being isolated through various security controls. This is what our DataRobot Managed Cloud offers.

In a DataRobot context, multi-tenant SaaS is a single core DataRobot app (app.datarobot.com), a core set of instances/nodes. All customers are using the same job queue & resources pool.

In single-tenant, we instead run a custom environment for each user & connect to them with a private connection. This means that resources are dedicated to a single customer and allows for more restriction of access AND more customizability.

`Robot 3`

Single-tenant = We manage a cloud install for one customer.
Multi-tenant = We manage multiple customers on one instance—this is https://app.datarobot.com/

`Robot 2`

In a single-tenant environment, one customer's resource load is isolated from any other customer, which avoids someone's extremely large and resource-intensive job affecting others. That said, we isolate our workers, so even if a large working job is running on one user it doesn’t affect other users. We also have worker limits to prevent one user from hogging all the workers.

`Robot 1`

Ah okay, I see...

`Robot 2`

Single-tenant's more rigid separation is a way to balance the benefits of self-managed(privacy, dedicated resources, etc.) and the benefits of cloud (don't have to upkeep your own servers/hardware, software updating and general maintenance is handled by DR, etc.).

`Robot 1`

Thank you very much Robot 2 (and 3)... I understand this concept much better now!

`Robot 2`

Glad I could help clarify it a bit! Note that I'm not directly involved in single-tenant development, so I don't have details on how we're implementing it, but this is accurate as to the general motivation to host single-tenant SaaS options alongside our multi-tenant environments.


## What is Kubernetes and why is running it natively important?

Kubernetes is an open source platform for hosting applications and scheduling dynamic application workloads.

Before Kubernetes, most applications were hosted by launching individual servers and deploying software to them—that's your database node, your webserver node, etc.

Kubernetes uses container technology and a control plane to abstract the individual servers, allowing application deployments to easily change size in response to load and handle common needs like rolling updates, automatic recovery from node failure, etc.

It's important for DataRobot to run natively on Kubernetes because Kubernetes has become the world's most popular application hosting platform. Users' infrastructure teams have Kubernetes clusters and want to deploy third-party vendor software to them rather than maintaining bespoke virtual machines for every application. This means easier installation because many infrastructure teams already know how to set up or provide a Kubernetes cluster.

Interesting links:

["Smooth sailing with kubernetes."](https://cloud.google.com/kubernetes-engine/kubernetes-comic)

## CPUs vs GPUs

Here’s a good image from NVIDIA that helps to compare CPUs to GPUs.

CPU's are designed to coordinate and calculate a bunch of math—they have a bunch of routing set up and they're going to have drivers (or operating systems) built to make that pathing and organizing as easy as the simple calculations. Because they're designed to be a "brain" for a computer, they're built to do it all.

GPU's are designed to be specialized for, well, graphics hence the name. To quickly render video and 3d graphics, you want a bunch of very simple calculations performed all at once - instead of having one "thing" [CPU cores] calculating the color for a 1920x1080 display [a total of 2073600 pixels], maybe you have 1920 "things" [GPU cores] dedicated to doing one line of pixels each and all running in parallel.

"Split this Hex code for this pixel's color into a separate R, G, and B value and send it to the screen's pixel matrix" is a much simpler task than, say, the "convert this video file into a series of frames, combine them with the current display frame of this other application, be prepared to interrupt this task to catch and respond to keyboard/mouse input, and keep this background process running the whole time..." tasks that a CPU might be doing. Because of this, a GPU can be slower and more limited than a CPU while still being useful, and it might have unique methods to complete its calculations so it can be specialized for X purpose [3d rendering takes more flexibility than "display to screen"]. Maybe it only knows very simple conversions or can't keep track of what it used to be doing - "history" isn't always useful for displaying graphics, especially if there's a CPU and a buffer [RAM] keeping track of history for you.

Since CPU's want to be usable for a lot of different things, there tends to be a lot of Operating Systems/drivers to translate between the higher level code I might write and the machine's specific registers and routing. BUT since a GPU is made with the default assumption "this is going to make basic graphics data more scalable" they often have more specialized machine functionality, and drivers can be much more limited in many cases. It might be harder to find a translator that can tell the GPU how to do the very specific thing that would be helpful in a specific use case, vs the multiple helpful translators ready to explain to your CPU how to do what you need.
