Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Codespace sessions

This page outlines how to start a codespace session, upload files to a codespace, and manage its contents. You can view and manage codespaces from the Notebooks tab of the Use Case home page.

Although you cannot share individual codespaces directly with other users, in Workbench you can share Use Cases that contain codespaces. Therefore, to share a codespace with another user, you must share the entire Use Case so that they have access to all associated assets.

Start a codespace session

To manage the contents of the codespace file system or edit and execute its files, you must first start the codespace's environment. Click the environment icon to configure the environment. The environment image determines the coding language, dependencies, and open-source libraries used in the notebook. The default image for a codespace is a pre-built Python image. To see the list of all packages available in the default image, hover over that image in the Environment tab:

In addition to built-in environments, you can also use a custom environment for the codespace session by selecting it from the Environment dropdown.

To begin a codespace session, start the environment by toggling it on in the toolbar.

Wait a moment for the environment to initialize, and once it displays the Started status, you can begin working with the codespace. When the codespace session begins, the file system volume is mounted to the path /home/notebooks/storage/.

When the container session is started, the file system volume will be mounted. You can upload and manage files and folders to the file system in the side panel. You can also create new folders, notebook files, and non-notebook files from the file browser UI. As seen in the following screenshots, you can a new notebook by clicking on the “Create notebook” icon:

For AI Platform users (Self-Managed users excluded), DataRobot provides backup functionality and retention policies for codespaces. DataRobot takes snapshots of the codespace volume on session shutdown and on codespace deletion and will retain the contents for 30 days if you want to restore the codespace data.

Codespace environment variables

For codespace entities, environment variables are defined at the codespace level and not the individual notebook file level. When a codespace session is started, DataRobot sets all environment variables defined in the Environment Variables tab. You can retrieve these environment variables, via code, from any notebooks in the codespace file system.

To access environment variables, click the lock icon in the sidebar. Then click Create new entry.

In the dialog box, enter the key and value for a single entry; optionally provide a description.

To add multiple variables, select Bulk import. In the entry window, enter each variable, on a new line, in the following format:

KEY=VALUE # DESCRIPTION

Note

Any existing environment variable with the same key will have its value overwritten by the new value specified.

When you have finished adding environment variables, click Save.

Add files to a codespace

Each codespace consists of its own persistent file system. You can upload any number of files and folders to the codespace filesystem using the file browser in the side panel. To work with notebooks, upload .ipynb files.

You must start the codespace's environment before uploading files to it. For more information on how to configure an environment, see the Manage the notebook environment documentation.

After starting the environment, upload files or folders from your local machine by dragging and dropping them into the upload modal, or by clicking Upload and selecting File or Folder.

Create files

You can create new folders, notebook files, and non-notebook files directly from a codespace. Navigate to the Codespace files panel, and use the icons highlighted below.

To create new files, navigate to the Codespace files panel, and use the icons highlighted below.

Field Description
1 Create notebook Creates an executable .ipynb file. Provide a name for the notebook file.
2 Create file Creates a new file in the current folder of the codespace. Provide a name and specify the file type with an extension.
3 Create directory Creates a folder within the current folder of the codespace. Provide a name for the folder.

After creating a file, it appears as part of the codespace folder.

Manage files

You can manage and edit files from directly within a codespace. Select the Actions menu to the right of the file you want to edit to view the available actions.

Work with notebooks in a codespace

To edit and execute a notebook within a codespace, double click on a notebook file (.ipynb) in the file browser to open it.

The codespace interface supports a tabbed experience, so you can open, view, and edit multiple files at the same time. Similar to Jupyter, opening a notebook file will start a kernel process for that notebook. Each opened notebook will run in its own kernel.

DataRobot indicates which notebooks are running in active kernels with the purple notebook icon in the file browser. Inactive notebooks use a white icon.

To shut down a kernel, open the Actions menu for a notebook file and select Shut down kernel.

Work with non-notebook files

In addition to editing and executing notebooks, codespaces offer a text editor for you to also view and edit other file types. For example, as shown below, you can view image files and edit Python utility scripts.

Use Git with codespaces

Notebooks are represented as .ipynb files in a codespace, which allows you to version both your notebook and non-notebook files using an external Git repository. In addition to initializing a new codespace from a Git repo, you can directly use the Git CLI in a codespace's integrated terminal during an active session.

To create a new terminal instance in a codespace session, first ensure that the codespace environment is running.

From the sidebar, select the terminal icon at the bottom of the page to create a new terminal window.

Once the terminal session is running, you can use Git CLI commands like git pull or git push to sync the codespace with a remote repository and push your changes.

Note

If you push changes to a remote repo, you will be prompted in the terminal to configure and authenticate your Git account.

For example, you can use git clone to clone a Git repo to the codespace file system, as shown below.

Work with a private GitHub repository

Unlike a public repository, to work with a private repository in a codespace you will first need to authenticate via GitHub. DataRobot recommends using the GitHub CLI as the easiest way to authenticate. However, this approach is only recommended if you aren’t planning to collaborate on this codespace with other users, since only one personal access token can be set to the GH_TOKEN codespace environment variable at a time. If other users plan to start and edit this codespace, consider using the Git CLI authentication method instead.

Authenticate with GitHub CLI

  1. Create a GitHub personal access token (PAT) that works with DataRobot SSO. Follow the steps provided by GitHub to create the PAT. You must use a GitHub account associated with the “datarobot” organization.
  2. Authorize the PAT associated with the your organization and for use with SAML single sign-on (if necessary).
  3. Create a new codespace by selecting Add new > Codespace > Add codespace.
  4. In the Environment Variables side panel, create a codespace environment variable with the key GH_TOKEN and the value set to the PAT created in previous steps.
  5. Start the codespace session. Once the codespace is running, create a terminal session.
  6. In the newly created terminal instance install the GitHub CLI gh with the following command:

    curl -sS https://webi.sh/gh | sh

  7. DataRobot sets the GH_TOKEN environment variable at the start of the codespace session. gh automatically authenticates using that token. You can immediately proceed to cloning your repo to the codespace file system using the following gh command:

    gh repo clone <repository>

  8. Once cloning completes, the contents of your repository appears in the codespace file manager.

Authenticate with Git CLI

  1. Create a GitHub personal access token (PAT) and verify its compatibility with single-sign-on (if necessary). Follow the steps provided by GitHub to create the PAT. You must use a GitHub account associated with your organization.
  2. Authorize the PAT associated with your organization for use with SAML single sign-on (if necessary).
  3. Create a new codespace by selecting Add new > Codespace > Add codespace.
  4. Start the codespace session. Once the codespace is running, create a terminal session.
  5. In the terminal, cd into the “storage” directory before cloning a repo.

  6. Git clone the repo as you would in a Bash-like environment. When you are prompted for a username and password, provide the PAT you configured to work with DataRobot SSO in step 1.

  7. Once cloning completes, the contents of your repository appears in the codespace file manager.

Persistent dependency installations

When you install runtime custom dependencies into a codespace during an active session, Python and pip dependencies and HuggingFace artifacts can persist across sessions if they are installed to the user's virtual env. This persistent dependency installation capability is only supported when Python-based images are used for the codespace session.

A codespace has two virtual environments:

  • The user virtual environment,(/home/notebooks/storage/.venv): A new user virtual environment that persists any custom dependencies that are installed at runtime throughout codespace sessions. This venv is associated with the codespace (since it's persisted in the codespace file system), so all users who have access to the codespace will be able to access the same set of persisted pip installations when they start the codespace session.

  • The kernel virtual environment, (/etc/system/kernel/.venv): A built-in image virtual environment that holds all dependencies that DataRobot provides as a part of the selected notebook environment image. This virtual environment does not maintain any custom dependencies that you install; it maintains them for the duration of the session.

Although you cannot directly access the user virtual environment from the UI (it is not shown in the codespace file system panel), you can access it via the terminal. This is where you can install custom dependencies that will persist across sessions via !pip install <PACKAGE_NAME>.

By default, all new dependencies you install via !pip install will go into the user venv. The user venv takes precedence if you install a different version of a library DataRobot provides as a part of the built-in notebook image.

Installation considerations for DataRobot packages

If you use the dependency installation feature to install the datarobotx package or newer versions of the datarobot package (newer than the one that is preinstalled in the built-in images), you can corrupt the preinstalled datarobot packages that install themselves into the datarobot package (e.g., datarobot-mlops-connected-client). This is because when you install a new package or a new version of an existing package in an active codespace session, it’s being installed into the user virtual environment which takes precedence over the kernel virtual env. In this case, the installation of datarobotx creates the datarobot package in the user venv and the datarobot package from the kernel venv cannot be resolved. However, pip still thinks that datarobot-mlops-connected-client is installed.

To resolve this error, run pip install with --force-reinstall for all shadowed libs that extend the datarobot package (e.g., datarobot-mlops-connected-client). For example: pip install datarobot-mlops-connected-client --force-reinstall.

To check where a depedency was imported from, run the following command in a notebook cell or in the terminal.

pip list -v
Package                          Version      Location                                             Installer
-------------------------------- ------------ ---------------------------------------------------- ---------
...
datarobot                        3.3.0        /etc/system/kernel/.venv/lib/python3.9/site-packages pip
...

To check the size of a user virtual environment, run the following command in the terminal.

(.venv) [notebooks@kernel ~/storage]$ du -h . -d 1
14M     ./.venv
14M     .

If your user virtual environment seems broken and you want to recreate it, run the following command in the terminal.

rm -rf /home/notebooks/storage/.venv
python -m venv /home/notebooks/storage/.venv

Disable persistent dependency installations

If you want to disable persistent dependency installation for a given codespace, first add a new environment variable NOTEBOOKS_NO_PERSISTENT_DEPENDENCIES=1 to the codespace. Then, run rm -rf /home/notebooks/storage/.venv from the codespace terminal to remove the existing user venv. After that, restart your codespace session. Any Python dependencies installed at runtime will no longer be persisted between sessions.


Updated September 6, 2024