Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Execute a custom transformer using GraphSAGE

Access this AI accelerator on GitHub

Tabular data is one of the most common ways data is represented in machine learning. However, it is not the only structure that can be used. Many real-world problems involve relationships between entities that can be better captured using graph structures. Graph data represents entities as nodes and relationships as edges, making it a powerful tool for capturing relational dependencies. Common use cases for graph-based learning include social networks, recommendation systems, fraud detection, and molecular property prediction. In these applications, using geometric deep learning (i.e., the application of deep learning approaches on non-Euclidean data like graphs) techniques have grown in popularity in recent years. Deep learning is particularly well-suited for studying this type of information due to their ability to learn representations automatically, especially when it comes to unstructured data.

Despite its advantages, graph-based learning techniques are often overlooked for traditional tabular data. This is potentially due to the underlying question: how do you represent tabular data into a graph? Thankfully, methods like k-Nearest Neighbors (kNN) graphs exist that can do much of the heavy lifting for you.

In this accelerator, explore how geometric deep learning can be leveraged to extract graph-based features to enrich datasets for supervised tasks. You can achieve this by:

  • Converting a tabular dataset into a graph representation using kNN graphs
  • Training a GraphSAGE-based neural network to generate unsupervised node embeddings
  • Packaging the solution as a DataRobot Custom Transformer
  • Evaluating its impact on downstream machine learning tasks in DataRobot.

Updated February 27, 2025