# GraphSAGE custom transformer

> GraphSAGE custom transformer - Convert a tabular dataset into a graph representation, train a
> GraphSAGE-based neural network, and package the solution as a DataRobot custom transformer.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:09.574766+00:00` (UTC).

## Primary page

- [GraphSAGE custom transformer](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/custom-model-dev/custom-transform.html): Full documentation for this topic (HTML).

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [Developer learning](https://docs.datarobot.com/en/docs/api/dev-learning/index.html): Linked from this page.
- [AI accelerators](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/index.html): Linked from this page.
- [Custom model development](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/custom-model-dev/index.html): Linked from this page.

## Documentation content

[Access this AI accelerator on GitHub](https://github.com/datarobot-community/ai-accelerators/blob/main/advanced_ml_and_api_approaches/GDL%20Featurizer/GDL%20Featurizer.ipynb)

Tabular data is one of the most common ways data is represented in machine learning. However, it is not the only structure that can be used. Many real-world problems involve relationships between entities that can be better captured using graph structures. Graph data represents entities as nodes and relationships as edges, making it a powerful tool for capturing relational dependencies. Common use cases for graph-based learning include social networks, recommendation systems, fraud detection, and molecular property prediction. In these applications, using [geometric deep learning](https://graphics.stanford.edu/courses/cs233-18-spring/ReferencedPapers/GCNN_Geometric%20deep%20learning-%20going%20beyond%20Euclidean%20data.pdf) (i.e., the application of deep learning approaches on non-Euclidean data like graphs) techniques have grown in popularity in recent years. Deep learning is particularly well-suited for studying this type of information due to their ability to learn representations automatically, especially when it comes to unstructured data.

Despite its advantages, graph-based learning techniques are often overlooked for traditional tabular data. This is potentially due to the underlying question: how do you represent tabular data into a graph? Thankfully, methods like [k-Nearest Neighbors (kNN)](https://en.wikipedia.org/wiki/Nearest_neighbor_graph) graphs exist that can do much of the heavy lifting for you.

In this accelerator, explore how geometric deep learning can be leveraged to extract graph-based features to enrich datasets for supervised tasks. You can achieve this by:

- Converting a tabular dataset into a graph representation using kNN graphs
- Training a GraphSAGE -based neural network to generate unsupervised node embeddings
- Packaging the solution as a DataRobot Custom Transformer
- Evaluating its impact on downstream machine learning tasks in DataRobot.
