# Panel data self-joins

> Panel data self-joins - Explore how to implement self-joins in panel data analysis.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-01T23:10:47.744957+00:00` (UTC).

## Primary page

- [Panel data self-joins](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/time-series/self-joins.html): Full documentation for this topic (HTML).

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [Developer learning](https://docs.datarobot.com/en/docs/api/dev-learning/index.html): Linked from this page.
- [AI accelerators](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/index.html): Linked from this page.
- [Time series and specific use cases](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/time-series/index.html): Linked from this page.

## Documentation content

# Panel data self-joins

[Access this AI accelerator on GitHub](https://github.com/datarobot-community/ai-accelerators/tree/main/use_cases_and_horizontal_approaches/Self_join_technique_for_panel_data)

In this accelerator, explore how to implement self-joins in panel data analysis. Regardless of your industry, if you work with panel data, this guide is tailored to help you accelerate feature engineering and extract valuable insights.

Panel data, with multiple observations for consistent subjects over time, is ubiquitous in various domains. While panel data is often spread across multiple tables, it can also exist in a single dataset with multiple features suitable as panel dimensions. The self-join technique enables automated, time-aware feature engineering with just one dataset, generating hundreds of candidate features of lagged aggregations and statistics. Combining these features within panel dimensions can substantially improve predictive model performance.

The accelerator focuses on predicting airline take-off delays of 30 minutes or more to illustrate the self-join technique. However, this framework applies broadly across verticals and can easily be adapted to your use case. Using a single dataset, join it four times across different features, engineer time-based features from each join, using the AI Catalog for data management.

The accelerator covers data preparation with multiple joins and time horizons, how to mitigate target leakage with multiple feature lists as well as time gaps in time-aware joins.

Panel data analysis unlocks valuable insights into subjects evolving over time, and is often overlooked when there is a singular dataset.
