# Reinforcement learning

> Reinforcement learning - Implement a model based on the Q-learning algorithm.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:09.575844+00:00` (UTC).

## Primary page

- [Reinforcement learning](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/custom-model-dev/reinforce-learn.html): Full documentation for this topic (HTML).

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [Developer learning](https://docs.datarobot.com/en/docs/api/dev-learning/index.html): Linked from this page.
- [AI accelerators](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/index.html): Linked from this page.
- [Custom model development](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/custom-model-dev/index.html): Linked from this page.

## Documentation content

[Access this AI accelerator on GitHub](https://github.com/datarobot-community/ai-accelerators/blob/main/advanced_ml_and_api_approaches/Reinforcement%20learning/Reinforcement_Learning.ipynb)

In this accelerator, you implement a very simple model based on the Q-learning algorithm. This accelerator shows a basic form of reinforcement learning that doesn't require a deep understanding of neural networks or advanced mathematics and how one might deploy such a model in DataRobot.

This example shows the Grid World problem, where an agent learns to navigate a grid to reach a goal.

The accelerator will go through the following steps:

1. Define state and action space
2. Create a Q-table to store expected rewards for each state/action combination
3. Implement a learning algorithm and train a model
4. Evaluate the model
5. Deploy the model to a DataRobot REST API endpoint