Fine-tune and deploy LLMs in a codespace¶
Access this AI accelerator on GitHub
This accelerator illustrates an end-to-end workflow for fine-tuning and deployment an LLM using features of Huggingface, Weights and Biases (W&B), and DataRobot.
Specifically, the accelerator walks you through the following steps:
- Downloading an LLM from the Hugging Face Hub.
- Acquiring a dataset from Hugging Face.
- Leveraging DataRobot codespaces, notebooks, and GPU resources to facilitate fine-tuning via Hugging Face and W&B.
- Leveraging DataRobot MLOps to register and deploy a model as an inference endpoint.
- Leveraging DataRobot's LLM Playground to evaluate and compare your fine-tuned LLM against available LLMs.
The accelerator uses Hugging Face as a common example that you can modify based on your needs. It uses Weights and Biases to help keep track of your experiments. It is helpful to visualize training loss in real time as well as log prompt results for review during fine-tuning. Also, if you decide to do some hyperparameter tuning, you can do so with W&B Sweeps.
Considerations¶
This accelerator has been tested in a DataRobot codespace with a GPU resource bundle. requirement.txt has a pinned version of the required libraries.
Notebooks images in DataRobot have limited writable space (about 20GB). Therefore, checkpointing models during finetuning is not encouraged, and if you do checkpoint, limit it. This accelerator opts to fine-tune llama-3.2-1B since it is on the smaller side.
Use Weights and Biases to track the experiment. The W&B API Key is available in .env.
If you don't have a W&B account, get one at the W&B sign up page.