Overview
TRL (Transformer Reinforcement Learning) is a Hugging Face library designed to train transformer language models with Reinforcement Learning. It covers the full post-training pipeline: from Supervised Fine-tuning (SFT) to Reward Modeling (RM), Proximal Policy Optimization (PPO), and Direct Preference Optimization (DPO).
Key Features
- SFTTrainer: A wrapper around Hugging Face
Trainerfor easy Supervised Fine-Tuning. - DPOTrainer: For Direct Preference Optimization.
- PPOTrainer: For classical RLHF pipelines.
Agent Post-Training
TRL is increasingly being used to fine-tune models to act as Agents (tool use, reasoning paths, step-by-step thinking). By using datasets of function-calling and agentic behavior, you can use TRL to post-train base models to explicitly output structured commands and reflect on tool outputs.
Examples & Notebooks
- SFT Nemotron Notebook - A great starting point for Supervised Fine-Tuning.
TODO: Add specific code snippets for SFTTrainer and post-training agentic behaviors.