Build Better AI Agents with RL & Fine-Tuning (Kyle from OpenPipe)

AI Tinkerers - "One-Shot" por Joe Heitzeberg

Notas del episodio

What you’ll learn:

• How reinforcement learning can reduce AI agent error rates by up to 60% and drastically lower inference costs.

• The critical difference between supervised fine-tuning and RL for agentic workflows, and why RL is essential for true agent reliability.

• A practical, code-level walkthrough of building and training an email search agent that outperforms OpenAI’s GPT-3.5 on a 14-billion-parameter open-source model.

• Strategies for generating high-quality synthetic data and designing nuanced reward functions with ‘partial credit’ to effectively train your agents.

• Key use cases where RL fine-tuning delivers the most significant benefits, including real-time voice agents and high-volume applications.

Kyle Corbett is the founder of OpenPipe, a platform dedicated to helping enterprises build  ... 

 ...  Leer más
Palabras clave
Reinforcement learning for LLMsAI fine-tuningAI agent reliabilityOpenPipe AILLM optimizationAI cost reductionReinforcement learning tutorialAI latency optimizationCustom AI modelsAI Tinkerers podcast