Instruction Tuning & RLHF | Episodio del podcast in RSS.com

Instruction Tuning & RLHF

Adapticx AI di Adapticx Technologies Ltd

S6 · E2

9 gen 2026

28:15

Note sull'episodio

In this episode, we explore how large language models learned to follow instructions—and why this shift turned raw text generators into reliable AI assistants. We trace the move from early, unaligned models to instruction-tuned systems shaped by human feedback.

We explain supervised fine-tuning, reward models, and reinforcement learning from human feedback (RLHF), showing how human preference became the key signal for usefulness, safety, and control. The episode also looks at the limits of RLHF and how newer, automated alignment methods aim to scale instruction learning more efficiently.

This episode covers:

Why early LLMs struggled with instructions
Supervised instruction tuning (SFT)
RLHF and reward modeling
Helpfulness, truthfulness, and safety trade-offs
Bias, cost, and s ...

Leggi dettagli

Parole chiave

Artificial Intelligence chatgptRLHFInstruction tuning

Dove è stato create l'episodio

Country

United Kingdom, United Kingdom