PLAY PODCASTS
Arash Ahmadian on Rethinking RLHF
Episode 51

Arash Ahmadian on Rethinking RLHF

TalkRL: The Reinforcement Learning Podcast

March 25, 202433m 30s

Audio is streamed directly from the publisher (media.transistor.fm) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI.

Featured Reference

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker


Additional References

Topics

Reinforcement LearningMachine LearningArtificial Intelligence