PLAY PODCASTS
31 - Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling

31 - Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling

ICLR 2017 paper by Hakan Inan, Khashayar Khosravi…

NLP Highlights · Allen Institute for Artificial Intelligence

October 6, 201711m 19s

Audio is streamed directly from the publisher (podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

ICLR 2017 paper by Hakan Inan, Khashayar Khosravi, Richard Socher, presented by Waleed. The paper presents some tricks for training better language models. It introduces a modified loss function for language modeling, where producing a word that is similar to the target word is not penalized as much as producing a word that is very different to the target (I've seen this in other places, e.g., image classification, but not in language modeling). They also give theoretical and empirical justification for tying input and output embeddings. https://www.semanticscholar.org/paper/Tying-Word-Vectors-and-Word-Classifiers-A-Loss-Fra-Inan-Khosravi/424aef7340ee618132cc3314669400e23ad910ba