
#38 – Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Misreading Chat · Jun Mukai
November 8, 201830m 2s
Audio is streamed directly from the publisher (misreadingchat.files.wordpress.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
ニューラル自然言語処理の前処理として複雑な単語を限られた語彙集合で分割するアルゴリズムについて向井が話します。
- Neural Machine Translation of Rare Words with Subword Units
- [1804.10959] Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
- google/sentencepiece: Unsupervised text tokenizer for Neural Network-based text generation.
- vocabulary for chromium class names