SMOTE: makin' yourself some fake minority data
Machine learning on imbalanced classes: surprisin…
June 13, 201614m 37s
Audio is streamed directly from the publisher (feeds.soundcloud.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Machine learning on imbalanced classes: surprisingly tricky. Many (most?) algorithms tend to just assign the majority class label to all the data and call it a day. SMOTE is an algorithm for manufacturing new minority class examples for yourself, to help your algorithm better identify them in the wild.
Relevant links:
https://www.jair.org/media/953/live-953-2037-jair.pdf
Topics
datasciencemachinelearninglineardigressions