PLAY PODCASTS
SMOTE: makin' yourself some fake minority data

SMOTE: makin' yourself some fake minority data

Machine learning on imbalanced classes: surprisin…

Linear Digressions

June 13, 201614m 37s

Audio is streamed directly from the publisher (feeds.soundcloud.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

Machine learning on imbalanced classes: surprisingly tricky. Many (most?) algorithms tend to just assign the majority class label to all the data and call it a day. SMOTE is an algorithm for manufacturing new minority class examples for yourself, to help your algorithm better identify them in the wild. Relevant links: https://www.jair.org/media/953/live-953-2037-jair.pdf

Topics

datasciencemachinelearninglineardigressions