Unsupervised Dimensionality Reduction: UMAP vs t-SNE

Dimensionality reduction redux: this episode cove…

January 13, 202029m 34s

Audio is streamed directly from the publisher (feeds.soundcloud.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

Dimensionality reduction redux: this episode covers UMAP, an unsupervised algorithm designed to make high-dimensional data easier to visualize, cluster, etc. It’s similar to t-SNE but has some advantages. This episode gives a quick recap of t-SNE, especially the connection it shares with information theory, then gets into how UMAP is different (many say better). Between the time we recorded and released this episode, an interesting argument made the rounds on the internet that UMAP’s advantages largely stem from good initialization, not from advantages inherent in the algorithm. We don’t cover that argument here obviously, because it wasn’t out there when we were recording, but you can find a link to the paper below. Relevant links: https://pair-code.github.io/understanding-umap/ https://www.biorxiv.org/content/10.1101/2019.12.19.877522v1

Topics

datasciencemachinelearninglineardigressions

← All episodes of Linear Digressions