PLAY PODCASTS
The Enron Dataset

The Enron Dataset

In 2000, Enron was one of the largest and compani…

Linear Digressions

February 9, 201512m 27s

Audio is streamed directly from the publisher (feeds.soundcloud.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

In 2000, Enron was one of the largest and companies in the world, praised far and wide for its innovations in energy distribution and many other markets. By 2002, it was apparent that many bad apples had been cooking the books, and billions of dollars and thousands of jobs disappeared. In the aftermath, surprisingly, one of the greatest datasets in all of machine learning was born--the Enron emails corpus. Hundreds of thousands of emails amongst top executives were made public; there's no realistic chance any dataset like this will ever be made public again. But the dataset that was released has gone on to immortality, serving as the basis for a huge variety of advances in machine learning and other fields. http://www.technologyreview.com/news/515801/the-immortal-life-of-the-enron-e-mails/

Topics

datasciencemachinelearninglineardigressions