PLAY PODCASTS
Optimized Optimized Web Crawling

Optimized Optimized Web Crawling

Last week’s episode, about methods for optimized …

Linear Digressions

November 4, 201819m 42s

Audio is streamed directly from the publisher (feeds.soundcloud.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

Last week’s episode, about methods for optimized web crawling logic, left off on a bit of a cliffhanger: the data scientists had found a solution to the problem, but it wasn’t something that the engineers (who own the search codebase, remember) liked very much. It was black-boxy, hard to parallelize, and introduced a lot of complexity to their code. This episode takes a second crack, where we formulate the problem a little differently and end up with a different, arguably more elegant solution. Relevant links: http://www.unofficialgoogledatascience.com/2018/07/by-bill-richoux-critical-decisions-are.html http://www.csc.kth.se/utbildning/kth/kurser/DD3364/Lectures/KKT.pdf

Topics

datasciencemachinelearninglineardigressions