
78. Where do corpora come from?, with Matt Honnibal and Ines Montani
Most NLP projects rely crucially on the quality o…
NLP Highlights · Allen Institute for Artificial Intelligence
January 15, 201930m 21s
Audio is streamed directly from the publisher (podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Most NLP projects rely crucially on the quality of annotations used for training and evaluating models. In this episode, Matt and Ines of Explosion AI tell us how Prodigy can improve data annotation and model development workflows. Prodigy is an annotation tool implemented as a python library, and it comes with a web application and a command line interface. A developer can define input data streams and design simple annotation interfaces. Prodigy can help break down complex annotation decisions into a series of binary decisions, and it provides easy integration with spaCy models. Developers can specify how models should be modified as new annotations come in in an active learning framework.
Prodigy: https://prodi.gy
Prodigy recipe scripts: https://github.com/explosion/prodigy-recipes
Twitter:
https://twitter.com/_inesmontani
https://twitter.com/honnibal