
Crowdsourced AI benchmarks have serious flaws, some experts say
TechCrunch Industry News · TechCrunch
April 24, 20255m 29s
Audio is streamed directly from the publisher (mgln.ai) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
AI labs are increasingly relying on crowdsourced benchmarking platforms such as Chatbot Arena to probe the strengths and weaknesses of their latest models. But some experts say that there are serious problems with this approach from an ethical and academic perspective.
Learn more about your ad choices. Visit podcastchoices.com/adchoices