Crowdsourced AI benchmarks have serious flaws, some experts say

April 24, 20255m 29s

Audio is streamed directly from the publisher (mgln.ai) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

AI labs are increasingly relying on crowdsourced benchmarking platforms such as Chatbot Arena to probe the strengths and weaknesses of their latest models. But some experts say that there are serious problems with this approach from an ethical and academic perspective.

Learn more about your ad choices. Visit podcastchoices.com/adchoices

← All episodes of TechCrunch Industry News