
#143 – SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
Misreading Chat · Hajime Morrita
December 11, 202436m 7s
Audio is streamed directly from the publisher (misreading.chat) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
GitHub の Issue を読んでバグを直すエーアイについて森田が読みました。ご意見感想などは Reddit やおたより投書箱にお寄せください。iTunes のレビューや星もよろしくね。
- [2310.06770] SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
- [2405.15793] SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
- SWE-bench
- Introducing SWE-bench Verified | OpenAI
- The new Claude 3.5 Sonnet, Computer Use, and Building SOTA Agents — with Erik Schluntz, Anthropic