PLAY PODCASTS
Apollo: An Exploration of Video Understanding in Large Multimodal Models

Apollo: An Exploration of Video Understanding in Large Multimodal Models

AI Papers Podcast Daily · AIPPD

December 17, 202422m 10s

Audio is streamed directly from the publisher (media.rss.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

This document is all about a new computer program called Apollo that can understand videos really well! It was created by researchers who wanted to see how well computers can understand videos. They found that a lot of the ways computers currently understand videos aren't very good because they rely on understanding the words that go with the video more than actually looking at the video. To make their program better, they had to look at lots of different ways that videos can be broken up and understood by computers. They also found that they didn't have to train Apollo on the absolute biggest computers to get good results, which will help other people do similar research without needing huge computers. In the end, the researchers found that Apollo is really good at understanding videos, even better than some other programs that use much bigger computers. They think that Apollo will help other researchers create even better video understanding programs in the future.

https://arxiv.org/pdf/2412.10360