More Than Pixels – Unlock your image data with Vision-Language Models (sps24)

Chaos Computer Club - archive feed · Johannes Kolbe

October 18, 202428m 45s

Audio is streamed directly from the publisher (cdn.media.ccc.de) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

Join us on two Vision-Language Adventures! We'll uncover the information hidden inside big image collections with Vision-Language Models (VLMs) showing us the way. Who knows which forgotten gems await us? In the first part, we'll use CLIP and FAISS to go on a treasure hunt in your photo collection. You'll learn how to filter through millions of images in a breeze, using natural language. Bye-bye endless scrolling, hour-long tagging, and frustrated folder searching. In the second part, we will harness the power of VLMs to help us caption images – translating pixels to words. Then we'll make use of the BERTopic library to reveal even deeper insights into your photo collections. By the end of this talk, you'll be equipped with the knowledge and tools to unlock new insights, identify patterns, and make your image data work harder for you. This talk is for an intermediate audience – it is good if you bring some knowledge in Computer Vision, NLP or just general Deep Learning. about this event: https://c3voc.de

Topics

56264importData Science & MoreAula2024

← All episodes of Chaos Computer Club - archive feed