MiniRAG: Simple Retrieval-Augmented Generation for Small Language Models

March 9, 202518m 21s

Audio is streamed directly from the publisher (content.rss.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

The

This podcast introduces MiniRAG, a novel Retrieval-Augmented Generation (RAG) system designed for Small Language Models (SLMs) in resource-constrained environments. MiniRAG utilizes a semantic-aware heterogeneous graph indexing mechanism and a lightweight topology-enhanced retrieval approach to overcome the limitations of SLMs. It outperforms existing lightweight RAG systems while using significantly less storage space and maintaining robustness when transitioning from Large Language Models (LLMs) to SLMs. The paper includes a new benchmark dataset, LiHuaWorld, specifically designed for evaluating lightweight RAG systems under realistic on-device scenarios. Experiments demonstrate that MiniRAG's unique architecture enables it to achieve comparable performance to LLM-based methods even with SLMs. A detailed analysis validates the contributions of the key components, showcasing the effectiveness of the proposed query-guided reasoning path discovery mechanism.

← All episodes of Tech Unplugged