
Cassandra: A Decentralized Storage System for Scalable Data Management
Tech Unplugged · Sublimetechie
Audio is streamed directly from the publisher (content.rss.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
This podcast introduces Cassandra, a decentralized storage system designed for managing large datasets across commodity servers. Cassandra prioritizes high availability and fault tolerance, running efficiently on infrastructure with frequent failures. It utilizes a simple data model that gives users dynamic control over data layout. Developed by Facebook to address the needs of Inbox Search, Cassandra handles high write throughput and data replication across data centers. The system combines well-known techniques for scalability and availability, such as consistent hashing, replication, and gossip-based membership. Cassandra achieves efficient data retrieval through local persistence mechanisms, including commit logs and in-memory data structures, adapting to network and server load conditions. Experiences from implementing and maintaining Cassandra highlight its practical applications and ongoing development efforts.