PLAY PODCASTS
The Library of Life: Inside GenBank’s 34 Trillion Base Pairs
Episode 3183

The Library of Life: Inside GenBank’s 34 Trillion Base Pairs

pplpod · pplpod

February 27, 202619m 46s

Audio is streamed directly from the publisher (content.rss.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

In this episode of pplpod, we unzip the massive history and mechanics of GenBank, the world’s most significant open-access DNA sequence database. Maintained by the NCBI (National Center for Biotechnology Information), this genetic library has grown exponentially since its founding in 1982, doubling in size roughly every 18 months.

Join us as we explore:

  • The Scale of Science: How GenBank currently houses over 4.7 billion nucleotide sequences and data on more than 580,000 species, ranging from Homo sapiens and Triticum aestivum (wheat) to SARS-CoV-2.
  • A Historic Shift: The database's evolution from the theoretical biology groups at Los Alamos National Laboratory to a global bioinformatics essential.
  • The Data Dilemma: The challenges of maintaining accuracy in open science, including how misidentified specimens—like the 75% error rate found in certain fish sequences—can complicate research.

Whether you are interested in genetics, open science data, or the infrastructure of biological discovery, this deep dive explains how scientists catalog the building blocks of life.