The publicly funded Human Genome Project (HGP) launched in October 1990 and announced formal completion in April 2003, timed to the 50th anniversary of Watson and Crick's double-helix paper. Twenty institutions in six countries (United States, United Kingdom, France, Germany, Japan, and China) shared the work, with NHGRI in the US, the Sanger Centre in the UK, the Whitehead Institute at MIT, and Washington University in St. Louis carrying the heaviest sequencing loads. The original $3 billion budget came in at roughly $2.7 billion.
What 3.05 Billion Base Pairs Actually Showed
The reference genome contains about 3.05 × 10⁹ base pairs spread across 23 chromosome pairs, and it codes for only around 20,000 protein-coding genes. Researchers had expected closer to 100,000, so the smaller count forced a rethink of how regulation, splicing, and non-coding DNA shape human biology.
The Race With Celera
While the public consortium ran a clone-by-clone strategy, Craig Venter's Celera Genomics pursued whole-genome shotgun sequencing in parallel. The two efforts published back-to-back on 15–16 February 2001, with the HGP paper appearing in Nature and the Celera paper in Science. The competition compressed timelines by years and pushed both teams to release data faster than either had originally planned.
Finishing the Job: T2T and the Pangenome
The 2003 "complete" sequence covered about 92 percent of the genome; repetitive heterochromatin near centromeres and telomeres stayed unreadable for almost two decades. The Telomere-to-Telomere Consortium closed those gaps with T2T-CHM13, released in March 2022, adding 200 million base pairs and 99 previously hidden genes. The Human Pangenome Reference Consortium then published a draft pangenome in May 2023 built from 47 individuals across global populations, replacing a reference that had skewed heavily European.
From $2.7 Billion to $200
Sequencing cost has fallen faster than Moore's Law. The first human genome cost about $2.7 billion in 2003. By 2007 the price was near $1 million per genome, by 2014 it hit $1,000, and on Illumina's NovaSeq X platform a 2024 genome runs around $200. That curve is what turned sequencing from a one-time international project into a routine clinical assay.
Clinical Payoff
The downstream medicine is concrete. BRCA1 and BRCA2 testing now guides breast and ovarian cancer risk decisions for millions of patients. CYP2C9 and VKORC1 variants inform warfarin starting doses and reduce bleeding events. CFTR sequencing supports cystic fibrosis carrier screening for prospective parents. Whole-exome and whole-genome panels diagnose thousands of rare pediatric conditions each year that previously took families through diagnostic odysseys lasting five to ten years.
What the Reference Enabled
The HGP did not just produce a string of letters; it produced the scaffolding that GWAS studies, cancer genomics, CRISPR target design, ancestry analysis, and pharmacogenomics all rest on. T2T closed the structural gaps, the pangenome is broadening representation, and per-genome cost keeps falling. The 1990 plan was to read the human instruction set once. Thirty-six years later, the field is reading millions of genomes and using them to make daily clinical calls.
💬 Discussion (0)
Leave a Comment