Project Wants to Sequence the DNA of All Complex Life on Earth

“We’re only just beginning to understand the full majesty of life on Earth,” wrote the founding members of the Earth BioGenome Project in 2018. The ambitious project raised eyebrows when first announced. It seeks to genetically profile over a million plants, animals, and fungi. Documenting these genomes is the first step to building an atlas of complex life on Earth.

Many living species remain mysterious to science. A database resulting from the project would be a precious resource for monitoring biodiversity. It could also shed light on the genetic “dark matter” of complex life to inspire new biomaterials, medicines, or spark ideas for synthetic biology. Other insights could tailor agricultural practices to ramp up food production and feed a growing global population.

In other words, digging into living creatures’ genetic data is set to unveil “unimaginable biological secrets,” wrote the team.

The problem? A hefty price tag. With an estimated cost of $4.7 billion, even the founders of the project called it a moonshot. However, against all odds, the project has made progress, with 3,000 genomes already sequenced and 10,000 more species expected by 2026.

While lagging its original goal of sequencing roughly 1.7 million genomes in a decade, the project still hopes to hit this goal by 2032—later than the original goalpost, but with a much lower price tag thanks to more efficient DNA sequencing technologies.

Meanwhile, the international team has also built infrastructure to share gene sequencing data, and machine learning methods are further helping the consortium analyze thousands of datasets—helping characterize new species and monitor DNA data for endangered ones.

Genetic material is everywhere. It’s an abundant resource to make sense of life of Earth. As genetic sequencing becomes faster, cheaper, and more reliable, recent studies have begun digging into information represented by DNA from species across the globe.

One method, dubbed metagenomics, captures and analyzes microbial DNA gathered in a variety of environments, from city sewers to boiling hot springs. The method captures and analyzes all DNA from a particular source to paint a broad genetic picture of bacteria from a given environment. Rather than bacteria, the Earth BioGenome Project, or EBP, is aiming to sequence the genomes of individual eukaryotic creatures—basically, those that keep most of their DNA in a nut-like structure, or nucleus, inside each cell.

Humans, plants, fungi, and other animals all fall into this group. In one estimate, there are roughly 10 to 15 million eukaryotic species on our planet. But just a little over two million have been documented.

Sequencing DNA from eukaryotic cells could vastly expand our knowledge of Earth’s genetic diversity. Such a database could also be a treasure trove for synthetic biology. Scientists have already tinkered with the genetic blueprints of life in bacteria and yeast cells. Deciphering—and then reprogramming—their genes has led to advances such as coaxing bacteria cells to pump out biofuels, degradable materials, and medicines such as insulin.

Charting eukaryotes’ genomes could further inspire new materials or medicines. For example, cytarabine, a chemotherapy drug, was initially isolated from a sponge-like sea creature and approved by the FDA to treat blood cancers that spread to the brain. Other plant-derived medications are already being used to tackle viral infections or to control pain. From nearly 400,000 different plant species, hundreds of medicines have already been approved and are on the market. Similarly, deciphering plant genetics have galvanized ideas for new biodegradable materials and biofuels.

Genetic sequences from complex organisms can “provide the raw materials for genome engineering and synthetic biology to produce valuable bioproducts at industrial scale,” wrote the team.

Medical and industrial uses aside, the effort also documents biodiversity. Creating a DNA digital library of all known eukaryotic life can pinpoint which species are most at risk—including species not yet fully characterized—providing data for earlier intervention.

“For the first time in history, it is possible to efficiently sequence the genomes of all known species and to use genomics to help discover the remaining 80 to 90 percent of species that are currently hidden from science,” wrote the team.