Scientists Launch the Genomic Encyclopedia of Bacteria and Archaea

Scientists estimate that there are approximately 4 × 10^30 microbes living on the planet. To put this number into perspective, there are 4,000,000,000,000,000,000,000,000,000,000 microbes living on the planet compared to a mere 6,793,220,750 humans.

All life on the planet can be grouped into one of three branches on the tree of life – eukaryotes, bacteria, and archaea. More complex organisms, like humans, trees, fish, many fungi, and even single-celled 'microbial-type' organisms like diatoms, algae, and amoebae belong to the eukaryotic branch of the evolutionary tree. These organisms are made up of eukaryotic cells that are characterized by a membrane-bound nucleus. Most microbes belong to one of the other two branches. Bacteria are prokaryotic cells that do not contain a membrane-bound nucleus. Archaea have characteristics of both eukaryotic and prokaryotic cells, but, like the bacteria, lack a membrane-bound nucleus.

Organisms located on the bacteria and archaea branches are significant because, though a small number can be plant and animal pathogens, the vast majority are responsible for recycling nutrients, fixing carbon dioxide, and mitigate important agricultural and industrial processes. "Microbes mediate almost every conceivable biological process on the planet," said Eddy Rubin, Director of the DOE Joint Genome Institute (JGI), the high-throughput DNA sequencing facility supported the U.S. Department of Energy's Office of Science and located in Walnut Creek, Calif.

Due to their importance, bacteria and archaea have been a focus of many genome sequencing projects — with more than 1,000 genomes currently available. "Genome sequencing has revolutionized our understanding of the diverse roles that [these organisms] play," said Rubin. But the diversity of organisms within this genome database is low. "[What is known provides only a] narrow window into the diversity of bacteria and archaea" said Jonathan Eisen, a professor at the University of California, Davis and adjunct scientist at the DOE JGI.

Think of phylogenetic diversity as all of the separate branches growing from the main "trunk" of the Tree of Life. "Most of these separate branches within the bacteria and archaea have not yet been sampled in regard to genome sequencing" said Eisen.

To expand the genomic sampling of bacterial and microbial diversity, Eisen and others at the DOE JGI developed the Genomic Encyclopedia of Bacteria and Archaea (GEBA) to fill in the genomic gaps in the tree of life. Also participating in this work were researchers from many institutions including multiple DOE National Laboratories and the German Collection of Microorganisms and Cell Cultures (DSMZ).

According to Eisen, "GEBA does not focus on specific organisms or processes, but instead represents a way to build a framework of diversity." By understanding microbe diversity as a whole, the scientific community can better understand microbes of scientific importance.

Photo credit: Karin Higgins

Eisen collecting samples from the back of the Alvin submarine.

For the pilot study, the research team selected 200 microbial candidates. "We focused on organisms from which we could get DNA quickly and belonged to a branch in the bacterial or archaeal portion of the tree of life for which genomes were not available" said Eisen.

Focusing on 56 genomes in the pilot study, the researchers showed that better genomic sampling of the tree of life can lead to fundamental improvements in the discovery of new diversity as well as the interpretation and analysis of existing data from other organisms. There are practical benefits as well.

The genome of one microbe sequenced in the GEBA pilot study can produce enzymes capable of breaking down plant matter in highly acidic environments. This new information could have important implications in developing pretreatment processes to more efficiently and economically produce plant-derived transportation fuels.

Eisen is calling for an expanded effort. "This pilot study still only covers a tiny fraction of archaea and bacteria diversity. But by sequencing an additional 1,000 organism genomes, we can account for about 50 percent of known microbial diversity for organisms that can be grown in a lab. This can be accomplished in the next few years."


Photo credit: Karin Higgins

Researcher Jonathan Eisen

Future work lies in deciphering the diversity in organisms that cannot be grown in the lab. "That is the next frontier" said Eisen. "[To accomplish this,] we will need about 10,000 genomes to understand this diversity."

The Department of Energy (DOE) funded this project through the Office of Science Office Biological and Environmental Research. DOE invests in science and solving critical issues impacting people's daily lives and the nation's future. To learn more about DOE visit

The DOE Joint Genome Institute, supported by the Office of Science, is headquartered in Walnut Creek, Calif., and provides integrated high-throughput sequencing and computational analysis that enable systems-based scientific approaches to these challenges. To learn more about JGI visit

For 100 years, University of California, Davis has engaged in teaching, research and public service that matter to California and transform the world.

This article was written by Stacy W. Kish.