Branching Out with Phylogenetically Driven Genome Sequencing

What's Hot in Biology, November/December 2011

by Jeremy Cherfas

Your gut (and mine) harbors roughly 100 trillion microbial cells, roughly 10 times the number that make up the rest of your body (and mine). They have a profound effect on health, helping to extract energy and nutrition from food, and changes in the gut flora are associated with obesity and some bowel diseases.

The paper at #3, from the Metagenomics of the Human Intestinal Tract Consortium (MetaHIT), details the microbial gene set, 150 times larger than the human genome, and the more than 1,000 bacterial species that can be found in the gut. It identified a minimal metagenome shared by all the people studied, including overweight individuals and those suffering bowel disease. Many of the genes are closely linked to nutrition, helping to break down complex molecules such as polysaccharides, and synthesizing essential amino acids, vitamins, and short-chain fatty acids.

The metagenomic approach harvests and sequences the DNA of all the organisms living in a particular environment, in this case human feces. This gives it an edge over other survey techniques, which are often incomplete because many bacteria cannot be cultured.

As a result, metagenomics often identifies not just species but entire metabolic pathways that a more conventional survey might miss. The MetaHIT Consortium estimated an increase of 30% in the number of functional categories it identified, compared to previous studies. Even metagenomics, however, is not enough for some bacterial completists.

At #1 is a paper from Jonathan Eisen at the University of California, Davis, and a large group of colleagues, who set out to see if they could target microbes for sequencing specifically in order to understand better their evolution, phylogenetic history, and functioning.

A woman works with human genetic material at a laboratory in Munich, May 2011. REUTERS/Michael Dald.
A woman works with human genetic material at a laboratory in Munich, May 2011. REUTERS/Michael Dald.

The team’s quest started from the observation that almost everything we knew about bacterial evolution and family trees was derived from just three of the 40 or so phyla of bacteria. The dominant phylogenetic tree was derived from the sequence of one small piece of RNA, and there were bits of the tree that made no sense. So Eisen and his colleagues set out to draw up a "Genomic Encyclopedia of Bacteria and Archaea" (GEBA).

They first made a list of branches of the tree that had little or no sequence data available and sent it to Hans-Peter Klenk at the DSMZ (the German Collection of Microorganisms and Cell Cultures), who identified about 200 microbes from those branches in the collection.

The DSMZ grew the microbes and sent them to the Joint Genome Institute of the U.S. Department of Energy, which sequenced them and returned the data to Eisen and colleagues who, as Eisen recalls in a blog entry discussing the paper, spent a good deal of time then analyzing the data asking a pretty simple question—are there any general benefits that come from this ‘phylogeny driven’ approach to sequencing genomes compared to what one might find with sequencing just any random genome?" Simple answer: yes.

The paper reports on sequences from 56 of the selected microbes, 53 Bacteria and 3 Archaea. The 53 GEBA bacteria species delivered up to 4.4 times more phylogenetic diversity than a randomly chosen set of 53 non-GEBA bacteria. The distantly related GEBA species also vastly increased the rate of discovery of novel gene families and gene functions, compared to sampling more closely related groups such as those within a phylum or family. More than 10% of the protein families identified from the GEBA sequences, as the authors note, "showed no significant sequence similarity" to existing known proteins. One of those proteins is especially significant and exciting.

BARP—for bacterial actin-related protein—is, as its name suggests, closely related to actin, one of the proteins responsible in eukaryotes for maintaining cell shape and for movement inside and outside cells. Actin itself has never been found in bacteria and archaea, although they do have a protein called MreB that, like actin in eukaryotes, is part of the cell scaffolding.

BARP, isolated from the marine bacteria Haliangium ochraceum, is clearly related structurally and in sequence to actin. The function of BARP is not yet known, but its presence in a bacterial species suggests an evolutionary antecedent for eukaryote actin.

Based on these first 56 genomes, and the amount of new knowledge they have delivered, the GEBA team is confident both that targeted sequencing is a valid approach to scientific exploration and that "the benefits of phylogenetically driven genome sequencing show no sign of saturating with these first 56 genomes." They calculate that 1,520 phylogenetically diverse sequences would reveal half of the diversity present in known bacterial and archaeal cultures.

But there’s the rub. The vast majority of microbe species cannot currently be cultured. They reveal themselves only through efforts like the MetaHIT Consortium’s metagenome of human gut inhabitants. Data from metagenomic sequences, however, allow the GEBA group to estimate that only 9,218 sequences from currently uncultured species would capture half of that diversity. With the cost of sequencing plummeting faster even than the cost of computing, and new approaches to metagenomics flourishing, it probably won’t be long before we have both a much more complete tree of life and a better understanding of what it means.End

Dr. Jeremy Cherfas is Senior Science Writer at Bioversity International, Rome, Italy.

Click the tab above to view Hot Papers.
What's Hot in Biology
Rank Paper Cites This Period
May-Jun 11
Rank Last Period
Mar-Apr 11
D.Y. Wu, et al., "A phylogeny-driven genomic encyclopedia of Bacteria and Archaea," Nature, 462(7276): 1056-60, 24 December 2009. [6 U.S. and German institutions]  *535UB 71 +
The 1000 Genomes Project Consortium (D.L. Altshuler, et al.), "A map of human genome variation from population-scale sequencing," Nature, 467(7319): 1061-73, 28 October 2010. [78 institutions worldwide]  *671XW 59 4
3 J.J. Qin, et al., “A human gut microbial gene catalogue established by metagenomic sequencing,” Nature, 464(7285): 59-65, 4 March 2010. [14 institutions worldwide]  *563GZ 45 1
Y. Tanaka, et al., "Genome-wide association of IL28B with response to pegylated interferon-a and ribavirin therapy for chronic hepatitis C," Nature Genetics, 41(10): 1105-9, October 2009. [17 Japanese institutions]  *500UG 43 2
C. Choudhary, et al., "Lysine acetylation targets protein complexes and co-regulates major cellular functions," Science, 325(5942): 834-40, 14 August 2009.  [Max Planck Inst. Biochem.,  Martinsried, Germany; U. Copenhagen, Denmark]  *487AK       37 3
6 K. Kim, et al., "Epigenetic memory in induced pluripotent stem cells," Nature, 467(7313): 285-90, 16 September 2010. [7 U.S. institutions]  *650CF 36 +
7 M. Constanzo, et al., "The genetic landscape of a cell," Science, 327(5964): 425-31, 22 January 2010. [15 institutions worldwide]  *546BS 35 +
T.M. Teslovich, et al., "Biological, clinical and population relevance of 95 loci for blood lipids," Nature, 466(7307): 707-13, 5 August 2010. [117 institutions worldwide]  *634EN 35 +
9 H.L  Guo, et al., "Mammalian microRNAs predominantly act to decrease target mRNA levels," Nature, 466(7308): 835-40, 12 August 2010. [Whitehead Inst., Cambridge, MA; Howard Hughes Med. Inst.; MIT, Cambridge; U. Calif., San Francisco; Calif. Inst. Quantitative Biosci., San Francisco]  *636TT 35 7
10 J. Schmutz, et al., "Genome sequence of the palaeopolyploid soybean," Nature, 463(7278): 178-83, 14 January 2010. [17 U.S. institutions]  *543MQ 32 +
SOURCE: Thomson Reuters Hot Papers Database. Read the Legend.




   |   BACK TO TOP