Science Watch® - Tracking Trends and Performance in Basic Research
May/June 2004


MEGA2 Proves Mega-Useful in Analyzing Evolution and Variation
by Jeremy Cherfas
WHAT'S HOT IN BIOLOGY
Rank      Paper Citations This Period (Nov-Dec 03) Rank Last Period (Sep-Oct 03)
1 S. Kumar, et al., "MEGA2: molecular evolutionary genetics analysis software," Bioinformatics, 17(12): 1244-5, December 2001. [Arizona St. U., Tempe; Tokyo Metropolitan U., Japan; Pennsylvania St. U., University Park,] *507UW
97 1
2 R.H. Waterston, et al. (Mouse Genome Sequencing Consortium), "Initial sequencing and comparative analysis of the mouse genome," Nature, 420(6915): 520-62, 5 December 2002. [46 institutions worldwide] *621VK 73 2
3 A.-C. Gavin, et al., "Functional organization of the yeast proteome by systematic analysis of protein complexes," Nature, 415(6868): 141-7, 10 January 2002. [Cellzome AG, Heidelberg, Germany; EMBL, Heidelberg; CGM-CNRS, Gif sur Yvette Cedex, France] *509PR 50 3
4 A. Bateman, et al., "The Pfam protein families database,", Nucleic Acids Res., 30(1): 276-80, 1 January 2002. [Wellcome Trust Sanger Inst. and Europ. Bioinfomatics Inst., Cambridge, U.K.; SIB, ISREC, Lausanne, Switzerland; Howard Hughes Med. Inst., Washington U. Sch. Med., St. Louis, MO; Karolinska Inst., Stockholm, Sweden] *508FB
47 5
5 T.R. Brummelkamp, R. Bernards, R. Agami, "A system for stable expression of short interfering RNAs in mammalian cells," Science, 296(5567): 550-3, 19 April 2002. [Netherlands Cancer Inst., Amsterdam; Ctr. Biomedical Genetics, Netherlands] *544UE 47 4
6 Y. Ho, et al., "Systemic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry," Nature, 415(6868): 180-3, 10 January 2002. [MDS Proteomics, Toronto, Canada; Mount Sinai Hosp., Toronto, Canada; U. Toronto, Canada] *509PR 44 6
7 J. Yu, et al., "A draft sequence of the rice genome (Oryza sativa L. ssp. indica)," Science, 296(5565): 79-92, 5 April 2002. [12 institutions worldwide] *539FA 37 8
8 S.A. Goff, et al., "A draft sequence of the rice genome (Oryza sativa L. ssp. japonica)," Science, 296(5565): 92-100, 5 April 2002. [6 U.S. institutions] *539FA 37 7
9 M.J. Gardner, et al., "Genome sequence of the human malaria parasite Plasmodium falciparum," Nature, 419(6906): 498-511, 3 October 2002. [13 institutions worldwide] *599RF 37 9
10 Y. Jiang, et al., "Crystal structure and mechanism of a calcium-gated potassium channel," Nature, 417(6888): 515-22, 30 May 2002. [Howard Hughes Med. Inst., Rockefeller U., New York, NY] *556QK 31
SOURCE: ISI's Hot Papers Database  the full legend.

The torrent of data spewing out of DNA sequencers in labs around the world is threatening to drown biologists even as it enables them to ask fundamental questions. No surprise, then, that a tool to tame the flood has been the most highly cited paper for two issues in a row. MEGA2 is the latest offspring of a software package that streamlines molecular evolutionary genetics analysis.

Masatoshi Nei, Professor of Biology at Pennsylvania State University and one of the founders of the field, tells Science Watch he was "surprised" at the popularity of the paper he co-authored with Sudhir Bioinformatics - read a profile about the journal BioinfomaticsKumar and colleagues at the Arizona Biodesign Institute, Arizona State University. "We simply tried to produce a computer program package that can easily be used by investigators without spending much time," he says. "There are many other computer programs for phylogenetic analysis, but they are not easy to use for uninitiated researchers. For this reason, we also developed simple analytical approaches that can be understood intuitively."

Even without molecular data, figuring out the evolutionary relationship between two organisms is fraught with problems. If they share some particular characteristic, is that because they both inherited it from a common ancestor? Or is it because both have independently arrived at the same kind of solution to a common problem? If they are radically different, is that because they are not closely related, or because they are living radically different lives? Molecular data add several layers of complexity. DNA that codes for absolutely vital functions will be very constrained and will not change much; it thus offers little information about relationships. And DNA that does not code will be free to mutate, but may then mutate back. Then there are the difficulties created by insertions and deletions, rearrangements of larger chunks of sequence, and so on.

A variety of methods have been devised to deal with these and other complexities, and, as Nei points out, there are a huge number of computer programs designed to help researchers use molecular information to explore evolutionary history. Kumar, now Director of the Center for Evolutionary Functional Genomics at Arizona State University, started working on MEGA in 1991 while a Ph.D. student in Nei’s lab. He says that one of the great strengths of MEGA2 is the way it handles large datasets. The difficulty arises because the number of possible evolutionary trees goes up exponentially as the number of branches increases. "Finding the optimal tree is impossible to do," Kumar tells Science Watch, adding that "you cannot exhaustively compute" all the options.

Nei had helped to devise a technique called the neighbor-joining method, which gave a very good approximation to the best tree (N. Saitou, M. Nei, "The neighbor-joining method: a new method for reconstructing phylogenetic trees," Molecular Biology and Evolution, 4[4]: 406-425, 1987), which is something of a citation champion itself, by now having logged more than 10,000 cites. "With MEGA you can do a neighbor-joining tree in less than a minute on an ordinary PC," Kumar says. The results bear that out. MEGA2 has been used with huge datasets, for example a recent analysis of 1,060 sequences of human mitochondrial DNA.

Nei has recently used MEGA to build phylogenetic trees for hundreds of MADS-box genes (MADS-box genes are highly conserved sequences that are widely distributed and that seem to control many important aspects of development; some biologists believe they will prove as important as homoeobox genes). The genes were identified from the Arabidopsis and rice sequences but their function is currently unknown. "Our analysis," Nei tells Science Watch, "identified several groups of genes whose functions are likely to be similar within groups." (J. Nam, et al., "Type I MADS-box genes have experienced faster birth-and-death evolution than type II MADS-box genes in angiosperms," PNAS, 101[7]: 1910-15, 17 February 2004). Experimenters are now following up on suggestions that emerged from the phylogenetic trees to work out what these genes do. And that could have profound consequences because other MADS-box genes are known to be involved in the timing of flowering. Manipulating those genes could offer a route to improving the yield of rice and other cereals.

This, in essence, is how both Kumar and Nei see the usefulness of MEGA2. "Researchers can obtain the results quickly and spend more time for thinking about the biological meanings of the results," Nei says. "MEGA2 is a biological tool for biologists, not a bioinformatics tool for bioinformaticians," notes Kumar. "The focus has been on the bench scientists all along, to enable them to discover novel patterns through comparative sequence analysis."

As more and more data become available, so the task of making sense of it becomes harder and harder. Are more powerful computers and cleverer programs the answer? Not for Nei. "I do not think high-speed computers can solve major problems in evolutionary biology," he tells Science Watch. "The most important equipment is the human brain. Only human brains can extract important scientific principles from vast amounts of DNA sequences. However, computers are certainly useful for sorting out important factors."end

Dr. Jeremy Cherfas is Science Writer at the
International Plant Genetic Resources Institute, Rome, Italy.

Science Watch®, May/June 2004, Vol. 15, No. 3
Citing URL: http://www.sciencewatch.com/may-june2004/sw_may-june2004_page
8.htm

Search | May/Jun 2004 Index | Archives | Contact | Home

What's New in Research - (Updated weekly) - What's NEW in Research
The Most-Cited Researchers in...
  |  Analysis Of...  |  Site Map by Field | ! QUICK SCIENCE !
Alphabetized List of All Essential Science Indicators Editorial Features/Interviews


Science Watch® is an editorial component of Essential Science Indicators. RSS Feeds for Essential Science Indicator's editorial Web sites
Visit other editorial components of ESI: "in-cites" and "Special Topics."
Write to the Webmaster with questions or comments about this site. Terms of Usage.
View all the products of the Research Services Group from Thomson Scientific.


(c) 2008 The Thomson Corporation.
Thomson Scientific