Science Watch® - Tracking Trends and Performance in Basic Research
November/December 2005


Even in Biology, Good Workers Ignore Their Tools
by Jeremy Cherfas
WHAT'S HOT IN BIOLOGY
Rank      Paper Citations This Period (May-Jun 05) Rank Last Period (Mar-Apr 05)
1 M. Zuker, et al., "Mfold web server for nucleic acid folding and hybridization prediction," Nucl. Acids Res., 31(13): 3406-15, 1 July 2003. [Rensselaer Polytech. Inst., Troy, NY] *695LT 52 1
2 A. Caspi, et al., "Influence of life stress on depression: Moderation by a polymorphism in the 5-HTT gene," Science, 301(5631): 386-9, 18 July 2003. [King’s College London, U.K.; U. Wisconsin, Madison; U. Otago, Dunedin, New Zealand] *702CG 45
3 T. Schwede, et al., "SWISS-MODEL: an automated protein homology-modeling server," Nucl. Acids Res., 31(13): 3381-5, 1 July 2003. [U. Basel, Switzerland; Swiss Inst. Bioinformatics, Basel; Novartis AG, Basel; GlaxoSmithKline, Research Triangle Park, NC] *695LT 44 3
4 F. Heil, et al., "Species-specific recognition of single-stranded RNA via toll-like receptor 7 and 8," Science, 303(5663): 1526-9, 5 March 2004. [U. Munich, Germany; Osaka U., Japan; Japan Sci. Tech. Corp., Osaka; Coley Pharmaceut. Grp., Wellesley, MA] *800AA 41
5 J.M. Alonso, et al., "Genome-wide insertional mutagenesis of Arabidopsis thaliana," Science, 301(5633): 653-7, 1 August 2003. [Salk Inst. Biol. Stud., La Jolla, CA; Plant Biotech. Inst., Saskatoon, Canada; U. Calif., San Diego] *706UN  40
6 W.-K. Huh, et al., "Global analysis of protein localization in budding yeast," Nature, 425(6959): 686-91, 16 October 2003. [U. Calif. San Francisco, Howard Hughes Med. Inst., San Francisco, CA] *732DA 39 8
7 R.A. Gibbs, et al. (The International HapMap Consortium), "The International HapMap Project", Nature, 426(6968): 789-96, 18/25 December 2003. [74 institutions worldwide] *754QM 37 6
8 J.D. Bendtsen, et al., "Improved prediction of signal peptides: SignalP 3.0," J. Mol. Biol., 340(4): 783-95, 16 July 2004. [Tech. U. Denmark, Lyngby; Stockholm U., Sweden] *838SU 35
9 S.S. Diebold, et al., "Innate antiviral responses by means of TLR7-mediated recognition of single-stranded RNA," Science, 303(5663): 1529-31, 5 March 2004. [London Res. Inst., U.K.; Osaka U., Japan; RIKEN Res. Ctr., Yokohama Japan; Japan Sci. Tech. Corp., Osaka] *800AA 33
10 M. Stephens, P. Donnelly, "A comparison of Bayesian methods for haplotype reconstruction from population genotype data," Am. J. Hum. Genet., 73(5): 1162-9, November 2003. [U. Washington, Seattle] *742AR 32
 SOURCE: ISI’s Hot Papers DatabaseRead  the Legend.

Repeatedly present the same stimulus to an animal and, if there are no consequences, the animal will stop responding. This is habituation, and it afflicts even these Top Ten reports. The papers at #1 and #3 have been hovering in the top five for so long, preceded by so many other, similar papers, that they have become invisible. And not surprisingly. They describe tools that most molecular biologists probably use several times a day. You don’t expect a carpenter to care much exactly what hammer he is using, as long as the nail goes in.

What prompts a close look now? This, from the paper at #8: "We found a surprisingly high error rate in Swiss-Prot, where, for example, of the order of 7% of the Gram-positive entries had either wrong cleavage site position and/or wrong annotation of the experimental evidence." One molecular tool dissing another molecular tool, even in the most polite terms, is bound to evoke a fresh response, and so it did.

Four of the 10 most-cited papers are about tools for making sense of molecular data. Sequences are now a commodity, to be obtained as cheaply and quickly as possible. Scientists add value by interpreting the sequence to make biological sense of it. Michael Zuker’s Mfold web server, at #1, predicts how DNA and RNA will fold up and hybridize, allowing researchers to design antisense sequences that will block particular messages, among other things. Torsten Schwede and his group created SWISS-MODEL at #3, which does a similar kind of job for proteins, predicting the structure of a protein from its amino acid sequence. At #10, Matthew Stephens and Peter Donnelly discuss different statistical methods for predicting the higher-level structure of the chromosomes from knowledge of the detailed sequence. And at #8, Søren Brunak and his colleagues at the Technical University of Denmark and at Stockholm University describe the latest improved version of their system for predicting the presence of genes that code for signal proteins.

Brunak’s group teaches its software, called SignalP, to recognize signal peptides. Show it the sequence of several peptides known to be signals. Let the program work out their salient features. Now show it an unknown sequence and ask SignalP to decide whether it represents a signal or something else. The known signals that make up the lessons are obviously crucial. So the Scandinavian team went over a whole slew of purported signal sequences in databases such as Swiss-Prot (which also regularly features in the highly cited list) to make absolutely certain that they were what they claimed to be. Some were not, hence the mention of errors.

Not that this is a big deal. Researchers do not get upset at this kind of checking, and the community as a whole gains from a more accurate dataset. It gains, too, from the new version of SignalP, which is now somewhat more adept at spotting signal sequences. To deal with issues about the accuracy of sequences, and especially of the annotations that make sense of the sequence, several research communities have created curated databases. Human curators check all the data submitted, so what the database lacks in quantity it more than makes up for in quality. But while a web search reveals many such curated databases, where are the peer-reviewed papers describing them? By rights, they ought to be highly cited, but don’t seem to be.

Michael Zuker notes that his #1 paper is "the first and only paper describing in detail what was already a popular web site," originally launched in 1996. He added that he asks people to cite the article if they publish articles containing useful results obtained on the Mfold web server. The score so far is roughly a citation a day. But researchers are querying the Mfold server about 800,000 times a month, and many of those are student bioinformaticians whose teachers have set them problems to solve. When they are publishing, will they be citing the tools they have used? Or will we all be so habituated to the idea of web servers that make sense of sequences that there will be no point? Indeed, in the rapidly evolving world of web services and web publishing, will citation counts remain the best indication of a result’s importance?end

Dr. Jeremy Cherfas is Science Writer at the
International Plant Genetic Resources Institute, Rome.

View the top 10 scientists and/or top 3 Hot Papers in Biology & Biochemistry.
Assorted features pertaining to Biology & Biochemistry from all of the Essential Science Indicators Web product editorials. Features include interviews, essays, comments, profiles, and rankings of the top- performing scientists, institutions, countries, journals, papers, and fields.

 

Science Watch®, November/December 2005, Vol. 16, No. 6
Citing URL: http://www.sciencewatch.com/nov-dec2005/sw_nov-dec2005_page
8.htm

Search | Nov/Dec 2005 Index | Archives | Contact | Home

What's New in Research - (Updated weekly) - What's NEW in Research
The Most-Cited Researchers in...
  |  Analysis Of...  |  Site Map by Field | ! QUICK SCIENCE !
Alphabetized List of All Essential Science Indicators Editorial Features/Interviews


Science Watch® is an editorial component of Essential Science Indicators. RSS Feeds for Essential Science Indicator's editorial Web sites
Visit other editorial components of ESI: "in-cites" and "Special Topics."
Write to the Webmaster with questions or comments about this site. Terms of Usage.
View all the products of the Research Services Group from Thomson Scientific.


(c) 2008 The Thomson Corporation.
Thomson Scientific