The paper is given as the reference on the Dali server web site. The server
has been running for about 15 years at a series of locations. The Dali
server is used by structural biologists to compare newly solved protein
structures against the Protein Data Bank (PDB). Similarities to other
proteins can help to elucidate the function of an uncharacterized protein
and shed light on molecular evolution.
Does it describe a new discovery, methodology, or
synthesis of knowledge?
The paper describes an update of the original Dali algorithm (Holm L;
Sander C, "Protein structure comparison by alignment of distance matrices"
J. Mol. Biol. 233: 123-38, 1993).
Would you summarize the significance of your paper in
layman's terms?
"Our goal is to integrate protein sequence and
structural comparisons for function
prediction."
Hundreds of new structures are added to the PDB each week. The paper
reports a change to a data structure within Dali which speeds up database
updates. The Dali server uses pre-computed similarities between PDB
structures in order to find all the structural neighbors of the query
structure.
The idea is that one usually finds a few highly similar structures using
quick heuristics. Restricting the search space to neighbors of these
previously found matches allows the exclusion of large parts of the
database without explicit comparison.
How did you become involved in this research, and were
there any problems along the way?
I was initially interested in protein structure prediction, where the
problem of optimizing a sum-of-pairs function comes up in aligning—a
contact map predicted from—a sequence to the contact map of a protein
with known structure.
Structure comparison by aligning two known contact maps was a similar
problem, where the correctness of the result was easy to check visually.
The structure comparison program yielded biologically interesting results
so I ended up pursuing that line of research.
Our formulation of the protein structure alignment problem belongs to a
class of problems that computer scientists call NP-complete. This means
that algorithms with a practical running time cannot be guaranteed to find
the exact optimum. Therefore, the server has to compromise between speed
and robustness.
Where do you see your research leading in the
future?
Our goal is to integrate protein sequence and structural comparisons for
function prediction.
Liisa Holm, Ph.D.
Professor & Group Leader
Bioinformatics Group
Institute of Biotechnology
Department of Biological and Environmental Sciences
University of Helsinki
Helsinki, Finland Web