Science Watch® - Tracking Trends and Performance in Basic Research
November/December 2006



 Cambridge’s Randy J. Read: A Structured Approach to Biophysics

GO TO: The Interviews With the sequencing of the human genome now largely a done deal, researchers have turned their attention to proteomics, or the large-scale study of proteins, as the next great frontier in biology and medicine. This pursuit depends, in turn, on the very same technology—X-ray crystallography—that Crick and Watson used fifty years ago to elucidate the double-helix structure of DNA. Since then chemists and biochemists have solved the three-dimensional structure of tens of thousands of proteins, a number that will assuredly grow exponentially in the coming years.

Randy Read
"The projects I initiate myself tend to be things where knowing the structure of a protein tells you something about the disease," says Randy J. Read of the University of Cambridge, U.K.

The key to solving structures today are computer programs that use the data from those protein structures already elucidated to help make sense of the X-ray diffraction patterns of those structures still unsolved. And so the more protein structures elucidated, the easier it is to solve the next one. In this business of comparing and solving structures, by far the hottest program out there at present is known as Phaser, the creation of Randy J. Read and his colleagues from the University of Cambridge, U.K. And while Phaser is helping researchers solve protein structures they had long considered unsolvable, the program and its predecessors have also catapulted Read into Thomson’s Essential Science Indicators listings of the top-cited researchers in chemistry as well as in biology & biochemistry. Read co-authored a 1998 paper that has now been cited more than 8,000 times and which currently ranks at #3 among all papers tracked by Essential Science Indicators over the last decade (A.T. Brunger, et al., Acta Crystallograph. Sect. D; see accompanying table, paper #1). More recently, Read’s 2004 Phaser paper, "Likelihood-enhanced fast rotation functions," (see L.C. Storoni, A.J. McCoy, R.J. Read, Acta Crystallograph. Sect. D, 60: 432-8, 2004) was a fixture for the better part of a year in this publication’s regular Top Ten list of the hottest papers categorized under the heading of chemistry.

Read, 49, received his bachelor’s degree in biochemistry from the University of Alberta in 1979 and his doctorate from the same institution seven years later. In 1986, he moved to The Netherlands, where he worked as a post-doctoral fellow at the University of Groningen. In 1988, he returned to the University of Alberta, where he stayed as first an assistant and then associate professor for the next decade. Since 1998, Read has been a professor of protein crystallography at the University of Cambridge in the U.K.

Professor Read spoke to Science Watch from his office in Cambridge.

SW:  You’ve been doing protein crystallography for over a quarter-century. What was the state of the art when you started in the late 1970s?

Labor intensive. You’d expect to work for a couple of years to do a single structure for a paper. If it was moderately interesting, you could expect to publish it in Nature. And you basically knew everybody in the field, although it wasn’t like the 1960s, when everybody could get together in Austria for a skiing holiday. Now there are thousands of people in the area, and if you want a Ph.D. you need half a dozen structures and a biological story to go with them.

SW:  Was it simply improved computing technology that drove the change?

It’s actually a combination of computing technology and synchrotron radiation, which allowed you to do experiments you could never do before. Synchrotron radiation allows you to tune the wavelength, and you can do it close to where a particular element in the crystal absorbs X-rays. That changes part of the diffraction pattern so it comes from just that subset of atoms. This is called anomalous diffraction, and you can then deduce where those atoms must be, figure out the phase information, and solve the structure. It’s taken a lot of the trial and error out of solving a structure. Wayne Hendrickson at Columbia worked a lot of this out. Now, if you can grow a crystal, you can take it to a synchrotron and solve the structure while you’re sitting there. Then this ties into our program, Phaser, and the fact that as protein structure databases get bigger, more and more often you know what the structure is going to look like, because you’ve seen something like it before. Then you need only one dataset and the technique of molecular replacement to get the initial model. So it’s a combination of all these things: better algorithms, better hardware, and new experiments.

SW:  With proteomics hailed as the next frontier in molecular biology and medicine, how necessary is it to actually have the three-dimensional structure of a protein? Wouldn’t it be enough, for instance, to simply know what it binds to in signal pathways?

This may be a slightly parochial viewpoint, but most people don’t feel they really understand what’s going on in biology until they can put it into the context of the structures of the relevant proteins. The ultimate step in reductionism is to understand things at the level of what’s happening with the individual atoms. If we want to understand how proteins associate or how enzymes work, if we want to design drugs to stop certain enzymes from acting, we need the three-dimensional structures of the proteins involved. There’s a big feeling that the whole human genome project is a preliminary step to understanding how things really work; the next important step is to understand the structure of all the proteins involved in doing these things. That’s where the whole structural genomics initiative started. It’s a big thing, and that doesn’t even include other aspects such as the systems biology, which is another big thing. We want to understand how things are working. That’s why every major drug company has crystallographers in-house to look at how the drugs they’re developing bind to targets and then figure out how to make them better.

SW:  When did you start working on Phaser, and what prompted you?


Highly Cited Papers by Randy J. Read and
Colleagues, Published Since 1996

(Ranked by total citations)

Rank Paper Citations
1 A.T. Brunger, et al., "Crystallography & NMR system: A new software suite for macromolecular structure determination," Acta. Crystallograph. Sect. D, 54: 905-21, 1998 8,276
2 J.A. Huntington, R.J. Read, R.W. Carrell, "Structure of a serpin-protease complex shows inhibition by deformation," Nature, 407(6806): 923-6, 2000. 319
3 P.D. Adams, et al., "Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement," PNAS, 94(10): 5018-23, 1997. 256
4 P.I. Kitov, et al., "Shiga-like toxins are neutralized by tailored multivalent carbohydrate ligands," Nature, 403(6770): 669-72, 2000. 245
5 N.S. Pannu, R.J. Read, "Improved structure refinement through maximum likelihood," Acta Crystallograph. Sect. A, 52: 659-68, 1996. 224

SOURCE: Thomson Scientific Web of Science

We started about six years ago. A couple of things motivated us. One is that I had always written my programs in FORTRAN, and I couldn’t find anybody willing to write FORTRAN. So we decided to start a new program from scratch. Also because of seeing ways in which different methods are built on the same framework, we wanted to make something that could do all these different things in one program. I was also fortunate in having a couple of really talented people on the team: Airlie McCoy, who is still at the center of Phaser developments, and Laurent Storoni, who has since moved on to a position as a lecturer.

SW:  Is there a lot of competition with programs that can be used to solve protein structures?

There are probably three or four major programs that people use to solve structures by this technique that we use, of molecular replacement. One of the more popular ones is called MOLREP; another is called AmoRe. The older programs aren’t used that much anymore.

SW:  So what does Phaser do that has so many researchers citing it left and right? Is it faster than other programs? Or does it solve structures that others don’t?

It’s not necessarily faster than the other new programs. It’s comparable in speed, maybe even a little slower, but not so slow that it stops people from using it. The key thing is that it’s more sensitive to the correct answer. It does have a better signal-to-noise than traditional methods. A lot of people switched to using Phaser because we also made it very much easier to use than the older methods. Probably a majority of structures could have been solved with other methods, but it seems that there are a significant number of cases where better methods make a difference. We were a bit surprised, actually. We had various test cases, and we could see we were getting a better signal-to-noise ratio. The answer was clearer with Phaser than with other methods. After we released it, it seemed like just about everybody I talked to had some dataset stored on a CD-ROM that they’d given up on. Then they tried to solve it with Phaser and it worked. Although it doesn’t work on everything, of course.

SW:  What is it about Phaser that allows it to solve these stubborn structures?

It’s based on new ideas using likelihood to tell how the model agrees with the data. Traditional methods are based on comparing this thing called a Patterson map, which you can calculate, from X-ray intensity data, to the Patterson map calculated from a model. You have to figure out how to rotate and translate models, however, so that the calculated Patterson map agrees with the measured Patterson map. The problem with that method is that it has no way of accounting for sources of error in the whole thing. Particularly as the protein becomes less closely related to the model you have, there isn’t a good way to account for those errors. Phaser can do that, using this likelihood method we developed. It can also take better account of the prior information we have, which traditional methods can’t do. If you’re doing the structure of a protease inhibitor complex, for instance, you might find where the protease is, but the inhibitor is a small part and hard to find on its own. The point now is that once you know where the protease is, it makes it much easier to find the inhibitor.

SW:  When you’re not creating the next generation of structure-solving programs, or solving structures for others, how do you choose what to work on yourself?

The projects I initiate myself tend to be things where knowing the structure of a protein tells you something about the disease. One of the themes of my research, for instance, is trying to understand bacterial toxins. We’ve worked on the structure of pertussis toxin, which from the point of view of vaccine design, can help us come up with a better whooping cough vaccine, one that causes fewer side effects. It turns out that the main side effects of the whooping cough vaccine are caused by the pertussis toxin itself. The other one we’ve worked on is the Shiga-like toxin from E. coli, the bacteria that cause food poisoning. If you eat meat contaminated with certain strains of E. coli, you end up with blood in your stools and possibly with kidney damage. Very old and very young people can die from the effects. We’re interested in the structure of the toxin to try to figure out ways of intervening and stopping this effect.

SW:  Once you have the structure, how does that help you come up with an intervention?

In the case of pertussis, it was actually fairly easy to come up with things that make a better vaccine candidate. You look at the structure, figure out what residues there are in the active site, and then change a few of those residues to make an inactive enzyme. It still looks the same to the immune system, but now the side effects may be reduced. In the case of the Shiga-like toxin from E. coli, we teamed up with some very talented carbohydrate chemists and came up with a compound designed to bind with very high affinity to the toxin. It turns out that the toxin binds to carbohydrates on the cell surface. The binding part of the toxin is a pentamer, meaning it has five-fold symmetry. One of the chemists, Pavel Kitov, came up with a very clever five-fold symmetric molecule, called a starfish molecule, which binds with sub-nanomolar affinity to the toxin. The jury is still out whether that will be effective as an intervention, but it works in principle.

SW:  What’s the biggest challenge these days facing protein crystallographers? In other words, what needs to be solved to move the field ahead?

I think it’s now come down to getting the crystals—finding the right set of conditions in which the protein will make well-ordered crystals that will diffract well. One you’ve got the crystal, the techniques for solving the structure have really become quite powerful. So getting enough protein and crystallizing it is probably the single biggest challenge. This part is still a bit of an art, and although people have been trying to make it into more of a science, and although they’re making progress, it’s not there yet. There is a whole area of structural genomics devoted to high-throughput crystallography and expanding the number of known structures as quickly as possible. A lot of the throughput technologies are devoted to looking at this crystallization question: making slight variations on proteins, using robotics to deal with the expression and purification of lots of different versions of the proteins, then monitoring the crystal growth, etc.

SW:  You’re one of the co-authors on a 1998 paper, "Crystallography & NMR system: A new software suite for macromolecular structure determination," which has now been cited a remarkable 8,200 times. Tell us about this paper and why this technology is so popular.

That’s a program called CNS that was initiated by Axel Brunger and which allows for the use of other structure-determination methods, such as NMR spectroscopy and electron microscopy. Axel envisioned it as a replacement for the program X-plor that he’d developed earlier, and that’s what it’s become. He started that at just about the time we developed something related to the work we’re doing on molecular replacement, again with maximum likelihood but applied to the problem of refining the structure to get a model that agrees better with the data. Our way of doing this was better than the traditional method, using what are called least-squares targets, and so we got involved in this whole big collaboration to replace X-plor. And X-plor itself was very popular, so when the new CNS program came out, many of the users switched over. A lot of people contributed a lot of good ideas to it, and it’s a very powerful program. It really took off, especially in the States. There’s an interesting divide, in that people in Europe tend to use programs developed in Europe that do similar things.

SW:  Is CNS the last word in this? Is there something coming to replace it?

I’m involved in another collaboration, which involves some of the same people and is being funded by the NIH. This is a new project called PHENIX. The guy coordinating that is Paul Adams, who was on the CNS paper. The idea is to really automate the whole process: from having a diffraction dataset to having a model you can deposit at the Protein Databank, and do it all with the most advanced computing tools. With CNS, the program had its own scripting language. It defined the sort of programming language you could use to solve structures. PHENIX will use a scripting language called Python, which is much, much more powerful.

SW:  Why "Python"?

The guy who developed it was a big Monty Python fan. All of the examples have to do with spam and eggs.

SW:  And when do you think it will be finished?

We’re just in the process of finalizing the new release. The package is not doing everything we want it to do yet; it’s still a work in progress. But we think the version about to come out will be very powerful and will be something people will choose to use.End of article

Science Watch®, November/December 2006, Vol. 17, No. 6
Citing URL: http://www.sciencewatch.com/nov-dec2006/sw_nov-dec2006_page3.htm

Interview Index
Search | Nov/Dec 2006 Index | Archives | Contact | Home

What's New in Research - (Updated weekly) - What's NEW in Research
The Most-Cited Researchers in...
  |  Analysis Of...  |  Site Map by Field | ! QUICK SCIENCE !
Alphabetized List of All Essential Science Indicators Editorial Features/Interviews


Science Watch® is an editorial component of Essential Science Indicators. RSS Feeds for Essential Science Indicator's editorial Web sites
Visit other editorial components of ESI: "in-cites" and "Special Topics."
Write to the Webmaster with questions or comments about this site. Terms of Usage.
View all the products of the Research Services Group from Thomson Scientific.


(c) 2008 The Thomson Corporation.
Thomson Scientific