Science Watch® - Tracking Trends and Performance in Basic Research
July/August 2007



Yale’s Josephine Hoh: Vision on Genes and Disease

GO TO: The Interviews

At the center of the retina is a circular area of yellow photoreceptor cells that play the critical role in our central vision. This area, known as the macula, will also commonly decay as we get older, causing a vision-impairment disorder known as age-related macular degeneration, or AMD, which afflicts some 10 million Americans. If you were a young researcher wanting to make a name for yourself, one way to do it might be to find the gene or genes that predispose us to AMD. Another way might be to pioneer a new method for identifying the genes involved in such common chronic diseases.

When Josephine Hoh, a statistician turned genetic epidemiologist, managed to do both, just three years out of her post-doc, the results were nothing short of spectacular. In April 2005, Hoh and her colleagues published "Complement factor H polymorphism in age-related macular degeneration" in Science, simultaneously demonstrating that a technique known as a genome-wide association study could be remarkably effective in pinpointing genes related to common diseases. As a result, Hoh has become a hot researcher in biomedicine, with her Science article garnering over 300 citations in two short years (see table below) and earning the #4 spot in this issue’s Medicine Top Ten.

Josephine Hoh

"It seems that researchers are now using this approach on every common human disease," says Josephine Hoh of Yale University, New Haven, Connecticut.

Hoh received her bachelor of science degree in mathematics from the National Tsing-Hua University in Taiwan. She then spent several years as a research and teaching assistant at Rutgers University and worked as a statistician in the pharmaceutical industry while pursuing her Ph.D., which she received from Rutgers. This was followed by a year as a post-doctoral fellow at Columbia University, and then a four-year stint working in the laboratory of Jurg Ott at Rockefeller University. In 2003, she joined the faculty in the Department of Epidemiology and Public Health at Yale University, where she is now an associate professor.

Hoh spoke to Science Watch from her Yale office in New Haven, Connecticut.

SW: How did you make such a successful transition from mathematics and biostatistics to genetic epidemiology?

When I went to work as a post-doc with Jurg Ott at Rockefeller, I really had to start from scratch in biology. So, while I was working as a researcher, I taught myself by going to classes and attending a seminar series. Because Dr. Ott is a statistical geneticist, the effort in his lab involves collaborating with other people; other labs generate the data and his lab develops methods to analyze it, and that’s what I was doing. We invented a lot of new approaches to genetic-data analysis, but I consider my main achievement at that time was learning a lot of biology and learning the logic of biology—how biologists think. Although I’d have to say I’m still learning that.

SW: And how did that blossom into research on AMD?

When I moved to Yale, to the Department of Epidemiology and Public Health, they were doing classic epidemiology, although a few senior faculty recognized that genetics and molecular biology promised to transform the field. I was encouraged to apply for an RO1 grant from NIH, and was subsequently awarded the grant to study new computational methods that would address questions such as which genes have to be transcribed to generate a particular tissue type—say, what makes heart heart, why liver is liver, why muscle is muscle. A problem in these studies is that liver or muscle tissue or most other tissues comprise several different cell types. That’s a very difficult situation to study in computational analysis. So I talked with some biologists, who suggested I study a tissue called retinal pigment epithelium, or RPE for short, which is all one single cell type. The other interesting aspect of this is that when RPE is defective or dies, the overlying light receptors languish and die too, leading to macular degeneration.

So I was lucky to get the grant and to work on these methods of transcription gene analysis. But I soon found it problematic to use data generated from various people’s laboratories, where each dataset was probably produced under different experimental conditions and environments. Thus, different datasets contained inconsistent patterns, which are difficult to discern. Thinking that this would not lead me anywhere, I got frustrated. I wanted to explore attacking the problem from another angle, perhaps from a direction of human genetics, to find the genes associated with AMD first. If I could do that, I could then come back to RPE to see whether and how these AMD genes are involved in making RPE cells distinct.

SW: How did you approach that problem?


Highly Cited Papers by Josephine Hoh,
Published Since 2001

(Ranked by total citations)

Rank Paper Citations
1 R.J. Klein, et al., "Complement factor H polymorphism in age-related macular degeneration," Science, 308(5720): 385-9, 2005. 316
2 J.J. Liu, et al., "A genomewide screen for autism susceptibility loci," Am. J. Hum. Genet., 69(2): 327-40, 2001. 127
3 J. Hoh, A. Willie, J. Ott, "Trimming, weighting, and grouping SNPs in human case-control association studies," Genome Res., 11(12): 2115-9, 2001. 87
4 J. Hoh, J. Ott, "Mathematical multi-locus approaches to localizing complex human trait genes," Nature Rev. Genet., 4(9): 701-9, 2003. 72 72
5 J. Hoh, et al., "The p53MH algorithm and its application in detecting p53-responsive genes," PNAS, 99(13): 8467-72, 2002. 72

SOURCE: Thomson Scientific Web of Science

We used an approach called genome-wide association mapping. To do this, we needed DNA samples. I contacted the ophthalmologist at the National Eye Institute (NEI) to see if I could get DNA samples from both cases and controls—people with macular degeneration and people without. The approach was very attractive, for if it was successful, we would establish a simple way for finding genes involved in diseases. It’s not a Mendelian approach, where you’re sifting through family pedigrees, which are normally hard to obtain, to track down the responsible genes. Here we were using unrelated cases and controls to find the genes involved in common diseases.

SW: Why hadn’t this been done before?

There had been theoretical analyses suggesting that in order to find genes with a case-control analysis, you would need an inordinate number of samples, cases, and controls. Thousands, probably. So, few people had pursued it. There was one "Insight" article, which is very famous, by Neil Risch in Nature in 2000, in which he suggested that a better way to go might be to collect family samples and use relatives as controls (N.J. Risch, Nature, 405[6788]: 847-56, 2000). For certain, you’d have well-matched controls. But collecting samples is very difficult, especially for age-related diseases.

SW: How did you get around that problem?

I, in a sense, did nothing more but applied Risch’s concept of well-matched cases and controls. What I actually used for cases and controls were collected as a part of a clinical trial—the Age-Related Eye Disease Study (AREDS). This was a big NIH-sponsored multi-center clinical trial looking at the effect of zinc, beta-carotene, and other supplements on macular degeneration. It’s a two-way factorial design, involving more than 4,000 subjects around the U.S. And the lucky thing was that they had taken DNA samples from both cases and controls and stored them in a cell repository along with the informed consents of the DNA donors. So we had the luxury of selecting a subset in which the cases were well matched to the controls, which is exactly what I needed. I was also fortunate that the two principal investigators at the NEI—Rick Serris and Emily Chew—were extremely cooperative and supportive, although they were also skeptical of this approach and probably of me, too, as a junior person just coming into the field. We knew that people had been looking for these genes for 20 years or more without success.

SW: What do you consider the most important factor in succeeding at this kind of case-control approach?

The well-chosen cases and controls. In my opinion, given that you have enough resources, the most important factor is how you select your cases and controls. There were 4,000 cases and 4,000 controls in AREDS to choose from—a rich resource. We decided to carefully choose the very severe cases to begin with. These patients had more deposits in the back of retina, called drusen, which ophthalmologists can see during eye exams. Also, AMD has two forms—a dry form and a wet form. Most Caucasians who get AMD have the dry form, which progresses slowly. The wet form is less common in this ethnic group and often progresses very quickly. In the wet form, new capillaries grow into and through the RPE, and blood leaks into the retina through RPE. In severe cases, both forms have very large drusen. So what I took from AREDS was 50 severe cases of the dry form, 50 of the wet form, and 50 very well-matched controls. Alice Henning, a very good statistician in AREDS, did the match not only based on age, gender, and smoking status, but the color of the eyes. We didn’t want to find out at the end, for instance, that we had identified the gene for blue eyes. She also matched all kinds of possible indirect environmental factors. So we started off with very carefully chosen and matched cases and controls, and I think that was one key to the success of this whole study.

SW: Is this the kind of study that could only be done in the past few years because of new advances in genetic technology? Or could it always have been done?

At the time, people were recommending the use of microsatellite genotyping, using chips that have about 400 markers across the genome. I thought, however, that since all previous studies on AMD genetics had used that technology, it might not take me too far from those previous results. I’d heard about the projected availability of genome microarray chips, the so-called SNP chips. An SNP is a single-nucleotide polymorphism, a single nucleotide in DNA that has two versions, or alleles. These chips weren’t even on the market yet, but I subsequently contacted Affymetrix, and they agreed to sell me 150 chips in their early-access program.

SW: Was it expensive?

For each individual sample I spent a lot of money—about $2,000. I kept applying for grants from NIH and getting rejected. Finally, I convinced the Raymond and Beverly Sackler Foundation to support me.

SW: Why was NIH rejecting you?

Well, there were reasons for rejection each time. The first time, for example, the criticism was that there was no hypothesis in my proposed approach and, hence, it was a "fishing expedition." The second time, the sample size was judged to be too small. The standard way to hunt for disease genes at that time was to look first at possible biochemical pathways and guess a few dozen genes of the genome that might vary between patients and controls. The fallback strategy was to do family studies. Since nobody had ever succeeded with a case-control study, and since our proposed sample size was not thousands, but 100 cases in two groups—dry and wet—and 50 controls, almost everyone believed we would fail to find anything of interest

SW: How difficult was it to do the ultimate analysis once you had the SNPs?

For each individual, we had more than 100,000 SNPs of data, so that’s a huge amount for most biostatisticians. This was the easiest and most exciting part of the study for me. And because the initial design was so well conceived, once we organized the data we didn’t need any sophisticated methods whatsoever; the conclusions almost leaped out and we really found the marker for the disease on the first pass. The marker was a genetic variant in a region near the disease-causative variant that either codes for protein or regulates the function and action of the gene. But once we had the marker, we sequenced the nearest genes, and that led us to the protein-altering mutation that we reported. We also tried to locate the gene in the eye and, sure enough, saw that this gene was made in RPE cells. I have to say it also helped that I was collaborating with really good people: not only Emily Chew and Rick Serris with AREDS, but Robert Klein, who was a new post-doc in Jurg Ott’s lab and who earned his spot as first author on the Science paper. He’s really good and very, very smart.

SW: Is the gene only expressed in the eye?

No, this gene is everywhere, mostly in the blood. It’s mainly produced in the liver, but also in RPE. And the mutation we identified actually changes the protein—this complement factor H (CFH). It’s a complement-system regulator. In mammals, the complement system is involved in what’s called innate immunity. CFH modulates the activities of the pathway that the immune system can use to kill pathogens coming into a host cell. It’s a regulator of the killer complement molecules that kill pathogens outside cells.

SW: Do you know what role it plays in AMD?

No. That’s my current research focus. I really want to figure out what’s going on. There are hypotheses, but nobody has solid evidence.

SW: Do you think your paper is being cited so frequently because of the CFH discovery, or because researchers are citing it for this case-control technique you used?

A combination of both—probably one half citations from the AMD field and the other half from human genetics. Research on AMD has flourished since, but I think the most significant effect of the research has been on other common diseases—diabetes, heart disease, hypertension, and schizophrenia. It seems that researchers are now using this approach on every common human disease. As I noted, it’s called a whole genome-wide association study—it even has an acronym: GWA. So the impact is on much more than just AMD research. If you read Science and Nature regularly, you’ll see that practically every few months a new GWA study is being published. For example, Judy Cho at Yale just published a major GWA on genetic disposition in Crohn’s disease. Another paper using the same GWA strategy just came out on diabetes, and another was recently published on obesity.

SW: Has your life changed dramatically since the publication of the Science paper?

In general, not really, but perhaps it’s provided a good extension of my scientific capacity—from a number-crunching statistician to a statistical geneticist who can discover genes linked to human disease. And now it’s time to reinvent myself again, to become a biologist who can address, for example, the mechanisms underlying the progression from genes to disease. I just need to work harder than before.End of article

Download this article as a Adobe PDF file.

 

Science Watch®, July/August 2007, Vol. 18, No. 4
Citing URL: http://www.sciencewatch.com/july-aug2007/sw_july-aug2007_page3.htm

Interview Index
Search | Jul/Aug 2007 Index | Archives | Contact | Home

What's New in Research - (Updated weekly) - What's NEW in Research
The Most-Cited Researchers in...
  |  Analysis Of...  |  Site Map by Field | ! QUICK SCIENCE !
Alphabetized List of All Essential Science Indicators Editorial Features/Interviews


Science Watch® is an editorial component of Essential Science Indicators. RSS Feeds for Essential Science Indicator's editorial Web sites
Visit other editorial components of ESI: "in-cites" and "Special Topics."
Write to the Webmaster with questions or comments about this site. Terms of Usage.
View all the products of the Research Services Group from Thomson Scientific.


(c) 2008 The Thomson Corporation.
Thomson Scientific