According to a recent analysis of Essential
Science Indicators data from
Thomson
Reuters, the work of Dr. Steve Porcella has entered
the top 1%
in the field of Microbiology. Dr. Porcella's current
record in this field includes 15 papers cited a total
of 648 times between January 1, 1998 and August 31,
2008. His overall record for this period in the
database includes 37 papers cited a total of 1,167
times.
Dr. Porcella is the Chief of the Research Technologies Section at Rocky
Mountain Labs, which is part of the National Institute of Allergy and
Infectious Diseases (NIAID), and oversees the RTS Genomics Unit within that
section.
In the interview below,
he talks with ScienceWatch.com about his highly
cited work.
Would you tell us a bit about your educational
background and research experiences?
I have a Ph.D. in Microbiology from the Division of Biological Sciences,
working with Dr. Ralph Judd at the University of Montana. I was fortunate
to perform some of my graduate research at Rocky Mountain Labs (RML),
NIAID, NIH, where I worked on Neisseria gonorrhea and worked with
a "new at the time" slab acrylamide 370S DNA sequencer. A post-doctoral
appointment in the Department of Microbiology at University of Texas
Southwestern Medical School in Dallas led to my work on Treponema
pallidum and Borrelia burgdorferi and management of a 377 DNA
sequencer. A second post-doctoral appointment under Dr. John Swanson at RML
allowed me to collaborate and work extensively with Dr. Tom Schwan on
Borrelia hermsii, Borrelia burgdorferi, and Borrelia
turicatae. All of my postdoctoral work involved genomics approaches,
protein work, bioinformatics, and nucleic acid work.
Also at RML, but while under Dr. James Musser, now at the Methodist
Hospital Research Institute in Houston, I worked on Streptococcus
pyogenes. Dr. Musser tasked me with setting up an ambitious Genomic
sequencing, Microarray (spotted), and q-PCR, and technology group at RML.
After Dr. Musser's departure, this technology group officially became the
RML Research Technologies Section (RTS), part of the intramural NIAID
Research Technologies Branch under Branch Chief Dr. Robert Hohman. Today I
am the Section Chief of the RTS, which now comprises three technology
Units—Genomics, Electron Microscopy, and Flow Cytometry. We have over
20 individuals in the RTS, ranging from highly skilled and trained
technical staff members to student researchers.
I spend much of my time working closely with the Genomics Unit, where the
molecular challenges are enormous, the workload always growing, and the
range of projects very diverse. The RTS Genomics Unit currently provides
technology support for high throughput (HT), high-density microarrays (all
custom and commercial Affymetrix products), HT-DNA sequencing (capillary,
454, SOLiD), HT-Robotics, HT-Q-PCR, HT phenotype microarray, HT patient
cohort and pathogen genotyping, and bioinformatics for all of the above.
"When the first fluorescent DNA
sequencer became available, no one could have
guessed how much progress would be made from
those early days."
Our goal within the RTS is to provide advanced technology support, from
experimental design to publishable figures to all Intramural investigators
within the NIAID. These scientists are all actively pursuing the NIAID
mission of performing basic and applied research contributing to the
development of diagnostic reagents, therapeutics, and vaccines against
human pathogens and human disease.
What would you say is the main focus of your research,
and what drew you to this area?
My main focus is genomics, both at the host and pathogen level. Many of the
investigators that we support are specifically interested in either the
host response to infection or the pathogen response during the infectious
process, and sometimes both at the same time. We also support those working
on genetic or immunological disorders, the latter of which can lead to
life-threatening infections. Last, but not least, we also work with labs
studying vectors such as fleas, ticks, mosquitoes, and their roles in the
transmission of disease. Our support of pathogen work also focuses on
analysis of newly discovered pathogen isolates, host genetics, host
susceptibility to disease, and pathogen genetics.
We work with many different scientists, labs, models, pathogens, and
patient cohorts, across the NIAID Intramural program. Through this
diversity, we have developed novel protocols and technologies specific or
compatible across many different models/systems. We routinely perform DNA
sequencing (single clone to genome level), RNA/DNA isolation, high
throughput microarray and the attendant data analysis, high throughput
Q-PCR for sample quantitation or microarray dataset validation, human or
pathogen genotyping, and complex genomic or statistical microarray
bioinformatics.
Several keys to these efforts are as follows:
1) Performing pathogen genome sequencing to high quality and deep
annotation, while providing a myriad of comparative analytical efforts, all
with the goal of efficiently communicating the complexity of the data to
the investigators.
2) Rapidly incorporating genome-level data into new custom expression
arrays where we can quickly perform Comparative Genomic Hybridization
(CGH), or determine differential expression profiles under different
conditions, over time, or under different treatments. At this time, we have
developed 8 custom Affymetrix arrays containing more than 60 different
pathogens.
3) Understanding the fundamentals of sound, randomized, multiple replicate
experimental designs in order to provide the best data possible, removed
from batch effects, or other artifactual external influences, is critical.
Typically, we prefer six biological replicates per time point or condition,
as this gives us the best numbers for statistical significance in the end,
but also allows for loss of one to three samples while still maintaining
numbers appropriate for analysis. Paramount is that the experiment to be
performed will provide the data for the question being asked. Equally
important to all of this is communicating effectively with investigators so
that both the wet-bench, RTS efforts, and bioinformatics are all using
consistent information.
4) When accomplished, the processes described above lend themselves to
rapid Q-PCR validation efforts where we often see high concordance (p-value
statistical correlation) with the chip-based expression data. This latter
effort enables researchers to move quickly to publication, or to pursue
other experiments for confirmation (protein-based, for example).
5) We provide all the support needed to get data into the public domain and
fast-track the acquisition of accession numbers for publication.
6) Generation of publication quality tables and figures. Because our RTS
scientists work with the technology every day, from start to finish, for
many years now, they understand the data and its interpretation better than
anyone. Therefore, with principal investigator's guidance we are capable of
producing any figure or table needed in order to efficiently tell the story
in a publication format. This extra effort by my group, which we make as a
priority, is the reason why so many collaborative papers are being
produced.
7) Staying on top of the latest technology. This is a no-brainer, and as an
example, we have recently acquired a SOLiD sequencer and are aggressively
pursuing this technology for many of the topics described above.
Finally, what drew my interest to this area? It is simple—early in my
graduate career I saw that genomic technologies or approaches could give
many answers to phenotypic observations/questions. This concept is
especially important for research on intractable, hard-to-study pathogens
or host-pathogen inter-relationships. Microarray research is basically an
extension of the genomic record, as well as Q-PCR (another tool),
genotyping, and phenotype arrays.
Bioinformatics are challenging, but our philosophy has always been, "focus
on the downstream bioinformatics, while we are pursuing the implementation
of a new technology, with the goal of preventing a bottleneck downstream."
It is this approach I believe that has allowed us to rapidly and
efficiently get new technologies up and running at high accuracy within
reasonable timeframes and meet nearly all investigators needs.
Also, DNA and, to a lesser extent, RNA are fairly easy and stable to work
with compared with proteins. Therefore, high throughput methodologies
enable rapid and extensive scientific discovery under the right conditions
across many different labs, models, and experimental questions. Finally,
there is nothing like the big data rush, when you are the first to see
those discoveries come and you communicate that to the investigators.
Your most-cited original paper in our database is
the 2002 PNAS article, "Genome sequence and comparative
microarray analysis of serotype M18 group A Streptococcus strains
associated with acute rheumatic fever outbreaks." Would you walk our
readers through this paper, its goals, findings, and
significance?
This was a landmark paper because we in-house sequenced a complex bacterial
genome at a time when only big sequencing centers were taking on these
labor-intensive projects. At the time, only one other Group A Streptococcus
(GAS) genome had been sequenced, that of an M1 isolate, genetically
distinct from M1 strains commonly responsible for GAS infections. M18
strains had been associated for decades with Acute Rheumatic Fever (ARF)
outbreaks in the US. As soon as we finished the M18 genome we made a
spotted array of it and hybridized DNAs from 36 M18 strains cultivated from
diverse localities. The results noted few genomic gene differences;
however, phage and phage-like elements turned out to be the primary source
of variation. This array data was further supported by a high throughput,
comparative gene sequencing of 500 isolates collected during the same ARF
outbreaks.
"My fundamental hope is that
computing and bioinformatics keeps pace with
the development of new, cutting-edge genomics
technologies"
Therefore, horizontal gene transfer events were described as being
important if not critical sources of genomic diversity among M18 strains.
The data also showed that M18 strains recovered from two ARF outbreaks 12
years apart in Salt Lake City (SLC), were nearly genetically identical and
therefore the increase in ARF cases in SLC in 1998-1999 was associated with
a resurgence of an M18 clone common in 1987-1988 in the same area.
An interesting point: shortly after this paper was published we started a
seven genome-sequencing project. To my knowledge, this was the most anyone
anywhere had undertaken in terms of multiple, highly related isolates
sequenced at the same time. Cross-contamination was a big fear during the
sequencing phase of all isolates, so we were meticulous in protecting
against it. A company, Integrated Genomics out of Chicago, was instrumental
in accomplishing that goal. Many of the subsequent papers we published on
Streptococcus were a direct result of that multi-genome sequencing
effort.
Judging from your papers in our database, you do quite a
bit of work with group A Streptococcus. What is it about this organism
that warrants such interest? What other organisms do you work
on?
Currently, my group is not actively doing work on Group A Streptococcus
pyogenes (GAS); that effort tapered off greatly when Dr. Musser took
an appointment at Methodist Hospital. My group does do an enormous amount
of work with many different bacterial pathogens described below. However,
Group A Streptococcal infections are a huge public health problem in
developed and underdeveloped countries around the world. A half million
deaths occur worldwide each year due to GAS infections. Globally, greater
than 600 million cases per year can be attributed to GAS infections. In the
US alone, in 2000, acute pharyngitis was responsible for 11 million office
visits. Invasive GAS infections can occur at a rate of approximately 3.5
per 100,000 with roughly 1,500 of those resulting in death. A vaccine
remains elusive and more research and funding are needed to reduce the
impact of this very important human pathogen in the US and world.
Pathogen organisms my group has or is currently working on, in terms of
providing technical support, are as follows (I won’t list all the
species or strains); Borrelia, Coxiella, Chlamydia, Rickettsia,
Fransicella, Granulibacter, Salmonella, Burkholderia, Mycobacteria,
Yersinia, Staphylococcus, Brucella, Giardia, Malaria, Langet virus,
prions, and a host of other viruses. We are also providing technical
support on whole-tissue or flow-sorted cell analysis for human, mouse, rat,
hamster, and guinea pig models and we are also spending a lot of work on
genetically immuno-diseased individuals, particularly those individuals
whose mutations are not known but which may be life-threatening.
You recently co-authored a paper in Infection and
Immunity, "The Chlamydia trachomatis plasmid is a
transcriptional regulator of chromosomal genes and a virulence
factor." Would you tell our readers a little bit about this
research?
I frequently use this paper as an example of the type of project that my
group regularly contributes to. Normally, Chlamydia trachomatis
contains a 7.5kb cryptic plasmid of unknown function. Drs. Carlson and
Caldwell had an isolate that lacked this plasmid. Growth kinetics, plaquing
efficiency, and plaque size all showed no difference between plasmid-minus
versus plasmid-containing strains. Failure to accumulate glycogen granules
was the only major, observed phenotype of the plasmid-minus strain. Based
upon some preliminary Q-PCR, it was known at what time during the
infectious cycle glycogen synthase (enzyme involved in glycogen synthesis)
was maximally expressed in the wild type relative to the mutant.
Therefore, a microarray experiment was designed to look at six biological
replicates of each condition (Wt and MT), including tissue culture cells
only (Chlamydia is an obligate intracellular parasite) at that time point.
One key hurdle to obligate parasites and microarrays is that you are
dealing with a mixed sample, usually with host RNA in great abundance
relative to the pathogen. Accurate quantitation of the Chlamydial RNA, in a
host RNA background, is also inherently difficult for the following reason:
Chlamydia is a biphasic organism, involving an active replication phase and
a dormant spore-like phase. No true constituitively expressed gene exists
for Q-PCR quantitation methods. Therefore, we extracted RNA and DNA
simultaneously (to use the DNA as normalization and genomic equivalents).
We tested by Q-PCR 3 different "constituitive-like" genes appropriate for
the replicating phase, approximated the amount of pathogen RNA that was
there, and ran the chips. We like to use PCA plots of the chip data to
demonstrate that replicates group more closely with each other rather than
across conditions and that was shown in the paper.
Venn diagrams, quality filters, and statistical analysis showed 29 genes
passed all criteria for significant differential expression. Twenty-two of
these genes were coded for on the chromosome including glycogen synthase,
while the others were on the plasmid. ID-50 experiments in mice confirmed
that the plasmid appears to play a significant role, perhaps more so in
vivo, with regards to virulence and gene expression control. This is a
very important study that opens up a wide range of research opportunities
on this important human pathogen and it opens other avenues of exploration
in other organisms where cryptic plasmids are known to exist.
How has the scope of our knowledge with regard to
bacterial genomes, genomics and new technologies changed over the past
decade?
When the first fluorescent DNA sequencer became available, no one could
have guessed how much progress would be made from those early days. The
growth and development of genomics technologies has almost been logarithmic
in its pace. There have been pitfalls, and manufacturer mistakes, but by
and large the strides made have been far-reaching and with great
demonstrated impact. A question some ask is, "Are we better off with these
technologies in-house and operating 24/7 in support of biomedical
research?" Cost, resources, and complexities are valid arguments against
them. My group and I believe 100% that the benefit greatly outweighs the
drawbacks. The discoveries and efforts associated with these technologies
would not have been possible with older methodologies or techniques. My
group strongly believes that the best is still to come. With new
technologies coming and computing infrastructure catching up, future
endeavors look very promising.
A saying we use here is that there is a genomics technology revolution
taking place. Never before has such technology been available to answer so
many important and wide-ranging questions about pathogenesis and human
disease. When you look at the last 200 years of research and compare it to
the last 10 years, the recent accomplishments are mind-boggling. We must
never lose sight of this recent impact in biomedical research. My group
puts a lot of effort into staying on that cutting edge of the revolution.
We are taking advantage of the technologies available and those soon to be
released. We are aggressively pursuing their full implementation towards
the discovery of new diagnostics, therapeutics, and vaccines in human
disease.
What are your hopes for this field for the
future?
My fundamental hope is that computing and bioinformatics keeps pace with
the development of new, cutting-edge genomics technologies. Another very
important factor is the chemistry and high throughput aspects of these
technologies. Protocols or techniques can be modified or improved to give
better yields, higher-quality data, more reproducible results, higher
throughput (96 and 384 or 1536 formats) and hopefully all at lower costs.
Chemistry improvements must go hand-in-hand with technology developments
and an effort always needs to be made to reduce costs. We are very
interested in single molecule sequencing, as current next-generation
technologies require substantial input amounts of material. We are also
interested in higher density arrays, in smaller formats, again to reduce
sample amount, which for much of our work is a constant issue. Better and
deeper annotated genomes or genome updates is something that will
constantly need to be improved.
Therefore, with regard to arrays and other fixed platform technologies is
it imperative that new, streamlined bioinformatics processing for updating
that information be available. As you can see, we are interested in high
throughput aspects of technologies, all of which can generate enormous
amounts of data. Data processing times slow us down in terms of getting
that data to the investigators. We are always looking for ways to automate
our analysis to maintain a high throughput pipeline.
A key, but overlooked, aspect of this is databases. Quick, inexpensive,
easy-to-use databases for sample tracking, data tracking, inventories, data
hand-offs, and storage of raw and processed data are greatly needed.
Especially attractive are those databases that include some level of
automated data analysis, whether it is quality analysis, or even volume
tracking. We are always looking for biomedical off-the-shelf databases, but
find few opportunities available. We have the experience and capability to
make databases in-house that are specific to our needs, but we would rather
spend our time instead on data generation, data analysis, and data hand-off
to investigators.
Whole-genome, high-density human genotyping platforms with the ability to
provide data on SNPs, copy number, methylation, Indels, or splice variants
are of great interest to my group. We are always looking for high-density,
high-coverage platforms that can accomplish mutation detection.
One last important comment: by working together great things can be
accomplished. Our Genomics Unit here has a very team-oriented philosophy.
Because of our outstanding RTS staff members and their team-oriented work
ethic I believe many good things are still to come in our efforts in
biomedical research and scientific discovery.
Stephen F. Porcella, Ph.D.
Section Chief
Research Technologies Section
Genomics, Flow Cytometry, Electron Microscopy Units
Research Technologies Branch
RML
NIAID, NIH
Hamilton, MT, USA
Smoot JC, et al., "Genome sequence and comparative
microarray analysis of serotype M18 group A Streptococcus
strains associated with acute rheumatic fever outbreaks,"
Proc. Nat. Acad. Sci. USA 99(7): 4668-73, 1 April
2002. Source:
Essential Science Indicators from
Thomson
Reuters.
Keywords: genomics, host, pathogen, host response to
infection, life-threatening infections, vectors, microarray analysis,
genome sequencing, Group A Streptococcus, pathogenic organisms,
Chlamydia, virulence factor, genomics technologies.