John D. Storey talks with
ScienceWatch.com and answers a few questions about
this month's Fast Moving Front in the field of
Article: Significance analysis of time course
JD;Xiao, WZ;Leek, JT;Tompkins, RG;Davis, RW
Journal: PROC NAT ACAD SCI USA, 102 (36): 12837-12842 SEP 6
Addresses: Univ Washington, Dept Biostat, Seattle, WA 98195
Univ Washington, Dept Biostat, Seattle, WA 98195 USA.
Stanford Univ, Stanford Genome Technol Ctr, Palo Alto, CA
Stanford Univ, Dept Biochem, Palo Alto, CA 94304 USA.
hriners Burn Ctr, Dept Surg, Boston, MA 02114 USA.
Harvard Univ, Massachusetts Gen Hosp, Sch Med, Boston, MA
Why do you think your paper is highly cited?
Does it describe a new discovery, methodology, or synthesis of
This paper provides a statistical approach for analyzing gene expression
studies carried out over a time course. Gene expression is essentially a
dynamic process, so characterizing gene expression variation over time is
of fundamental importance. Besides the high interest in this question, the
fact that the statistical approach can be applied to data through my lab's
EDGE software package has probably contributed to the number of citations.
The article represents a new set of statistical methodologies.
Would you summarize the significance of your paper
in layman's terms?
The article provides a set of methodologies to analyze the main types of
study designs and statistical questions that one might ask in a time course
gene expression study, where the levels at which genes are turned on are
measured over a period of time.
"Specific to this paper, we have
developed improved algorithms for carrying
out the analysis of time-course gene
The article provides a method that allows one to detect genes whose
expression shows any change over time. For example, as a proof of concept
in the paper, we identified genes whose expression changes in the human
kidney as one ages.
It also provides a method to detect genes whose expression as it changes
over time are different between two or more groups. In the paper, we
identified genes whose expression changes over a 24-hour period in human
blood were different between a group treated with endotoxin versus a
How did you become involved in this research and
were any particular problems encountered along the way?
I initially became interested in this research because the decreasing cost
of microarrays was allowing researchers to make genome-wide measurements of
gene expression levels on many more samples in a given study.
It was clear that this would lead to gene expression being measured over
time rather than in "static" conditions where the passage of time is
ignored. A statistical approach for this type of study would certainly be
needed. The project took off when I joined a large-scale NIH project called
"Inflammation and the Host Response to Injury," which is led by Dr. Ronald
Tompkins of Massachusetts General Hospital.
This project involves measuring gene expression over the course of
treatment for individuals who have been subjected to blunt force trauma,
such as from an automobile accident. In collaboration with the Ron Davis
lab of Stanford University, also participating in this NIH project, we
began to develop the statistical approach.
The main challenge of this research was to provide a single framework that
is applicable to the many different types of study designs and questions
which might be considered. We also had to make the methods understandable
to researchers across a wide range of areas of expertise.
Where do you see your research leading in the
My research is aimed at developing and applying quantitative approaches in
genomics to contribute to an understanding of the molecular biology of the
cell and the causes of human disease.
As new genomics technologies continue to emerge and the costs of existing
technologies decrease, we are faced with more and more data of an
increasing complexity. My current and future research involves integrating
multiple types of genomic data beyond the gene expression levels considered
in the paper. These multiple data types may be measured at different points
in time, in different conditions, and in different tissue types.
Specific to this paper, we have developed improved algorithms for carrying
out the analysis of time-course gene expression studies. This includes
speeding up the calculations as well as increasing the statistical power of
the calculations by adapting our "optimal discovery procedure" approach to
the time-course setting.
These new developments will be included in a forthcoming release of our
EDGE software package. We have also applied these methodologies in the
large-scale NIH project mentioned above, and we have made much progress in
understanding how early gene expression changes relate to recovery from
Do you foresee any social or political implications
for your research?
No, this doesn't seem likely.
John D. Storey, Ph.D.
Department of Molecular Biology
Lewis-Sigler Institute for Integrative Genomics
Princeton, NJ, USA
KEYWORDS: AGING; DIFFERENTIAL EXPRESSION; EXPRESSION
ARRAYS; Q VALUES; TIME SERIES.