Valadi K. Jayaraman, Bhaskar
D. Kulkarni, Piyushkumar Mundra, Madhan Kumar, and Krishna
Kumar Kandaswamy talk with ScienceWatch.com and
answer a few questions about this month's Fast Breaking
Paper in the field of Engineering.
Article Title: Using pseudo amino acid composition
to predict protein subnuclear localization: Approached with
PSSM
Authors: Mundra,
P;Kumar, M;Kumar, KK;Jayaraman, VK;Kulkarni,
BD
Journal: PATTERN RECOGNITION LETT
Volume: 28
Issue: 13
Page: 1610-1615
Year: OCT 1 2007
* Natl Chem Lab, Chem Engn & Proc Dev Div, Dr Homi
Bhabha Rd, Pune 411008, Maharashtra, India.
* Natl Chem Lab, Chem Engn & Proc Dev Div, Pune 411008,
Maharashtra, India.
Why do you think your paper is
highly cited?
Top to bottom:
Bhaskar D. Kulkarni, Piyushkumar Mundra, and
Krishna Kumar Kandaswamy
Nuclear proteins operating in related pathways, or those that
share common functionality, tend to be localized in specific
compartments within the nucleus. Protein subnuclear localization
prediction has tremendous biological significance as the mislocalization
of proteins can lead to genetic diseases and cancer. Our support vector
machine-based methodologies have yielded more accurate
predictions.
Does it describe a new discovery, methodology, or
synthesis of knowledge?
This paper studies different methods for extracting knowledge from sequence
information in the form of features for machine learning algorithms. These
include evolutionary information in the form of position specific scoring
matrix (PSSM) features, pseudo amino acid composition features (as proposed
by Kuo-Chen Chou at the Gordon Life Science Institute in San Diego) and
five factor solution score features (as derived by William R. Atchley of
the Department of Genetics at North Carolina State University) using nearly
500 amino acid properties.
Would you summarize the significance of your paper in
layman's terms?
Our methodology can speed up the process of protein annotation and
subsequent relevant discoveries.
How did you become involved in this research, and were
there any problems along the way?
We have been working on important in-silico predictions of protein
and gene functions for the past several years and found this particular
problem to be potentially crucial. We found that the biggest difficulty was
in determining how to choose a relevant dataset. Hong-Bin Shen and Kuo-Chen
Chou's previous work on this issue, at the Institute of Image Processing
and Pattern Recognition of Shanghai's Jiaotong University, led to our
utilization of their same dataset, which demonstrated excellent prediction
results.
Where do you see your research leading in the
future?
Considering its biological implications, we believe more research attention
will be diverted towards protein subnuclear localization prediction
problems. Newer methods for mathematically representing protein sequences
may be proposed to further increase prediction accuracy.
Do you foresee any social or political implications for
your research?
Our work could speed up the nuclear protein annotation process, which may
help the medical community, particularly those individuals working in the
fields of human genetics and cancer research.
Valadi K. Jayaraman, Ph.D.
Senior Scientist, Chemical Engineering and Process Development
Division
National Chemical Laboratory (NCL)
Pune, India
Bhaskar D. Kulkarni, Ph.D.
Deputy Director & Head, Chemical Engineering and Process Development
Division
National Chemical Laboratory (NCL)
Pune, India
Piyushkumar Mundra
Ph.D. student
School of Computer Engineering
Nanyang Technological University
Singapore
Krishna Kumar Kandaswamy
Ph.D. student
University of Luebeck
Luebeck, Germany
Keywords: nuclear proteins, protein subnuclear localization
prediction, mislocalization of proteins, support vector machine-based
methodologies, machine learning algorithms, position specific scoring
matrix, pseudo amino acid composition features, five factor solution
score features, protein subnuclear localization prediction problems,
nuclear protein annotation process.