Ponnuthurai Nagaratnam Suganthan, Ganesan Pugalenthi, Ke
Tang & Govindaraju Archunan talks with
ScienceWatch.com and answers a few questions about
this month's New Hot Paper in the field of Computer
Article Title: A machine learning approach for the
identification of odorant binding proteins from
G;Tang, K;Suganthan, PN;Archunan, G;Sowdhamini,
Journal: BMC BIOINFORMATICS, Volume: 8, Year: no.-351 SEP
* Natl Ctr Biol Sci, UAS GKVK Campus,Bellary Rd, Bangalore
560065, Karnataka, India.
* Natl Ctr Biol Sci, Bangalore 560065, Karnataka,
* Nanyang Technol Univ, Sch Elect & Elect Engn,
Singapore 639798, Singapore.
* Univ Sci & Technol China, Dept Comp Sci &
Technol, NICAL, Hefei 230026, Anhui, Peoples R China.
* Bharathidasan Univ, Dept Anim Sci, Tiruchchirappalli
Why do you think your paper is highly
Olfaction is an essential mechanism in vertebrates and insects for
behavioral response, survival, and reproduction. Odorant binding proteins
(OBPs) play a major role in the olfaction and are thought to act as a
carrier for odorants and carry odorant from the environment to the nasal
epithelium in vertebrates and sensillar lymph in insects.
Ponnuthurai Nagaratnam Suganthan
In addition, OBPs are also reported to be involved in many other functions
such as odorant recognition, deactivation of odorants, etc. Presence of
OBPs in the non-sensory tissues of insects suggests their non-sensory
roles. Over the past three decades, OBPs have received much attention due
to its multifunctional nature, diversity in sequence, and three-dimensional
There could possibly be four main reasons for the high citation rate.
Identification of OBPs is a challenging task, since OBPs show very
low sequence similarity between species, or even within the same
Our work describes a new methodology, named "regularized least
squares classifier" (RLSC), for identification of OBPs and the
prediction performance was examined using a leave-one-out
The features used in this study are new and provide a significant
contribution to the classification. These features can also be used
for other similar sequence-based prediction works.
Our method is quick and quite useful for the identification of new
OBPs drawn from sequence databases.
Does it describe a new discovery, methodology, or
synthesis of knowledge?
OBPs are very diverse in sequence as well as in structure. For example,
vertebrate OBPs share an 8-stranded ß-barrel, whereas insect OBPs
contain an alpha helical barrel and six highly conserved cysteines.
Sequence similarity based search methods such as PSI-BLAST and HMM methods
fail due to poor sequence similarity. Thus, identification of OBPs is a
difficult task and requires an efficient method, irrespective of the
Our RLSC approach extracted knowledge from protein sequence in the form of
features for machine learning algorithms. Each protein sequence was
represented by 1,463 feature indices. The features used in this study are
different from other sequence-based prediction works. Generally, the
frequencies of dipeptides and tripeptides are obtained from 20 amino acids.
In our study, we employed a new approach to derive such combinations.
First, 20 amino acids were reduced into 11 groups on the basis of their
physico-chemical properties. The di-peptide and tri-peptide combinations
were derived from 11 groups. This also reduces the dimensionality of
From the computer science viewpoint, RLSC is a well-established technique
whose history can be traced back to the early years of the 20th
century. This work also shows how synthesis of knowledge from different
areas (in our case, biology and computer science) can promote scientific
Would you summarize the significance of your paper
in layman's terms?
OBPs are important elements in olfaction mechanism. The biological
functions of OBPs are still unclear and more OBP sequences are required for
the complete understanding of the olfaction mechanism. This approach can be
used to annotate and identify new OBPs from sequence databases. Since our
approach does not depend on the sequence similarity, it is very useful for
the identification of OBPs which have very poor sequence similarity. Our
methodology can speed up the process of OBP annotation and subsequent
relevant discoveries. The features used for the classification can be
applied to other sequence-based prediction work.
How did you become involved in this research, and
were there any problems along the way?
Ramanathan Sowdhamini's group has a long-standing interest in distant
relationships, superfamilies, and also in the development of sequence
search strategies. The authors got involved due to our continuing interest
in olfaction and OBPs that we have with Professor Govindaraju Archunan's
laboratory in the Animal Science Department of Bharathidasan University.
Furthermore, the visit of Ganesan Pugalenthi to Professor Ponnuthurai
Nagaratnam Suganthan's laboratory brought the field of machine learning
closer to the detection of odorant binding proteins due to the strong
expertise of this lab in machine learning approaches.
Yes, the feature selection and improvement of accuracy was quite
challenging to arrive at. The technique of RSLC for this application was
quite novel and the detected gene products had to be carefully validated.
We did not have role models in any of the three areas—these were some
of the problems faced.
Where do you see your research leading in the
Till now, identification of OBPs has been carried out using sequence
similarity search methods such as PSI-BLAST, HMM, etc. Since OBPs are very
dissimilar in sequence, successful identification can be achieved only by
using methods that do not rely on sequence similarity. This work reports a
well established RLSC method which is more accurate, fast, and efficient
for the OBP identification. Considering the biological implications of
OBPs, we believe more research attention will be diverted towards the
identification of OBPs and of new OBP families. Newer methods for
mathematically representing protein sequences may be proposed to further
increase prediction accuracy
Do you foresee any social or political implications
for your research?
The complete understanding of OBPs and their role in olfaction mechanism
requires more sequence data. Our work speeds up the OBP annotation process,
which may help the medical community, particularly those individuals
working in the fields of olfaction.
Insects have numerous uncovered mysteries such as metamorphosis, cyclical
parthenogenesis, social behaviors, and a complicated chemical communication
system. Some insect species, such as mosquitoes, plant hoppers, and aphids,
may cause damage to human health or agricultural products; whereas others,
such as the honeybee and the silkworm are valuable agricultural or industry
resources for human life. Better understanding of odorant/pheromone binding
proteins and their role in the chemical communication of insects is helpful
not only to develop efficient strategies for the control of pests, but also
for the progress of basic research in the life sciences. Knowledge of
odorant binding and transport mechanism is also useful in olfactotherapy.
Prof. Ramanathan Sowdhamini
Computational Biology group
National Centre for Biological Sciences,
TATA Institute of Fundamental Research (TIFR)
Prof. Ponnuthurai Nagaratnam Suganthan
Machine Learning Lab
School of Electrical and Electronic Engineering,
Nanyang Technological University
Machine Learning Lab
School of Electrical and Electronic Engineering
Nanyang Technological University
Dr. Ke Tang
Nature Inspired Computation and Applications Laboratory (NICAL)
Department of Computer Science and Technology
University of Science and Technology of China
Hefei, Anhui, China
Dr. Govindaraju Archunan
Center for Pheromone Technology
Department of Animal Science
Bharathidasan University Tiruchirappalli,
KEYWORDS: SUPPORT VECTOR MACHINES; AMINO-ACID-COMPOSITION;
SUBCELLULAR LOCATION PREDICTION; PHEROMONE-BINDING; ENSEMBLE CLASSIFIER;
PATTERN-RECOGNITION; MULTIPLE SITES; DATABASE; CLONING;