Harnessing Power of Decision Tree Approach for HPF Prediction using SIPINA and See5
Sunny Sharma, Amritpal Singh, Dr. Rajinder Singh "Harnessing Power of Decision Tree Approach for HPF Prediction using SIPINA and See5". International Journal of Computer Trends and Technology (IJCTT) V34(3):139-143, April 2016. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.
Abstract -
Drug discovery process, Disease detection
and Prediction of molecular class are the area of
great significance for carrying out research. In past
few decades some precise approaches were used to
enhance the accuracy of Human protein Function
(HPF) prediction. This research study is primarily
concentrated on such approach of HPF prediction
with sequence derived features (SDF) using decision
trees and there variants implemented using C5 and
C4.5 algorithms like See5 and SIPINA. More
sequence derived features were identified and
incorporated. The training data was improved with
these incorporated features. The Sequence data was
evolved from HPRD (Human protein reference
database) in terms of number of sequences and the
features used to extract the relation towards a specific
class which enhancing power of training data.
Multiple techniques were examined for accuracy in
prediction and a widespread comparison was done
amongst them incorporating with previous research
results, and prescribed the overall accuracy of See5
with 64% and SIPINA with 88%.
References
[1] B. Bergeron, ?Bioinformatics Computing, pp 257-270, 2002.
[2] D. Arditi and T. Pulket, ?Predicting the outcome of
construction litigation using boosted decision trees, Journal
of Computing in Civil Engineering, vol. 19, no. 4, pp 387–
393, 2005.
[3] H. Wei-Feng, G. Na, Y. Yan, L. Ji-Yang, Y. Ji-Hong,
?Decision Trees Com-bined with Feature Selection for the
Rational Synthesis of Aluminophos-phate AlPO4-5,
National Natural Science Foundation of China, vol 27, no.9,
pp 2111-2117, 2011.
[4] I. Friedberg, ?Automated Protein Function Prediction- the
Genomic Chal-lenge, Briefings in Bioinformatics, vol 7,
no.3, pp 225-242.
[5] J. Han and M. Kamber, ?Data Mining Concepts and
Techniques, MorganKaufmann Publishers, USA pp 279-322,
2003.
[6] L.J. Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames C.
Kesmir, H. Nielsen, H.H. Stærfeldt, K. Rapacki, C. Workman
C.A.F. Andersen, S. Knudsen, A. Krogh, A.Valencia and S.
Brunak , ?Prediction of Human Protein Function from Post-
Translational Modifications and Localization Features,
Journal of Molecular Biology, vol. 319, issue 5,pp 1257-
1265, 2002.
[7] M. Singh, G. Singh, ?Cluster Analysis Technique based on
Bipartite Graph for Human Protein Class Prediction,
International Journal of Computer Applications (0975 –
8887), vol. 20, no.3, pp. 22-27, 2011.
[8] M. Singh, P. K. Wadhwa and P. S. Sandhu , ? Human Protein
Function Prediction using Decision Tree Induction ?, IJCSNS
International Journal of Computer Science and Network
Security, vol. 7, no.4, pp. 92-98, 2007.
[9] www.hprd.org.
[10] http://rulequest.com/see5-info.html.
[11] http://eric.univ-lyon2.fr/~ricco/sipina.html
Keywords
HPF, C5, C4.5, See5, Decision Tree, SDF,
SIPINA.