Combining statistical and Fuzzy-rough classifiers for cancer Subtype prediction

Paper Details

Research Paper 10/08/2022
Views (1159)
current_issue_feature_image
publication_file

Combining statistical and Fuzzy-rough classifiers for cancer Subtype prediction

Sukanta Majumder, Ansuman Kumar, Anindya Halder
J. Biodiv. & Environ. Sci. 21(2), 92-99, August 2022.
Copyright Statement: Copyright 2022; The Author(s).
License: CC BY-NC 4.0

Abstract

Cancer prediction from gene expression data is one of the challenging areas of research in the field of bioinformatics and machine learning. In gene expression data, labeled samples are very limited compared to unlabeled samples; and labeling of unlabeled data is expensive. Therefore, single classifier trained with limited training samples often fails to produce desired result. In this situation, combination of classifiers can be effective as its ensembles the results of individual classifiers which can improve the cancer prediction accuracy. In this article a novel method, combining statistical and fuzzy-rough classifiers (CSFRC) for cancer prediction is proposed which uses support vector machine, naive bayes as statistical classifiers and fuzzy-rough nearest neighbor classifier. The proposed method is able to deal the uncertainty, overlapping and indiscernibility usually present in cancer subtype classes of the gene expression data. The proposed method is validated on eight publicly available gene expression datasets. Experimental results suggest that the performance of the proposed method provides better results in comparison to other compared classifiers for cancer subtype prediction from gene expression data. The proposed method turns out to be very effective in cancer prediction from gene expression data particularly when the individual classifier result is not up to the mark with limited training samples.

Chandra B, Gupta M. 2011. Robust approach for estimating probabilities in naïve Bayesian classifier for gene expression data. Expert Systems with Applications 38(3), 1293-1298.

Cohen J. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1), 37-46.

Dettling M, Buhlmann P. 2003. Boosting for tumor classification with gene expression data. Bioinformatics 19(9), 1061-1069.

Dettling M. 2004. Bagboosting for tumor classification with gene expression data. Bioinformatics 20(18), 583-593.

Du D, Li K, Li X, Fei M. 2014. A novel forward gene selection algorithm for microarray data. Neurocomputing 133, 446-458.

Halder A, Misra S. 2014. Semi-supervised fuzzy k-NN for cancer classification from microarray gene expression data. In Proceedings of the 1st International Conference on Automation, Control, Energy and Systems (ACES 2014) (IEEE Computer Society Press) 1-5.

Jensen R, Cornelis C. 2008. A new approach to fuzzy-rough nearest neighbour classification. In: Proceedings of the 6th International Conference on Rough Sets and Current Trends in Computing 310-319, 2008.

Jiang D, Tang C, Zhang A. 2004. Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370-1386.

Keller JM, Gray MR, Givens JA. 1985. A fuzzy K -nearest neighbor algorithm. IEEE Transactions on Systems, Man and Cybernetics 15(4), 580-585.

Kumar A, Halder A. 2019. Active learning using fuzzy-rough nearest neighbour classifier for cancer prediction from microarray gene expression data. International Journal of Pattern Recognition and Artificial Intelligence 34(1), p. 2057001.

Kumar A, Halder A. 2020. Ensemble-based active learning using fuzzy-rough approach for cancer sample classification. Engineering Applications of Artificial Intelligence 91, p. 103591.

Kuncheva LI. 2004. Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons, 2nd ed.

Marak DCB, Halder A, Kumar A. 2021. Semi-supervised Ensemble Learning for Efficient Cancer Sample Classification from miRNA gene expression data. New Generation Computing (Springer) 1-27.

Osareh A, Shadgar B. 2013. An efficient ensemble learning method for gene microarray classification. BioMed Research International 2013(1), 1-10.

Pawlak Z. 1982. Rough sets. International Journal of Computer and Information Science 11(5), 341-356.

Polikar R. 2006. Ensemble based systems in decision making”, IEEE Circuits and Systems Magazine 6(3), 21-45.

Priscilla R, Swamynathan S. 2013. A semi-supervised hierarchical approach: two-dimensional clustering of microarray gene expression data. Frontiers of Computer Science 7(2), 204-213.

Radzikowska AM, Kerre EE. 2002. A comparative study of fuzzy rough sets. Fuzzy Sets and Systems 126, 137-156.

Stekel D. 2003. Microarray Bioinformatics. 1st ed., Cambridge University Press, Cambridge, UK.

Valentini G, Muselli M, Ruffino F. 2004. Cancer recognition with bagged ensembles of support vector machines. Neurocomputing 56, 461-466.

Vanitha CDA, Devaraj D, Venkatesulu M. 2015. Gene expression data classification using support vector machine and mutual information-based gene selection. Procedia Computer Science 47, 13-21.

Yang P, Yang YH, Zhou BB, Zomaya AY. 2010. A review of ensemble methods in bioinformatics. Machine Learning 5(4), 296-308.

Zadeh L. 1965. Fuzzy sets. Information and Control 8(3), 338-353.

Related Articles

SWAT+-based water balance assessment of Ipil watershed in Bohol, Philippines: Spatial and temporal patterns of water availability

Anselmo M. Aurestila*, Proceso M. Castil, Manolito C. Macalolot, J. Biodiv. & Environ. Sci. 28(6), 30-41, June 2026.

Spatiotemporal modeling of surface urban heat island and the influence of land cover changes in land surface temperature in Cagayan de Oro City, Misamis Oriental, Mindanao, Philippines

John Oliver R. Abian*, Peter D. Suson, Jaime Q. Guihawan, Hilly Ann Roa-Quiaoit, Elizabeth Edan M. Albiento, J. Biodiv. & Environ. Sci. 28(6), 17-29, June 2026.

Language and culture: Prerequisites for human capital development and enhanced household food security among vulnerable women farmers in Imo State, Nigeria

N. F. Nwulu, M. O. Igwenagu, G. U. Amadi, F. D. Anuonye, G. N. Ogbonna, C. F. Obumneke, S. U. Obasi, J. C. Onyeakazi, C. G. Iroagba, N. C. Anigbogu, K. U. Chukwu, C. G. Opara, E. N. Onuoha, N. U. Nzotta, C. R. Ayozie, B. N. Igbokwe, L. O. Duru, O. V. Obiagwu, C. I. Ahumaraeze, U. A. Agwuocha, J. U. Chikaire*, J. Biodiv. & Environ. Sci. 28(6), 1-16, June 2026.

Ziziphus spina-christi as a bioindicator of heavy metals (Cu, Cd) in Baghdad, Iraq

Israa Radhi Khudhair*, J. Biodiv. & Environ. Sci. 28(5), 45-49, May 2026.

Language choice for natural resource conservation and agricultural production information sharing and communication strategies for improved livelihoods among rural farmers in Southeast, Nigeria

N. F. Nwulu, C. F. Obumneke, S. U. Obasi, J. C. Onyeakazi, C. G. Iroagba, N. C. Anigbogu, K. U. Chukwu, C. G. Opara, E. N. Onuoha, C. R. Ayozie, B. N. Igbokwe, L. O. Duru, O. V. Obiagwu, M. O. Igwenagu, G. U. Amadi, F. D. Anuonye, G. N. Ogbonna, N. U. Nzotta, C. I. Ahumaraeze, U. A. Agwuocha, J. U. Chikaire*, J. Biodiv. & Environ. Sci. 28(5), 27-44, May 2026.

Correlates of students’ beliefs on environmental protection: Awareness, compliance, and sociodemographic influences

Anderson G. Gonzales*, Cyrus Kelly Macabangon, Dexter Dumayag, J. Biodiv. & Environ. Sci. 28(5), 18-26, May 2026.

Prevalence of phosphate solubilising bacteria in Muthupet Mangrove Reserve

S. Alice Keerthana, V. Shanmugaraju*, M. Poongothai, P. Arun, J. Biodiv. & Environ. Sci. 28(5), 9-17, May 2026.

The bush mango value chain in South West Cameroon: Governance, sustainability and emerging opportunities

Louis Njie Ndumbe*, Agbor Mc Nasare, Baliki Winifred, J. Biodiv. & Environ. Sci. 28(5), 1-8, May 2026.