Combining statistical and Fuzzy-rough classifiers for cancer Subtype prediction

Paper Details

Research Paper 10/08/2022
Views (934)
current_issue_feature_image
publication_file

Combining statistical and Fuzzy-rough classifiers for cancer Subtype prediction

Sukanta Majumder, Ansuman Kumar, Anindya Halder
J. Biodiv. & Environ. Sci. 21(2), 92-99, August 2022.
Copyright Statement: Copyright 2022; The Author(s).
License: CC BY-NC 4.0

Abstract

Cancer prediction from gene expression data is one of the challenging areas of research in the field of bioinformatics and machine learning. In gene expression data, labeled samples are very limited compared to unlabeled samples; and labeling of unlabeled data is expensive. Therefore, single classifier trained with limited training samples often fails to produce desired result. In this situation, combination of classifiers can be effective as its ensembles the results of individual classifiers which can improve the cancer prediction accuracy. In this article a novel method, combining statistical and fuzzy-rough classifiers (CSFRC) for cancer prediction is proposed which uses support vector machine, naive bayes as statistical classifiers and fuzzy-rough nearest neighbor classifier. The proposed method is able to deal the uncertainty, overlapping and indiscernibility usually present in cancer subtype classes of the gene expression data. The proposed method is validated on eight publicly available gene expression datasets. Experimental results suggest that the performance of the proposed method provides better results in comparison to other compared classifiers for cancer subtype prediction from gene expression data. The proposed method turns out to be very effective in cancer prediction from gene expression data particularly when the individual classifier result is not up to the mark with limited training samples.

Chandra B, Gupta M. 2011. Robust approach for estimating probabilities in naïve Bayesian classifier for gene expression data. Expert Systems with Applications 38(3), 1293-1298.

Cohen J. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1), 37-46.

Dettling M, Buhlmann P. 2003. Boosting for tumor classification with gene expression data. Bioinformatics 19(9), 1061-1069.

Dettling M. 2004. Bagboosting for tumor classification with gene expression data. Bioinformatics 20(18), 583-593.

Du D, Li K, Li X, Fei M. 2014. A novel forward gene selection algorithm for microarray data. Neurocomputing 133, 446-458.

Halder A, Misra S. 2014. Semi-supervised fuzzy k-NN for cancer classification from microarray gene expression data. In Proceedings of the 1st International Conference on Automation, Control, Energy and Systems (ACES 2014) (IEEE Computer Society Press) 1-5.

Jensen R, Cornelis C. 2008. A new approach to fuzzy-rough nearest neighbour classification. In: Proceedings of the 6th International Conference on Rough Sets and Current Trends in Computing 310-319, 2008.

Jiang D, Tang C, Zhang A. 2004. Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370-1386.

Keller JM, Gray MR, Givens JA. 1985. A fuzzy K -nearest neighbor algorithm. IEEE Transactions on Systems, Man and Cybernetics 15(4), 580-585.

Kumar A, Halder A. 2019. Active learning using fuzzy-rough nearest neighbour classifier for cancer prediction from microarray gene expression data. International Journal of Pattern Recognition and Artificial Intelligence 34(1), p. 2057001.

Kumar A, Halder A. 2020. Ensemble-based active learning using fuzzy-rough approach for cancer sample classification. Engineering Applications of Artificial Intelligence 91, p. 103591.

Kuncheva LI. 2004. Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons, 2nd ed.

Marak DCB, Halder A, Kumar A. 2021. Semi-supervised Ensemble Learning for Efficient Cancer Sample Classification from miRNA gene expression data. New Generation Computing (Springer) 1-27.

Osareh A, Shadgar B. 2013. An efficient ensemble learning method for gene microarray classification. BioMed Research International 2013(1), 1-10.

Pawlak Z. 1982. Rough sets. International Journal of Computer and Information Science 11(5), 341-356.

Polikar R. 2006. Ensemble based systems in decision making”, IEEE Circuits and Systems Magazine 6(3), 21-45.

Priscilla R, Swamynathan S. 2013. A semi-supervised hierarchical approach: two-dimensional clustering of microarray gene expression data. Frontiers of Computer Science 7(2), 204-213.

Radzikowska AM, Kerre EE. 2002. A comparative study of fuzzy rough sets. Fuzzy Sets and Systems 126, 137-156.

Stekel D. 2003. Microarray Bioinformatics. 1st ed., Cambridge University Press, Cambridge, UK.

Valentini G, Muselli M, Ruffino F. 2004. Cancer recognition with bagged ensembles of support vector machines. Neurocomputing 56, 461-466.

Vanitha CDA, Devaraj D, Venkatesulu M. 2015. Gene expression data classification using support vector machine and mutual information-based gene selection. Procedia Computer Science 47, 13-21.

Yang P, Yang YH, Zhou BB, Zomaya AY. 2010. A review of ensemble methods in bioinformatics. Machine Learning 5(4), 296-308.

Zadeh L. 1965. Fuzzy sets. Information and Control 8(3), 338-353.

Related Articles

Agroforestry in woody-encroached Sub-Saharan savannas: Transforming ecological challenges into sustainable opportunities

Yao Anicet Gervais Kouamé, Pabo Quévin Oula, Kouamé Fulgence Koffi, Ollo Sib, Adama Bakayoko, Karidia Traoré, J. Biodiv. & Environ. Sci. 27(3), 10-22, September 2025.

Extreme rainfall variability and trends in the district of Ouedeme, municipality of Glazoue (Benin)

Koumassi Dègla Hervé, J. Biodiv. & Environ. Sci. 27(3), 1-9, September 2025.

Heterosis breeding, general and specific combining ability and stability studies in pearl millet: Current trends

Ram Avtar, Krishan Pal, Kavita Rani, Rohit Kumar Tiwari, Mahendra Kumar Yadav, J. Biodiv. & Environ. Sci. 27(2), 117-124, August 2025.

Combining ability, heterosis and stability for yield and fibre quality traits in cotton: Breeding approaches and future prospects

Rohit Kumar Tiwari, Krishan Pal, R. P. Saharan, Ram Avtar, Mahendra Kumar Yadav, J. Biodiv. & Environ. Sci. 27(2), 109-116, August 2025.

Bridging the COPD awareness gap in marginalized populations: Findings from a multicentre study in Khalilabad, Sant Kabir Nagar, Uttar Pradesh, India

Anupam Pati Tripathi, Jigyasa Pandey, Sakshi Singh, Smita Pathak, Dinesh Chaudhary, Alfiya Mashii, Farheen Fatima, J. Biodiv. & Environ. Sci. 27(2), 97-108, August 2025.

Antioxidant and anti-inflammatory activity of Pleurotus citrinopileatus Singer and Pleurotus sajor-caju (Fr.) Singer

P. Maheswari, P. Madhanraj, V. Ambikapathy, P. Prakash, A. Panneerselvam, J. Biodiv. & Environ. Sci. 27(2), 90-96, August 2025.

Mangrove abundance, diversity, and productivity in effluent-rich estuarine portion of Butuanon River, Mandaue City, Cebu

John Michael B. Genterolizo, Miguelito A. Ruelan, Laarlyn N. Abalos, Kathleen Kay M. Buendia, J. Biodiv. & Environ. Sci. 27(2), 77-89, August 2025.

Cytogenetic and pathological investigations in maize × teosinte hybrids: Chromosome behaviour, spore identification, and inheritance of maydis leaf blight resistance

Krishan Pal, Ravi Kishan Soni, Devraj, Rohit Kumar Tiwari, Ram Avtar, J. Biodiv. & Environ. Sci. 27(2), 70-76, August 2025.