Comparative Performance of Machine Leaning Algorithms in Prediction of Cervical Cancer

Emmanuel, Ahishakiye; Waweru, Mwangi; Petronilla, Muthoni; Lawrence, Nderu; Ruth, Wario

dc.contributor.author	Emmanuel, Ahishakiye
dc.contributor.author	Waweru, Mwangi
dc.contributor.author	Petronilla, Muthoni
dc.contributor.author	Lawrence, Nderu
dc.contributor.author	Ruth, Wario
dc.date.accessioned	2023-04-17T07:07:23Z
dc.date.available	2023-04-17T07:07:23Z
dc.date.issued	2021-10
dc.identifier.citation	Ahishakiye, E., Omulo, E. O., Taremwa, D., & Wario, R. (2017). Comparative Analysis of Open source Business Intelligence tools for Crime Data Analytics. IJLRET.	en_US
dc.identifier.uri	https://hdl.handle.net/20.500.12504/1308
dc.description.abstract	Cervical cancer is among the most common types of cancer affecting women around the world despite the advances in prevention, screening, diagnosis, and treatment during the past decade. Cervical cancer can be treated if diagnosed in its early stages. Machine learning algorithms like multi-layer perceptron, decision trees, random forest, K-Nearest Neighbor, and Naïve-Bayes have been used for the prediction of cervical cancer to aid in its early diagnoses. In this study, we compare the performance of ensemble methods (AdaBoost, Stochastic Gradient Boosting, Random Forests, and Extra Trees), and classification algorithms (K-Nearest Neighbor and Support Vector Machine) in the prediction of cervical cancer basin g risk factors. Ensemble methods and classification algorithms were used during this study. Ensemble methods were selected because they combine several machine learning techniques into one model to decrease variance, bias, or improve performance while the classification methods were selected because our dataset was generally categorical and therefore could work well with our problem domain. Experimental results revealed that all the algorithms did not perform well on the “imbalanced” dataset. Experiments on balanced revealed an improved performance. The performance metrics used include Fl-score, Area Under Curve (AUC), and Recall. Extra Trees performed better than the rest when using the Fl-score metric, Stochastic Gradient Boosting and Random Forest performed better than the rest when using the AUC metric, K-Nearest Neighbors outperformed the rest using the recall metric, and Extra Trees had the best accuracy 0.96. The application of machine learning methods in the prediction of cervical cancer using risk factors may lead to early detection of the disease which can be treated if diagnosed early. Six algorithms have been considered in this study. The general performance reveals that ensemble methods performed better than classification methods using both imbalanced and balanced datasets.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE xplore	en_US
dc.subject	Cervical cancer	en_US
dc.subject	Algorithms	en_US
dc.subject	Machine Leaning	en_US
dc.subject	Comparative Performance	en_US
dc.title	Comparative Performance of Machine Leaning Algorithms in Prediction of Cervical Cancer	en_US
dc.type	Presentation	en_US

Files in this item

Name:: Emmanuel Ahishakiye 2021 Screen.PNG
Size:: 156.2Kb
Format:: PNG image
Description:: screenshot

View/Open

This item appears in the following Collection(s)

Journal Articles

Show simple item record