Comparative Performance of Machine Leaning Algorithms in Prediction of Cervical Cancer
dc.contributor.author | Emmanuel, Ahishakiye | |
dc.contributor.author | Waweru, Mwangi | |
dc.contributor.author | Petronilla, Muthoni | |
dc.contributor.author | Lawrence, Nderu | |
dc.contributor.author | Ruth, Wario | |
dc.date.accessioned | 2023-04-17T07:07:23Z | |
dc.date.available | 2023-04-17T07:07:23Z | |
dc.date.issued | 2021-10 | |
dc.description.abstract | Cervical cancer is among the most common types of cancer affecting women around the world despite the advances in prevention, screening, diagnosis, and treatment during the past decade. Cervical cancer can be treated if diagnosed in its early stages. Machine learning algorithms like multi-layer perceptron, decision trees, random forest, K-Nearest Neighbor, and Naïve-Bayes have been used for the prediction of cervical cancer to aid in its early diagnoses. In this study, we compare the performance of ensemble methods (AdaBoost, Stochastic Gradient Boosting, Random Forests, and Extra Trees), and classification algorithms (K-Nearest Neighbor and Support Vector Machine) in the prediction of cervical cancer basin g risk factors. Ensemble methods and classification algorithms were used during this study. Ensemble methods were selected because they combine several machine learning techniques into one model to decrease variance, bias, or improve performance while the classification methods were selected because our dataset was generally categorical and therefore could work well with our problem domain. Experimental results revealed that all the algorithms did not perform well on the “imbalanced” dataset. Experiments on balanced revealed an improved performance. The performance metrics used include Fl-score, Area Under Curve (AUC), and Recall. Extra Trees performed better than the rest when using the Fl-score metric, Stochastic Gradient Boosting and Random Forest performed better than the rest when using the AUC metric, K-Nearest Neighbors outperformed the rest using the recall metric, and Extra Trees had the best accuracy 0.96. The application of machine learning methods in the prediction of cervical cancer using risk factors may lead to early detection of the disease which can be treated if diagnosed early. Six algorithms have been considered in this study. The general performance reveals that ensemble methods performed better than classification methods using both imbalanced and balanced datasets. | en_US |
dc.identifier.citation | Ahishakiye, E., Omulo, E. O., Taremwa, D., & Wario, R. (2017). Comparative Analysis of Open source Business Intelligence tools for Crime Data Analytics. IJLRET. | en_US |
dc.identifier.uri | https://hdl.handle.net/20.500.12504/1308 | |
dc.language.iso | en | en_US |
dc.publisher | IEEE xplore | en_US |
dc.subject | Cervical cancer | en_US |
dc.subject | Algorithms | en_US |
dc.subject | Machine Leaning | en_US |
dc.subject | Comparative Performance | en_US |
dc.title | Comparative Performance of Machine Leaning Algorithms in Prediction of Cervical Cancer | en_US |
dc.type | Presentation | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Emmanuel Ahishakiye 2021 Screen.PNG
- Size:
- 156.22 KB
- Format:
- Portable Network Graphics
- Description:
- screenshot
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: