Improving Classification Accuracy of Breast Cancer Using Ensemble Methods

Document Type : Original Article

Authors

1 Department of Information Technology, Faculty of Computers and information, Luxor University.

2 Head of integration test dept. Egyptian Space Agency, Egypt

3 Egyptian Company for blood transfusion services, Egypt

4 Department of Computer and Electronics Engineering, Thebes Higher Institute of Engineering

5 Department of Physics and Engineering Mathematics Mattaria, Faculty of Engineering, Helwan University, Cairo, Egypt - Department of Software Engineering and Information Technology, Faculty of Engineering and Technology, Egyptian Chinese University, Egypt

Abstract

Artificial intelligence plays an important role in medical sector, especially in improving healthcare for patients, in which the early detection and diagnosis of disease increasing the probability of recovery. Breast cancer ranks first among the most common types of cancer, globally, regionally. This paper with the help of machine learning technique proposes to present a non-invasive method for diagnosing and classify breast diseases based on mammograms and ultrasound images, to extract the statistical features of them (smoothness, perimeter, area, concavity, compactness, symmetry, size, diameter, concave and radius), to identify the breast tissue as malignant tumor, or a benign tumor and predicting in the future at the long term to prevent it. Learning algorithms are used mainly: support vector machine (SVM), multilayer perceptron (MLP), naïve Bayes (NB) and Decision tree (DT) algorithms to build model capable of classifying the breast tissue into malignant or a benign, based on several features reached up to 30 features. Ensemble methods were used in this study to improve the classification accuracy mainly: bagging, boosting and stacking on the same dataset that we have used it before in the classification using individual classifier. The Results showed that SVM achieved higher accuracy which is reached up to 97.89%, followed by MLP classifier with 95.61%, and the NB accuracy which is reached up to 92.62%. Also, the experimental results showed that the ensemble method is given higher accuracy than individual classifier, where the accuracy of Decision tree (DT) is increased from 93.15 as individual classifier to 97.71% using stacking algorithm.

Keywords