Classification of Leukocytes: Comparison of different feature extraction and machine learning approaches

Abstract

Blood cells can be separated into three types: erythrocytes, leukocytes and platelets, and to evaluate the health of a patient, a Complete Blood Count (CBC) is necessary. CBC is amongst the most performed tests worldwide, and when evaluated manually by physicians is time-consuming and susceptible to errors. Recently there have been efforts in the scientific community to automate the evaluation of CBC. Automating CBC analysis is beneficial to laboratories worldwide and to patients, which obtain a faster and more reliable result. One of the challenges in automating CBC is the classification of leukocytes or White Blood Cells (WBC). These cells are part of the immune system and are responsible for protecting the body against infections. The most common types of WBC are: neutrophils, eosinophils, monocytes and lymphocytes. The four types of WBC have similarities, and most techniques have difficulties classifying them into the four types. There are several techniques in literature tackling this issue. Some consist of feature extraction of the cells in the images, followed by an expert system. Also, there are techniques consisting of feature extraction followed by applying classical techniques from Machine Learning (ML). More recently, there have been applications of artificial neural networks to solve this problem. In neural networks, the features are extracted automatically by the network and its layers and then proceed to classification. This paper’s objective is to compare different techniques to improve the reliability and reduce the time spent evaluating CBC. This objective will be accomplished by testing two feature extraction techniques and then using ML techniques to classify the features into the four types of leukocytes. The methods for feature extraction tested in this paper are Histogram of Oriented Gradients (HOG) and Local Binary Patterns (LBP). The ML techniques tested will be Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost), and a convolutional neural network (CNN). The main goal of this paper is to enable the proposal of new techniques and products to support laboratories, physicians, and patients.

Publication
Proceedings of the 11th International Conference on Production Research – Americas

Related