Automated sickle cell disease identification in human red blood cells using a lensless single random phase encoding biosensor and convolutional neural networks

Peter M. Douglass; Timothy O’Connor; Bahram Javidi

doi:10.1364/OE.469199

1. Introduction

Turbid and diffusive media are traditionally considered as optical obstructions to be computationally remedied in post-processing. Many approaches have been developed to successfully image through such diffusive media [1–4]. Recent studies, however, instead harness the pseudorandom scattering properties of optical diffusers and substitute them for a conventional lens to reduce system cost, complexity, and size [5–9]. Such lensless systems computationally reconstruct an object’s image instead of performing direct imaging. Alternatively, some approaches use diffusive media in mask-based sensing strategies to perform classification tasks, such as handwritten digit recognition or automated cell identification directly on the signal captured by the sensor, without the need to reconstruct an image [10–15].

Automated cell identification may provide a minimally invasive approach for point-of-care disease diagnosis. This development would be essential in low resource areas where dedicated healthcare facilities, which are often needed for accurate diagnostics, may be few and far between. These areas may also lack access to trained experts, expensive test kits, etc. [16]. To address this need, several optical technologies have been proposed as point-of-care diagnostic systems. Among these optical technologies, 3D imaging approaches using quantitative phase imaging (QPI), including digital holographic microscopy (DHM), have accomplished automated disease diagnosis and cell characterization, classification, and monitoring [17–22]. Additionally, optical coherence tomography coupled with fluorescence microscopy has achieved 3D in vivo cell imaging and tracking [23]. These 3D imaging approaches were successful in their task; however, systems with reduced sensitivity to mechanical noise and lensless design can increase field-use practicality due to reduced complexity, size, and cost [19].

Single random phase encoding (SRPE) was proposed in [11] as a compact, low cost, and lensless approach to cell identification. In the SRPE system, a laser diode light source transilluminates a sample. Intricate intracellular details modulate the light which then propagates to a diffuser that serves to retain the spatial frequency information through encodement and scatter of the signal. The intensity of this encoded pattern, referred to as the opto-biological signature (OBS), is recorded by a CMOS image sensor and is used to train a machine learning model for cell classification. One previously used approach assesses the entire captured opto-biological signature based on statistical features of its spatial distribution [11–12]; however, convolutional neural networks (CNN) were shown to improve classification of animal red blood cell (RBC) samples in lensless pseudorandom cell identification [13]. CNNs apply successive convolutional filtering over many stacked layers to automatically extract features and patterns of increasing complexity from within the input [24]. The lensless nature of such a single random phase encoding system removes considerable constraints on the numerical aperture and field-of-view otherwise imposed by the presence of a lens. Such lens-based restrictions may result in loss of information due to the diffraction of high spatial frequencies which may not be incident on the lens. Unlike the shearing digital holographic approach which requires a sparse sample of blood cells [13], the proposed SRPE system may use dense wet mounted samples containing thousands of blood cells. Additionally, the system’s design is generalized for use with any sample placed at the object plane.

An important application of cell identification with lensless SRPE is its potential use for automated diagnosis of diseases like sickle cell disease (SCD). Sickle cell disease is a collection of inherited red blood cell disorders that, in the form of sickle cell anemia, annually affects approximately 300,000 United States’ newborns. Sickle cell anemia is the most common and very severe type of sickle cell disease [25–26]. Additionally, approximately 5% of the world’s population carries trait genes for hemoglobin disorders, primarily being sickle cell disease and thalassemia [27]. Sickle cell disease affects hemoglobin, and it is caused by inheriting a mutated beta globin gene which leads to production of defective hemoglobin [25]. This abnormal hemoglobin cannot carry oxygen as effectively which can subsequently result in hypoxia causing incorrect polymerization of the mutated hemoglobin. Consequently, the originally flexible and biconcave RBCs are misshapen into rigid, crescent-shaped cells. These deformed cells do not flow easily and can stick to blood vessel walls; an event which can block blood flow resulting in severe pain, potential stroke, or organ damage [25]. While traditional hemoglobin electrophoresis testing for sickle cell disease is routine for all newborns in the United States [28], low-resource areas in regions such as sub-Saharan Africa, where the disease is highly prevalent [29], are not afforded this luxury and could therefore benefit from a low-cost and portable diagnostic system.

In this paper, we investigate a compact, lensless, and 3D-printed biosensor for classification between human healthy whole blood and human sickle cell disease whole blood. To our knowledge, this is the first study of live human red blood cells and disease screening in pseudorandom cell identification as neither was considered in previous related works [11–13]. Slides containing a small volume of live human RBC samples are inserted into the SRPE system after which the recorded opto-biological signatures are used to fine-tune a pretrained CNN for this classification task. Multiple CNN architectures are compared to assess their respective performance and proceed with the best model. Additionally, we apply local binary pattern map generation as a preprocessing step which improves classification performance. This technique was not used in previous related systems [11–13] but was shown in [10] to improve classification of scenes for incoherent mask-based imaging. Finally, we investigate the distribution of classifiable information within the OBS in an exploration of methods by which to substantially reduce network input size thereby increasing network training speed and data portability. While this paper reports automated sickle cell disease identification, the success of the system indicates its potential for rapid low-cost identification of other diseases which we will pursue in the future and is an important step toward rapid lensless compact disease screening in low-resource areas.

The rest of the paper is organized as follows: section 2 presents an overview of the proposed system along with its specifications as well as a brief mathematical description. Section 3 provides experimental classification results, proposed dimensionality reduction methods using 1D processing, and related discussions. Finally, section 4 presents conclusions and notes future work to be conducted.

2. Methods

2.1 System overview

The proposed SRPE system combines a laser diode, light shaping diffuser, and CMOS image sensor into a compact 3D-printed bio-sensing instrument. The entire 3D-printed system measures 70 mm × 130 mm × 155 mm and weighs ∼130 g. Sensing of cells under inspection is accomplished by placing them onto a positively charged glass slide underneath a slip of borosilicate cover glass. Positively charged slides are used in place of neutral glass slides for improved cell adhesion. The glass slide is then inserted between the laser diode and light shaping diffuser which has an $80^\circ $ dispersion angle. A laser diode transilluminates the specimen (red blood cells) under inspection thereby modulating the wavefront by the phase and amplitude of the micro-biological specimen. This modulated waveform carries identifying information about the cell to the light shaping diffuser over a propagation distance, ${z_1}$, of 3.67 mm. The diffuser encodes the complex amplitude of the sample by dispersing the entire waveform into a pseudorandom scatter by phase modulation. The diffuse waveform then propagates a distance, ${z_2}$, of 26.75 mm to the CMOS image sensor where the intensity of the complex waveform is recorded. To reduce information loss due to diffraction, the propagation distances, ${z_1}$ and ${z_2}$, are minimized such that they are at their lower bound as defined by physical constraints. A diagram of the system is presented by Fig. 1.

Fig. 1. (a) 3D-printed experimental setup for single random phase encoding system, and (b) progression of waveform propagation beginning from the laser diode source to the object plane, diffuser, and lensless CMOS detector with each respective coordinate system denoted.

Download Full Size | PDF

We use a FLIR Systems Blackfly S USB3 CMOS sensor (BFS-U3-27S5M-C, 1936 × 1464, 4.5 µm square pixels). This sensor was chosen for its 71 dB dynamic range, high saturation capacity (24,859 ${e^ - }$), and low noise (5.8 ${e^ - }$). The exposure time was set to 143 µs. All automatic color correction parameters such as white balance and gain were disabled. To refrain from saturating pixels, the selected laser source is low power and exposure time is brief. The light source used for experiments is a Coherent StingRay laser diode (638.3 nm, 0.991 mW) with a user-adjusted focus and an elliptical beam (1 mm × 3 mm). The light shaping diffuser is an $80^\circ $ diffusing angle unmounted sheet polycarbonate holographic diffuser (0.78 mm thick, >85% transmission efficiency) which is cut to size for use in the SRPE system. The entire system is mounted on an aluminum optical breadboard which rests on four duro-50 Sorbothane mounting feet for vibration isolation.

2.2 Mathematical description

The use of a single random-phase diffuser to encode the waveform and retain high spatial frequency information increases the amount of object information incident on the sensor [30]. Additionally, the random-phase diffuser disperses the waveform such that the resulting encoded intensity pattern can be considered as a white random process that has greater bandwidth than a lens-based imaging system [11]. The mathematical model is as follows: we begin with the complex field leaving the object plane which has received amplitude and phase modulation from the sample and is given as [31]:

(1)$${u_0}({x,y} )= |{{a_{obj}}({x,y} )} |{e^{j{\varphi _{obj}}({x,y} )}}. $$

In this equation, ${u_0}({x,y} )$ is the complex waveform leaving the object plane with modulated amplitude and phase, ${a_{obj}}({x,y} )$ and ${\varphi _{obj}}({x,y} )$ respectively, from the specimen under inspection. The complex field then propagates a distance, ${z_1}$, where it becomes incident on the diffuser which imposes a pseudo-random phase modulation described as:

(2)$${u_1}({\eta ,\nu } )= [{{u_0}({x,y} )\ast {h_1}({\eta ,\nu } )} ]{e^{j{\phi _{rand}}({\eta ,\nu } )}}, $$

where ${u_1}({\eta ,\nu } )$ is the complex field leaving the diffuser, ${\varphi _{rand}}({\eta ,\nu } )$ is the random phase component imposed by the diffuser, and ${\ast} $ is the convolution operator. ${h_1}({\eta ,\nu } )$ is the Fresnel diffraction convolution kernel describing free space propagation over a distance of ${z_1}$ and is given as:

(3)$${h_1}({\eta ,\nu } )= \frac{{{e^{jk{z_1}}}}}{{j\lambda {z_1}}}exp \left[ {\frac{{jk}}{{2{z_1}}}({{\eta^2} + {\nu^2}} )} \right], $$

where $\lambda $ is the wavelength of the light source and k is the wave number. The field then travels a distance of ${z_2}$ to the sensor, located in the Fraunhofer region, at which point the signature from the final complex waveform is recorded and is given as:

(4)$${u_2}({\alpha ,\beta } )= \frac{{{e^{jk{z_2}}}}}{{j\lambda {z_2}}}exp \left[ {\frac{{jk}}{{2{z_2}}}({{\alpha^2} + {\beta^2}} )} \right]{ {\mathrm{{\cal F}}\{{{u_1}({\eta ,\nu } )} \}} |_{{f_\eta } = \frac{\alpha }{{\lambda {z_2}}},{f_\nu } = \frac{\beta }{{\lambda {z_2}}}}}, $$

where ${u_2}({\alpha ,\beta } )$ is the complex waveform incident on the image sensor, $\mathrm{{\cal F}}\{\cdot \}$ is the Fourier transform, and ${f_\eta }$ and ${f_\nu }$ are spatial frequencies. The image sensor records the intensity, I, of the resultant complex waveform, $I = {|{{u_2}({\alpha ,\beta } )} |^2}$. Local coordinate systems used in this mathematical description of the waveform’s progression are illustrated in Fig. 1.

2.3 Data collection procedure

The specimen under inspection are human whole blood samples. The healthy and SCD whole blood samples each came from a single donor. Each slide was prepared as a 3 µl wet mount of the whole blood samples. All data was collected over a consecutive 3-day period beginning immediately upon delivery from the supplier to ensure the viability of the cells. On each day, two separate data sets were recorded for each class multiple hours apart. Formulation of a single data set is as follows: a micropipette was used to place 3 µl of the sample onto a charged glass slide after which a borosilicate coverslip was placed on top of the sample. About 30 seconds were permitted to pass to allow the sample to settle and adhere with the positively charged surface after spreading to the extent of the coverslip. The slide was then inserted into the SRPE system and aligned such that the laser source passed through the sample containing thousands of blood cells. Next, a video was recorded for 5 seconds at 4 fps with an exposure time of 0.143 ms. The slide was then laterally translated to inspect a new area and another video was recorded. 10 different locations were recorded per slide, and 15 slides were used per class. Accordingly, there were 3,000 frames per class per set. The complete data set was 36,000 frames and is split evenly between classes.

As a pre-processing step before classification by a CNN, we apply a domain transformation through local binary pattern (LBP) map generation as was proposed in [10]. LBP was previously shown to suppress global disturbances and enhance the input’s basic-level features in mask-based imaging [10]. To implement LBP map generation, each pixel is reassigned a new value based on its relation to the 8 nearest neighbors in a 3 × 3 grid surrounding it. The central pixel is considered a threshold at which surrounding pixels are assigned a 1 if they exceed or equate the threshold or a 0 if they fail the threshold. These 8 binary values create an 8-bit binary number when concatenated clockwise around the central pixel. When converted back to decimal, this number becomes the central pixel’s new value. This transformation is computed for each pixel of all recorded opto-biological signatures. Figure 2 visualizes the differences between the original input and LBP domain.

Fig. 2. (a) Raw opto-biological signature recorded at the image sensor of the SRPE system for a healthy red blood cell (RBC), and (b) generated local binary pattern map of (a).

Download Full Size | PDF

2.4 Deep learning model

We tested and compared 4 unique deep learning models for cell classification. The first was AlexNet, a simple and common 8-layer (5 convolution followed by 3 fully connected layers) classification network [32] which was previously studied in lensless pseudorandom cell identification [13]. We then considered 3 alternate architectures in an attempt to improve the classification abilities of our system. The second was VGG19, a much deeper 19-layer network (16 convolution followed by 3 fully connected layers) which is heavily based on the AlexNet architecture in that the sequential model applies multiple successive convolutional blocks immediately feeding into fully connected layers [33]. VGG19 makes use of more layers and smaller filter sizes to reduce the overall number of tunable parameters which speeds up training. The third model tested was ResNet-50, a very deep 48-layer residual learning network which utilizes shortcut connections between layers to prevent degrading accuracy as the network deepens [34]. Finally, we tested SqueezeNet, a deep 18-layer model developed for transportability and computational speed for machine vision applications [35]. SqueezeNet uses a convolutional layer followed by 8 fire modules and then another convolutional layer. Each fire module is 2 convolutional layers deep where the first is a squeeze layer with a 1 × 1 filter and the second is an expansion layer with a combination of 1 × 1 and 3 × 3 filters.

All models were pretrained on the ImageNet database [36]. We then replaced the final layer of each pretrained model with a sigmoid activated single neuron to satisfy the binary classification task. The adapted pretrained models were retrained on our data set to tune the preset weights and biases before classifying the test data. Each network was optimized using a grid search to select the best performing mini-batch size (32, 64, 128, or 256), bias and weight learn rate factor (1, 5, or 10), and optimizer (Adam or SGDM). Minimum loss on a validation set used was used as the metric for best performance as well as early stopping during training (patience of 6 iterations with a validation frequency of 20 iterations). Table 1 presents the optimized parameters for each model.

Table 1. Optimized parameters for each of the tested classification networks

View Table | View all tables in this article

3. Results

3.1 Sickle cell disease classification

Training and testing of each model was performed as a three-fold cross validation wherein data was separated based on the day it was collected, and data from each of the 3 days alternated turns as the testing set. Doing so allowed a fair assessment of the system’s ability to accurately classify diseased blood and protected against overfitting to time-dependent properties of the live samples, noting that morphological changes may occur over time [37–38]. Therefore, it is important to ensure that the system can accurately classify inputs recorded from samples on a different day than the input used to train the model without overfitting to external parameters. Performance of the CNN classifier is assessed with and without LBP map generation as a pre-processing technique.

Summative CNN performance metrics comparing between all 4 models as well as both the standard opto-biological signature input and LBP map input can be seen in Table 2. These metrics consist of accuracy, precision, recall, F1 score, and Matthew’s correlation coefficient (MCC). Precision, recall, F1 score, and accuracy all have values between 0 and 1 and each uniquely demonstrates information about model performance. Accuracy is a raw evaluation of the number of correctly selected classes divided by the total number of data points; $({TP + TN} )/({P + N} )$, where $TP$ is the number of true positive cases, $TN$ is the number of true negative cases, P is the number of positive cases in the data, and N is the number of negative cases in the data. Precision denotes the percentage of correctly selected positive classes relative to the total number of predicted positive classes; $TP/({TP + FP} )$, where $FP$ is the number of false positive cases. Similarly, recall denotes the percentage of correctly selected positive classes relative to the total number of possible positive classes; $TP/({TP + FN} )$, where $FN$ is the number of false negative cases. F1 score is the harmonic mean of precision and recall. The value of MCC lies in the range of -1 to 1 and offers a correlation between observed and predicted binary classes with 0 representing a random guess. Each of these performance metrics agree that pre-processing with LBP map generation improves classification performance.

Table 2. Mean performance metrics comparing between 4 optimized CNN architectures with both standard OBS and LBP map input. CNN: convolutional neural network. LBP: local binary pattern. OBS: opto-biological signature. MCC: Mathew’s correlation coefficient. F1 score is the harmonic mean of precision and recall.

View Table | View all tables in this article

The receiver operating characteristic (ROC) curves in Fig. 3 illustrate the LBP input case of each model and demonstrate that the simple sequential architectures of AlexNet and VGG19 perform the best with AlexNet generalizing better to this data set than all other models. Figure 3 also displays area under the curve (AUC) for each network. AUC is a single scalar value used to represent performance based on the two-dimensional ROC curve. AUC represents area on a unit square, with values between 0 and 1 and an AUC of 0.5 representing a random guess. As such, AUC denotes the probability that a randomly selected positive class sample will rank above a randomly selected negative class sample.

Fig. 3. Receiver operating characteristic (ROC) curves and area under the curve (AUC) for local binary pattern input to each optimized deep learning model.

Download Full Size | PDF

All performance evaluations in Table 2 and Fig. 3 represent the mean performance over each of the testing sets. Because a three-fold cross validation procedure was implemented to account for the split data collection process, three separate models were trained using alternating $2/3$ portions of the overall set, and the remaining $1/3$ was used to test the given model. Therefore, every data sample was used for testing. Conclusively, the AlexNet architecture combined with LBP map generation for pre-processing displays the best classification performance for this sickle cell disease identification task.

The optimized system demonstrates point-of-care disease diagnosis capability as opto-biological signatures can be processed and classified on the order of seconds to a minute. Local binary pattern map generation computes in 19.7 seconds per image which AlexNet then classifies in 0.86 seconds per image. This allows for higher throughput than digital holographic systems which additionally require a reconstruction step [19]. Overall, this is an exceptional turnaround time as compared to traditional hemoglobin electrophoresis testing which requires samples to be sent to a lab with results coming a day or two later.

3.2 Dimensionality reduction with 1D crops of opto-biological signatures

In addition to retaining higher spatial frequency information through encodement, the diffuser scatters the signal and spreads it as a white process across the entire sensor. In effect, it is expected that classifiable information regarding the sample is uniformly distributed about the opto-biological signature. This is in contrast to a conventional lens-based system in which object details are mapped to a respective camera pixel. We employ this aspect of SRPE by performing classification on small subsets of the original data to demonstrate system capability with the computational advantage of a significantly reduced input size.

Dimensionality reduction is implemented by cropping 1D rows or columns of pixels from each original OBS captured by the image sensor and using them as the input for classification. These new inputs are classified using a manual 1D adaptation of AlexNet, as this network architecture was shown to provide the best results in 2D. Our 1D CNN is effectively AlexNet with every relevant height dimension set to 1 (see Supplement 1 for network summary). Using this architecture, we first investigate the relevance of orientation and location of these 1D OBS crops with respect to classification performance by taking 30 evenly spaced horizontal and vertical lines across each OBS for comparison. Figure 4 illustrates the location of each line relative to a full-size OBS, and Fig. 5 presents ROC curves demonstrating classification performance on data sets comprised of the different 1D crop locations.

Fig. 4. Subsets of the opto-biological signatures (OBS) captured by image sensor used in dimensionality reduction experiments. 30 data sets with reduced dimensionality are created by taking 1D line crops at each of the depicted locations in red. 15 crop locations were vertical and the other 15 horizontal. The 1D lines were made to contain the same number of pixels (1464) irrespective of orientation.

Download Full Size | PDF

Fig. 5. Receiver operating characteristic (ROC) curves and area under the curve (AUC) for (a) horizontal and (b) vertical 1D OBS crops displaying near identical performance independent of location and orientation. See Fig. 4 for details of horizontal and vertical extracts. In black solid line, performance using the full 2D OBS is shown for general comparison. OBS: opto-biological signature.

Download Full Size | PDF

Evidently, the location and orientation in which 1D lines are cropped from the OBS has a nominal impact on classification. The vertical line set has a mean test accuracy of 78.61% with a standard deviation of 1.25%, and the horizontal line set has mean test accuracy of 79.21% with a standard deviation of 1.08%. Small discrepancies in classification accuracy across different locations can be attributed to the random nature inherent to initialization and training of artificial neural networks. To further demonstrate this claim, we use a 0.05 significance level one-way ANOVA across the testing accuracies of each of the 3 testing folds for all 30 crop locations. The test returns a p-value of 0.46 indicating that there is not a statistically significant difference in performance between different crop orientations and locations within the OBS. Conclusively, perturbations in testing accuracy between crop locations stem from the random initialization of intractable network parameters, and statistical hypothesis testing maintains the assertion that classifiable information is uniformly spread about the sensor by the diffuser.

Lastly, having established the homogeneity of signal distribution on the image sensor, we examine the relationship between the number of pixels within the input and classification performance. We compare 3 different dimensionality reduction methods for this assessment: 1D crops, 2D crops, and 1D pixel vectors created from randomly selected pixels from within the OBS. For the 1D crops, we create 92 data sets each consisting of increasingly larger line crops taken from the original OBS dataset. The location and orientation (either vertical or horizontal) of these line crops are randomized between each data set, and the crop sizes range from 84 to 1,924 in intervals of 20 pixels. For the 2D crops, we create 45 data sets ranging from 12 × 12 to 100 × 100 randomly placed window crops. For the 1D pixel vectors created from randomly selected pixels, we assemble 23 data sets ranging from 144 to 6,724 pixels, however, there is no longer a spatial relation between neighboring values as there are for direct 1D crops.

We then train and test either our manual 1D AlexNet or standard 2D AlexNet, corresponding to the appropriate input, for each of the constructed data sets across each of the 3 dimensionality reduction methods. Figure 6 presents a scatter plot demonstrating the change in testing accuracy as the number of pixels within the input varies. When comparing a smaller number of pixels, the 1D cropped input outperforms a 2D crop containing the same number of pixels. When the 2D crop grows to contain 6,400 pixels and above, it surpasses the largest 1D crop (1,924 pixels) in performance. It is important to note that 1D crops are limited by sensor dimensions and therefore cannot exceed 1,936 pixels, while the 2D crop and random 1D pixel vector can span any number of pixels ranging from a few up to the entire image (1464 × 1936). In our tests, the 1D randomized pixel input is outclassed in every regard, however, still displays ability to classify at 75% accuracy using as few as 6,400 randomly selected pixels. All 3 reduction methods display an upward trend in test accuracy as the number of pixels within the input increases; a result that is expected seeing as the amount of information regarding the sample has increased. These results may suggest that a substantial amount of redundant information exists within the original OBS as classification can still be performed using an incredibly small fraction (0.035%, or 1,000 pixels) of the full input.

Fig. 6. Testing accuracy as a function of number of pixels within the input OBS for 3 different dimensionality reduction methods applied to the OBS. All methods trend upward in accuracy as pixel count of the OBS increases. OBS: opto-biological signature.

Download Full Size | PDF

Each data reduction approach offers significant advantages in terms of data processing and storage. When training each deep learning model using the originally recorded opto-biological signatures and LBP map generation, models could take upwards of 6-10 hours to reach their early stopping criteria. As a consequence of the cross-validation procedure, these training times were tripled, meaning the full model set could take a day to train. However, the reduced data sets each train their respective deep learning models in as little as 3-5 minutes meaning full models could be finalized in 10-15 minutes. Additionally, in a point-of-care setting that is short on computational resources, data storage becomes a matter to consider. The original data from these experiments is ∼67 GByte, whereas each of the reduced data sets take up 10-100 MByte at most; a space savings of 3 orders of magnitude.

As seen using the 1D crop method for dimensionality reduction, successful classification using single lines of pixels taken from each original intensity pattern could permit future work to be conducted using a linear pixel array rather than a traditional 2D CMOS image sensor [39]. This would allow significantly higher framerates to be achieved in any potential time-series applications, such as flow cytometry with circulating tumor cells for cancer screening [40], and would considerably expedite data handling.

Future work aims to improve the system’s ability to capture identifying information of cells. One potential avenue for increased performance is the inclusion of motility information such as was previously shown to increase classification abilities in digital holographic microscopy systems [19]. Additional work may consider the potential benefits of using two diffusers rather than one as the white process becomes wide-sense stationary as well [12]. Additionally, the use of a second diffuser before the object plane to illuminate the specimen with a diffuse waveform may amplify the distinguishing optical properties of the sample and increase classification performance. Further future work may involve the use of microfluidics [13] to focus on single cell isolation or reduced sample concentration for motility observations. Lastly, future iterations of the SRPE system may consider varying parameters such as propagation distances (${z_1}$ and ${z_2}$) or diffuser specifications (diffusing angle and substrate material) in an investigation of their potential impact on classification performance.

4. Conclusions

In this paper, we proposed a novel approach to disease identification in humans through the combinatory applications of deep learning and a lensless pseudorandom phase encoding biosensor. A compact, low-cost, and 3D-printed system is implemented to capture opto-biological signatures from healthy and diseased cells under inspection. In the experiments, we used a small volume of healthy and sickle cell disease human red blood cells. A convolutional neural network is trained and tested using the images of these signatures. The diseased RBCs can be classified on the order of seconds to a minute. The lensless nature of the system reduces the burden of potential limitations on the numerical aperture and field of view suffered by lens-based systems, and it reduces overall complexity and cost. Performance is improved through local binary pattern map generation to suppress global disturbances and enhance local features. The system reaches an accuracy of 88.70% and an AUC of 0.9622 for sickle cell disease whole blood samples. Additionally, we have introduced an SRPE application with substantially reduced dimensionality of the captured opto-biological signatures, including 1D signatures. We have assessed the performance and computational savings of classifying sickle cell disease using 1D signature data. The proposed dimensionality reduction approaches can significantly reduce requirements for training deep learning models, data storage, and processing of data. While the presented work is concerned with discrimination between healthy and sickle cell disease red blood cells, the SRPE system can be trained for a variety of classification tasks based on the training procedure. Unlike previous pseudorandom phase encoding studies [11–13] which discriminated between micro-objects, this work presents the first study for human disease identification using the lensless pseudorandom phase encoding system. Future work may explore other diseases or cell types, seeks to increase captured cell information thereby improving classification accuracy, may compare performance with a variety of digital holographic based systems [41], and may test other various deep learning algorithms.

Acknowledgments

P. M. Douglass and T. O’Connor acknowledge support through the GAANN fellowship.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time.

Supplemental document

See Supplement 1 for supporting content.

References

1. S. Li and J. Zhong, “Dynamic imaging through turbid media based on digital holography,” J. Opt. Soc. Am. 31(3), 480–486 (2014). [CrossRef]

2. J. W. Goodman, W. H. Huntley, D. W. Jackson, and M. Lehmann, “Wavefront-reconstruction imaging through random media,” Appl. Phys. Lett. 8(12), 311–313 (1966). [CrossRef]

3. E. Leith, C. Chen, H. Chen, Y. Chen, D. Dilworth, J. Lopez, J. Rudd, P.-C. Sun, J. Valdmanis, and G. Vossler, “Imaging through scattering media with holography,” J. Opt. Soc. Am. A 9(7), 1148–1153 (1992). [CrossRef]

4. V. Bianco, M. Paturzo, A. Finizio, D. Balduzzi, R. Puglisi, A. Galli, and P. Ferraro, “Clear coherent imaging in turbid microfluidics by multiple holographic acquisitions,” Opt. Lett. 37(20), 4212–4214 (2012). [CrossRef]

5. Y. Li, G. N. McKay, N. J. Durr, and L. Tian, “Diffuser-based computational imaging funduscope,” Opt. Express 28(13), 19641–19654 (2020). [CrossRef]

6. K. Monakhova, J. Yurtsever, G. Kuo, N. Antipa, K. Yanny, and L. Waller, “Learned reconstructions for practical mask-based lensless imaging,” Opt. Express 27(20), 28075–28090 (2019). [CrossRef]

7. N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “DiffuserCam: lensless single-exposure 3D imaging,” Optica 5(1), 1–9 (2018). [CrossRef]

8. X. Jin, D. M. S. Wei, and Q. Dai, “Point spread function for diffuser cameras based on wave propagation and projection model,” Opt. Express 27(9), 12748–12761 (2019). [CrossRef]

9. O. Katz, P. Heidmann, M. Fink, and S. Gigan, “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations,” Nat. Photonics 8(10), 784–790 (2014). [CrossRef]

10. X. Pan, T. Nakamura, X. Chen, and M. Yamaguchi, “Lensless inference camera: incoherent object recognition through a thin mask with LBP map generation,” Opt. Express 29(7), 9758–9771 (2021). [CrossRef]

11. B. Javidi, S. Rawat, S. Komatsu, and A. Marksman, “Cell identification using single beam lensless imaging with pseudo-random phase encoding,” Opt. Lett. 41(15), 3663–3666 (2016). [CrossRef]

12. B. Javidi, A. Marksman, and S. Rawat, “Automatic multicell identification using a compact lensless single and double random phase encoding system,” Appl. Opt. 57(7), B190–196 (2018). [CrossRef]

13. T. O’Connor, C. Hawxhurst, L. M. Shor, and B. Javidi, “Red blood cell classification in lensless single random phase encoding using convolutional neural networks,” Opt. Express 28(22), 33504–33515 (2020). [CrossRef]

14. Z. W. Wang, V. Vineet, F. Pittaluga, S. N. Sinha, O. Cossairt, and S. Bing Kang, “Privacy-preserving action recognition using coded aperture videos,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2019).

15. T. Ando, R. Horisaki, and J. Tanida, “Speckle-learning-based object recognition through scattering media,” Opt. Express 23(26), 33902–33910 (2015). [CrossRef]

16. I. J. Ezennia, S. O. Nduka, and O. I. Ekwunife, “Cost benefit analysis of malaria rapid diagnostic test: the perspective of Nigerian community pharmacists,” Malar. J. 16(1), 7 (2017). [CrossRef]

17. A. Anand, V. K. Chhaniwal, N. R. Patel, and B. Javidi, “Automatic identification of malaria-infected RBC with digital holographic microscopy using correlation algorithms,” IEEE Photonics J. 4(5), 1456–1464 (2012). [CrossRef]

18. I. Moon, A. Anand, M. Cruz, and B. Javidi, “Identification of malaria-infected red blood cells via digital shearing interferometry and statistical inference,” IEEE Photonics J. 5(5), 6900207 (2013). [CrossRef]

19. B. Javidi, A. Markman, S. Rawat, T. O’Connor, A. Anand, and B. Andemariam, “Sickle cell disease diagnosis based on spatio-temporal cell dynamics analysis using 3D printed shearing digital holographic microscopy,” Opt. Express 26(10), 13614–13627 (2018). [CrossRef]

20. J. Yoon, Y. Jo, M. Kim, K. Kim, S. Lee, S. Kang, and Y. Park, “Identification of non-activated lymphocytes using three-dimensional refractive index tomography and machine learning,” Sci. Rep. 7(1), 6654 (2017). [CrossRef]

21. Y. Jo, S. Park, J. Jung, J. Yoon, H. Joo, M. Kim, S. Kang, M. C. Choi, S. Y. Lee, and Y. Park, “Holographic deep learning for rapid optical screening of anthrax spores,” Sci. Adv. 3(8), e1700606 (2017). [CrossRef]

22. M. Hejna, A. Jorapur, J. S. Song, and R. L. Judson, “High accuracy label-free classification of single-cell kinetic states from holographic cytometry of human melanoma cells,” Sci. Rep. 7(1), 11943 (2017). [CrossRef]

23. X. Li, W. Zhang, W. Y. Wang, X. Wu, Y. Li, X. Tan, D. L. Matera, B. M. Baker, Y. M. Paulus, X. Fan, and X. Wang, “Optical coherence tomography and fluorescence microscopy dual-modality imaging for in vivo single-cell tracking with nanowire lasers,” Biomed. Opt. Express 11(7), 3659–3672 (2020). [CrossRef]

24. A. Géron, Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow, 2nd ed. (O’Reilly, 2019).

25. “Sickle Cell Disease,” https://www.nhlbi.nih.gov/health-topics/sickle-cell-disease/

26. A. Sedrak and N. P. Kondamudi, “Sickle cell disease,” StatPearls (2021).

27. “Sickle Cell Disease,” https://www.afro.who.int/health-topics/sickle-cell-disease

28. “Hemoglobin Electrophoresis,” https://medlineplus.gov/lab-tests/hemoglobin-electrophoresis/

29. S. D. Grosse, I. Odame, H. K. Atrash, D. D. Amendah, F. B. Piel, and T. N. Williams, “Sickle cell disease in Africa,” Am. J. Prev. Med. 41(6), S398–S405 (2011). [CrossRef]

30. A. Stern and B. Javidi, “Random projections imaging with extended space-bandwidth product,” J. Display Technol. 3(3), 315–320 (2007). [CrossRef]

31. J. W. Goodman, Introduction to Fourier Optics, (McGraw-Hill, 1996).

32. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in NIPS, 2012.

33. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in ICLR, 2015.

34. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in CVPR, 2016.

35. F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5mb model size,” in ICLR, 2017.

36. J. Deng, W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition.

37. I. Moon, F. Yi, Y. H. Lee, B. Javidi, D. Boss, and P. Marquet, “Automated quantitative analysis of 3D morphology and mean corpuscular hemoglobin in human red blood cells stored in different periods,” Opt. Express 21(25), 30947–30957 (2013). [CrossRef]

38. Y. Park, C. A. Best, K. Badizadegan, R. R. Dasari, M. S. Feld, T. Kuriabova, d. L. Henle, A. J. Levine, and G. Popescua, “Measurement of red blood cell mechanics during morphological changes,” Proc. Natl. Acad. Sci. U.S.A. 107(15), 6731–6736 (2010). [CrossRef]

39. V. Bianco, M. Paturzo, V. Marchesano, I. Gallotta, E. D. Schiavi, and P. Ferraro, “Optofluidic holographic microscopy with custom field of view (FoV) using a linear array detector,” Lab Chip 15(9), 2117–2124 (2015). [CrossRef]

40. A. Lopresti, F. Malergue, F. Bertucci, M. L. Liberatoscioli, S. Garnier, Q. DaCosta, P. Finetti, M. Gilabert, J. L. Raoul, D. Birnbaum, C. Acquaviva, and E. Mamessier, “Sensitive and easy screening for circulating tumor cells by flow cytometry,” JCI insight 4(14), e128180 (2019). [CrossRef]

41. B. Javidi, A. Carnicer, A. Anand, G. Barbastathis, W. Chen, P. Ferraro, J. W. Goodman, R. Horisaki, K. Khare, M. Kujawinska, R. A. Leitgeb, P. Marquet, T. Nomura, A. Ozcan, Y. K. Park, G. Pedrini, P. Picart, J. Rosen, G. Saavedra, N. Shaked, A. Stern, E. Tajahuerce, L. Tian, G. Wetzstein, and M. Yamaguchi, “Roadmap on digital holography,” Opt. Express 29(22), 35078–35118 (2021). [CrossRef]

		Accuracy	Precision	Recall	F1 Score	MCC
AlexNet	OBS	86.06	83.36	91.29	87.14	0.7340
AlexNet	LBP	88.70	86.48	92.05	89.19	0.7783
VGG19	OBS	81.88	79.83	86.36	82.96	0.6475
VGG19	LBP	84.81	78.40	97.64	86.97	0.7297
SqueezeNet	OBS	79.66	73.27	97.19	83.55	0.6531
SqueezeNet	LBP	83.68	79.86	91.59	85.19	0.6897
ResNet-50	OBS	81.35	78.29	89.27	83.42	0.6515
ResNet-50	LBP	81.51	78.61	88.78	83.39	0.6518

		Accuracy	Precision	Recall	F1 Score	MCC
AlexNet	OBS	86.06	83.36	91.29	87.14	0.7340
AlexNet	LBP	88.70	86.48	92.05	89.19	0.7783
VGG19	OBS	81.88	79.83	86.36	82.96	0.6475
VGG19	LBP	84.81	78.40	97.64	86.97	0.7297
SqueezeNet	OBS	79.66	73.27	97.19	83.55	0.6531
SqueezeNet	LBP	83.68	79.86	91.59	85.19	0.6897
ResNet-50	OBS	81.35	78.29	89.27	83.42	0.6515
ResNet-50	LBP	81.51	78.61	88.78	83.39	0.6518

Automated sickle cell disease identification in human red blood cells using a lensless single random phase encoding biosensor and convolutional neural networks

Abstract

1. Introduction

2. Methods

2.1 System overview

2.2 Mathematical description

2.3 Data collection procedure

2.4 Deep learning model

3. Results

3.1 Sickle cell disease classification

3.2 Dimensionality reduction with 1D crops of opto-biological signatures

4. Conclusions

Acknowledgments

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (6)

Tables (2)

Equations (4)

Optics Express