Automatic segmentation of nine retinal layer boundaries in OCT images of non-exudative AMD patients using deep learning and graph search

Leyuan Fang; David Cunefare; Chong Wang; Robyn H. Guymer; Shutao Li; Sina Farsiu

doi:10.1364/BOE.8.002732

1. Introduction

Optical coherence tomography (OCT) can acquire 3D cross sectional images of human tissue at micron resolutions [1], which has been widely used for a variety of medical and industrial imaging applications [1,2]. The high resolution of OCT enables the visualization of multiple retinal cell layers and biomarkers of retinal and neurodegenerative diseases, including age-related macular degeneration (AMD) [3–5], diabetic retinopathy [6], glaucoma [7], Alzheimer’s disease [8], and amyotrophic lateral sclerosis [9]. For the study of many retinal diseases, accurate quantification of layer thicknesses in the acquired OCT images is crucial to advance our understanding of such factors as disease severity and pathogenic processes, and to identify potential biomarkers of disease progression. Moreover, segmentation of retinal layer boundaries is the first step in creating vascular pattern images from the popular new OCT angiography imaging modalities [10–13]. Since manual segmentation of OCT images is time consuming and subjective, it is necessary to develop automatic layer segmentation algorithms.

Over several decades, a multitude of OCT retinal segmentation algorithms have been developed, which can be generally classified into the following two categories: fixed mathematical model based methods and machine learning based methods [14,15]. Mathematic model based methods construct a fixed or adaptive model based on prior assumptions for the structure of the input images, and include A-scan [16,17], active contour [18–21], sparse high order potentials [22], and 2D/3D graph [23–30] based methods. Machine learning based methods formulate layer segmentation as a classification problem, where features are extracted from each layer or its boundaries and used to train a classifier (e.g. support vector machine, neural networks, or random forest classifiers) for determining layer boundaries [29,31–33].

In recent years, deep learning [34–36] based neural networks have been demonstrated to be a very powerful tool in the field of computer vision [37,38]. One of the most established realizations of deep learning is the convolutional neural network (CNN) [38], which automatically learns a hierarchy of increasingly complex features and a related classifier directly from training data sets. Recent works have extended the CNN framework to complex medical image analysis, such as retinal blood vessels segmentation [39,40], retinal hemorrhage detection [41], brain tumor segmentation [42], and cerebral microbleeds detection [43]. Very recently, a CNN based method was proposed [44] as an alternative to classic machine learning methods [5] for classification of normal and pathologic OCT images. In addition, the CNN model has also been applied to analyze OCT images of skin, aiming to characterize heathy skin and healing wounds [45].

Some of the more successful segmentation techniques are hybrids combining two or more approaches, e.g. in [46], active contours is combined with Markov random fields to create a global layer segmentation method. Other methods also combine the CNN model with additional techniques (e.g. conditional random fields, Markov random fields, and graph cut) for specific applications [47–51]. Most recently (after the conference version of our paper was accepted for presentation at the 2017 ARVO conference), a related method based on multi-scale convolutional neural networks combined with graph search was published for segmenting the choroid in OCT retinal images [51]. Our paper is in the same class of segmentation algorithms, which combines the CNN model with graph search methodology (termed as CNN-GS) for the automatic segmentation of nine layer boundaries on human retinal OCT images. In this method, we first decompose training OCT images into patches. Then, we utilize a CNN to automatically extract representative features from patches centered on specific retinal layer boundaries and train a corresponding classifier to delineate nine layer boundaries. We use the CNN classifier to generate class labels and probability maps for the layer boundaries. Finally, we use a modified variation of our popular graph theory and dynamic programming (GTDP) method [23], in which instead of using gradient based edge weights, we utilize these CNN based probability maps to create the final boundaries. An exciting property of our approach is that it requires fewer ad hoc rules as compared to many previous fixed mathematical model based methods for segmentation of inner retinal layers in non-exudative AMD eyes.

The rest of this paper is organized as follows. Section II briefly reviews the visible retinal layers in human non-exudative AMD OCT images, the 2D graph based GTDP layer segmentation algorithm, and the CNN model. Section III details the proposed CNN-GS method for the layer segmentation of OCT images. Section IV presents experimental results on clinical human non-exudative AMD OCT data. Conclusions and suggestions for future works are given in Section V.

2. Review

In this Section, we briefly review the retinal layers visible in OCT images of non-exudative AMD subjects, the GTDP layer segmentation algorithm, and the CNN model.

2.1 Human non-exudative AMD OCT image

Figure 1 illustrates a representative retinal OCT image of a patient with non-exudative AMD, with layers labeled. Note that following the terminology of our previous work [52], we refer to the area between the apex of the drusen and RPE layer to Bruch's membrane as retinal pigment epithelium and drusen complex (RPEDC) [53]. The difficulty in segmenting OCT images of non-exudative AMD as compared to normal eyes is the abnormal deformation (and ultimately atrophy) of the retinal layers, especially in the RPE layer in the form of drusen (highlighted by pink rectangles in Fig. 1). On another front, other normal anatomic (e.g. large vessels) and pathologic features (e.g. hyperreflective foci) affect the accuracy of segmentation algorithms developed for normal and diseased retina (e.g. the very basic implementation of the GTDP algorithm as discussed in the next subsection). Through the years, sophisticated software packages have been developed that apply a myriad of ad hoc rules to enhance the performance of these algorithms in the face of specific pathologies. Machine learning algorithms, such as the CNN-GS method described in the remainder of this paper, can be used as an alternative approach to reduce the reliance of segmentation techniques on ad hoc rules.

Fig. 1 Illustration of a retinal OCT image of a patient with non-exudative AMD with nine boundaries between the inner limiting membrane (ILM) labeled in blue and Bruch’s membrane (BrM) labeled in yellow. Eight layers consist of 1-2) nerve fiber layer (NFL); 2-3) ganglion cell layer and inner plexiform layer (GCL + IPL); 3-4) inner nuclear layer (INL); 4-5) outer plexiform layer (OPL); 5-6) outer nuclear layer (ONL); 6-7) inner segment (IS); 7-8) outer segment (OS); and 8-9) retinal pigment epithelium (RPE) and drusen complex (RPEDC).

Download Full Size | PDF

2.2. GTDP layer segmentation

The GTDP algorithm [23] represents each OCT B-scan as a graph of nodes where each node corresponds to a pixel. Neighboring pixels are connected in the graph by links called edges, and each edge is assigned a weight. The weights $w_{a b}$ for each edge between two nodes $a$ and $b$ are calculated based on the intensity gradients

w_{a b} = 2 - (g_{a} + g_{b}) + w_{\min},

where

g_{a}

and

g_{b}

are the vertical intensity gradients at node a and b, respectively, and

w_{\min}

is the minimum possible weight in the graph. The final step is to find a set of connected edges (called a path) that bisects the image. The GTDP method adopts Dijkstra’s algorithm [54] to select the minimum weighted path, which corresponds to a boundary between retinal layers.

2.3. CNN model

A CNN classifier uses a series of transforming layers to extract and classify the features from an input image. Commonly used layers for a CNN classifier include: 1) convolution layers; 2) pooling layers; 3) fully connected layers; and 4) soft max classification layers [38, 39]. Figure 2 illustrates a typical architecture of a CNN model, described in the following.

Fig. 2 Illustration of a typical CNN architecture with two convolution layers, two max pooling layers, one fully connected layer, and one soft max classification layer.

Download Full Size | PDF

Assuming an input two dimensional image x (of size N × M × 1), the first convolution layer convolves the image with $K_{1}$ different spatial kernels (of size $n_{1}$ × $m_{1}$ × 1) to obtain a three dimensional volume of feature maps (of size $N_{1}$ × $M_{1}$ × $K_{1}$ ). Later convolutional layers filter input volumes with different spatial kernels (of size $n_{1}$ × $m_{1}$ × $K_{i - 1}$ ) to obtain a new volume of feature maps (e.g. of $N_{1}$ × $M_{1}$ × $K_{1}$ ). After the convolutions, each unit in a feature map is connected to the previous layer by the corresponding kernel’s weights. Applying multiple kernel convolutions will increase the number of feature maps, creating high computational burdens for future steps. Thus, a pooling layer is often applied after convolution layers, which fuses nearby spatial information (in a kernel window of size w × w) with the max or averaging operations in order to reduce the dimensions of the feature maps [39]. After several convolutional and pooling layers, the high level features are combined in fully connected layers, where the outputs have full connections to all values in the previous layer, with separate weighs for each connection. Finally, a soft max classification layer is applied to the final fully connected layer, which determines the probability of the input image belonging to each class [38]. A CNN that has multiple convolutional, max pooling, and fully connected layers, is termed a deep neural network. In additional to the above four types of basic layers, other commonly used layers include rectified linear units (ReLU) layers and normalization layers [42]. The ReLU layers are usually applied after convolutional layers to non-linearly transform the data using the function $\max (0, x)$ . Here, $x$ is the input to the ReLU layer. The simple transformation in ReLU layers can greatly accelerate the CNN training process [38].

The initial CNN layer weights are randomly selected. The training set is split into mini-batches, with B images per batch. Given a batch of training patches, the CNN uses multiple convolution and pooling layers to extract features and then classify each patch based on the probabilities from the soft max classification layer. After that, the CNN calculates the error between the classification result and the reference label, and then utilizes the backpropagation process [55] to tune all the layer weights to minimize this error. The above process will be repeated several epochs, until the whole CNN model becomes convergent. Here, an epoch is defined as when all batches have been seen, and multiple epochs are used for training [41]. More details about the CNN model can be found in [56].

3. CNN-GS framework for OCT segmentation

We propose the CNN-GS method, which combines CNN and graph search models for the automatic segmentation of OCT images. The CNN-GS method is composed of two main parts: 1) CNN layer boundary classification; and 2) graph search layer segmentation based on the CNN probability maps. The outline of the proposed CNN-GS algorithm is illustrated in Fig. 3.

Fig. 3 Outline of the proposed CNN-GS algorithm.

Download Full Size | PDF

3.1 CNN layer boundary classification

Since there are variations in intensity ranges between OCT images, we first perform intensity normalization on both the training and testing images. Following the intensity normalization method in [32], we first rescale the intensity values of the B-scan X, $I_{X}$ , to be between [0, 1] as follows:

(I_{X} - I_{m i n}) / (I_{m a x} - I_{m i n}),

where

I_{m a x}

and

I_{m i n}

stand for the maximum and minimum values in the B-scan X, respectively. We then apply a median filter with a mask of size 20 × 2. Next, we find the maximum pixel intensity value

I_{m}

from the whole filtered B-scan. We set the value of all pixels in the unfiltered intensity scaled B-scan that are larger than 1.05 ×

I_{m}

to 1.05 ×

I_{m}

. Finally, the intensity values of all pixels are normalized by dividing by the max value in the B-scan. Then, we train a CNN to extract features of specific retinal layer boundaries and to classify nine layer boundaries on OCT images. Specifically, we assign labels “1-9” to layer boundaries from “ILM” to “BrM”. Any pixels that are not on the target layer boundaries, either in or out of the retina, are assigned the label “0”. In the training step, we first extract patches (of size 33 × 33 pixels) centered on each pixel of nine manually segmented layer boundaries from OCT B-scans. These extracted patches are regarded as the positive training samples (with labels “1-9”). In addition, for each A-scan of the OCT B-scans, we randomly select one pixel from non-boundary regions (e.g. layer or background regions) and extract its patch (also of size 33 × 33 pixels) to construct the negative training samples (with label “0”). Both the positive and negative training patches and their classifications are used to train the CNN. In this paper, we use a modified Cifar-CNN [38,57] architecture. After the training process, we obtain a new CNN model with optimized layer weights.

In the testing step, for each pixel of the test OCT image, we extract a patch (of size 33 × 33) centered on that pixel. Classification of all patches from each image would create a high computational burden. Therefore, we first utilize the GTDP algorithm [23] to attain a pilot segmentation of the ILM (top) and BrM (bottom) (see Fig. 1) boundaries and only use patches located between these two boundaries. Since there might be slight errors in the segmentation, the pilot estimation of ILM and BrM boundaries are moved up $T_{u p}$ pixels and down $T_{d o w n}$ pixels, respectively. Finally, we apply the trained CNN to each patch. The CNN outputs a class label and ten probabilities (corresponding to nine layer boundaries and non-boundary regions) for each patch. The output label and probabilities correspond to the center pixel of that patch taken from the full-sized image.

3.2 Graph search layer segmentation based on CNN probability map

The class labels from the CNN are often not precise enough to localize the layer boundaries. Therefore, we use the class probabilities for each pixel with a modified GTDP method to refine the boundaries. Specifically, as described above, we divide each OCT B-scan into 10 classes (consisting of 9 classes of boundaries and 1 class of non-boundary regions). Let t be the class label. For the patch centered at pixel a, the CNN outputs 10 classes of probabilities $P_{a, t}, t \in {0, 1, 2, \dots, 9}$ . The sum of these probabilities for each patch is 1. The higher the probability value for one specific layer boundary, the more likely the pixel belongs to that boundary. We create probability maps for each class by extracting the corresponding probability for that class from all pixels in the image. The probability maps for 9 layer boundaries are illustrated in Fig. 4. For each pixel in the probability map of the t-th layer boundary, larger values correspond to a higher probability that this pixel will be classified as that layer boundary. As shown in Fig. 4(e)-4(m), to better visualize the probability maps, we used a color map where red and blue correspond to larger and smaller probability values, respectively.

Fig. 4 (a) A sample OCT B-scan from our data set, where a sample A-scan is delineated with red. Examples of zoomed in patches extracted from this A-scan are shown in (b). The vertical light-to-dark and dark-to-light gradient images for (a) used in the GTDP algorithm [23], normalized to values between 0 and 1, are shown in (c) and (d). The probability maps created by the CNN for each of the target nine layer boundaries on this B-scan are shown in (e)-(m). Different colors (from blue to red, as shown in the colorbar on the right) represent the probability values in these probability maps. The CNN classification results for test image (a) are shown in (n), where for each pixel, the assigned class corresponds to the class with the highest probability value.

Download Full Size | PDF

To segment the t-th layer boundary, we use the probabilities from the t-th map to compute graph edge weights $w_{a b, t}^{\Pr o b}$ :

w_{a b, t}^{\Pr o b} = 2 - (P_{a, t} + P_{b, t g b}) + w_{m i n},

where a and b are neighboring pixels. Finally, as in [23], we set the minimum graph weight (

w_{m i n}

) to

10^{- 5}

and use Dijkstra’s shortest path algorithm [54] to find the optimal path that bisects the image.

4. Experimental results

4.1 Data sets descriptions

To validate the effectiveness of the proposed CNN-GS layer segmentation method, our experiments used 117 SD-OCT volumes from 39 participants with non-exudative AMD (50 years of age or older). This data set was originally introduced and described in detail in our previous study [58], which was approved by the human research ethics committee of the Royal Victorian Eye and Ear hospital and conducted in adherence with the declaration of Helsinki. All the participants were imaged at three time points over a 12-month period at 6-month intervals. The SD-OCT volumes were acquired using a Spectralis HRA + OCT device (Heidelberg Engineering, Heidelberg, Germany). Each volume includes 48 to 49 B-scans (of size 496 × 1024 pixels). Each B-scan is the average of up to 25 frames acquired at almost the same position to reduce noise. Each B-scan was semi-automatically segmented by DOCTRAP software (Duke University, Durham, NC, USA). Specifically, DOCTRAP utilizes the GTDP algorithm [23, 52] as the core segmentation engine combined with a set of ad hoc rules to automatically segment the B-scans into eight layers. Next, the automatically segmented layers were carefully reviewed and corrected by an expert grader to attain the gold standard grading.

From the data set, we randomly selected 19 eyes for training the CNN model. 171 OCT B-scans from the 57 volumes were used (for each volume, one B-scan was taken from foveal region and from the peripheral regions above and below the fovea, respectively). We used the remaining 60 volumes from 20 eyes (2915 B-scans) for the testing. Note that, the CNN training data set was completely separate from the data set used for the testing.

4.2 Parameter setting

In our experiment, we adopted the Cifar-CNN architecture from the MatConvNet platform [59] (Downloaded at: http://www.vlfeat.org/matconvnet/) for training and testing the CNN model. The architecture and parameters of the Cifar-CNN used in our experiments are presented in Table 1. The parameters for the Cifar-CNN model in Table 1 were chosen to be the default values of Matconvnet. These parameters were already tuned by the Matconvnet researchers in constructing the Cifar-CNN model [59]. For the sake of completeness, we tested other parameters but did not achieve better results than with the default parameters. The patch size was chosen with respect to the resolution (pixel-pitch) of the images in our data set and the size of anatomical features of interest. The number of patches in each batch, B, for training was 100. The model was trained for 45 epochs, and the weight decay and learning rates were kept at their default values. In our experiments, utilizing more epochs did not significantly decrease the training error, but increased computational cost. The parameters for adjusting initial GTDP segmentation of the ILM and BrM boundaries ( $T_{u p}$ and $T_{d o w n}$ ) were set to 15 and 20 pixels, respectively. These can avoid segmentation errors from the GTDP algorithm.

Table 1. Architecture of the Cifar-CNN Used in Our Experiments

View Table | View all tables in this article

4.3 Layer segmentation results

After training the CNN model, we then evaluated its performance using a separate test data set. We applied the proposed CNN-GS method, DOCTRAP software, and the publicly available OCTExplorer software (downloaded at: https://www.iibi.uiowa.edu/content/iowa-reference-algorithms-human-and-murine-oct-retinal-layer-analysis-and-display) [26, 60, 61] to the 60 OCT volumes from the test set and compared their results with the manually corrected segmentations. OCTExplorer is a 3D OCT layer segmentation software, which utilizes correlations among nearby B-scans for segmentation [60]. The latest version OCTExplorer 4.0.0 (beta) was used in our experiments. To have a fair comparison with OCTExplorer, we strived to match layer boundary delineations of the OCTExplorer and the gold standard grading. Note that there might exist a bias between the manual and OCTExplorer segmentations for each boundary. Such biases arise when the convention in marking the location of a certain layer boundary by one method is consistently different to that of another method. To correct for any bias, we applied pixel shifts to each segmented boundary from OCTExplorer and found the shift that minimized the absolute pixel difference with respect to the corresponding manually segmented boundary across the test data set. We found that the best results are attained when we shift each boundary in the automatic segmentation of the OCTExplorer down by bias values of 1, 1, 1, 1, 1, 1, 2, and 1 pixels, respectively.

To attain quantitative performance metrics, first, for each B-scan in the test data set, we calculated the mean thickness difference (in pixels) between the automated and manual segmentations for all layers (eight in the cases of DOCTRAP and CNN-GS, and seven in the case of OCTExplorer). Next, after taking the absolute value of these differences, the mean and standard deviation across all 2915 B-scans from the 60 volumes were calculated. These values are shown in Table 2. Note that, OCTExplorer does not segment the ONL-IS boundary, so a combination of the two layers (ONL + IS) is reported in Table 2 to allow for comparison between the methods. The total retinal thickness (in Table 2) stands for the thickness between the ILM and BrM boundaries. Figure 5 also illustrates the visual comparison results of the CNN-GS, DOCTRAP, OCTExplorer, and manual segmentation. As can be seen, the proposed CNN-GS method performed better than the OCTExplorer software on all segmented layers except the OS in terms of mean difference. However, it is important to note that the manual grader and CNN-GS aimed to segment the BrM boundary, while the OCTExplorer software targeted the outer boundary of RPE [26, 60, 61]. This mismatch in boundary definition in large has contributed to the differences of OCTExplorer with manual grading for the RPEDC layer and total retina thicknesses. On another front, for the sake of completeness, we have also reported the results of the automated DOCTRAP software before manual correction. It should be noted that since the manual grading is based on the semi-automatic correction of the DOCTRAP results, there is a positive bias towards the reported accuracy of the DOCTRAP software.

Table 2. Differences (in pixels) in segmentation between manual grading and automated grading using our CNN-GS method, OCTExplorer software, and DOCTRAP software. The best results for each layer are labeled in bold. (*) emphasizes that OCTExplorer does not delineate the challenging Bruch’s membrane boundary and instead is targeted at segmenting the outer boundary of RPE, which in part explains the large differences with manual grading. (**) emphasizes that since manual grading was based on correcting the DOCTRAP results, there is a positive bias toward the reported DOCTRAP accuracy.

View Table | View all tables in this article

Fig. 5 Visual comparisons among the manual segmentations, OCTExplorer, DOCTRAP, and CNN-GS results on three non-exudative AMD images from the testing set.

Download Full Size | PDF

Our CNN-GS algorithm was implemented in MATLAB R2016b and run on a desktop PC with a GPU NVIDIA GeForce GTX 980 equipped on an Intel Core i7 3.5 GHz machine. The average run time of our CNN-GS algorithm per B-Scan is 43.1 seconds.

5. Conclusions

In this paper, we presented a novel convolutional neural networks and graph search based method named CNN-GS for automatic segmentation of nine layer boundaries on non-exudative AMD OCT images. The CNN-GS method utilizes a CNN to extract effective features of specific retinal layer boundaries and train a corresponding classifier to delineate eight layers. In addition, we further applied graph search methods on the probability maps generated by the CNN to obtain the final boundaries. Our experimental results on a relatively large data set of 60 OCT volumes (2915 B-scans) from non-exudative AMD eyes demonstrate the effectiveness of the proposed CNN-GS method. Note that we could have used a deeper CNN, which is expected to further improve our segmentation performance, but at the cost of a higher computational burden.

A major disadvantage of the proposed method is its reliance on the availability of a large annotated data set. The black-boxed design of the CNN makes customization and performance analysis of each step less tractable and reduces the options for customization. Also, there are still some parameters such as patch and filter size that are empirically selected. Moreover, in its current implementation, CNN-GS is more computationally intensive as compared to competing methods. We note that the majority of the computational cost is incurred by the CNN classification of each patch of the B-scan. Our CNN model is implemented in the Matconvnet platform and is expected to be accelerated by using the Caffe or Python platforms. In addition, classification of the large number of patches per B-scan creates a very high computational burden, and in our future work we plan to design a deep convolutional network model to process each B-scan as a whole to increase efficiency.

The results of this study are encouraging because of the simplicity and versatility of the proposed method. We emphasize that the core framework of the CNN-GS method is not specifically tailored for non-exudative AMD structures. This is in contrast to many previous automatic algorithms which required a multitude of ad hoc rules to make them suitable for segmenting images from AMD eyes (e.g [52].). We expect that this framework will be applicable for many other types of disease by simply replacing or extending the training data set with manually segmented images of the disease of interest. It is expected that in some more challenging cases, modifications and customizations to this learning based technique will be needed. Such modifications are expected to be much less extensive than what is required for repurposing fixed mathematical model based methods. Thus, the proposed method is an important step toward the ultimate goal of attaining a universally applicable OCT segmentation software.

Note that, each 33 × 33 patch (e.g. Fig. 4(b)) is used to calculate only the probability values of the central pixel in that patch for the layer probability maps (e.g. Figs. 4(e)-4(m)). The fixed patch sizes and the (5 × 5) convolutional filters adopted in the CNN model are not considered to be optimal as they were chosen empirically. Therefore, one of our ongoing works is to design shape adaptive filters which can adjust their sizes according to the resolution of the OCT system, and the size of the anatomic and pathologic structures of interest.

Our segmentations were demonstrated to be close to semi-automatic grading, which we deem as the gold standard. However, we should emphasize that there is no guarantee that the gold standard human marking, even if we do not consider inter or intra grader variabilities, perfectly represent the true anatomic and pathological features of interest. OCT images, as in other ophthalmic imaging technologies [62], contain imaging artifacts. A good example is the variability in visualizing Henle's fiber layer, which severely affects the delineation of the OPL-ONL boundary [63].

In this paper, the proposed CNN-GS method was only trained and tested on non-exudative AMD SDOCT images. In further publications, we will extend the proposed CNN-GS model to handle other kinds of pathologies seen in other diseases of the eye. In addition, there are strong incentives to apply the CNN methodology to other OCT image applications (e.g. image denoising, interpolation, and lesion detection).

Funding

Duke/Duke-NUS pilot collaborative grant and National Natural Science Foundation of China (NSFC) (61325007, 61501180).

References and links

1. D. Huang, E. A. Swanson, C. P. Lin, J. S. Schuman, W. G. Stinson, W. Chang, M. R. Hee, T. Flotte, K. Gregory, C. A. Puliafito, and J. G. Fujimoto, “Optical coherence tomography,” Science 254(5035), 1178–1181 (1991). [CrossRef] [PubMed]

2. S. Bhat, I. V. Larina, K. V. Larin, M. E. Dickinson, and M. Liebling, “4D reconstruction of the beating embryonic heart from two orthogonal sets of parallel optical coherence tomography slice-sequences,” IEEE Trans. Med. Imaging 32(3), 578–588 (2013). [CrossRef] [PubMed]

3. P. A. Keane, S. Liakopoulos, R. V. Jivrajka, K. T. Chang, T. Alasil, A. C. Walsh, and S. R. Sadda, “Evaluation of optical coherence tomography retinal thickness parameters for use in clinical trials for neovascular age-related macular degeneration,” Invest. Ophthalmol. Vis. Sci. 50(7), 3378–3385 (2009). [CrossRef] [PubMed]

4. P. Malamos, C. Ahlers, G. Mylonas, C. Schütze, G. Deak, M. Ritter, S. Sacu, and U. Schmidt-Erfurth, “Evaluation of segmentation procedures using spectral domain optical coherence tomography in exudative age-related macular degeneration,” Retina 31(3), 453–463 (2011). [CrossRef] [PubMed]

5. P. P. Srinivasan, L. A. Kim, P. S. Mettu, S. W. Cousins, G. M. Comer, J. A. Izatt, and S. Farsiu, “Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images,” Biomed. Opt. Express 5(10), 3568–3577 (2014). [CrossRef] [PubMed]

6. J. C. Bavinger, G. E. Dunbar, M. S. Stem, T. S. Blachley, L. Kwark, S. Farsiu, G. R. Jackson, and T. W. Gardner, “The effects of diabetic retinopathy and pan-retinal photocoagulation on photoreceptor cell function as assessed by dark adaptometry DR and PRP effects on photoreceptor cell function,” Invest. Ophthalmol. Vis. Sci. 57(1), 208–217 (2016). [CrossRef] [PubMed]

7. C. A. Puliafito, M. R. Hee, C. P. Lin, E. Reichel, J. S. Schuman, J. S. Duker, J. A. Izatt, E. A. Swanson, and J. G. Fujimoto, “Imaging of macular diseases with optical coherence tomography,” Ophthalmology 102(2), 217–229 (1995). [CrossRef] [PubMed]

8. B. Knoll, J. Simonett, N. J. Volpe, S. Farsiu, M. Ward, A. Rademaker, S. Weintraub, and A. A. Fawzi, “Retinal nerve fiber layer thickness in amnestic mild cognitive impairment: Case-control study and meta-analysis,” Alzheimers Dement (Amst) 4(8), 85–93 (2016). [PubMed]

9. J. M. Simonett, R. Huang, N. Siddique, S. Farsiu, T. Siddique, N. J. Volpe, and A. A. Fawzi, “Macular sub-layer thinning and association with pulmonary function tests in Amyotrophic Lateral Sclerosis,” Sci. Rep. 6(29187), 1–6 (2016). [CrossRef] [PubMed]

10. J. Wang, M. Zhang, A. D. Pechauer, L. Liu, T. S. Hwang, D. J. Wilson, D. Li, and Y. Jia, “Automated volumetric segmentation of retinal fluid on optical coherence tomography,” Biomed. Opt. Express 7(4), 1577–1589 (2016). [CrossRef] [PubMed]

11. J. Polans, D. Cunefare, E. Cole, B. Keller, P. S. Mettu, S. W. Cousins, M. J. Allingham, J. A. Izatt, and S. Farsiu, “Enhanced visualization of peripheral retinal vasculature with wavefront sensorless adaptive optics optical coherence tomography angiography in diabetic patients,” Opt. Lett. 42(1), 17–20 (2017). [CrossRef] [PubMed]

12. L. Fang, S. Li, D. Cunefare, and S. Farsiu, “Segmentation based sparse reconstruction of optical coherence tomography images,” IEEE Trans. Med. Imaging 36(2), 407–421 (2017). [CrossRef] [PubMed]

13. L. Fang, S. Li, X. Kang, J. A. Izatt, and S. Farsiu, “3-D adaptive sparsity based image compression with applications to optical coherence tomography,” IEEE Trans. Med. Imaging 34(6), 1306–1320 (2015). [CrossRef] [PubMed]

14. D. C. DeBuc, “A review of algorithms for segmentation of retinal image data using optical coherence tomography,” in Image Segmentation (InTech, 2011), pp. 15–54.

15. R. Kafieh, H. Rabbani, M. D. Abramoff, and M. Sonka, “Intra-retinal layer segmentation of 3D optical coherence tomography using coarse grained diffusion map,” Med. Image Anal. 17(8), 907–928 (2013). [CrossRef] [PubMed]

16. D. Koozekanani, K. Boyer, and C. Roberts, “Retinal thickness measurements from optical coherence tomography using a Markov boundary model,” IEEE Trans. Med. Imaging 20(9), 900–916 (2001). [CrossRef] [PubMed]

17. H. Ishikawa, D. M. Stein, G. Wollstein, S. Beaton, J. G. Fujimoto, and J. S. Schuman, “Macular segmentation with optical coherence tomography,” Invest. Ophthalmol. Vis. Sci. 46(6), 2012–2017 (2005). [CrossRef] [PubMed]

18. M. Mujat, R. Chan, B. Cense, B. Park, C. Joo, T. Akkin, T. Chen, and J. de Boer, “Retinal nerve fiber layer thickness map determined from optical coherence tomography images,” Opt. Express 13(23), 9480–9491 (2005). [CrossRef] [PubMed]

19. M. A. Mayer, J. Hornegger, C. Y. Mardin, and R. P. Tornow, “Retinal nerve fiber layer segmentation on FD-OCT scans of normal subjects and glaucoma patients,” Biomed. Opt. Express 1(5), 1358–1383 (2010). [CrossRef] [PubMed]

20. S. Farsiu, S. J. Chiu, J. A. Izatt, and C. A. Toth, “Fast detection and segmentation of Drusen in retinal optical coherence tomography images,” in Proceedings of Photonics West, Ophthalmic Technologies (SPIE, 2008),pp. 1–12.

21. S. Niu, L. de Sisternes, Q. Chen, T. Leng, and D. L. Rubin, “Automated geographic atrophy segmentation for SD-OCT images using region-based C-V model via local similarity factor,” Biomed. Opt. Express 7(2), 581–600 (2016). [CrossRef] [PubMed]

22. J. Oliveira, S. Pereira, L. Gonçalves, M. Ferreira, and C. A. Silva, “Multi-surface segmentation of OCT images with AMD using sparse high order potentials,” Biomed. Opt. Express 8(1), 281–297 (2017). [CrossRef] [PubMed]

23. S. J. Chiu, X. T. Li, P. Nicholas, C. A. Toth, J. A. Izatt, and S. Farsiu, “Automatic segmentation of seven retinal layers in SDOCT images congruent with expert manual segmentation,” Opt. Express 18(18), 19413–19428 (2010). [CrossRef] [PubMed]

24. S. J. Chiu, C. A. Toth, C. Bowes Rickman, J. A. Izatt, and S. Farsiu, “Automatic segmentation of closed-contour features in ophthalmic images using graph theory and dynamic programming,” Biomed. Opt. Express 3(5), 1127–1140 (2012). [CrossRef] [PubMed]

25. F. LaRocca, S. J. Chiu, R. P. McNabb, A. N. Kuo, J. A. Izatt, and S. Farsiu, “Robust automatic segmentation of corneal layer boundaries in SDOCT images using graph theory and dynamic programming,” Biomed. Opt. Express 2(6), 1524–1538 (2011). [CrossRef] [PubMed]

26. X. Chen, M. Niemeijer, L. Zhang, K. Lee, M. D. Abràmoff, and M. Sonka, “Three-dimensional segmentation of fluid-associated abnormalities in retinal OCT: probability constrained graph-search-graph-cut,” IEEE Trans. Med. Imaging 31(8), 1521–1531 (2012). [CrossRef] [PubMed]

27. B. Keller, D. Cunefare, D. S. Grewal, T. H. Mahmoud, J. A. Izatt, and S. Farsiu, “Length-adaptive graph search for automatic segmentation of pathological features in optical coherence tomography images,” J. Biomed. Opt. 21(7), 076015 (2016). [CrossRef] [PubMed]

28. J. Tian, B. Varga, G. M. Somfai, W.-H. Lee, W. E. Smiddy, and D. C. DeBuc, “Real-time automatic segmentation of optical coherence tomography volume data of the macular region,” PLoS One 10(8), e0133908 (2015). [CrossRef] [PubMed]

29. P. P. Srinivasan, S. J. Heflin, J. A. Izatt, V. Y. Arshavsky, and S. Farsiu, “Automatic segmentation of up to ten layer boundaries in SD-OCT images of the mouse retina with and without missing layers due to pathology,” Biomed. Opt. Express 5(2), 348–365 (2014). [CrossRef] [PubMed]

30. S. P. K. Karri, D. Chakraborthi, and J. Chatterjee, “Learning layer-specific edges for segmenting retinal layers with large deformations,” Biomed. Opt. Express 7(7), 2888–2901 (2016). [CrossRef] [PubMed]

31. Q. Yang, C. A. Reisman, K. Chan, R. Ramachandran, A. Raza, and D. C. Hood, “Automated segmentation of outer retinal layers in macular OCT images of patients with retinitis pigmentosa,” Biomed. Opt. Express 2(9), 2493–2503 (2011). [CrossRef] [PubMed]

32. A. Lang, A. Carass, M. Hauser, E. S. Sotirchos, P. A. Calabresi, H. S. Ying, and J. L. Prince, “Retinal layer segmentation of macular OCT images using boundary classification,” Biomed. Opt. Express 4(7), 1133–1152 (2013). [CrossRef] [PubMed]

33. K. McDonough, I. Kolmanovsky, and I. V. Glybina, “A neural network approach to retinal layer boundary identification from optical coherence tomography images,” in IEEE Conference on Computational Intelligence in Bioinformatics and computational biology, (IEEE, 2015),1–8. [CrossRef]

34. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553), 436–444 (2015). [CrossRef] [PubMed]

35. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science 313(5786), 504–507 (2006). [CrossRef] [PubMed]

36. G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural Comput. 18(7), 1527–1554 (2006). [CrossRef] [PubMed]

37. W. Ouyang, X. Wang, X. Zeng, S. Qiu, P. Luo, Y. Tian, H. Li, S. Yang, Z. Wang, and C.-C. Loy, “Deepid-net: Deformable deep convolutional neural networks for object detection,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2015),pp. 2403–2412. [CrossRef]

38. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proceedings of the Advances in Neural Information Processing Systems, (MIT Press, 2012),pp. 1097–1105.

39. P. Liskowski and K. Krawiec, “Segmenting retinal blood vessels with deep neural networks,” IEEE Trans. Med. Imaging 35(11), 2369–2380 (2016). [CrossRef] [PubMed]

40. Q. Li, B. Feng, L. Xie, P. Liang, H. Zhang, and T. Wang, “A cross-modality learning approach for vessel segmentation in retinal images,” IEEE Trans. Med. Imaging 35(1), 109–118 (2016). [CrossRef] [PubMed]

41. M. J. van Grinsven, B. van Ginneken, C. B. Hoyng, T. Theelen, and C. I. Sánchez, “Fast convolutional neural network training using selective data sampling: Application to hemorrhage detection in color fundus images,” IEEE Trans. Med. Imaging 35(5), 1273–1284 (2016). [CrossRef] [PubMed]

42. S. Pereira, A. Pinto, V. Alves, and C. A. Silva, “Brain tumor segmentation using convolutional neural networks in MRI images,” IEEE Trans. Med. Imaging 35(5), 1240–1251 (2016). [CrossRef] [PubMed]

43. Qi Dou, Hao Chen, Lequan Yu, Lei Zhao, V. C. Jing Qin, Defeng Wang, Mok, Lin Shi, and Pheng-Ann Heng, “Automatic detection of cerebral microbleeds from MR images via 3D convolutional neural networks,” IEEE Trans. Med. Imaging 35(5), 1182–1195 (2016). [CrossRef] [PubMed]

44. S. P. K. Karri, D. Chakraborty, and J. Chatterjee, “Transfer learning based classification of optical coherence tomography images with diabetic macular edema and dry age-related macular degeneration,” Biomed. Opt. Express 8(2), 579–592 (2017). [CrossRef] [PubMed]

45. D. Sheet, S. P. K. Karri, and A. Katouzian, “Deep learning of tissue specific speckle representations in optical coherence tomography and deeper exploration for in situ histology,” in IEEE International Symposium on Biomedical Imaging, (IEEE, 2015),pp.777–780. [CrossRef]

46. I. Ghorbel, F. Rossant, I. Bloch, S. Tick, and M. Paques, “Automated segmentation of macular layers in OCT images and quantitative evaluation of performances,” Pattern Recognit. 44(8), 1590–1603 (2011). [CrossRef]

47. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected CRFs,” in Proceedings of International Conference on Learning Representation, (2015)

48. C. Li and M. Wand, “Combining Markov random fields and convolutional neural networks for image synthesis,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2016),pp. 2479–2486. [CrossRef]

49. S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineety, Z. Su, D. Du, C. Huang, and P. H. S. Torr, “Conditional random fields as recurrent neural networks,” in Proceedings of International Conference on Computer Vision, (IEEE, 2015),pp. 1529–1537. [CrossRef]

50. M. Niepert, M. Ahmed, and K. Kutzkov, “Learning convolutional neural networks for graphs,” in Proceedings of International Conference Machine Learning, (2016)

51. X. Sui, Y. Zheng, B. Wei, H. Bi, J. Wu, X. Pan, Y. Yin, and S. Zhang, “Choroid segmentation from optical coherence tomography with graph-edge weights learned from deep convolutional neural networks,” Neurocomputing 237, 332–341 (2017).

52. S. J. Chiu, J. A. Izatt, R. V. O’Connell, K. P. Winter, C. A. Toth, and S. Farsiu, “Validated automatic segmentation of AMD pathology including drusen and geographic atrophy in SD-OCT images,” Invest. Ophthalmol. Vis. Sci. 53(1), 53–61 (2012). [CrossRef] [PubMed]

53. S. Farsiu, S. J. Chiu, R. V. O’Connell, F. A. Folgar, E. Yuan, J. A. Izatt, C. A. Toth, and Age-Related Eye Disease Study 2 Ancillary Spectral Domain Optical Coherence Tomography Study Group, “Quantitative classification of eyes with and without intermediate age-related macular degeneration using optical coherence tomography,” Ophthalmology 121(1), 162–172 (2014). [CrossRef] [PubMed]

54. E. W. Dijkstra, “A note on two problems in connexion with graphs,” Numer. Math. 1(1), 269–271 (1959). [CrossRef]

55. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Handwritten digit recognition with a back-propagation netw,” in Proceedings of Advances in Neural Information Processing Systems (MIT Press, 1990), pp. 396–404.

56. J. Schmidhuber, “Deep learning in neural networks: an overview,” Neural Netw. 61(10), 85–117 (2015). [CrossRef] [PubMed]

57. H. R. Roth, L. Lu, J. Liu, J. Yao, A. Seff, K. Cherry, L. Kim, and R. M. Summers, “Improving computer-aided detection using convolutional neural networks and random view aggregation,” IEEE Trans. Med. Imaging 35(5), 1170–1181 (2016). [CrossRef] [PubMed]

58. Z. Wu, D. Cunefare, E. Chiu, C. D. Luu, L. N. Ayton, C. A. Toth, S. Farsiu, and R. H. Guymer, “Longitudinal associations between microstructural changes and microperimetry in the early stages of age-related macular degeneration longitudinal structure and function associations in AMD,” Invest. Ophthalmol. Vis. Sci. 57(8), 3714–3722 (2016). [CrossRef] [PubMed]

59. A. Vedaldi and K. Lenc, “Matconvnet: Convolutional neural networks for matlab,” in Proceedings of the ACM International Conference on Multimedia, (ACM, 2015),PP. 689–692. [CrossRef]

60. B. Antony, M. D. Abràmoff, L. Tang, W. D. Ramdas, J. R. Vingerling, N. M. Jansonius, K. Lee, Y. H. Kwon, M. Sonka, and M. K. Garvin, “Automated 3-D method for the correction of axial artifacts in spectral-domain optical coherence tomography images,” Biomed. Opt. Express 2(8), 2403–2416 (2011). [CrossRef] [PubMed]

61. M. D. Abràmoff, M. K. Garvin, and M. Sonka, “Retinal imaging and image analysis,” IEEE Rev. Biomed. Eng. 3(1), 169–208 (2010). [CrossRef] [PubMed]

62. D. Cunefare, R. F. Cooper, B. Higgins, D. F. Katz, A. Dubra, J. Carroll, and S. Farsiu, “Automatic detection of cone photoreceptors in split detector adaptive optics scanning light ophthalmoscope images,” Biomed. Opt. Express 7(5), 2036–2050 (2016). [CrossRef] [PubMed]

63. B. J. Lujan, A. Roorda, R. W. Knighton, and J. Carroll, “Revealing Henle’s Fiber Layer Using Spectral Domain Optical Coherence Tomography,” Invest. Ophthalmol. Vis. Sci. 52(3), 1486–1492 (2011). [CrossRef] [PubMed]

	Type	Filter size	Stride	Filter number	Padding
Layer 1	Convolution	5 × 5 × 1	1 × 1	32	2
Layer 2	Max pool	3 × 3	2 × 2	—	0
Layer 3	ReLU	—	—	—	—
Layer 4	Convolution	5 × 5 × 32	1 × 1	32	2
Layer 5	ReLU	—	—	—	—
Layer 6	Average Pool	3 × 3	2 × 2	—	0
Layer 7	Convolution	5 × 5 × 32	1 × 1	64	2
Layer 8	ReLU	—	—	—	—
Layer 9	Average Pool	3 × 3	2 × 2	—	0
Layer 10	Fully Connected	4 × 4 × 64	—	64	0
Layer 11	ReLU	—	—	—	—
Layer 12	Fully Connected	1 × 1 × 64	—	10	0
Layer 13	Softmaxloss	—	—	—	—

Retinal Layer	CNN-GS vs manual grader		OCTExplorer vs manual grader		DOCTRAP ** vs manual grader
Retinal Layer	Mean Difference	Standard Deviation	Mean Difference	Standard Deviation	Mean Difference	Standard Deviation
NFL	0.57	0.58	0.68	1.30	0.48	0.74
GCL + IPL	0.46	0.67	0.65	1.28	0.51	0.85
INL	0.22	0.30	0.65	1.02	0.43	1.42
OPL	0.66	0.89	0.84	1.19	0.70	0.96
ONL	0.77	1.00	—	—	1.00	1.46
IS	0.16	0.29	—	—	0.08	0.20
ONL + IS	0.71	0.93	1.40	3.11	0.98	1.44
OS	0.98	0.96	0.76	0.59	0.96	1.01
RPEDC*	1.15	1.61	2.25	2.83	1.33	3.17
Total Retina*	1.26	1.24	3.40	6.98	0.50	3.57

	Type	Filter size	Stride	Filter number	Padding
Layer 1	Convolution	5 × 5 × 1	1 × 1	32	2
Layer 2	Max pool	3 × 3	2 × 2	—	0
Layer 3	ReLU	—	—	—	—
Layer 4	Convolution	5 × 5 × 32	1 × 1	32	2
Layer 5	ReLU	—	—	—	—
Layer 6	Average Pool	3 × 3	2 × 2	—	0
Layer 7	Convolution	5 × 5 × 32	1 × 1	64	2
Layer 8	ReLU	—	—	—	—
Layer 9	Average Pool	3 × 3	2 × 2	—	0
Layer 10	Fully Connected	4 × 4 × 64	—	64	0
Layer 11	ReLU	—	—	—	—
Layer 12	Fully Connected	1 × 1 × 64	—	10	0
Layer 13	Softmaxloss	—	—	—	—

Retinal Layer	CNN-GS vs manual grader		OCTExplorer vs manual grader		DOCTRAP ** vs manual grader
Retinal Layer	Mean Difference	Standard Deviation	Mean Difference	Standard Deviation	Mean Difference	Standard Deviation
NFL	0.57	0.58	0.68	1.30	0.48	0.74
GCL + IPL	0.46	0.67	0.65	1.28	0.51	0.85
INL	0.22	0.30	0.65	1.02	0.43	1.42
OPL	0.66	0.89	0.84	1.19	0.70	0.96
ONL	0.77	1.00	—	—	1.00	1.46
IS	0.16	0.29	—	—	0.08	0.20
ONL + IS	0.71	0.93	1.40	3.11	0.98	1.44
OS	0.98	0.96	0.76	0.59	0.96	1.01
RPEDC*	1.15	1.61	2.25	2.83	1.33	3.17
Total Retina*	1.26	1.24	3.40	6.98	0.50	3.57

Automatic segmentation of nine retinal layer boundaries in OCT images of non-exudative AMD patients using deep learning and graph search

Abstract

1. Introduction

2. Review

2.1 Human non-exudative AMD OCT image

2.2. GTDP layer segmentation

2.3. CNN model

3. CNN-GS framework for OCT segmentation

3.1 CNN layer boundary classification

3.2 Graph search layer segmentation based on CNN probability map

4. Experimental results

4.1 Data sets descriptions

4.2 Parameter setting

4.3 Layer segmentation results

5. Conclusions

Funding

References and links

Cited By

Figures (5)

Tables (2)

Equations (3)

Biomedical Optics Express