Hybrid deep learning and optimal graph search method for optical coherence tomography layer segmentation in diseases affecting the optic nerve

Zhi Chen; Zhi Chen; Honghai Zhang; Honghai Zhang; Edward F. Linton; Brett A. Johnson; Yun Jae Choi; Yun Jae Choi; Mark J. Kupersmith; Milan Sonka; Milan Sonka; Mona K. Garvin; Mona K. Garvin; Mona K. Garvin; Randy H. Kardon; Randy H. Kardon; Jui-Kai Wang; Jui-Kai Wang; Jui-Kai Wang

doi:10.1364/BOE.516045

1. Introduction

Optical coherence tomography (OCT) imaging is essential for the diagnosis, progression monitoring and damage assessment of diseases affecting the optic nerve, most of which cause some degree of irreversible vision loss [1–3]. OCT enables visualization and thickness measurement of the retinal nerve fiber layer (RNFL) and ganglion cell-inner plexiform layer (GCIPL), which are reduced due to optic nerve damage. These measures provide important clinical information about structural changes needed for monitoring the progression of diseases. For glaucoma patients, the gradual progression of nerve damage and thinning of these layers are visible [3–6] and can be quantified [3,7]. For patients with multiple sclerosis (MS), subtle regional thinning over time in RNFL and GCIPL thickness are correlated with factors beyond visual function, such as reduced brain volume, neuronal loss (demonstrated on pathology of the eye), cognitive impairment, and overall disability [8–10]. For patients with non-arteritic ischemic optic neuropathy (NAION), the condition is evident by the initial manifestation of substantial retinal swelling, followed by rapid thinning of the RNFL and inner layers of the retina in the subsequent months [2,11,12]. Figure 1 shows several examples of OCT images with thickening and thinning of retina layers. The significant thickening and thinning of retina layers can cause standard automated segmentation methods to fail to identify RNFL and GCIPL correctly [13,14], confining eye care providers to following glaucoma progression primarily by visual field (VF) outcomes, which depends on determining the threshold light perception in different locations of patients’ visual field and can be inconsistent particularly when the visual field loss is severe [15]. To ensure reliable and reproducible retinal structural quantification in OCT, an accurate layer segmentation method that is robust to extreme variations in layer thickness is critical.

Fig. 1. Examples of OCT central B-scans showing thickening and thinning. (a) A NAION affected eye shows edema at the region close to the optic nerve head and fluid around the outer retina. (b) The same patient’s unaffected fellow eye at the same visit shows no swelling and atrophy. (c) The NAION affected eye after six months shows significant GCIPL thinning, and the RNFL is hardly identifiable. (d) An independent eye with advanced glaucoma also shows severe thinning of the RNFL and GCIPL.

Download Full Size | PDF

Traditionally, experts design automated OCT layer segmentation approaches based on domain knowledge assuming regular retinal thicknesses, surface topology, and image properties of OCT devices. Previously, we developed The Iowa Reference Algorithms (also known as OCTExplorer [40]) with highly integrated specific designs and parameters to fit automated layer segmentation in both macular and optic-nerve-head (ONH) volumetric scans for different OCT devices. It was designed based on Layered Optimal Graph Image Segmentation for Multiple Objects and Surfaces (LOGISMOS, Sec. 2.2), a general purpose method for optimally segmenting multiple $n$-D surfaces that mutually interact within individual and/or between objects [16–19]. Since LOGISMOS theoretically guarantees surface/object topology by design, it has been widely used and exceeds segmentation accuracy compared to other standard automated methods [20] in multi-surface/object segmentation applications. However, the LOGISMOS parameters in OCTExplorer were not tailored for severe retinopathy and/or retinal deformation, resulting in unreliable segmentation results and structural measurements in extreme cases of thickening and thinning of the retinal layers. Although LOGISMOS can achieve the desired accuracy by using fine-tuned parameters (surface constraints) for specific retina pathology [21–23], such parameter tuning is often difficult for users without sufficient understanding of the underlying algorithm.

Recent advances in deep learning methods have demonstrated exceptional capabilities in modeling retinal structure by directly learning features from input data (with precise, high-quality manually traced layers/surfaces in cases of supervised retinal layer segmentation). Among various deep learning architectures, U-Net [24] based approaches have gained significant attention in OCT layer segmentation. Mishra et al. [25] proposed using a U-Net to create 2D probability maps of drusen and reticular pseudodrusen (RPD) and combine them with the image gradient to guide a shortest-path algorithm to identify retinal surfaces in 2D B-scans. He et al. [26] trained a residual U-Net with two output branches for two probability maps of the retinal layers and junction surfaces and merged probability maps in an iterative surface topology module to estimate the retinal surface locations. Furthermore, Yadav et al. [27] proposed a cascaded two-stage design, in which the first U-Net identified the retina in 2D OCT B-scan and the following U-Net focused on identifying each retinal layer based on the pre-segmented retina. Most recently, He et al. [28] aimed to segment longitudinal OCT scans by proposing a long-short-term-memory U-Net (LSTM-UNet) to incorporate adjacent 2D B-scans information and convert the segmented surfaces as longitudinal priors. A hidden Markov model was also used to avoid segmenting all past images in the OCT longitudinal segmentation task to reduce GPU memory usage. Mukherjee et al. [29] utilized 3D U-Net using blocks of 8 neighboring B-scans to produce single-pixel surfaces generated from A-scan-wise maximal probability and a 3D autoencoder was used as constraint-inducing regularizer to enforce surface smoothness and improve surface topology.

Existing automated approaches have shown excellent performance in identifying retinal surfaces, but several major issues remain. First, although 2D segmentation methods do not require intense computational power, the segmentation results can be locally unstable due to the absence of volumetric contextual information. Some deep-learning methods require intricate pre- and post-processing steps, such as hole filling, outlier detection, and surface/layer topology checking [27], which are often empirically tuned according to the training data and can cause unforeseen errors in different test datasets. Another common strategy in post-processing is to refine the segmented surface according to the topological order of the retina [25] such that the errors occur on the first-processed surface can easily propagate to other surfaces that were originally segmented correctly.

In this study, a hybrid approach, Deep LOGISMOS, was developed to combine the strengths of deep learning and LOGISMOS while overcoming their respective limitations. A two-stage true 3D pipeline was implemented to improve the accuracy, robustness and generalizability. The novelties and advantages of this work are as follows.

A two-stage segmentation approach is used – pre-segmentation locates the whole retina region in the full image and final segmentation identifies individual layers. Each stage employs a dedicated deep learning network, whose output probability map is used to derive in-region cost for the following LOGISMOS optimization that also incorporates weighted gradient-based on-surface cost to improve segmentation robustness and accuracy.

High quality deep learning probability maps eliminate the need for complex parameter tuning for LOGISMOS. The built-in topology guarantees in LOGISMOS eliminate the need for complex and dataset-specific post-processing of deep learning results. State-of-the-art multi-threading maximum flow solver substantially reduces running time of LOGISMOS graph optimization. True 3D implementation is less sensitive to outlier B-scans with heavy noise or artifacts.

In addition to the traditional segmentation accuracy metrics, the clinical usability of the developed method is also assessed by quality of resulting retinal layer thickness maps. The robustness and generalizability are evaluated across cohorts and OCT devices.

This work does not involve creating a new deep learning segmentation network or augmenting existing capabilities of LOGISMOS. Instead, it determines the usefulness of effective combination of two methodologies. The deep learning components in this study were implemented using nnU-Net framework [30] but can be potentially replaced with any network capable of producing probability maps for retinal layers. The LOGISMOS graph structure remains consistent with that proposed in [18] while using in-region costs derived from deep learning.

2. Methods

As shown in Fig. 2, the proposed method consists of two stages: pre-segmentation and final segmentation. The retina and top/bottom background regions are first segmented during the pre-segmentation stage; the target retinal layers/surfaces are then simultaneously identified in 3D in the final segmentation stage. The five target retinal layers are: retinal nerve fiber layer (RNFL), ganglion cell-inner plexiform layer (GCIPL), inner nuclear plus outer plexiform layer (INOPL), outer nuclear layer (ONL), and retinal pigment epithelium (RPE) complex. The target surfaces are the bounding surfaces of the five interested layers, including: the inner limiting membrane (ILM), RNFL-GCL, IPL-INL, OPL-ONL, upper and lower RPE complex surfaces. The pre-segmentation stage is implemented using a 2D U-Net network followed by a two-surface 3D LOGISMOS, while the final segmentation stage uses a 3D U-Net network followed by a six-surface 3D LOGISMOS.

Fig. 2. Deep LOGISMOS workflow. The pre-segmentation stage identifies the whole retina region, while the final segmentation stage identifies individual surfaces within focal retina region.

Download Full Size | PDF

2.1. Surface-oriented two-stage segmentation

OCT images inherently have a significant observed imbalance between the retina and background, as the retina constitutes a minor portion of the overall image. The penetration of OCT is approximately 2 mm and thickness of normal retina rarely goes beyond 400 $\mu$m [31], which means less than 20% of the B-scan is occupied by retina. Even with existing pathology considered, the majority of B-scan voxels still belong to the background. Considering the thickness of individual layers is even smaller percentage of the B-scans, this skewed representation of the retina and its layers relative to the background might lead the deep learning network to overemphasize the background when training a multi-layer segmentation model utilizing complete B-scans.

In order to more effectively manage this imbalance between background and retinal layers, as well as reduce the computational cost for subsequent 3D U-Net based multi-layer segmentation, we implemented pre-segmentation of the entire retinal region using a 2D U-Net component from the nnU-Net framework. The retinal region, in this context, is defined as the area between the ILM and lower bounding surface of the RPE complex. This allowed the generation of probability maps distinctly for the bottom background, the retinal region, and the top background. Although a 3D U-Net could be utilized for pre-segmentation, we found that the 2D U-Net was sufficient for accurate retina delineations with significantly reduced computational demands.

Typically, the pre-segmentation process provides a reasonable labeling of the retinal region. However, the retinal region can exhibit complex and irregular features due to various factors, including artifacts, local pathologies, OCT deficiencies, blood vessels, etc. Furthermore, when applying the trained model to an unseen dataset, the pre-segmentation results may produce unexpected false predictions such as holes, irregular attachments, incorrect labels and other similar anomalies. These inaccuracies stem from unseen patterns that arise from different imaging device characteristics and cohort properties. Converting such region-based segmentation to surface-based segmentation may not be straightforward and might require tailored post-processing. Moreover, having an accurate outline of the retinal region is essential to achieve final segmentation accuracy. To better handle such complexities, the LOGISMOS framework was employed by utilizing the probability maps of 3 regions (bottom background, retina and top background) from 2D U-Net output to achieve the 2-surface segmentation of the retinal region. More details about our Deep LOGISMOS framework are discussed in Section 2.2.

Following the pre-segmentation, the retinal regions are flattened based on the top surface (ILM), which typically has less complexity than the bottom surface (OB-RPE). This flattening is intended to provide better layer connectivity, facilitating improved multi-layer/-surface segmentation in both 3D U-Net and LOGISMOS algorithms. Additionally, the retinal region is cropped based on the maximum retinal thickness. A top and bottom margin are added to the cropped retinal region, allowing for some errors from unseen complexities and incorporate some contextual information.

The final segmentation of individual layers is performed using the 3D U-Net component from the nnU-Net framework. The training phase uses the retinal region cropped based on manually labeled surfaces to ensure that the training is independent of the pre-segmentation model. The inference phase uses the retinal region cropped based on the pre-segmentation results. The 3D U-Net was trained to segment each layer independently, with training labels consisting of the cropped background and five retinal layers (derived from six surfaces) within the focal and cropped retinal region. In spite of all efforts to address label imbalance and improve the 3D connectivity of the retinal layers, the 3D U-Net may still encounter challenges in generalizing to unseen patterns and inevitable artifacts, and occasionally produce unexpected false predictions. Similar to the pre-segmentation stage, we opted for a unified LOGISMOS framework instead of applying customized post-processing techniques for unforeseen failure cases.

2.2. Deep LOGISMOS

As shown in Fig. 3, the 3D image to segment is covered by columns of inter-connected graph nodes such that each target surface intersects with each column exactly once and the graph edges connecting graph nodes are used to enforce geometric constraints. The smoothness constraint $\Delta _s$ determines the maximal allowed shape variation between neighboring columns of the same surface. If node $n$ on a given column is the on-surface node, the on-surface nodes on the neighboring columns can only reside within $[n-\Delta _s, n+\Delta _s]$. Surface separation constraints, $\Delta _l$ and $\Delta _u$, determine the bounds of distances between two surfaces on the corresponding columns.

Fig. 3. LOGISMOS graph construction. (a) Two terrain-like target surfaces in 3D volume image. Each target surface intersects with each column exactly once. (b) Intra-column edges and inter-column edges for surface smoothness constraints, $\Delta _s$. (c) Edges connecting corresponding columns from two non-crossing surfaces to enforce lower and upper bounds of surface separation constraints, $\Delta _l$ and $\Delta _u$, respectively.

Download Full Size | PDF

Each graph node is assigned a cost that can be on-surface cost, in-region cost, or their combination. Let $s,c,n$ denote the indices of surface, column and node and $S,C,N$ denote the number of surfaces, columns per surface and nodes per column, respectively. The on-surface cost emulates the unlikeness of target surface passing thought a given node. The on-surface cost of node $n$ on column $c$ of surface $s$ is thus denoted as $\mathcal {C}_{os}(s,c,n)$. A given multi-surface segmentation can be represented by its surface function $\mathbb {SF}$ such that $n=\mathbb {SF}(s,c)$ defines the location of surface $s$ on column $c$. The total on-surface cost for a segmentation is

(1)$$\mathbf{C}_{os} = \sum_{s=1}^{S} \sum_{c=1}^{C} \mathcal{C}_{os}(s,c,\mathbb{SF}(s,c)).$$

When the target surfaces are non-crossing, the above surface function has an equivalent region function $\mathbb {RF}$ such that $r=\mathbb {RF}(c,n)$ assigns node $n$ on column $c$ to one of the $S+1$ regions (Fig. 3(a)). The in-region cost $\mathcal {C}_{ir}(r,c,n)$ emulates the unlikeness of $(c,n)$ belonging to region $r\in [0,S]$. The total in-region cost for a segmentation is

(2)$$\mathbf{C}_{ir} = \sum_{c=1}^{C} \sum_{n=1}^{N} \mathcal{C}_{ir}(\mathbb{RF}(c,n),c,n).$$

On-surface and in-region costs can be used individually or together with adjustable weights as

(3)$$w_s \mathbf{C_{os}} + w_r \alpha \mathbf{C_{ir}},$$

where $w_s$ and $w_r$ are used to adjust relative contributions of on-surface and in-region costs. Because $\mathbf {C_{os}}$ and $\mathbf {C_{ir}}$ are computed on different numbers of cost values, $\alpha$ must be used as an additional correction factor. When on-surface and in-region cost have the same absolute value range, $\alpha =S/N$. After the graph is constructed using proper cost functions, the desired segmentation with globally minimal total cost while satisfying geometric constraints can be found by maximum flow of the graph. For more details about the graph construction, converting above costs to terminal capacity of the graph nodes, and finding the maximum flow of the graph, please refer to [16,18].

Deep LOGISMOS utilizes in-region cost derived from high-quality probability maps from deep learning to achieve fast and reliable layer identification. Deep learning, while regionally robust, may not always produce very accurate surface delineation and may produce local errors due to noise or artifacts. Therefore, Gaussian gradient based on-surface cost was combined with learned in-region cost. Using $w_s<w_r$ in Eq. (3), the main role of on-surface cost is providing an additional nudge when the in-region cost is insufficient.

One limitation of the previous OCTExplorer is its long running time associated with Boykov-Kolmogorov maximum flow solver [32]. IBFS and EIBFS solvers [33,34] can achieve 4–8 times speedup on the same graph and an additional 2–4 times speedup using a multi-threading implementation [35]. The running time of these maximum flow solvers also heavily depend on the quality of the cost, using cost with less ambiguity such as that derived from deep learning can further reduce the running time substantially comparing to hand-crafted cost.

3. Experimental methods

3.1. Datasets

Table 1 lists the detailed information about the training and test datasets used in this study. For the training data, 124 Cirrus OCT macular volumes (Carl Zeiss Meditc, Germany) from 31 NAION participants (both eyes and two visits) were manually selected from the Quark Pharmaceutical clinical trial (ClinicalTrials.gov Identifier: NCT02341560). For the training set, the inclusion criteria of OCT images from both the NAION-affected study eye and the unaffected fellow eye across two visits aim to maximize the utility of limited data. This approach covers a broad spectrum of retinal patterns from swelling to atrophy, providing the model with diverse learning opportunities. Reference tracings are available for three cross-sectional test datasets. Test-NAION includes 40 Cirrus OCT macular scans of 20 additional participants from the same Quark NAION trial. Test-G includes 29 Cirrus OCT macular scans of affected eye from randomly selected 29 glaucoma patients from the University of Iowa Hospitals and Clinics. Test-JHU is the publicly available dataset that included 21 and 14 Spectralis OCT macular scans (Heidelberg Engineering Inc., Germany) from 21 multiple sclerosis (MS) and 14 control subjects, respectively, from the Johns Hopkins Hospital [36]. Additionally, a longitudinal test dataset (Test-G-L) contains 155 Cirrus OCT macular scans from 15 independent glaucoma subjects from the University of Iowa Hospitals and Clinics, averaging 10.33$\pm$0.7 semi-annual (162.32$\pm$14.63 day interval) scans of the study eye was used to test the robustness of our method in longitudinal data. Reference tracings of six surfaces – ILM, RNFL-GCL, IPL-INL, OPL-ONL, upper and lower RPE – were created by manually modifying the resulting surfaces produced by OCTExplorer for the Training, Test-NAION, and Test-G datasets. In addition, this study included two imaging protocols for the Cirrus OCT scans: 128 B-scans of 512$\times$1024 pixels and 200 B-scans of 200$\times$1024 pixels, physically covering 6$\times$6$\times$2 mm$^3$, and each Spectralis OCT scan contained 49 B-scans of 1024$\times$496 pixels, physically covering 6$\times$6$\times$1.8 mm$^3$. Further insights into our model’s approach to managing diverse OCT dimensions are elaborated in Section 3.2.

Table 1. Summary of training and test datasets.

View Table | View all tables in this article

The study protocol was approved by the University of Iowa’s Institutional Review Board and adhered to the tenets of the Declaration of Helsinki. The Quark NAION trial data was prospectively collected during a study that was approved by numerous IRBs and participants provided consent prior to enrollment.

3.2. Deep learning

We incorporated two U-Net architectures from the nnU-Net framework [30] into our segmentation pipeline – a 2D nnU-Net for pre-segmentation and a 3D nnU-Net for final segmentation. For both U-Net variants, we employed Stochastic Gradient Descent (SGD) with mini-batches [37] as the optimizer, configured with a momentum of 0.99 and a weight decay of 0.005. The learning rate was initially set as 0.01 and then gradually decreased throughout the training. Both the 2D and 3D U-Net underwent an initial training phase of 30 epochs to reach a performance plateau and were fine-tuned for an additional 15 epochs to optimize their performance. The proposed models were implemented using the PyTorch platform [38] and trained on an NVIDIA Tesla V100 GPU.

As described above and outlined in Table 1, our segmentation model was trained on Cirrus OCT images from a single cohort of NAION patients and then applied to various test datasets to assess its performance and generalizability. To facilitate the application of our model across different cohorts, devices and protocols, we standardized the size of the B-scan image input to the 2D and 3D U-Net models to 512$\times$512 and 512$\times$256, respectively. The B-scans were restored to the original size and resolution for final LOGISMOS segmentation.

The patch size of 128$\times$128$\times$128 was used in the 3D nnU-Net model. This patch size was chosen to manage the computational demands of processing 3D scans while still preserving sufficient image details for effective segmentation. During the inference phase, the patches, along with the sliding window technique, were utilized to generate the probability maps for Cirrus OCT images, which usually have more than 120 B-scans.

However, in contrast to 128 or 200 B-scans per OCT volume in the collected Cirrus data, the Spectralis OCT scans used in the Test-JHU dataset have only 49 B-scans per volume, which causes a challenge during the inference phase using the 3D U-Net with 128 B-scans patch. To address this, the number of B-scans of Spectralis OCT images was increased to 145 by inserting two linearly interpolated B-scans between each two neighboring original B-scans. This approach preserved all the original B-scans without introducing artifacts and brought the scan to a comparable range with the Cirrus OCT images in the training dataset. The original B-scans were extracted after the inference phase.

At the end of this stage, the resulting probability maps were mapped back to the original image by resizing and reversing the flattening and cropping steps. This remapping serves two purposes. First, the output probability maps can be visualized in the context of the original image for easy quality check. Second, the next LOGISMOS step can be performed independently without the knowledge regarding the resizing, flattening and cropping parameters used for the deep learning.

3.3. Deep LOGISMOS

The in-region costs of Deep LOGISMOS were directly derived from probability maps with simple probability-to-unlikeness conversion. The on-surface costs were computed based on intensity gradient along A-scan and the known bright-to-dark or dark-to-bright pattern of given surfaces. The ranges of absolute values of in-region and on-surface costs were linearly scaled to [0, 100] so the correction factor $\alpha$ in Eq. (3) is $S/N$. The relative contribution of in-region and on-surface costs, $w_r$ and $w_s$, were experimentally set to 0.6 and 0.4 for pre-segmentation, 0.9 and 0.1 for final segmentation, by testing several $w_r>w_s$ combinations on selected difficult cases and observing frequency and severity of local errors.

LOGISMOS optimization was always performed in 3D to segment all target surfaces simultaneously. For pre-segmentation, graph columns covered complete A-scans with 256 evenly spaced nodes. For final segmentation, graph columns covered A-scans within the retina range identified by pre-segmentation with one-to-one node-to-voxel correspondence, thus achieving implicit flattening and cropping. The LOGISMOS constraints were $\Delta _s=8$ node, $\Delta _l=0$ and $\Delta _u=0.5N$, where $N$ is the number of nodes per column. These constraints were extremely relaxed and applied to both segmentation stages, and were standardized for all test datasets.

3.4. Validation metrics

The deep learning segmentation results – assigning each voxel a unique label based on corresponding values in the probability maps – have no guaranteed correct topology and therefore were only evaluated by the Dice similarity coefficient. Deep LOGISMOS results with correct topology were compared with manual tracing using the Dice coefficient, surface positioning error, and layer thickness error.

To demonstrate the robustness and quality of clinically oriented metrics derived from Deep LOGISMOS, the GCIPL layer thicknesses were measured in an elliptical annulus grid centered on the fovea using standard Cirrus macular analysis settings: vertical inner and outer radii of 0.5 and 2.0 mm, respectively; horizontal inner and outer radii of 0.6 and 2.4 mm, respectively [5].

4. Results

The training dataset included 124 scans from 31 NAION participants, and the model training time was approximately 2.5 and 4.0 hours for the pre-segmentation and final segmentation stages, respectively, using a Tesla V100 GPU. For each input OCT volume, the pre-segmentation stage took approximately 40 seconds and 1 second for the deep learning inference (using a Tesla V100 GPU) and the LOGISMOS graph optimization (using an 8-core modern CPU), respectively. The subsequent final segmentation stage took approximately 5 seconds with deep learning inference and 10–15 seconds with LOGISMOS optimization. For comparison, OCTExplorer took around 210 seconds to segment a Cirrus OCT macular scan.

After training, Deep LOGISMOS successfully identified six target surfaces in all 259 OCT scans from four test datasets without failures. Figure 4 shows several examples from subjects in the Test-NAION dataset. The Deep LOGISMOS results were evaluated against the reference tracings. For comparison, the results of OCTExplorer and deep-learning-only (referred to as Deep Learning and was only evaluated by Dice due to lack of correct topology) were also evaluated. The performance evaluation results are listed in Tables 2, 3 and 4.

Fig. 4. Example segmentation results of NAION subjects. Column 1: Original images. Column 2: Deep learning pre-segmentation of (top to bottom) the top background (no color), the retina and the bottom background. Column 3: Deep LOGISMOS pre-segmentation of ILM and lower RPE surfaces. Column 4: Deep learning final segmentation of RNFL, GCIPL, INOPL, ONL, RPE complex. Column 5: Deep LOGISMOS final segmentation showing six target surfaces.

Download Full Size | PDF

Table 2. Dice similarity coefficients of five target retinal layers.

View Table | View all tables in this article

Table 3. Surface positioning errors^a (voxels) of six target retinal surfaces.

View Table | View all tables in this article

Table 4. Layer thickness errors^a (voxel) of five retina layers.

View Table | View all tables in this article

4.1. Test-NAION dataset

The characteristics of the Test-NAION dataset was similar to the training dataset, in which all subjects were from the same Quark NAION study and all OCT images were acquired by Cirrus device. Figure 5 illustrates a comparison example among retinal thickness maps generated based on the segmentation results from Deep LOGISMOS, OCTExplorer, and manual tracing. Comparing to Deep LOGISMOS, OCTExplorer was struggling to stably determine a thin RNFL, and then the segmentation errors propagated to other layers in the inner retina.

Fig. 5. Retinal thickness maps generated from segmentation using Deep LOGISMOS, OCTExplorer, and manual tracing. The dashed lines mark the B-scan location in the thickness maps. The color scale is adjusted to highlight the variation of the thicknesses in different layers. The segmentation errors propagate among the inner retinal layers of OCTExplorer results. Horizontal line artifacts exist in the thickness maps derived from manual tracings because surfaces were traced on individual B-scans without sufficient visual feedback to check and maintain surface continuity between B-scans.

Download Full Size | PDF

Table 2 shows Deep LOGISMOS achieved significantly higher Dice coefficients than the deep-learning-only approach in all five retinal layers. For the layers that are highly affected by NAION (showing apparent thinning of the inner retina, including RNFL, GCIPL, and INOPL), Deep LOGISMOS significantly outperformed OCTExplorer by yielding much higher mean Dice coefficients and lower standard deviations. For surface positioning errors, Deep LOGISMOS produced outstanding results in Table 3, especially for the bounding surfaces of GCIPL (i.e., RNFL-GCL and IPL-INL) on which OCTExplorer showed approximately 6–8 voxel signed and unsigned errors and they are reduced by more than 50${\%}$ in Deep LOGISMOS. Since the thin-retina-related segmentation errors are often caused by misidentified regional GCIPL, the neighboring RNFL, INOPL (sometimes even ONL) thicknesses are mainly affected. Deep LOGISMOS has shown significantly better thickness measurement accuracy in RNFL, INOPL, and ONL in Table 4.

4.2. Test-G dataset

The Test-G dataset was used to assess the performance of Deep LOGISMOS on images acquired by the same OCT device but from a different glaucoma cohort. Table 2 shows that Deep LOGISMOS had significantly higher Dice coefficients than the deep-learning-only approach in all five target layers. Compared to OCTExplorer, Deep LOGISMOS showed much lower standard deviations of the Dice coefficients at RNFL, GCIPL, and INOPL; the improvement in GCIPL was significant ($p$-value 0.012). Regarding the surface positioning errors listed in Table 3, Deep LOGISMOS demonstrated an excellent ability to accurately identify the junction surfaces of RNFL-GCL, IPL-INL, and OPL-ONL (which are the three surfaces most affected by severe retinal atrophy) showing lower signed and unsigned errors and standard deviations. This consistent performance is also reflected in the layer thickness errors in Table 4.

4.3. Test-JHU dataset

The Test-JHU dataset was used to assess the performance of Deep LOGISMOS on images acquired by a different OCT device (Spectralis) from eyes with a different disease (multiple sclerosis). Table 2 shows that Deep LOGISMOS yielded the highest Dice coefficients in all five target layers ($p$-value < 0.05). For surface positioning errors, Deep LOGISMOS achieved a sub-voxel level for the mean signed/unsigned errors and standard deviations in all target layers except for the lower RPE. The Deep LOGISMOS results are comparable with those from other deep-learning methods that were directly trained using Spectralis OCTs [27]. For the layer thickness error measurements in Table 4, both OCTExplorer and Deep LOGISMOS show sub-voxel-level errors in all target layers except for the RPE complex.

Comparing with Cirrus images, the Spectralis images have higher quality, especially clearer distinction between layers. Therefore, the improvement in Dice introduced by Deep LOGISMOS over deep learning was not as substantial as on the other test datasets in Table 2. However, the success of the final segmentation was made possible by the robust pre-segmentation in which LOGISMOS is imperative in overcoming errors in deep learning as shown in Fig. 10.

4.4. Test-G-L dataset

The Test-G-L dataset was used to test the stability of Deep LOGISMOS for longitudinal macular OCT scans, which is essential for monitoring disease progression. A standard Cirrus elliptical annulus grid, was centered at the fovea and for the regional thicknesses of GCIPL in six sectors (i.e., N: nasal, S: Superior, T: Temporal, and I: Inferior). Figure 6 shows a comparison example of the regional GCIPL thicknesses in sectors calculated by Deep LOGISMOS and OCTExplorer. Both the radar plot and longitudinal thickness maps demonstrate that Deep LOGISMOS was able to stably identify the GCIPL even when the thickness had reached the floor value. For this extremely challenging case, the OCTExplorer results were only reliable in the nasal sectors.

Fig. 6. Longitudinal GCIPL thickness measurements from a subject in the Test-G-L dataset. The radar plots highlight the thickness estimations being regionally unstable over time. The example thickness maps help visualize the unstable regions.

Download Full Size | PDF

Reference tracings were not available for Test-G-L because the process is extremely time-consuming for longitudinal datasets. However, it is known that the included 15 subjects did not exhibit rapid glaucoma progression and the visit intervals were less than six months. Therefore, it is expected that the sector thickness changes ($\Delta$T) in the GCIPL annulus grid would be relatively small and stable. Figure 7 shows distributions of $\Delta$T based on the segmentation results from Deep LOGISMOS and OCTExplorer. Deep LOGISMOS yielded highly consistent $\Delta$T measurements (|$\Delta$T| < 2 $\mu$m) that agree with the expectation derived from the clinical findings. On the other hand, dramatic $\Delta$T changes (e.g., |$\Delta$T| > 4 $\mu$m) produced by OCTExplorer, suggests unreliable measurements due to inconsistent segmentation.

Fig. 7. Distributions of sector-wise (Cirrus macular annulus grid) thickness change ($\Delta$T) obtained from Deep LOGISMOS and OCTExplorer results. 1 voxel = 1.95$\mu$m along A-scan direction. Shown thickness maps derived from Deep LOGISMOS results.

Download Full Size | PDF

5. Discussion

Deep LOGISMOS can robustly and accurately segment retinal layers even in challenging cases of NAION and glaucoma, as shown in Figs. 8 and 9. Although mild segmentation errors still occurred in Fig. 8(e,f) and Fig. 9(d), Deep LOGISMOS, with its built-in topology, was able to eliminate the need for complex and dataset-specific post-processing of deep learning results and limit the size of the affected region and the severity of errors. Figure 10 shows that the capability of Cirrus-trained deep learning model in segmenting Spectralis images is substantially compromised in the pre-segmentation stage, producing many mislabeled regions. However, the LOGISMOS optimization, without adopting new parameter set or data-specific parameter tuning, overcame these negative effects of cross-device inference and produced correct pre-segmentation results. After that, the final segmentation showed no apparent differences between the Spectralis and Cirrus OCT images.

Fig. 8. Challenging cases in Test-NAION with individual B-scans and the segmented surfaces from Deep LOGISMOS. (a) a hyper-reflective region under RPE complex, (b) shadows under separate inner retinal vessels, (c) shadows from vitreous media opacity, (d) shadows under multiple retinal vessels close to each other, (e) weak OCT signal, (f) weak OCT signal and pigment epithelial detachment, (g) optic disc edema, (h) sub-retinal and intra-retinal fluid due to the edema, (i) severe motion artifact, and (j) hypo-reflectivity at the inner retina. Yellow arrows: challenging regions. (i, j) reconstructed vertical B-scans perpendicular to the regular horizontal B-scan.

Download Full Size | PDF

Fig. 9. Challenging cases in Test-G with the individual B-scans and the segmented surfaces from Deep LOGISMOS. (a) asymmetric thinning of the inner retina, (b) extremely thin inner retina with a steep tilt, (c) vitreous opacity, (d) hyper-reflective vitreous material overlaying the retina, (e) detached vitreous, and (f) loss of foveal contour due to epiretinal membrane. Yellow arrows: challenging regions.

Download Full Size | PDF

Fig. 10. Four challenging cases in Test-JHU. Although the region estimation is challenging for deep learning in the pre-segmentation stage, the following Deep LOGISMOS still successfully identified correct retinal surfaces, preventing error propagation to later stages.

Download Full Size | PDF

The reference tracings of the training, test-NAION, and test-G datasets were achieved by manual modification of OCTExplorer results to reduce human efforts. Since the underlying pathology mainly affects the RNFL and GCIPL, their related surfaces were subjected to more manual modifications than ILM, ONL and RPE which were often visually reasonable and thus did not require modification in most situations. This time-saving approach might introduce bias into the trained model and quantitative results and it also caused slightly larger (but still acceptable) errors on RPE in Tables 3, and 4. This potential bias in reference tracings does not exist in the independently traced test-JHU dataset and Deep LOGISMOS outperformed OCTExplorer equally in all retinal layers/surfaces.

Potential future work may include extending our layer segmentation approach to simultaneously identify retinal layers and regions with fluid accumulations, including cases with age-related macular degeneration, diabetic retinopathy, and radiation retinopathy. The implemented workflow can also be further enhanced with Just-Enough Interaction (JEI) [39] to allow intuitive and efficient correction of segmentation errors.

6. Conclusion

This study presented Deep LOGISMOS, a novel hybrid framework that combines deep learning and optimal graph search for accurate and robust segmentation of retinal layers in OCT images. The results demonstrated that Deep LOGISMOS consistently outperformed existing standard automated algorithms (OCTExplorer) and deep learning alone. Trained on NAION cases acquired by Cirrus, the method showed great flexibility and generalizability when applied two various disease cohorts (NAION, glaucoma, and MS) and images acquired by different device (Spectralis). It also provided stable and robust longitudinal quantification of retinal layers, enabling more reliable monitoring of disease progression over time.

In summary, Deep LOGISMOS addresses key limitations of current OCT layer segmentation methods and provides a robust automated approach to extract clinically meaningful retinal structural information, especially in cases of significant retinal layer thickening or thinning where current algorithms fail. With further validation, this hybrid framework has the potential to aid diagnosis, enhance understanding of pathophysiology, and improve management of diseases affecting the optic nerve.

Funding

Rehabilitation Research and Development Service (I01RX001786, I01RX003797, I50RX003002); National Institute of Biomedical Imaging and Bioengineering (EB004640); National Eye Institute (EB004640, EY023279, EY032522).

Acknowledgment

We acknowledge the support from the Department of Veterans Affairs (VA) Center for the Prevention and Treatment of Visual Loss (CPTVL) Pilot Project Grant; The New York Eye and Ear Infirmary Foundation, New York, NY.

Disclosures

Z. Chen, None; H. Zhang, None; E.F. Linton, None; B.A. Johnson, None; Y.J. Choi, None; M.J. Kupersmith, None; M. Sonka, University of Iowa (Patents), Medical Imaging Applications, LLC (Co-Founder), VIDA Diagnostics, Inc. (Co-Founder); M.K. Garvin, University of Iowa (Patent); R.H. Kardon, FaceX, LLC (Co-Founder), J-K Wang, None.

Data Availability

The data used to generate the results in this study are not available to the public at this time, but can be obtained from the authors upon request.

References

1. A. Petzold, J. F. de Boer, S. Schippling, et al., “Optical coherence tomography in multiple sclerosis: a systematic review and meta-analysis,” The Lancet Neurol. 9(9), 921–932 (2010). [CrossRef]

2. M. J. Kupersmith, M. K. Garvin, J.-K. Wang, et al., “Retinal ganglion cell layer thinning within one month of presentation for non-arteritic anterior ischemic optic neuropathy,” Investig. Ophthal. Vis. Sci. 57(8), 3588–3593 (2016). [CrossRef]

3. A. Geevarghese, G. Wollstein, H. Ishikawa, et al., “Optical coherence tomography and glaucoma,” Annu. Rev. Vis. Sci. 7(1), 693–726 (2021). [CrossRef]

4. R. N. Weinreb, T. Aung, and F. A. Medeiros, “The pathophysiology and treatment of glaucoma: a review,” JAMA 311(18), 1901–1911 (2014). [CrossRef]

5. J.-C. Mwanza, J. D. Oakley, D. L. Budenz, et al., “Macular ganglion cell-inner plexiform layer: Automated detection and thickness reproducibility with spectral domain-optical coherence tomography in glaucoma,” Invest. Ophthalmol. Vis. Sci. 52(11), 8323–8329 (2011). [CrossRef]

6. G. Mahmoudinezhad, V. Mohammadzadeh, J. Martinyan, et al., “Comparison of ganglion cell layer and ganglion cell/inner plexiform layer measures for detection of early glaucoma,” Ophthalmol. Glaucoma 6(1), 58–67 (2023). [CrossRef]

7. Z. Guo, Y. H. Kwon, K. Lee, et al., “Optical coherence tomography analysis based prediction of Humphrey 24-2 visual field thresholds in patients with glaucoma,” Invest. Ophthalmol. Vis. Sci. 58(10), 3975–3985 (2017). [CrossRef]

8. S. Saidha, O. Al-Louzi, J. N. Ratchford, et al., “Optical coherence tomography reflects brain atrophy in multiple sclerosis: A four-year study,” Ann. Neurol. 78(5), 801–813 (2015). [CrossRef]

9. E. H. Martinez-Lapiscina, S. Arnow, J. A. Wilson, et al., “Retinal thickness measured with optical coherence tomography and risk of disability worsening in multiple sclerosis: A cohort study,” The Lancet Neurol. 15(6), 574–584 (2016). [CrossRef]

10. N. Giedraitiene, E. Drukteiniene, R. Kizlaitiene, et al., “Cognitive decline in multiple sclerosis is related to the progression of retinal atrophy and presence of oligoclonal bands: A 5-year follow-up study,” Front. Neurol. 12, 678735 (2021). [CrossRef]

11. S. Berry, W. V. Lin, A. Sadaka, et al., “Nonarteritic anterior ischemic optic neuropathy: cause, effect, and management,” Eye Brain 9, 23–28 (2017). [CrossRef]

12. A. J. Green, S. McQuaid, S. L. Hauser, et al., “Ocular pathology in multiple sclerosis: retinal atrophy and inflammation irrespective of disease duration,” Brain 133(6), 1591–1601 (2010). [CrossRef]

13. Y. H. Hwang, M. K. Kim, and D. W. Kim, “Segmentation errors in macular ganglion cell analysis as determined by optical coherence tomography,” Ophthalmology 123(5), 950–958 (2016). [CrossRef]

14. R. A. Alshareef, A. Goud, M. Mikhail, et al., “Segmentation errors in macular ganglion cell analysis as determined by optical coherence tomography in eyes with macular pathology,” Int. J. Retin. Vitr. 3(1), 25 (2017). [CrossRef]

15. A. Thenappan, E. Tsamis, Z. Z. Zemborain, et al., “Detecting progression in advanced glaucoma: Are optical coherence tomography global metrics viable measures?” Optom. Vis. Sci. 98(5), 518–530 (2021). [CrossRef]

16. K. Li, X. Wu, D. Z. Chen, et al., “Optimal surface segmentation in volumetric images — a graph-theoretic approach,” IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 119–134 (2006). [CrossRef]

17. M. Haeker, M. D. Abràmoff, R. Kardon, et al., “Segmentation of the surfaces of the retinal layer from OCT images,” in MICCAI 2006 Proceedings, Part I, vol. 4190 of Lecture Notes in Computer Science (Springer, 2006), pp. 800–807.

18. M. K. Garvin, M. D. Abramoff, X. Wu, et al., “Automated 3-D intraretinal layer segmentation of macular spectral-domain optical coherence tomography images,” IEEE Trans. Med. Imaging 28(9), 1436–1447 (2009). [CrossRef]

19. Y. Yin, X. Zhang, R. Williams, et al., “LOGISMOS — layered optimal graph image segmentation of multiple objects and surfaces: cartilage segmentation in the knee joint,” IEEE Trans. Med. Imaging 29(12), 2023–2037 (2010). [CrossRef]

20. L. Zhang, Z. Guo, H. Zhang, et al., “Assisted annotation in Deep LOGISMOS: Simultaneous multi-compartment 3D MRI segmentation of calf muscles,” Med. Phys. 50(8), 4916–4929 (2023). [CrossRef]

21. K. Lee, M. Niemeijer, M. K. Garvin, et al., “Segmentation of the optic disc in 3-D OCT scans of the optic nerve head,” IEEE Trans. Med. Imaging 29(1), 159–168 (2010). [CrossRef]

22. J.-K. Wang, R. H. Kardon, M. J. Kupersmith, et al., “Automated quantification of volumetric optic disc swelling in papilledema using spectral-domain optical coherence tomography,” Invest. Ophthalmol. Vis. Sci. 53(7), 4069–4075 (2012). [CrossRef]

23. H. Bogunovic, M. Sonka, Y. H. Kwon, et al., “Multi-surface and multi-field co-segmentation of 3-D retinal optical coherence tomography,” IEEE Trans. Med. Imaging 33(12), 2242–2253 (2014). [CrossRef]

24. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in MICCAI 2015 (Springer International Publishing, 2015), pp. 234–241.

25. Z. Mishra, A. Ganegoda, J. Selicha, et al., “Automated retinal layer segmentation using graph-based algorithm incorporating deep-learning-derived information,” Sci. Rep. 10(1), 9541 (2020). [CrossRef]

26. Y. He, A. Carass, Y. Liu, et al., “Structured layer surface segmentation for retina OCT using fully convolutional regression networks,” Medical Image Anal. 68, 101856 (2021). [CrossRef]

27. S. K. Yadav, R. Kafieh, H. G. Zimmermann, et al., “Intraretinal layer segmentation using cascaded compressed U-Nets,” J. Imaging 8(5), 139 (2022). [CrossRef]

28. Y. He, A. Carass, Y. Liu, et al., “Longitudinal deep network for consistent OCT layer segmentation,” Biomed. Opt. Express 14(5), 1874–1893 (2023). [CrossRef]

29. S. Mukherjee, T. D. Silva, P. Grisso, et al., “Retinal layer segmentation in optical coherence tomography (OCT) using a 3D deep-convolutional regression network for patients with age-related macular degeneration,” Biomed. Opt. Express 13(6), 3195–3210 (2022). [CrossRef]

30. F. Isensee, P. F. Jaeger, S. A. Kohl, et al., “nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation,” Nat. Methods 18(2), 203–211 (2021). [CrossRef]

31. C. E. Myers, B. E. Klein, S. M. Meuer, et al., “Retinal thickness measured by spectral-domain optical coherence tomography in eyes without retinal abnormalities: The Beaver Dam eye study,” Am. J. Ophthalmol. 159(3), 445–456.e1 (2015). [CrossRef]

32. Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision,” IEEE Trans. Pattern Anal. Machine Intell. 26(9), 1124–1137 (2004). [CrossRef]

33. A. V. Goldberg, S. Hed, H. Kaplan, et al., “Maximum flows by incremental breadth-first search,” in Algorithms–ESA 2011, vol. 6942 of Lecture Notes in Computer Science (Springer, 2011), pp. 457–468.

34. A. V. Goldberg, S. Hed, H. Kaplan, et al., “Faster and more dynamic maximum flow by incremental breadth-first search,” in Algorithms–ESA 2015, vol. 9294 of Lecture Notes in Computer Science (Springer, 2015), pp. 619–630.

35. J. Liu and J. Sun, “Parallel graph-cuts by adaptive bottom-up merging,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (IEEE, 2010), pp. 2181–2188.

36. Y. He, A. Carass, S. D. Solomon, et al., “Retinal layer parcellation of optical coherence tomography images: Data resource for multiple sclerosis and healthy controls,” Data in Brief 22, 601–604 (2019). [CrossRef]

37. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun. ACM 60(6), 84–90 (2017). [CrossRef]

38. A. Paszke, S. Gross, F. Massa, et al., “PyTorch: an imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32 (Curran Associates, Inc., 2019), pp. 8024–8035.

39. H. Zhang, K. Lee, Z. Chen, et al., “Chapter 11 - LOGISMOS-JEI: Segmentation using optimal graph search and just-enough interaction,” in Handbook of Medical Image Computing and Computer Assisted Intervention, S. K. Zhou, D. Rueckert, G. Fichtinger, eds. (Academic Press, 2020), The Elsevier and MICCAI Society Book Series, pp. 249–272.

40. https://iibi.uiowa.edu/oct-reference.

Dataset	Device	Subjects	Images	Image Size^a	Voxel Size ( $μ$ m)	B-scans
Training	Cirrus^b	31 NAION	124^d	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	15,872
Test-NAION	Cirrus	20 NAION	40^e	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	5,120
Test-G	Cirrus	29 Glaucoma	10	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	1,280
			19	200 $\times$ 1024 $\times$ 200	30.00 $\times$ 1.95 $\times$ 30.00	3,800
Test-JHU	Spectralis^c	21 MS	21	1024 $\times$ 496 $\times$ 49	5.78 $\times$ 3.87 $\times$ 123.72	1,029
	Spectralis	14 Control	14	1024 $\times$ 496 $\times$ 49	5.78 $\times$ 3.87 $\times$ 123.72	686
Test-G-L^f	Cirrus	15 Glaucoma	27^f	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	3,456
			128^f	200 $\times$ 1024 $\times$ 200	30.00 $\times$ 1.95 $\times$ 30.00	25,600

	RNFL	GCIPL	INOPL	ONL	RPE
Test-NAION: Dice (%)
OCTExplorer	70.31 $\pm$ 21.85	72.52 $\pm$ 22.18	82.44 $\pm$ 15.51	96.25 $\pm$ 3.48	99.04 $\pm$ 2.00
Deep Learning	61.79 $\pm$ 16.95	85.56 $\pm$ 3.62	88.18 $\pm$ 2.84	91.79 $\pm$ 2.24	89.93 $\pm$ 4.02
Deep LOGISMOS	77.47 $\pm$ 10.30^a^b	89.97 $\pm$ 3.59^a^b	91.63 $\pm$ 2.87^a^b	94.72 $\pm$ 2.32^b	94.28 $\pm$ 3.70^b
Test-G: Dice (%)
OCTExplorer	86.93 $\pm$ 11.10	85.26 $\pm$ 12.55	91.43 $\pm$ 8.85	98.58 $\pm$ 1.96	99.85 $\pm$ 0.27
Deep Learning	74.03 $\pm$ 6.77	86.92 $\pm$ 2.32	89.07 $\pm$ 1.83	92.50 $\pm$ 1.27	89.55 $\pm$ 2.23
Deep LOGISMOS	85.41 $\pm$ 4.86^b	90.63 $\pm$ 2.56^a^b	93.15 $\pm$ 1.56^b	96.30 $\pm$ 1.05^b	96.09 $\pm$ 1.49^b
Test-JHU: Dice (%)
OCTExplorer	84.16 $\pm$ 4.41	86.60 $\pm$ 2.88	85.35 $\pm$ 1.38	89.91 $\pm$ 1.52	86.86 $\pm$ 3.10
Deep Learning	91.90 $\pm$ 2.91	93.09 $\pm$ 2.02	92.60 $\pm$ 1.29	95.68 $\pm$ 1.38	92.56 $\pm$ 1.36
Deep LOGISMOS	92.43 $\pm$ 2.12^a^b	94.06 $\pm$ 1.76^a^b	93.91 $\pm$ 1.37^a^b	95.98 $\pm$ 1.61^a^b	93.90 $\pm$ 0.93^a^b

	ILM	RNFL-GCL	IPL-INL	OPL-ONL	Upper RPE	Lower RPE
Test-NAION: Signed, 1 voxel=1.95 $μ$ m.
OCTExplorer	−0.53 $\pm$ 2.84	7.90 $\pm$ 6.66	6.72 $\pm$ 6.14	2.41 $\pm$ 2.60	−0.04 $\pm$ 0.57	−0.03 $\pm$ 0.64
Deep LOGISMOS	−0.03 $\pm$ 0.91	1.53 $\pm$ 1.64^b	0.51 $\pm$ 1.43^b	0.38 $\pm$ 1.45^b	0.18 $\pm$ 1.84	−0.21 $\pm$ 1.59
Test-NAION: Unsigned, 1 voxel=1.95 $μ$ m.
OCTExplorer	0.88 $\pm$ 2.97	8.18 $\pm$ 6.42	7.05 $\pm$ 5.99	2.79 $\pm$ 2.61	0.33 $\pm$ 1.02	0.41 $\pm$ 1.08
Deep LOGISMOS	1.77 $\pm$ 1.60^b	3.06 $\pm$ 1.42^b	2.65 $\pm$ 1.42^b	2.73 $\pm$ 1.43	1.99 $\pm$ 2.19	1.90 $\pm$ 1.58
Test-G: Signed, 1 voxel=1.95 $μ$ m.
OCTExplorer	0.01 $\pm$ 0.09	3.08 $\pm$ 3.39	3.44 $\pm$ 3.17	1.02 $\pm$ 1.50	0.00 $\pm$ 0.03	0.00 $\pm$ 0.06
Deep LOGISMOS	0.07 $\pm$ 0.48	−0.66 $\pm$ 1.55^b	0.39 $\pm$ 0.89^b	0.10 $\pm$ 0.72^b	0.58 $\pm$ 0.31	0.75 $\pm$ 0.68
Test-G: Unsigned, 1 voxel=1.95 $μ$ m.
OCTExplorer	0.10 $\pm$ 0.19	3.72 $\pm$ 2.99	3.49 $\pm$ 3.21	1.04 $\pm$ 1.51	0.04 $\pm$ 0.08	0.05 $\pm$ 0.09
Deep LOGISMOS	0.97 $\pm$ 0.44	2.65 $\pm$ 0.95^b	2.08 $\pm$ 0.70^b	1.92 $\pm$ 0.55	1.02 $\pm$ 0.44	1.31 $\pm$ 0.45
Test-JHU: Signed, 1 voxel=3.87 $μ$ m.
OCTExplorer	−1.31 $\pm$ 0.34	−1.99 $\pm$ 0.25	−2.03 $\pm$ 0.25	−2.19 $\pm$ 0.21	−1.65 $\pm$ 0.37	−2.71 $\pm$ 0.75
Deep LOGISMOS	0.65 $\pm$ 0.35^b	0.06 $\pm$ 0.38^b	0.01 $\pm$ 0.35^b	−0.06 $\pm$ 0.31^b	0.59 $\pm$ 0.42^b	−1.12 $\pm$ 0.68^b
Test-JHU: Unsigned, 1 voxel=3.87 $μ$ m.
OCTExplorer	1.42 $\pm$ 0.29	2.13 $\pm$ 0.26	2.13 $\pm$ 0.22	2.30 $\pm$ 0.18	1.69 $\pm$ 0.35	2.73 $\pm$ 0.73
Deep LOGISMOS	0.84 $\pm$ 0.23^b	0.89 $\pm$ 0.22^b	0.95 $\pm$ 0.22^b	0.89 $\pm$ 0.21^b	0.69 $\pm$ 0.39^b	1.32 $\pm$ 0.35^b

	RNFL	GCIPL	INOPL	ONL	RPE
Test-NAION: 1 voxel=1.95 $μ$ m.
OCTExplorer	8.42 $\pm$ 7.12	−1.18 $\pm$ 1.93	−4.31 $\pm$ 4.38	−2.45 $\pm$ 2.43	0.02 $\pm$ 0.32
Deep LOGISMOS	1.56 $\pm$ 1.15^b	−1.02 $\pm$ 1.20	−0.13 $\pm$ 0.78^b	−0.20 $\pm$ 2.03^b	−0.39 $\pm$ 1.83
Test-G: 1 voxel=1.95 $μ$ m.
OCTExplorer	3.07 $\pm$ 3.39	0.36 $\pm$ 1.96	−2.42 $\pm$ 2.03	−1.02 $\pm$ 1.50	0.00 $\pm$ 0.04
Deep LOGISMOS	−0.73 $\pm$ 1.70^b	1.05 $\pm$ 2.10	−0.29 $\pm$ 0.75^b	0.48 $\pm$ 0.63^b	0.17 $\pm$ 0.59
Test-JHU: 1 voxel=3.87 $μ$ m.
OCTExplorer	−0.68 $\pm$ 0.31	−0.04 $\pm$ 0.24	−0.17 $\pm$ 0.19	0.55 $\pm$ 0.33	−1.07 $\pm$ 0.54
Deep LOGISMOS	−0.59 $\pm$ 0.41	−0.06 $\pm$ 0.43	−0.06 $\pm$ 0.34	0.65 $\pm$ 0.51	−1.71 $\pm$ 0.42

Dataset	Device	Subjects	Images	Image Size^a	Voxel Size ( $μ$ m)	B-scans
Training	Cirrus^b	31 NAION	124^d	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	15,872
Test-NAION	Cirrus	20 NAION	40^e	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	5,120
Test-G	Cirrus	29 Glaucoma	10	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	1,280
			19	200 $\times$ 1024 $\times$ 200	30.00 $\times$ 1.95 $\times$ 30.00	3,800
Test-JHU	Spectralis^c	21 MS	21	1024 $\times$ 496 $\times$ 49	5.78 $\times$ 3.87 $\times$ 123.72	1,029
	Spectralis	14 Control	14	1024 $\times$ 496 $\times$ 49	5.78 $\times$ 3.87 $\times$ 123.72	686
Test-G-L^f	Cirrus	15 Glaucoma	27^f	512 $\times$ 1024 $\times$ 128	11.72 $\times$ 1.95 $\times$ 46.88	3,456
			128^f	200 $\times$ 1024 $\times$ 200	30.00 $\times$ 1.95 $\times$ 30.00	25,600

Hybrid deep learning and optimal graph search method for optical coherence tomography layer segmentation in diseases affecting the optic nerve

Abstract

1. Introduction

2. Methods

2.1. Surface-oriented two-stage segmentation

2.2. Deep LOGISMOS

3. Experimental methods

3.1. Datasets

3.2. Deep learning

3.3. Deep LOGISMOS

3.4. Validation metrics

4. Results

4.1. Test-NAION dataset

4.2. Test-G dataset

4.3. Test-JHU dataset

4.4. Test-G-L dataset

5. Discussion

6. Conclusion

Funding

Acknowledgment

Disclosures

Data Availability

References

Data Availability

Cited By

Figures (10)

Tables (4)

Equations (3)

Biomedical Optics Express