Neural network-based inverse model for diffuse reflectance spectroscopy

Qing Lan; Qing Lan; Ryan G. McClarren; Ryan G. McClarren; Karthik Vishwanath; Karthik Vishwanath

doi:10.1364/BOE.490164

1. Introduction

Diffuse reflectance spectroscopy (DRS) is a well-developed and widely-used optical spectroscopic technique for non-invasive sensing of tissue function and physiology. DRS measures the amount of light that is diffusely reflected from the tissue surface after travelling within the medium for a path-length of several mm-cm from the optical source to a detection fiber. When the incident light is a steady-state, broadband source, DRS captures the spectrally-resolved diffusely reflected light spectrum measured at a specific source-detector geometry. The measured quantity, diffuse reflectance, is a function of the absorption coefficient $\mu _a(\lambda )$, the scattering coefficient $\mu _\mathrm {s}(\lambda )$ and the anisotropy factor $g$, in which when far from the light source, $\mu _s$ and $g$ are coupled together as the reduced scattering coefficient $\mu _s'=\mu _s(1-g)$. By measuring the diffuse reflectance spectrum, we can inversely estimate the absorption, $\mu _a(\lambda )$, and the scattering coefficient, $\mu (\lambda )$ or the reduced scattering coefficient $\mu _s'(\lambda )$ of the tissue, which in turn can be used to infer the tissue structure and physiology composition [1]. This process usually requires both a forward model, that provides an estimation for the measured reflectance given the optical coefficients and source-detector geometry, and a corresponding inverse model, that can iteratively apply the forward model to fit the measured reflectance measured by the experimental system [1,2]. Then an optimization algorithm is usually employed to find the optimal solution of the optical parameters.

In DRS, each measurement (from a channel) provides a diffuse reflectance spectrum, $(R(\lambda ))$, which needs to be inverted to extract $\mu _a(\lambda )$ and $\mu _s'(\lambda )$. Traditionally both analytical and numerical approaches have been used for inversions [3]. A widely used analytical approach is based on photon diffusion theory which is an approximation of the more exact radiative transport equation [3]. Numerical methods include methods such as Finite Element Method (FEM) [4] or Finite Difference Method (FDM) [5], which discretize the problem domain and solve the resulting system of equations. However, these methods could suffer ill-posedness, limited accuracy due to precision of grid and expensive computational cost. On the the other hand, the iterative Monte Carlo (MC) approach to model diffuse reflectance in turbid media are also commonly used [3,6]. However, since forward calculations based on MC simulations are computationally expensive, they are typically used to generate suitable lookup tables (LUTs) for practical use as inverse solvers [7–10]. In the MC-based LUT method (MCLUT), MC simulated diffuse reflectance (at the source-detector geometry of interest) are cast into a finite grid of absorption and scattering coefficients. Since most commonly the tissue is modeled as being a semi-infinite, homogeneous medium and thus the diffuse reflectance is sought as a function of only two parameters: the optical absorption coefficient and the reduced scattering coefficient, $\mu _\mathrm {s}'$. Note that diffuse reflectance is indeed dependent on three of $\mu _a, \mu _s, g$, for complexity regarding dimensionality and simplicity of the experimental setups, it is typical to just infer absorption $\mu _a$ and the reduced scattering coefficient $\mu _s'$ instead of decoupling $\mu _s$ and $g$. This is especially of minimal accuracy sacrifice when source-detector distance is further than mean free path ($\mathrm {MFP}$), according to diffusion approximation theory [3].

With the advent of machine learning (ML), studies have been carried out to solve problems in biophotonics with ML models. These techniques leverage the inherent capacity of ML models to learn complex patterns and relationships from large datasets, enabling accurate predictions and analysis of optical properties. In the aspect of measuring technique, compared to ours, the existing ML-based models deal with different DRS techniques, such as frequency domain DRS (FD-DRS) that estimates optical properties from measuring amplitude and phase shift of the diffusely reflected light [11–14], or spatially resolved DRS [15–17], or subdiffusive reflectance in which measurements are made closer than $\mathrm {MFP}$ and therefore more optical properties are involved (first similarity parameter $\gamma$) [17,18]. The generic DRS, which is the technique we work with, compared to FD-DRS, requires less expensive devices, less complicated experimental setups, and the data interpretation is a lot easier due to the simpler mechanism. There are also some ML-based inverse models that also deal with the generic DRS. However, many of them directly estimate physiological parameters from measured diffuse reflectance [19–21]. Some others lack validation on experimental DRS data or real-world data [16,22]. For practical applications, validation on lab-measured data and real-world data is necessary.

Our study firstly explores the utilization of neural networks to develop faster surrogate forward models for generic DRS. Further, we propose to replace the MC simulations in the traditional MCLUT method with the trained surrogate forward model, to construct a neural network-based inverse model that extracts optical parameters ($\mu _a, \mu _s'$) from DRS spectra. We intend to develop an ML-based inverse model that’s able to estimate optical properties from diffuse reflectance for broader applications. By only inferring optical properties, the solution can be incorporated with property profile of any tissue of interest to further estimate physiology properties. Additionally, this NN-based model allows exploration to separate $\mu _s'$ into $\mu _s$ and $g$, especially near the light source, which is the topic of our ongoing research.

To build the inverse model, we first train a neural network forward model (NNFM) using diffuse reflectance data obtained from MC simulations conducted in a semi-infinite homogeneous geometry. The NNFM learns the mapping between input optical properties and corresponding diffuse reflectance values for different source-detector separations. Subsequently, the trained NNFM is employed to generate a lookup table of diffuse reflectance, as a comparison to traditional MC lookup tables. The inverse model is validated on two experimental phantom datasets and compared with the traditional MCLUT method.Our results demonstrate that the NN-based inverse model achieves comparable accuracy to the gold-standard MCLUT method while being faster and more versatile.

In this paper, we first provide comprehensive details on the development and training of the NNFM model, evaluate its performance within and beyond the training data range and varying sizes of training data. Then we integrate the NNFM with the LUT approach, positioning it as the forward model. The accuracy of this NN-based inverse model is compared with the conventional MCLUT in extracting wavelength-dependent optical properties from experimentally measured DRS data. Furthermore, we discuss the errors associated with the NN-based inverse model and highlight its flexibility in handling different scenarios.

2. Methods

2.1 Forward model

2.1.1 Monte Carlo data

Monte Carlo (MC) has been the standard forward method in diffuse reflectance spectroscopy because of its accuracy and versatility. The MC model simulates the stochastic propagation of photons in a medium, initially proposed and described in [6]. The program we have worked with is a steady-state MC program for multi-layered tissues (MCML) initially developed by Lihong Wang and Steven Jacques [23]. This program offers flexibility and ease of use, allowing users to specify various parameters such as photon energy (wavelength), optical properties of the medium including absorption coefficient ($\mu _a$), scattering coefficient ($\mu _s$), and anisotropy factor ($g$), which are related to the probability of photon getting absorbed, scattered and the direction after scattering. Additionally, the refractive indices ($n$) of the fiber, target medium, and underlying medium can be specified. The program also incorporates geometrical information, including source-detector distances (SDDs), medium thickness, and grid structure.

5000 MC simulations were run to obtain the data for training the neural network. The optical parameters $\mu _\mathrm {a}, \mu _\mathrm {s}, g$ were randomly sampled using Latin hyper-cube method [24] to better fill the sampling space. The ranges of the sampling space of the optical parameters were chosen based on typical phantom data: $\mu _\mathrm {a}$ was sampled within (0, 10) $\mathrm {cm^{-1}}$, $\mu _\mathrm {s}$ in (100, 350) $\mathrm {cm^{-1}}$ and $g$ in (0.8, 1.0) which resulted in $\mu _s'$ ranging between 20-350 $\mathrm {cm^{-1}}$. Fig. 1(a) shows the sample points of $\mu _\mathrm {a}$ vs. $\mu _\mathrm {s}$. The data has been uploaded as Data File 1 in supplemental material.

Fig. 1. (a): The 5000 sample points of $\mu _\mathrm {a}, \mu _\mathrm {s}$ in the MC simulations. Points are sampled using Latin hyper-cube method to fill up the space. The red dashed lines indicate where we truncated the data for the extrapolation tests that are discussed in Section 3.1.2 (Data File 1). (b): The histogram distribution of $\mathrm {MFP}$ of our sample points. In our MC simulations the shortest source detector separation distance was 0.14 $\mathrm {cm}$ so that for the vast majority of cases simulated $\mathrm {MFP} \ll$ SDD (Data File 1).

Download Full Size | PDF

According to the diffusion approximation, when the SDD is larger than a mean free path given by $\mathrm {MFP} = 1/(\mu _\mathrm {a}+\mu _\mathrm {s}')$, the dependency of diffuse reflectance $R(\mu _\mathrm {a}, \mu _\mathrm {s}, g)$ on optical parameters is approximately reduced to $R(\mu _\mathrm {a}, \mu _\mathrm {s}')$ [3]. A histogram distribution of the $\mathrm {MFP}$ in the training data used is shown in Fig. 1(b). In the MC simulations used to generate training data, we choose 11 SDDs from 0.14 cm to 0.34 $\mathrm {cm}$ in 0.02 $\mathrm {cm}$ increments to simulate the diffuse reflectance. From Fig. 1(b), we observe that a large fraction of our simulations should have $R$ depending on $(\mu _\mathrm {a}, \mu _\mathrm {s}')$ because $\mathrm {MFP} \ll \,$SDD which gives more credibility to solving for $\mu _\mathrm {s}'$ instead of each of $\mu _\mathrm {s}, g$.

2.1.2 Neural network forward model (NNFM)

The MC data is then used to train a neural network forward model (NNFM) that learns the mapping from optical parameters ($\mu _\mathrm {a}, \mu _\mathrm {s}, g$) to diffuse reflectance $R$. 90% of the MC data (4500 instances) were used for training and 10% (500 instances) were used for testing the trained model. The NNFM takes the optical parameters ($\mu _\mathrm {a}, \mu _\mathrm {s}, g$) as the input and predicts the diffuse reflectance from the 11 SDDs; that is, the NN has 3 inputs and 11 outputs.

Alternatives to the current ML method were also explored. We employed a user-friendly neural network called DJINN [25] that automatically determines optimal hyperparameters, of which the performance can be found in our previous work [26]. We also examined the accuracy of a couple of basic simple machine learning models, for example, a nonlinear regression model without hidden layers and a simple gradient boosting regressor, whose performance illustrations have been included as Visualization 3 and Visualization 4 in supplemental material. These alternative approaches were not able to fit the forward problem as well as the deep fully-connected neural networks considered in this work.

With a fully-connected neural network to fit the forward MC calculations, through a series of adjustments based on the ideas of parsimony and robustness, we fine-tuned and cross-validated the network architecture [27]. Ultimately, our model consists of four fully-connected hidden layers, incorporating 10, 30, 80, and 50 neurons each, resulting in a total of 7461 trainable parameters. The architecture of the Neural Network Forward Model (NNFM) is depicted in Fig. 2.

Fig. 2. Structure of the neural network forward model (NNFM). The 4 fully connected hidden layers have 10, 30, 80, and 50 neurons, respectively.

Download Full Size | PDF

The output data are scaled into $[0, 1]$ as diffuse reflectance over the 11 SDDs vary significantly from each other. Rectifier activation function (ReLU) [28] was applied in all layers except that sigmoid was applied to the output layer to match the scale of the output values. Mean squared error (MSE) was used to optimize the weights in the neural network. Earlier study comparing the time efficiency of the NNFM to Monte Carlo was showed in our paper [26] and more details are given in the discussion section 4.

2.2 Neural network-based inverse model

The typically-used MC-based inverse model generates its LUT over a grid of $\mu _\mathrm {a}, \mu _\mathrm {s}'$ and assumes $g$ to be a constant. At each point of $(\mu _\mathrm {a}, \mu _\mathrm {s}')$, a Monte Carlo simulation is performed to calculate the diffuse reflectance. In our approach, the NNFM generates the LUT instead. The NNFM takes in values for all grid points of $(\mu _\mathrm {a}, \mu _\mathrm {s}')$ simultaneously and produces predictions for diffuse reflectance across the entire parameter space. To maintain consistency with the legacy MC-based inverse model, we assume a constant value of $g=0.9$.

2.2.1 Lookup table

Compared to traditional MC-based inverse model, the trained NNFM takes the place of MC simulations for generating the LUT. The grid of optical coefficients used in the NN-based inverse model was kept identical to the previously used MC-based inverse model in [29]. 82 points for $\mu _\mathrm {a}$ were distributed between 0 an 200 $\mathrm {cm}^{-1}$ and 66 points of $\mu _\mathrm {s}'$ between 0.1-150 $\mathrm {cm}^{-1}$. The anisotropy factor $g$ was set to be fixed at 0.9. For any given inputs of of $\mu _\mathrm {a}, \mu _\mathrm {s},$ and $g$, the trained NNFM provided the predicted diffuse reflectance $R$ at all points of the LUT grid to generate the table.

2.2.2 Solving the inverse problem

For any measured diffuse reflectance data, an initial guess for $\mu _\mathrm {a}, \mu _\mathrm {s}'$ is chosen. The squared-error between the measured spectrum of diffuse reflectance ${R_M(\lambda _i)}$ and the values from the the LUT, $R_{LUT}(\lambda _i), i=1\cdots {N}$ corresponding to the initial guess in the table $\mu _\mathrm {a}, \mu _\mathrm {s}'$ is calculated. The squared-error is given by

(1)$$E = \sum_{i=1}^N [R_\mathrm{LUT}(\lambda_i)-R_\mathrm{M}(\lambda_i)]^2.$$

where $\lambda _i, i=1\ldots {N}$ are the chosen wavelengths for the measured spectrum $R_M(\lambda )$. Then, the Levenberg–Marquardt least-squares criterion is employed to find the optimal solution for optical parameters. The guess for $\mu _\mathrm {a},\mu _\mathrm {s}'$ is updated with by stepping between grid points in the LUT until spectra of $\mu _\mathrm {a}, \mu _\mathrm {s}'$ are found that minimizes the error $E$.

2.2.3 Experimental phantoms for validation

We validated the performance of our NN-based inverse model using DRS data from two previously reported experimental data sets [29].

The DRS system consists of a light source using a tungsten halogen lamp (HL2000-HP, OceanOptics, Dunedin, FL), detectors from a charge-coupled device (CCD) array-based spectrometer (USB4000, OceanOptics, Dunedin, FL) and fiber-optics probes (SMA-terminated fiber-optic probes) that are connected to both of the light source and the detectors. A picture of the DRS system can be found in Fig. 1 in the paper [29]. Before performing a measurement, any other light in the environment is turned off. When measuring, the lights source illuminates the target phantom through one of the probes, then the other probes receive the diffused light on the surface of the phantom and transfer it to the spectrometer. The spectrometer then analyzes the diffused light to give a spectrum of light intensity over wavelengths. To eliminate noise from the light source itself, the environment and the device, two strategies are taken: 1) The light source was pre-warmed in advance and is kept on during the whole set of measurements. This approach helps stabilize the light source output and mitigate fluctuations. 2) A reference spectrum obtained from measuring a reference medium, represented by a white puck, was employed as a scaling spectrum. This reference spectrum is used to calibrate the raw DRS data acquired from the phantoms, effectively reducing noise and enhancing the accuracy of the measurements.

The liquid phantoms were prepared by mixing known volumes of monodisperse polystyrene microsphere suspension as scatterer and hemoglobin powder dissolved in water as absorber, characterized by an optical absorption spectrophotometer (Cary 300, Varian Inc, Walnut Creek, CA). The liquid phantoms tested in this report were prepared following protocols that were established by previous studies [9,10]. A prepared phantom’s optical absorption and scattering data were obtained by measuring the absorption of the hemoglobin solution in a spectrophotometer and using Mie theory to estimate its scattering [10]. Here, we used two phantom data sets. The first data set was obtained in 12 liquid phantoms that were prepared by serial additions of small volumes of the absorber to an initial optical phantom of known $\mu _\mathrm {s}'$ and $\mu _\mathrm {a}$, thus these phantoms had nearly identical scattering but had increasing absorption. The second data set contained 11 phantoms which were prepared by serial additions of the polystyrene microspheres to an initial phantom and thus had nearly constant $\mu _\mathrm {a}$ but increasing $\mu _\mathrm {s}'$. For phantom set 1, the expected absorption coefficient was calculated to be between $10^{-4}$ to 3.5 $\mathrm {cm}^{-1}$ while the $\mu _\mathrm {s}'$ varied between from 150 to 260 $\mathrm {cm}^{-1}$ for the wavelength range 450-650 nm. For the second phantom set, the $\mu _\mathrm {a}$ varied between $10^{-4}$ and 1.6 $\mathrm {cm}^{-1}$ while the $\mu _\mathrm {s}'$ ranged between 100 and 350 $\mathrm {cm}^{-1}$. For all these phantoms, $g$ varies between 0.915 and 0.935.

Note that while the reduced scattering coefficient of some of the phantoms may seem relatively high among biological tissues commonly seen, various studies have reported a wide range of scattering coefficients, with tissues such as white matter and skin epidermal layers demonstrating notably high values [30]. We intend to evaluate the NN-based inverse model across a spectral region spanning from the visible to near-infrared (NIR) region across a variety of tissue sites including the skin layers and brain.

Before solving the inverse problem for the phantoms, calibration of the measured diffuse reflectance spectra needs to be performed. This was done by choosing one phantom in the data set as the reference for the other phantoms. The raw measured diffuse reflectance spectrum of each phantom was divided by the spectrum of the reference phantom. This calibration process is called reconstruction, and different reconstructions result in different spectra and solutions for the optical parameters. In other words, for phantom set 1, there were 11 reconstructions for each phantom because there were 12 different reference phantoms in total. For phantom set 2, there were 10 different reconstructions for each phantom as there were 11 phantoms.

3. Results

3.1 Performance of NNFM

The performance of the converged NNFM is presented in Fig. 3. We use the normalized mean absolute error (NMAE), which is calculated as the mean absolute error (MAE) between the true data $\mathbf {y}$ and the predicted value $\hat {\mathbf {y}}$, divided by the mean value of $|\mathbf {y}|$ over all instances in the data:

(2)$$\mbox{NMAE}=\frac{\mathrm{MAE}(\mathbf{y}, \mathbf{\hat{y}})} {\mathrm{mean}(|\mathbf{y}|)}$$

where N is the number of instances in the data, $\mathbf {y}$ is true data and $\hat {\mathbf {y}}$ is the predicted value. The reason of using NMAE is because relative error is misleading due to the presence of very small values in diffuse reflectance, especially at large SDDs [26].

Fig. 3. Predictions of diffuse reflectance as estimated by NNFM versus the values compute via MC. The black solid line represents perfect fit where $R_\mathrm {MC} = R_\mathrm {NNFM}$. The dots are predictions for the test data set. The color of the points indicates the SDD as illustrated in the color bar. For example, blue dots are predictions for diffuse reflectance at 0.16 $\mathrm {cm}$, and red dots are predictions for data at 0.34 $\mathrm {cm}$. Note that values at each of the source-detector distances are individually scaled into [0, 1] so as to be shown in the same figure. The overall NMAE of all predictions is 0.018. Performance of using simpler ML models can be found in supplemental material Visualization 3 and Visualization 4, which show less accurate predictions.

Download Full Size | PDF

These results demonstrate that the NNFM is capable of reproducing results consistent with the MC calculation over a range of diffuse reflectance and SDDs.

3.1.1 Data size for training NNFM

As MC data is needed to train the NNFM, it is reasonable to wonder how the size of training data affects the performance of the NN. The training set size was varied as a percentage of the 5000 MC samples in increments of 2%, 4%, 6%, 8%, 10%, 20%, 30%,…, 80%, 90%. The test data remained 10% of the original data ($5000\times $10%=500 MC simulations) and was fixed in all the comparisons. Both sets were randomly selected from the whole MC data without replacement. We trained the NNFM independently 20 times for each training set size and obtained a total of 20 NMAE values. Fig. 4 shows the results of these comparisons. The training result data has been uploaded as Data File 2 in supplemental material.

Fig. 4. Error of the NNFM over varying sizes of training data. The red dots are averaged total NMAEs and error bars represent the variance when re-training the model 20 time (Data File 2).

Download Full Size | PDF

As depicted in Fig. 4, it is evident that the error steadily converges as the amount of training data increases. The discrepancy between the Normalized Mean Absolute Error (NMAE) values for 80% and 90% of the training data is approximately 0.002 indicating a small derivative with respect to adding more data. Given this observation, it is unlikely that the error will converge to a substantially smaller value even with an additional 4500 instances of Monte Carlo (MC) data. Considering this and additional cost of getting more MC data, we stick with the model trained with 4500 instances, which accounts for 90% of the available MC data. The error convergence trends and the minimal difference in performance between 80% and 90% of the training data strongly support the notion that augmenting the training data further will have very small impact on reducing the error.

3.1.2 Extrapolation

To investigate the performance of the NNFM when dealing with optical parameters outside the range of the training data, we split the data as indicated by the red lines in Fig. 1(a). We repeated the training of the model by removing the upper or lower 10% of $\mu _\mathrm {a}$ or $\mu _\mathrm {s}$ from the training set and placing it in the test set. For example, the red dashed line on the left splits data to investigate extrapolation for $\mu _\mathrm {a}$ below those in training data. The NNFM was trained on the data on the right side of the red dashed line and makes predictions for data on the left. The red dashed line on the top splits data to investigate extrapolation for $\mu _\mathrm {s}$ above that line. In total, we investigated four extrapolation cases, respectively 10% on each edge of the data in Fig. 1(a).

The results of this analysis are presented in Fig. 5. As shown in the figure, the NNFM is robust when extrapolating above the training data, i.e., when $\mu _\mathrm {a}$ or $\mu _\mathrm {s}'$ is larger than the data in the training set. However, it loses more accuracy when extrapolating to smaller $\mu _\mathrm {a}$ and $\mu _\mathrm {s}'$. In particular, we observe that the error of extrapolating smaller $\mu _\mathrm {a}$ is significantly larger than extrapolating smaller $\mu _\mathrm {s}'$, and the predicted diffuse reflectance is underestimated. This is likely due to the fact that the ML model has not encountered sufficient instances during training where almost no photons are absorbed. But when extrapolating towards smaller $\mu _s$, there is less scattering but not vanishing scattering, so the model has no problem extrapolating towards smaller $\mu$. To validate the above explanation, we have applied MaxAbsScaler and logarithmic transformations of the features, of which the extrapolation results can be found in the supplemental material Visualization 1 and Visualization 2. The extrapolation did not materially change with these alternate approaches. Therefore, we hypothesize it is not the scaling for the data that is the causing the extrapolation issue.

Fig. 5. (a) Extrapolation of $\mu _\mathrm {a}$ below the training data. The model was trained with data on the right side of the red dashed line on the left of Fig. 1(a) and the predictions of the trained NNFM for the data on the left side of the line are shown here. NMAE=0.162. Performance of extrapolating to upper $\mu _s$ using MaxAbsScaler and logarithm scaler can be found in supplemental material Visualization 1 and Visualization 2. (b) Extrapolation of $\mu _\mathrm {a}$ above the training data. NMAE=0.112. (c) Extrapolation of $\mu _\mathrm {s}$ below the training data. NMAE=0.057. (d) Extrapolation of $\mu _\mathrm {s}$ above the training data. NMAE=0.027. In each of the figures, 10% of the data is removed from each edge of the parameter range shown in Fig. 1(a) and excluded from the training set. The NNFM is trained with the remaining of 90% data. The color of each dot indicates the SDD as in Fig. 3

Download Full Size | PDF

Overall, our results suggest that we can have more confidence in the predictions made by the NNFM for data with larger optical parameters than our training data. These findings also indicate that the training data should include the minimum possible optical parameter values that the model may be asked to evaluate.

3.2 Fitted diffuse reflectance spectra of measured phantoms

The model first fits the $R(\lambda )$ spectra of phantoms utilizing Levenberg–Marquardt least-squares criterion for minimizing the squared-error between fitted spectrum and measured spectrum as described in Section 2.2.2. The fitted R spectra of of phantoms in phantom set 1 using MC-based inverse model and the NN-based inverse model are compared in Fig. 6. Fitting only of phantoms with absorption levels 3rd, 5th, 7th, 9th, 11rd (from low absorption to high absorption) are showed when using phantom 1 as reference. Fig. 6(a) shows the fitted spectra using the LUT generated by MC simulations and Fig. 6(b) is the fitted spectra using LUT generated by the NNFM. In both of Fig. 6(a) and Fig. 6(b), upper plot is the fitted diffuse reflectance spectra and the lower plot is the residual between the measured spectra and fitted spectra.

Fig. 6. (a) MC-based inverse model fitted R spectra and the residual (Dataset 1, Ref. [33] and Dataset 2, Ref. [34]). (b) NN-based inverse model fitted R spectra and the residual. Fitted R spectra of phantoms 3rd, 5th, 7th, 9th, 11st in set 1 when using phantom 1 as reference are shown. In the diffuse reflectance plots, dashed lines are the measured diffuse reflectance, the curves with dots are the fitted diffuse reflectance using the corresponding inverse model. Colors mean different absorption levels (Dataset 1, Ref. [33] and Dataset 3, Ref. [35]).

Download Full Size | PDF

From the residual plots, it is observed that the MC-based fitting has good agreement with the NN-based fitting with some minor discrepancies, which is expected since the NNFM is a data-driven approximation of forward MC simulations instead of a first-principle approach like MC. As shown in Fig. 6, the fitted reflectance is worse in phantoms with high absorption and longer wavelengths, when the absorption coefficient of the reconstructed targets is significantly different from the reference phantom used (phantom 1 for the data here). When we reconstructed the same reflectance data by using phantom 11 as the reference, we observed that the fitting to phantom 1 became worse (data not shown).

3.3 Inverse solution for measured phantoms

The wavelength-averaged values over 450-650 $\mathrm {nm}$ of the inferred optical parameters are calculated and shown in Fig. 7. The errors bars are the standard deviation (std) over the different reconstructions using different reference phantoms and the circles are the mean values. To evaluate the accuracy, We calculate the Euclidean distance (ED) between the circles and their expected values. The Euclidean distance is the square root of the squared-error defined in (1).

Fig. 7. (a) Solution for $\mu _\mathrm {a}$ of phantom set 1 using MC-based inverse model. ED$=0.28$ (Dataset 1, Ref. [33] and Dataset 2, Ref. [34]). (b) Solution for $\mu _\mathrm {a}$ of phantom set 1 using NN-based inverse model. ED$=0.25$ (Dataset 1, Ref. [33] and Dataset 3, Ref. [34]). (c) Solution for $\mu _\mathrm {s}'$ of phantom set 2 using MC-based inverse model. ED=$5.26$ (Dataset 4, Ref. [36] and Dataset 5, Ref. [37]). (d) Solution for $\mu _\mathrm {s}'$ of phantom set 2 using NN-based inverse model. ED=$4.14$ (Dataset 4, Ref. [36] and Dataset 6, Ref. [38]). In the above figures, wavelength-averaged values of solutions over 450-650 $\mathrm {nm}$ are shown. The dashed line represents the perfect fit. The circles are the mean values over different reconstructions for the wavelength-averaged solutions. The errors bars are the variance over all the reconstructions.

Download Full Size | PDF

From Fig. 7, we see that the NN-based inverse model is able to solve for $\mu _\mathrm {a}$ and $\mu _\mathrm {s}'$ with comparable accuracy. Moreover, the NNFM-based results are more accurate in estimating $\mu _\mathrm {s}'$ than the MC-based results. However, the variation in recovered absorption for different reconstructions is larger with the NNFM compared to the MC-based method.

4. Discussion

In this paper, we have first used an NNFM as a surrogate model for forward MC simulations for photon transport to estimate the diffuse reflectance reaching a detector. Once trained, unlike the traditional MCLUT that requires a MC simulation for each point of the grid, the NNFM is able to make predictions for grid points fast, significantly saving time and computational cost. In our previous work [26], we compared the MC simulation time distribution over 1750 sample points of ($\mu _a, \mu _s, g$) with the training time and prediction time of the NNFM. MC simulation time ranged from 10-3500s, while the NNFM training time only took 75s and prediction time took 0.093s for 350 predictions, 0.113s for 1000 predictions and 0.112s for 1500 predictions. It was demonstrated that the NNFM model is faster than MC simulations by a factor of $10^2-10^4$. Even considering the training time, the NNFM is $5-10^2$ times faster. Furthermore, while the time of a MC simulation depends the number of photons being simulated and the values of the optical properties, time to make an NNFM prediction does not vary with these factors. In contrast to MC predictions and lookup-table interpolation, NNFM prediction time does not necessarily increase linearly with the number of predictions due to hardware efficiencies exploited in modern ML libraries. Consequently, the NNFM is able to generate tables with higher speed and more flexibility as making predictions for any grid takes almost the same computational cost, giving a potential faster tool for clinical diagnosis.

We also investigated how much MC data is needed to sufficiently train a NNFM. In our numerical experiments we found that 2000 or more MC simulations are enough to build an accurate model. Extrapolation was also studied by removing particular ranges of $\mu _\mathrm {a}$ and $\mu _\mathrm {s}$ from the extremes of the training data. We demonstrated that the NNFM is robust in extrapolation towards higher values of both $\mu _\mathrm {a}$ and $\mu _\mathrm {s}$, but the accuracy suffers when the model is asked to extrapolate below the values used to train the model.

Note that while the Neural Network Forward Model (NNFM) is solely trained and tested on Monte Carlo (MC) simulation data, the performance of the inverse model utilizing the trained NNFM in estimating optical parameters of DRS data is the best test of the approach. Thus, while certain assumptions are made regarding the geometry, phantom profiles, and fiber profiles in the MC data, their precise details become less significant as long as the overall feasibility and validation of the inverse model on DRS data are convincingly demonstrated. This ensures that the inverse model’s practical effectiveness is not hindered by the specifics of the MC simulation assumptions, affirming its potential for accurate and reliable estimation of optical parameters in real-world scenarios.

We then modified the traditional MCLUT by replacing the MC forward simulations with the trained NNFM. The NN-based inverse model was validated on two experimental data sets. Our results show that the LUT based on the NNFM is a feasible approach with similar accuracy in inferring $\mu _\mathrm {a}$ and $\mu _\mathrm {s}$ compared to the traditional MCLUT, and that the NNFM approach led to a bit larger variance over different reconstructions. Overall, the NN-based inverse model outperforms the traditional MCLUT due to its higher efficiency and better flexibility with comparable accuracy. These results for using NNFM for inversion on phantoms indicate that this technology should be tested on real world, biological tissues in future work.

With the NNFM being a fast and easy approach to making predictions for the diffuse reflectance, a continuous inverse model without lookup tables is currently being investigated by the authors. This would allow the inverse model to choose values that are not on a grid as they are with the LUT approach. This could lead to more accurate results. Moreover, while in this paper we kept $g$ a constant as in the traditional MCLUT method, it would be possible to make predictions for $\mu _\mathrm {a}$, $\mu _\mathrm {s}$, and $g$ without combining $\mu _\mathrm {s}$ and $g$ into a single parameter using neural networks. This would allow the model to take into account the fact that $g$ is not constant in many realistic scenarios. We would also repeat the training study completed in section 3.1.1 that such a model with varying $g$ has similar convergence properties.

In conclusion, we have demonstrated the potential of building a data-driven generic inverse model for DRS to estimate optical parameters with measured diffuse reflectance spectrum, and we will continue developing a more accurate and more user-friendly inverse model to serve as a potential clinical diagnostic tool.

Disclosures

We declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available in Data File 1, Ref. [31], Data File 2, Ref. [32], Dataset 1, Ref. [33], Dataset 2, Ref. [34], Dataset 3, Ref. [35], Dataset 4, Ref. [36], Dataset 5, Ref. [37], Dataset 6, Ref. [38].

References

1. R. Wilson, K. Vishwanath, and M.-A. Mycek, “Optical methods for quantitative and label- free sensing in living human tissues: principles, techniques, and applications,” Adv. Phys.: X 1(4), 523–543 (2016). [CrossRef]

2. S. R. Arridge, “Optical tomography in medical imaging,” Inverse Probl. 15(2), R41–R93 (1999). [CrossRef]

3. S. L. Jacques and B. W. Pogue, “Tutorial on diffuse light transport,” J. Biomed. Opt. 13(4), 041302 (2008). [CrossRef]

4. M. Schweiger, S. Arridge, and D. Delpy, “Application of the finite-element method for the forward and inverse models in optical tomography,” J Math Imaging Vis 3(3), 263–283 (1993). [CrossRef]

5. L. Bass, O. Nikolaeva, A. Bykov, and M. Kirillin, “Finite difference methods for solving the transport equation in the problems of optical biomedical diagnostics,” J. Biomed. Photonics Eng. 3(1), 010311 (2017). [CrossRef]

6. S. A. Prahl, “A Monte Carlo model of light propagation in tissue,” Dosimetry of Laser Radiation in Medicine and Biology, vol. 10305G. J. Mueller, D. H. Sliney, and R. F. Potter, eds., International Society for Optics and Photonics (SPIE, 1989).

7. N. Rajaram and J. W. Tunnell, “Lookup table-based inverse model for determining tissue optical properties,” in Biomedical Optics (Optica Publishing Group, 2008), p. BTuF23.

8. I. Fredriksson, M. Larsson, and T. Strömberg, “Inverse Monte Carlo method in a multilayered tissue model for diffuse reflectance spectroscopy,” J. Biomed. Opt. 17(4), 1–13 (2012). [CrossRef]

9. G. M. Palmer and N. Ramanujam, “Monte carlo-based inverse model for calculating tissue optical properties. part i: Theory and validation on synthetic phantoms,” Appl. Opt. 45(5), 1062–1071 (2006). [CrossRef]

10. J. E. Bender, K. Vishwanath, L. K. Moore, J. Q. Brown, V. Chang, G. M. Palmer, and N. Ramanujam, “A robust monte carlo model for the extraction of biological absorption and scattering,” IEEE Trans. Biomed. Eng. 56(4), 960–968 (2009). [CrossRef]

11. X. He, X. Jiang, X. Fu, Y. Gao, and X. Rao, “Least squares support vector machine regression combined with monte carlo simulation based on the spatial frequency domain imaging for the detection of optical properties of pear,” Postharvest Biol. Technol. 145, 1–9 (2018). [CrossRef]

12. B. G. Silva, M. R. Gonçalves, G. H. S. Alves, Á. Monte, and D. M. Cunha, “Determination of optical properties of skin tissues using spatial domain frequency imaging and random forests,” Tech. rep., EasyChair (2022).

13. S. Panigrahi and S. Gioux, “Machine learning approach for rapid and accurate estimation of optical properties using spatial frequency domain imaging,” J. Biomed. Opt. 24(07), 071606 (2019). [CrossRef]

14. S. Xing, J. Zhang, Y. Luo, Y. Yang, and X. Fu, “Extracting tissue optical properties and detecting bruised tissue in pears quickly and accurately based on spatial frequency domain imaging and machine learning,” Foods 12(2), 238 (2023). [CrossRef]

15. Y. Zhou, X. Fu, Y. Ying, and Z. Fang, “An integrated fiber-optic probe combined with support vector regression for fast estimation of optical properties of turbid media,” Anal. Chim. Acta 880, 122–129 (2015). [CrossRef]

16. T. J. Farrell, B. C. Wilson, and M. S. Patterson, “The use of a neural network to determine tissue optical properties from spatially resolved diffuse reflectance measurements,” Phys. Med. Biol. 37(12), 2281–2286 (1992). [CrossRef]

17. M. Ivančič, P. Naglič, F. Pernuš, B. Likar, and M. Bürmen, “Efficient estimation of subdiffusive optical parameters in real time from spatially resolved reflectance by artificial neural networks,” Opt. Lett. 43(12), 2901–2904 (2018). [CrossRef]

18. J. An, Q. Zhang, L. Zhang, C. Liu, D. Liu, M. Jia, and F. Gao, “Neural network-based optimization of sub-diffuse reflectance spectroscopy for improved parameter prediction and efficient data collection,” J. Biophotonics 16, e202200375 (2023). [CrossRef]

19. J. Moros, F. I nón, S. Garrigues, and M. de la Guardia, “Near-infrared diffuse reflectance spectroscopy and neural networks for measuring nutritional parameters in chocolate samples,” Anal. Chim. Acta 584(1), 215–222 (2007). [CrossRef]

20. I. Fredriksson, M. Larsson, and T. Strömberg, “Machine learning for direct oxygen saturation and hemoglobin concentration assessment using diffuse reflectance spectroscopy,” J. Biomed. Opt. 25(11), 112905 (2020). [CrossRef]

21. M. H. Nguyen, Y. Zhang, F. Wang, J. De La Garza Evia Linan, M. K. Markey, and J. W. Tunnell, “Machine learning to extract physiological parameters from multispectral diffuse reflectance spectroscopy,” J. Biomed. Opt. 26(05), 052912 (2021). [CrossRef]

22. B. Hokr and J. Bixler, “Machine learning estimation of tissue optical properties,” Sci. Rep. 11(1), 6561 (2021). [CrossRef]

23. S. J. Lihong Wang, “Monte Carlo Multi-Layered (MCML),” (1994). Software.

24. R. G. McClarren, Uncertainty Quantification and Predictive Computational Science (Springer, 2018).

25. K. D. Humbird, J. L. Peterson, and R. G. McClarren, “Deep neural network initialization with decision trees,” (2018).

26. Q. Lan, R. G. McClarren, and K. Vishwanath, “Neural network forward model and transfer learning calibration from Monte Carlo to diffuse reflectance spectroscopy,” in Optical Tomography and Spectroscopy of Tissue XIV, vol. 11639S. Fantini and P. Taroni, eds., International Society for Optics and Photonics (SPIE, 2021), pp. 49–57.

27. R. G. McClarren, Machine Learning for Engineers (Springer, 2021).

28. X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in International Conference on Artificial Intelligence and Statistics (China, 2011).

29. K. Vishwanath, K. Chang, D. Klein, Y. F. Deng, V. Chang, J. E. Phelps, and N. Ramanujam, “Portable, fiber-based, diffuse reflection spectroscopy (drs) systems for estimating tissue optical properties,” Appl. Spectrosc. 65(2), 206–215 (2011). [CrossRef]

30. S. Jacques, “Optical properties of biological tissues: A review,” Phys. Med. Biol. 58(11), R37–R61 (2013). [CrossRef]

31. Q. Lan, “MC data for training,” figshare (2023) https://doi.org/10.6084/m9.figshare.23898936.

32. Q. Lan, “NNFM with varying training size,” figshare (2023) https://doi.org/10.6084/m9.figshare.23500770.

33. Q. Lan, “Measured Hb data,” figshare (2023) https://doi.org/10.6084/m9.figshare.23898936.

34. Q. Lan, “MC fitted Hb data,” figshare (2023) https://doi.org/10.6084/m9.figshare.23898960.

35. Q. Lan, “NN fitted Hb data,” figshare (2023) https://doi.org/10.6084/m9.figshare.23898969.

36. Q. Lan, “Measured PS data,” figshare (2023) https://doi.org/10.6084/m9.figshare.23898927.

37. Q. Lan, “MC fitted PS data,” figshare (2023) https://doi.org/10.6084/m9.figshare.23898978.

38. Q. Lan, “NN fitted PS data,” figshare (2023) https://doi.org/10.6084/m9.figshare.23898975.

Name	Description
Data File 1	MC data we used to train the forward model.
Data File 2	We varied the training data percentage from 5%-95% to test the performance of the neural network fitting the forward problem of diffuse reflectance spectroscopy. With each training data size, the neural network was trained independently for 20 times.
Dataset 1	Measured data of the phantoms with fixed mua. The measured diffuse reflectance spectra, expected mua, expected mus, expected g were also included.
Dataset 2	Extracted optical properties of the phantoms with fixed mua using MCLUT inverse model. The fitted diffuse reflectance, extracted mua, extracted mus, extracted g were also included.
Dataset 3	Measured data of the phantoms with fixed mua. The measured diffuse reflectance spectra, expected mua, expected mus, expected g were also included.
Dataset 4	Extracted optical properties of the phantoms with fixed mua using NN-based inverse model. The fitted diffuse reflectance, extracted mua, extracted mus, extracted g were also included.
Dataset 5	Extracted optical properties of the phantoms with fixed mua using MCLUT inverse model. The fitted diffuse reflectance, extracted mua, extracted mus, extracted g were also included.
Dataset 6	Extracted optical properties of the phantoms with fixed mua using NN-based inverse model. The fitted diffuse reflectance, extracted mua, extracted mus, extracted g were also included.
Visualization 1	Applying MinAbsScaler on the features for extrapolation towards low absorptions.
Visualization 2	Applying logarithm on the features for extrapolation towards low absorptions.
Visualization 3	Using a simple gradient boosting regressor to fit the forward model, as a comparison to the deep neural network forward model. Diffuse reflectance at all source detector distances are scaled in to [0, 1].
Visualization 4	Using a simple nonlinear regressor to fit the forward model, as a comparison to the deep neural network forward model. Diffuse reflectance at all source detector distances are scaled in to [0, 1].

Neural network-based inverse model for diffuse reflectance spectroscopy

Abstract

1. Introduction

2. Methods

2.1 Forward model

2.1.1 Monte Carlo data

2.1.2 Neural network forward model (NNFM)

2.2 Neural network-based inverse model

2.2.1 Lookup table

2.2.2 Solving the inverse problem

2.2.3 Experimental phantoms for validation

3. Results

3.1 Performance of NNFM

3.1.1 Data size for training NNFM

3.1.2 Extrapolation

3.2 Fitted diffuse reflectance spectra of measured phantoms

3.3 Inverse solution for measured phantoms

4. Discussion

Disclosures

Data availability

References

Supplementary Material (12)

Data availability

Cited By

Figures (7)

Equations (2)

Biomedical Optics Express