Deep learning and sub-band fluorescence imaging-based method for caries and calculus diagnosis embeddable on different smartphones

Cheng Wang; Cheng Wang; Rongjun Zhang; Xiaoling Wei; Le Wang; Peiyu Wu; Peiyu Wu; Qi Yao; Qi Yao

doi:10.1364/BOE.479818

1. Introduction

High prevalence rate, low treatment rate, and high failure rate of dental restoration are currently the three major challenges of dental caries prevention and treatment [1]. The Lancet published a study in 2017 showing that permanent dental caries ranked first in prevalence and second in incidence among 328 diseases [2]. Globally, 2.3 billion people have untreated, permanent dental caries and 532 million children have untreated deciduous dental caries [3]. Successful periodontal treatment requires completely removing calculus [4–6]. Screening and diagnosis of early dental caries (ICDAS II codes 1 and 2) are beneficial for disease control and even reversal of the disease [7]. However, for early changes in enamel, traditional clinical methods (visual and palpation) and radiological techniques have a high risk of missed detections and rely strongly on clinician experience [8,9]. In addition, it is difficult to get timely screening and diagnosis of early dental caries due to the constraint of medical resources, especially in a region with limited medical resources [10]. Therefore, universal community and home caries screening are critical for dental health management.

Optical-based methods have recently been increasingly used for caries detection, such as quantitative light-induced fluorescence (QLF), near-infrared transillumination (TI), optical coherence tomography (OCT), Polarized Raman spectroscopy, terahertz spectroscopy [11–16]. Compared with other optical detection methods, the QLF technique has the advantages of high sensitivity and specificity, simple structure, and easy operation. It has the potential for developing miniaturized, portable community and home caries detection devices. Although some commercial products for caries detection are available, they are difficult to be applied to community and home caries detection because they are expensive and require specialized medical knowledge [17–19]. Consequently, it is urgent to construct a QLF-based method that can be applied to community and home caries screening.

The principle of detecting dental caries based on QLF is that when the blue-violet light is irradiated to the normal tooth enamel, an emission peak will be generated at 480-510 nm. Moreover, with the progress of caries disease, one or two continuously enhanced emission peaks will be produced at 590-710 nm. The color of tooth tissue also transitions from green to red [20]. The increasing emission peak at 590-710 nm is due to the accumulation of porphyrins, a byproduct of bacterial metabolism, on the surface of dental caries [21]. The fluorescence spectrum of calculus is very similar to that of caries with extensive cavities (ICDAS II codes 5 and 6), so it is necessary to distinguish between caries and calculus for diagnosis [22].

Many studies have been published on detecting dental caries based on fluorescence spectroscopy. Specifically, 1) Different excitation wavelengths. A.C. Ribeiro Figueiredo et al. compared the fluorescence spectra of enamel, dentin, and caries at three different laser excitation wavelengths [23]. Currently, most research employs 405 nm LEDs or lasers as the excitation light source. 2) Fluorescence spectral features and quantification. Betsy Joseph et al. compared the differences in the peak ratio (510/630 nm) of fluorescence spectra of the four stages of caries and found statistically significant differences [24]. Sung-Ae Son et al. analyzed the slope (550-600 nm), spectral area (550-590 nm), and peak ratio (625/667 nm) of the fluorescence spectra of different stages of caries [25]. 3) Color features and quantification. Qingguang Chen et al. studied the chromaticity parameters of different stages of dental caries and found statistically significant differences [26]. Our previous work investigated the visual representation of different stages of caries in the CIE 1931 chromaticity diagram [27]. 4) Multi-spectral or hyperspectral imaging features. Ahmed L. Abdel Gawad et al. employed a hyperspectral imaging technique to compare the image distinctions between different stages of dental caries and calculus at various wavelengths [28]. The above methods try to find the characteristic differences at stages of caries in fluorescence spectrum, color, or imaging and establish a mapping relationship to quantify it. However, these studies seldom consider using these characteristic differences for automatic diagnosis, which is critical for community and home caries screening.

Recently some methods based on fluorescence or reflectance spectroscopy and imaging combined with machine learning have been applied to the automatic diagnosis of dental caries. However, these methods also have some limitations. Specifically, 1) RGB camera-based method. Duc Long Duong et al. used support vector machines (SVM) in combination with smartphone images to automatically identify caries on the occlusal surface [29]. However, the accuracy of identifying dental caries using only RGB images from smartphones is very low. Furthermore, RGB images are device-dependent, leading to a poor generalization of deep learning models. 2) Hyperspectral-based method. The study by Robin Vosahlo et al. evaluated the performance of hyperspectral-based combined with machine learning in diagnosing early occlusal lesion detection and discrimination from stains [30]. Although hyperspectral techniques have achieved promising accuracy, the high cost of the equipment and the excessive acquisition time limit its application to community and home caries screening. 3) Multimodal imaging-based method. Cheng Wang et al. constructed a portable dental imaging system with dual channels of white light and fluorescence combined with deep learning to obtain high diagnostic performance [31]. Although the multimodal-based method can improve performance, it still does not solve the difficulty of device dependency, which will limit its application in general.

Given the above discussion and the need for community and home caries screening tools in terms of accuracy, portability, and low cost, this study proposes a method for automatic diagnosis based on fluorescence spectral sub-band imaging combined with deep learning. The method first collects caries’ fluorescence imaging information under different wavebands to form a 6-channel fluorescence image. It then performs automatic diagnosis through a 2-D-3-D hybrid convolutional neural network (CNN) combined with the attention mechanism. The proposed method has high accuracy and can be applied to different smartphones, potentially in caries screening.

2. Materials and methods

2.1 Tooth specimens

Eighty-three dental specimens were obtained from eighty-three different individuals at the Department of Endodontics, Shanghai Stomatological Hospital, Fudan University. These procedures complied with the Declaration of Helsinki (2013 version). The appropriate institutional review board approved this study. Teeth are extracted due to various oral diseases. Tooth specimens were preserved in a 10% formalin solution which can retain most of the fluorescence effect [32]. The time from tooth extraction to test completion was limited to two weeks.

2.2 Six-channel fluorescence image acquisition

The first stage of the proposed method is to obtain 6-channel fluorescence images. As shown in Fig. 1, the hyperspectral fluorescence images of dental caries at different wavebands (470-780 nm, 580-780 nm) were acquired. Notedly, the filters were used here to eliminate the excitation light, and the specific band ranges were chosen according to caries and dental calculus fluorescence spectral features [4,22,25,27]. The experimental equipment includes a 405 nm LED light source (6 W, FHWM 15 nm), a 470 nm high-pass filter, a 580 nm high-pass filter, a hyperspectral camera (SR-5000, TOPCON), and a computer with pre-installed software. The computer with pre-installed software can obtain hyperspectral fluorescence image data cubes. During the experiment, step 1: a high-pass filter of 470 nm, was placed in front of the hyperspectral camera to obtain image data cubes in the 470-780 nm band. Step2: the 580 nm high-pass filter was placed in front of the hyperspectral camera lens to obtain hyperspectral fluorescence data from 580-780 nm. The obtained image cubes have a spatial resolution of 1376 × 1024, and spectral information is 470-780 nm band and 580-780 nm band with 10 nm interval, respectively. As described in Eqs. (1) and (2):

(1)$$HSI({x,y,{\lambda_{470 - 780}}} )= SPD({x,y,{\lambda_{380 - 780}}} )\times {T_1}({{\lambda_{470 - 780}}} )$$

(2)$$HS{I^{\prime}}({x,y,{\lambda_{580 - 780}}} )= SPD({x,y,{\lambda_{380 - 780}}} )\times {T_1}({{\lambda_{470 - 780}}} )\times {T_2}({{\lambda_{580 - 780}}} )$$

Fig. 1. Schematic diagram of 6-channel fluorescence image acquisition.

Download Full Size | PDF

In Eq. (1)–(2), HSI and HSI’ represent hyperspectral data cubes; SPD is the fluorescence spectral power distribution of dental caries and calculus; x and y are the coordinates of the pixel points. λ represents the number of spectral bands of the spectrum, and T₁ and T₂ are the transmittance of the high-pass filter.

The RGB image can be accurately simulated from the hyperspectral image according to the specified linear photoelectric transfer relationship because the hyperspectral image and the RGB image are physically correlated, as in Eq. (3)–(4) [33–36]. RGB cameras are device-dependent, meaning that different RGB cameras may capture the same scene differently due to RGB cameras may have different spectral sensitivity functions (SSFs), exposure times, and white balances [37]. Consequently, in our experiments, we keep the exposure times consistent and generate virtual RGB images that approximate the real images by Eq. (3)–(4). The SSF we employed was the CIE 1931 color-matching function.

(3)$$RGB({x,y,3} )= HSI({x,y,{\lambda_{470 - 780}}} )\times SSF({\lambda _{470 - 780}},3)$$

(4)$${R^{\prime}}{G^{\prime}}{B^{\prime}}({x,y,3} )= HS{I^{\prime}}({x,y,{\lambda_{580 - 780}}} )\times SSF({\lambda _{580 - 780}},3)$$

In Eq. (3)–(4), RGB and R'G'B’ represent channel reading; SSF is the spectral sensitivity function.

As shown in Fig. 1, after obtaining fluorescence images at different wavebands (470-780 nm and 580-780 nm) by calculation, the two images are concatenated in the spectral dimension to obtain the RGBR'G'B’ fluorescence image with six channels.

2.3 Dataset

Category: This study is an image segmentation task involving 5 classifications. According to ICDAS II standards, the tooth surface can be divided into four caries stages and calculus: ICDAS II 0, ICDAS II 1-2, ICDAS II 3-4, ICDAS II 5-6, and calculus. Notably, different categories may exist on the same tooth surface, as shown in Fig. 2.

Fig. 2. Two different labeling methods. (a) Pixel-level precision labeling; (b) Region of interest (ROI) labeling combined with pixel blocks extraction.

Download Full Size | PDF

Labeling: Supervised deep learning requires labeling the dataset prior to training. The conventional data labeling method is pixel-level precise labeling, as shown in Fig. 2(a). Each of the different categories of regions in the fluorescent image is precisely annotated by a professional dentist using a labeling tool. Although this pixel-level labeling facilitates the training of end-to-end networks, it suffers some shortcomings. First, this labeling method is labor-intensive and time-consuming, requiring precise annotation of every region and category in the image, including the boundaries, which may lead to marking errors in uncertain areas [38]. In addition, the number of healthy enamel regions is much greater than that of carious and calculus regions, which leads to imbalances among the sample sizes of the different categories, further causing the trained model to predict the accuracy of each category with relatively large deviations. This study uses the region of interest (ROI) labeling combined with the pixel block extraction method. As seen in Fig. 2(b), first, the dentist annotated different categories of ROIs on the fluorescent images according to ICDAS II standards. Notably, the dentist does not need to annotate each region and boundary of the tooth surface here, only marking the obvious categories, which saves annotation time and eliminates uncertain annotations. Subsequently, pixel blocks with 5 × 5 × 6 size were extracted from the annotated ROIs to form the dataset. The label of each pixel block is consistent with the label of the ROIs. The motivation for extracting a pixel block is considering both spectral and neighborhood features for the pixel point. The too-large size of pixel blocks will increase the computational cost, and too-small pixel blocks cannot be extracted enough useful information by CNN, so a spatial size of 5 × 5 is used in this paper.

Building dataset: The distribution of the dataset is listed in Table 1. First, eighty-three dental specimens were divided into fifty-four for training, eight for validation, sixteen for testing, and five to validate the model's generalization ability. Although each tooth specimen had five surfaces, most of the surfaces were healthy enamel, and to reduce data redundancy, only 1-2 tooth surfaces were measured for each tooth specimen. A total of 125 tooth surfaces were examined. Finally, 2380-pixel blocks with different categories were extracted from the 120 examined tooth surfaces labeled by the professional dentist according to Fig. 2(b). Three tooth surfaces validating the model's generalization ability were labeled according to Fig. 2(a). The three pixel-level finely labeled images were used as ground truth (GT) for comparison with the visualization results, as discussed in Section 3.4 below. Two tooth surfaces were employed to test the performance of the proposed method transferred to different smartphones, which is introduced in Section 4.2. The above process ensures that the training, validation, and testing set do not overlap throughout the model construction process.

Table 1. Dataset distribution

View Table | View all tables in this article

Data Enhancement: For improving the generalization ability of the network, the RGBR'G'B’ gain of the pixel blocks in the training set is first fixed at 1, and then the RGBR'G'B’ values are adjusted upward and downward by 25% respectively to simulate fluorescent images in different brightness scenes. The validation set and test set remain unchanged. The training samples were 4998 after data enhancement. The validation samples are 238, and the test sample is 476.

2.4 Network architectures for diagnosis

2-D-CNN-based methods usually limit their performance due to ignoring inter-layer feature correlation, especially when the spectral dimension contains a large amount of valid information [39]. The 2-D-3-D hybrid CNN-based method can balance well the performance and computational effort of the model [40]. In order to sufficiently exploit the spectral dimension information, we use the Efficient Channel Attention (ECA) module, which can involve a few parameters while bringing significant performance gains [41]. As shown in Fig. 3, the proposed network contains three 3D-CNNs, one ECA module, two 2D-CNNs, and three fully connected layers. The exact layer types are listed in Table 2. The input pixel blocks first enter three 3-D-CNN layers to be extracted for spectral and spatial features, then pass through the ECA module after shape change. ECA module captures local cross-channel interactions by considering each channel and its several neighbors [41]. Next, spatial feature learning is performed by two 2-D-CNN layers. Finally, classification diagnosis is made by three fully connected layers. Batch Normalization (BN) layers and activation functions ReLU are appended to each convolution process. The dropout layer is added to the fully connected layer to prevent overfitting.

Fig. 3. The network structure of 2-D-3-D hybrid CNN combined with the attention mechanism.

Download Full Size | PDF

Table 2. The proposed network exact layer type

View Table | View all tables in this article

As seen in Fig. 4, the input is a whole fluorescent image during the test, and two additional pre-processing operations are required before the image enters the model. 1) The separation of tooth surface and background. The background is not included as a category in the proposed method and cannot participate in the calculation process. Otsu thresholding segmentation is employed as a tool to segment the tooth surface and the background, which is one of the most widely used image segmentation methods [42,43]. 2) Pixel block extraction. Each pixel point on the tooth surface is segmented into blocks of 5 × 5 × 6 sizes to meet the requirements of the proposed network input.

Fig. 4. The prediction process of the whole fluorescence images during the test.

Download Full Size | PDF

2.5 Metrics

The model performance was evaluated with accuracy, sensitivity, specificity, and F1 score. The more effective the diagnosis, the closer the accuracy, specificity, sensitivity, and F1 score of the model are to 1.

(5)$$ACC = \frac{{({TP + TN} )}}{{({TP + FN + TN + FP} )}}$$

(6)$$TPR = \frac{{TP}}{{TP + FN}}$$

(7)$$TNR = \frac{{TN}}{{TN + FP}}$$

(8)$$F1 = {\left( {\frac{{2 + \frac{{FP}}{{TP}} + \frac{{FN}}{{TP}}}}{2}} \right)^{ - 1}}$$

In Eq. (4)–(7), ACC, TPR, TNR, and F1 denote accuracy, sensitivity, specificity, and F1 score, respectively. TP and TN represent the sample sizes correctly diagnosed as positive and correctly diagnosed as negative. FP and FN represent the sample sizes incorrectly diagnosed as positive and incorrectly diagnosed as negative.

3. Experiments and results

3.1 Implementation details

All model training and testing were implemented on the public PyTorch platform and NVIDIA RTX 3090 GPU with 24GB memory. The data pre-processing was conducted by MATLAB 2020. The loss function used the cross-entropy function, and the optimizer chose Adam. The learning rate was set to 0.0001. Early stop mechanisms were used during training. The batch size was 256. The training epochs were 200, and the epochs of the early stop were 30.

3.2 Different input experiments and ablation study

In order to validate that 6-channel fluorescence images (RGBR'G'B’) provide more useful information to the network compared to 3-channel fluorescence images (RGB or R'G'B’), a comparison of different input experiments was conducted. Furthermore, ablation studies were conducted to verify the impact of the 2-D-3-D hybrid CNN and ECA module on performance. The corresponding hyperspectral fluorescence images (HSI) were also used for comparison and ablation study. For 3-channel fluorescent image input (RGB or R'G'B’), the 2-D-CNN was used as a baseline due to the small and conventional number of channels, and the impact of adding an ECA module on performance was verified. The 3-D-CNN was used as a baseline for 6-channel fluorescence images (RGBR'G'B’) and hyperspectral images (HSI) while verifying the effect of the 2-D-3-D hybrid structure and ECA module on the model performance. The results of the different inputs performance comparison and ablation study are listed in Table 3. On the baseline model, the performance of the 6-channel input is higher than the 3-channel input and less than the HSI input. With the same model, the RGB input performs better than R'G'B’ since RGB is mapped from a broader spectral band. For the same input, both the ECA module and the 2-D-3-D hybrid CNN structure can significantly improve the model's performance. The proposed 6-channel input coupled with 2-D-3-D hybrid CNN combined with ECA achieves 96.43% accuracy, significantly higher than the 3-channel input. Other metrics have the same results, which indicates that the 6-channel input can provide more information that can help diagnose. Although the model performance of the 6-channel input is lower than that of the HSI input, the 6-channel image acquisition cost is much lower than that of the hyperspectral image.

Table 3. Performance comparison of different inputs and ablation study results^a

View Table | View all tables in this article

3.3 Comparison with other methods

As listed in Table 4, the proposed method has a competitive performance compared to other similar methods. Diagnostic categories and ACC are employed to compare the performance of different methods. The model's performance is positively correlated with the diagnostic category and ACC. The literature [44] employed panoramic radiographs combined with deep learning to diagnose carious and non-carious teeth, and the accuracy is 87%. It is further stated that caries diagnosis based on radiological methods does not apply to early caries [8]. The literature [29] used smartphones and machine learning methods to build a three-classification caries diagnosis model with an accuracy of 92.4%. In the literature [45], an in vivo caries diagnosis study was performed based on smartphones combined with deep learning, and the model achieved two classifications, and the accuracy was 81%. These results demonstrate that the diagnostic accuracy of the RGB 3-channel imaging-based method is low. Although literature [47] used multispectral reflectance spectroscopy combined with machine learning to achieve 98.1% accuracy, only a 2-classification diagnosis was achieved. The literature [46] implemented a 5-classification diagnosis using Near-infrared (NIR) hyperspectral imaging combined with machine learning, and the accuracy was 95.8%. The proposed method achieves five classifications, including calculus diagnosis, with 96.4% accuracy. The proposed method of sub-band fluorescence imaging combined with deep learning offers close to hyperspectral and multispectral diagnostic performance while having advantages in cost and portability.

Table 4. Comparison of the diagnostic effects of different methods ^a

View Table | View all tables in this article

3.4 Visualization results

In order to analyze the diagnostic performance of different inputs for stages of caries and calculus in more detail, Receiver Operating Characteristic (ROC) curves were employed as a visualization tool. For the 3-channel input of RGB and R'G'B’, the 2-D-CNN combined with the ECA module was used, and for the 6-channel RGBR'G'B’ input and HSI input, the 2-D-3-D hybrid CNN combined with ECA module was performed. The ROC curves of the five stages of caries and calculus diagnosis results are shown in Fig. 5. All inputs are sensitive to the detection of normal tooth enamel (ICDAS II 0). For ICDAS II 1-2 and ICDAS II 3-4, the R'G'B’ input has the worst recognition ability. The other three have good recognition ability, indicating the useful fluorescence band for diagnosing early caries is 470-580 nm. In diagnosing ICDAS II 5-6 and dental calculus, the R'G'B’ input outperformed the RGB input, indicating that the 580-780 nm fluorescence band provided adequate identification information. Overall, HSI can identify the stages of caries and calculus well, as it contains further useful information. The 6-channel RGBR'G'B’ input performs second best and outperforms the 3-channel RGB input and R'G'B’ input.

Fig. 5. Receiver Operating Characteristic (ROC) curves for different stages of dental caries and calculus.

Download Full Size | PDF

Pseudo-color plots are applied to compare the effects of different inputs on diagnostic performance qualitatively and verify the model's generalization ability. Three dental fluorescence images were selected which do not overlap with the training set, validation set, and test set samples in Section 2.3. The professional dentist annotated the three fluorescent images with pixel-level precision according to Fig. 2(a). The annotated results act as Ground Truth (GT). The model applied for different inputs is the same as those that generate the ROC curves. As shown in Fig. 6, the HSI input produces results closest to GT. The performance of RGBR'G'B’ input produces slightly lower results than that of HSI input. The RGB input and R'G'B’ input produce the poor performance of the prediction results, specifically in terms of correct classification rate and boundaries. The conclusions obtained are consistent with that generated by the ROC curves above.

Fig. 6. Visual comparison between four inputs for the diagnosis of dental caries and calculus. Different colors represent different stages of dental caries and calculus.

Download Full Size | PDF

4. Transfer to different RGB camera devices

4.1 Virtual fluorescence image validation

RGB camera devices, including smartphones, and digital cameras, have device dependencies, meaning that different devices may capture the same scene differently. RGB camera devices capture images can be divided into two processes: first, the raw image (raw-RGB data) is generated by the sensor based on the light response as shown in Eq. (1)–(4). Then the raw image is passed through the Image Signal Processing (ISP) to form the final image (sRGB data) [48]. The main factors that affect the raw-RGB data are SSF and light exposure [37]. The effect of light exposure on the raw-RGB data can be eliminated to some extent by setting the same light exposure or by model generalization. However, different SSF has significant effects on the raw-RGB data. MICA Toolbox, developed by Vision Ecologist, provides SSFs of common cameras [49]. Figure 7 shows the SSFs of six different cameras, and it can be seen that these SSFs have differences in peak and Full Width at Half Maximum (FWHM), which will cause different raw-RGB data. We use this publicly available data to verify the performance of the proposed method transferred to different RGB camera devices. Specifically, according to Eq. (1)–(4), the acquired hyperspectral data were utilized to obtain virtual RGBR'G'B’ 6-channel fluorescence images acting with six SSFs. Then the dataset was built according to the previous annotation, and the 2-D-3-D hybrid CNN combined with the ECA module was employed to build the diagnostic model. The test results are listed in Table 5. The mean accuracy of the SSFs was 96.82%, which was close to the results of Section 3.2. Among the six SSFs, the red channel sensitive curve with larger FWHM corresponds to higher performance, such as Nikon D7000 CoastalOpt 105 mm, Samsung NX1000 Nikkor EL 80 mm, and Sony A7 Nikkor EL 80 mm. These results indicate that a larger FWHM of the red channel sensitive curve positively affects the model performance.

Fig. 7. Spectral sensitivity function (SSF) of six different cameras.

Download Full Size | PDF

Table 5. Performance of the proposed method under the action of spectral sensitivity functions of six different cameras^a

View Table | View all tables in this article

4.2 Actual fluorescence image validation

To further validate the performance of the proposed method, an experiment was conducted to capture actual fluorescence images of dental samples using different smartphones. The experiment consists of the SSFs measurement and the actual fluorescence image acquisition. Specifically,1) SSFs measurement. The SSFs of two smartphones (Redmi 10X and HUAWEI P30) were measured using 22 narrow-band light sources according to the method in the literature [50,51]. These narrow-band light sources are generated by a spectral calibration light source (FT-2300, Labsphere) and a high-precision programmable power supply (LPS-600, Labsphere). As shown in Fig. 8, (a)-(b), the SSFs of the two smartphones have a similar trend but differ in peak ratio. The measured SSFs are combined with the tooth fluorescence spectroscopy dataset in Section 2.3 can obtain the prediction model by training. Table 5 shows the performance metrics of the prediction model on the test dataset. 2) The actual fluorescence image acquisition. As illustrated in Fig. 8(c), the two smartphones were used to acquire fluorescence images of two tooth samples at different wavebands (470-780 nm and 580-780 nm), respectively. The professional mode was adopted for capturing fluorescent images, and parameters such as shutter and sensitivity were set the same for each smartphone. A customized image calibration procedure aligned the acquired fluorescence images (RGB and R'G'B’) to eliminate pixel point shifts due to dithering during the shooting process. Finally, the two fluorescent images were concatenated as RGBR'G'B’ image, which was taken as the input to the deep learning model. In addition, hyperspectral images of the two tooth samples were acquired for comparison.

Fig. 8. (a), (b) The SSFs of two consumer smartphones measured experimentally; (c) The schematic diagram of the experiment to capture actual fluorescence images.

Download Full Size | PDF

Figure 9 shows the visual comparison of the model's predicted results for actual fluorescence images captured by two consumer smartphones. The first two columns are the actual fluorescence images for the input prediction model. The caries stages and calculus were labeled in the ROIs of the actual images. The prediction results were compared for four different inputs, HSI, RGBR'G'B’, RGB, and R'G'B’. The highest-performance neural network architecture in Table 3 was used for different inputs. The predicted results from the hyperspectral image input have the highest performance that can accurately identify the category of caries stages and calculus in ROIs. The RGBR'G'B’ image input can accurately identify the category of the ROIs but is less accurate than the hyperspectral input in terms of boundaries. RGB or R'G'B’ inputs are deficient in recognizing the category and boundaries of ROIs. RGB input is insensitive to moderate caries, heavy caries, and calculus, which are easily misidentified. In contrast, R'G'B’ input is not sensitive to early caries. These are consistent with the previous results. In summary, for actual fluorescent images, the proposed method still has a superior performance.

Fig. 9. Visual comparison of the model's predicted results for actual fluorescence images captured by two consumer smartphones. The fluorescent images captured by the cameras in the 470-780 nm band have labels in the ROIs. Different colors represent different stages of dental caries and calculus.

Download Full Size | PDF

4.3 Potential application flow

The proposed method preserves the original hyperspectral data and can build a 6-channel RGBR'G'B’ dataset similar to real scenes based on the SSFs of different RGB camera devices. Combined with the powerful power of deep learning, it is expected to solve the device-dependence problem of RGB cameras, which provides a new way of thinking and a new approach to promoting community and home caries screening. Figure 10 shows a schematic diagram of the application of the proposed method on different smartphones. After obtaining the smartphone type and corresponding SSF, the cloud will train the diagnostic model using a hyperspectral database combined with deep learning and pass it to the smartphone. The smartphone will collect fluorescent images combined with customized filters and light source devices, and finally, the model outputs the diagnostic results.

Fig. 10. Schematic diagram of the application of the proposed method to different smartphones.

Download Full Size | PDF

5. Discussion

5.1 Analysis of the reasons for the validity of six-channels RGBR'G'B’ fluorescence imaging

The proposed method first collected fluorescence images of caries in the 470-780 nm band and 580-780 nm band through two filters. Then the two fluorescence images were concatenated in the spectral dimension to form the 6-channel RGBR'G'B’ fluorescence image, which was taken as the input to the diagnostic model. Compared with the 3-channel fluorescence images (RGB or R'G'B’), the 6-channel fluorescence images have better diagnostic performance because more valid information is provided. With the development of dental caries, the corresponding fluorescence spectrum will change in two places: 1) the peak in the 480-510 nm band decreases continuously, which is due to the gradual destruction of healthy tooth enamel structure by caries, resulting in the decrease of incident light absorption and increase of scattering; 2) 1-2 emission peaks will be formed gradually in the 590-710 nm band, which is due to the accumulation of the content of bacteria and its metabolite porphyrins with the development of caries [27]. Researchers have found that long wavelength bands (near 680 nm and 700 nm) are more sensitive for diagnosing dentin caries and calculus compared to short wavelength bands when studying hyperspectral or multispectral fluorescence spectra of caries and calculus [22,28]. Inspired by this phenomenon, we fused imaging information in the 470-780 nm band and 580-780 nm band to form 6-channel fluorescence images, which provided more effective information for diagnosing dental calculus and dentin caries. Figure 5 and Fig. 6 show that the ROC curves and visualization comparison results indicate the advantage of 6-channel input for dentin caries and calculus diagnosis.

5.2 Analysis of advantages and disadvantages

The proposed method has higher diagnostic performance than the RGB camera device. For model construction, the 2-D-3-D hybrid CNN combined with an ECA module is employed to build the neural network, which improves the model's performance compared with the traditional 2-D-CNN or 3-D-CNN. In addition, to address the device-dependent problem of RGB cameras, the proposed method embeds a device containing two filters and an LED light source on a smartphone. It uses the hyperspectral and SSF databases stored in the cloud for deep learning model training, which can enable the diagnosis of dental caries using different smartphones. The proposed method is advantageous over existing methods in terms of cost, accuracy, and applicability. It has the potential to develop diagnostic tools for dental caries in the community and at home.

Here also exist some limitations. First, The RGBR'G'B’ fluorescence image input combined with the deep learning model can accurately identify caries stages and calculus classes but needs more precision regarding boundaries. It is partly because RGBR'G'B’ fluorescence images lack spectral detail information compared to hyperspectral fluorescence images. On the other hand, the ROI labeling combined with the pixel block extraction method used in this study may lead to the loss of boundary features. Further optimization of labeling methods and algorithms may improve the accuracy of boundary recognition. Next, two filters and an LED light source have not been developed as an integrated module for embedding on the smartphone. Then, smartphone manufacturers generally do not publicize SSFs and must build the corresponding test method to optimize the SSF database in the cloud. Measuring SSFs using multiple narrow-band light sources is complex and lengthy, and is only suitable for laboratory testing. Developing a simpler method for estimating SSFs would be more beneficial to the application of the proposed method. Finally, manufacturers have differences in processing raw-RGB data, called ISP, for example, white balance and color space transformation. Therefore, the proposed method needs to process raw-RGB data to be well applied to different phones, but not each smartphone can allow users to access raw-RGB data. Subsequently, we will work on developing the overall embedded device and exploring more convenient methods to evaluate the SSFs of the smartphone.

6. Conclusion

The development of rapid, objective, high-precision, low-cost, and portable caries diagnostic tools will facilitate caries screening in the community and at home. Given the challenges of the high price of hyperspectral devices and the low accuracy and device dependency of smartphones, we propose a modeling method of sub-band fluorescence imaging combined with deep learning. The method has close to the hyperspectral performance and the potential to transfer to different smartphones providing new ideas and methods for developing community and home dental caries diagnostic tools.

Funding

National Natural Science Foundation of China (62275051); State Key Laboratory of Applied Optics (SKLAO2021001A16); School of Pharmacy, Fudan University (yg2021-028).

Acknowledgments

This work was financially supported by National Natural Science Foundation [62275051]; State Key Laboratory of Applied Optics [SKLAO2021001A16]; Medical Engineering Fund of Fudan University [yg2021-028].

Disclosures

There are no conflicts to declare.

Data availability

The data that support the findings of this study are available on request from the corresponding author.

References

1. L. Cheng, L. Zhang, L. Yue, J. Ling, M. Fan, D. Yang, and X. Zhou, “Expert consensus on dental caries management,” Int J Oral Sci 14(1), 17 (2022). [CrossRef]

2. L. James, D. Abate, H. Abate, et al., “Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017,,” The Lancet 392, 1789 (2018). [CrossRef]

3. GBD 2017 Oral Disorders Collaborators, “Global, regional, and national levels and trends in burden of oral conditions from 1990 to 2017: a systematic analysis for the global burden of disease 2017 study,” J. Dental Res. 99, 362 (2020). [CrossRef]

4. O.-H. Tung, S.-Y. Lee, Y.-L. Lai, and H.-F. Chen, “Characteristics of subgingival calculus detection by multiphoton fluorescence microscopy,” J. Biomed. Opt. 16(6), 066017 (2011). [CrossRef]

5. S. Gonchukov, T. Biryukova, A. Sukhinina, and Y. Vdovin, “Fluorescence detection of dental calculus,” Laser Phys. Lett. 7(11), 812–816 (2010). [CrossRef]

6. J. E. Frencken, P. Sharma, L. Stenhouse, D. Green, D. Laverty, and T. Dietrich, “Global epidemiology of dental caries and severe periodontitis - a comprehensive review,” J Clin Periodontol 44, S94–S105 (2017). [CrossRef]

7. N. B. Pitts, D. T. Zero, P. D. Marsh, K. Ekstrand, J. A. Weintraub, F. Ramos-Gomez, J. Tagami, S. Twetman, G. Tsakos, and A. Ismail, “Dental caries,” Nat. Rev. Dis. Primer 3(1), 17030 (2017). [CrossRef]

8. R.H. Selwitz, A.I. Ismail, and N.B. Pitts, “Dental caries,” The Lancet 369, 52–59 (2007). [CrossRef]

9. V. Majanga and S. Viriri, “Automatic blob detection for dental caries,” Applied Sciences 11, 9232 (2021). [CrossRef]

10. F. F. Demarco, K. Collares, F. H. Coelho-de-Souza, M. B. Correa, M. S. Cenci, R. R. Moraes, and N. J Opdam, “Anterior composite restorations: A systematic review on long-term survival and reasons for failure,” Dent. Mater. 31(10), 1214–1224 (2015). [CrossRef]

11. N. Abogazalah and M. Ando, “Alternative methods to visual and radiographic examinations for approximal caries detection,” J. Oral Sci. 59(3), 315–322 (2017). [CrossRef]

12. A. C. T. Ko, M. Hewko, M. G. Sowa, C. C. Dong, and B Cleghorn, “Detection of early dental caries using polarized Raman spectroscopy.,” Opt. Express 14(1), 203–215 (2006). [CrossRef]

13. J. Cai, M. Guang, J. Zhou, Y. Qu, H. Xu, Y. Sun, and X. Wu, “Dental caries diagnosis using terahertz spectroscopy and birefringence,” Opt. Express 30(8), 13134–13147 (2022). [CrossRef]

14. F. Casalegno, T. Newton, R. Daher, M. Abdelaziz, A. Lodi-Rizzini, F. Schürmann, I. Krejci, and H. Markram, “Caries Detection with Near-Infrared Transillumination Using Deep Learning,” J. Dent. Res. 98(11), 1227–1233 (2019). [CrossRef]

15. N. Miyamoto, T. Adachi, F. Boschetto, M. Zanocco, T. Yamamoto, E. Marin, S. Somekawa, R. Ashida, W. Zhu, N. Kanamura, I. Nishimura, and G. Pezzotti, “Molecular Fingerprint Imaging to Identify Dental Caries Using Raman Spectroscopy,” Materials 13(21), 4900 (2020). [CrossRef]

16. E.-S. Kim, E.-S. Lee, S.-M. Kang, E.-H. Jung, E. d. Josselin de Jong, H.-I. Jung, and B.-I. Kim, “A new screening method to detect proximal dental caries using fluorescence imaging,” Photodiagn. Photodyn. Ther. 20, 257–262 (2017). [CrossRef]

17. E. Betrisey, N. Rizcalla, I. Krejci, and S. Ardu, “Caries diagnosis using light fluorescence devices: VistaProof and DIAGNOdent,” Odontology 102(2), 330–335 (2014). [CrossRef]

18. M. Melo, A. Pascual, I. Camps, Á del Campo, and J. Ata-Ali, “Caries diagnosis using light fluorescence devices in comparison with traditional visual and tactile evaluation: a prospective study in 152 patients,” Odontology 105(3), 283–290 (2017). [CrossRef]

19. Y. Iwami, A. Shimizu, H. Yamamoto, M. Hayashi, F. Takeshige, and S. Ebisu, “In vitro study of caries detection through sound dentin using a laser fluorescence device, DIAGNOdent: Caries detection using laser fluorescence,” Eur. J. Oral Sci. 111(1), 7–11 (2003). [CrossRef]

20. Q. G. Chen, H. H. Zhu, Y. Xu, B. Lin, and H. Chen, “Quantitative method to assess caries via fluorescence imaging from the perspective of autofluorescence spectral analysis,” Laser Phys. 25(8), 085601 (2015). [CrossRef]

21. M.-A. I. Timoshchuk, J. S. Ridge, A. L. Rugg, L. Y. Nelson, A. S. Kim, and E. J. Seibel, “Real-time porphyrin detection in plaque and caries: a case study,” in P. Rechmann and D. Fried, eds. (2015), p. 93060C.

22. S. P. Singh, P. Fält, I. Barman, A. Koistinen, R. R. Dasari, and A. M. Kullaa, “Objective identification of dental abnormalities with multispectral fluorescence imaging,” J. Biophotonics 10(10), 1279–1286 (2017). [CrossRef]

23. A. C. Ribeiro Figueiredo, C. Kurachi, and V. S. Bagnato, “Comparison of fluorescence detection of carious dentin for different excitation wavelengths,” Caries Res. 39(5), 393–396 (2005). [CrossRef]

24. B. Joseph, C. S. Prasanth, J. L. Jayanthi, J. Presanthila, and N. Subhash, “Detection and quantification of dental plaque based on laser-induced autofluorescence intensity ratio values,” J. Biomed. Opt 20(4), 048001 (2015). [CrossRef]

25. S.-A. Son, K.-H. Jung, C.-C. Ko, and Y. H. Kwon, “Spectral characteristics of caries-related autofluorescence spectra and their use for diagnosis of caries stage,” J. Biomed. Opt 21(1), 015001 (2016). [CrossRef]

26. Q. Chen, H. Zhu, Y. Xu, B. Lin, and H. Chen, “Discrimination of dental caries using colorimetric characteristics of fluorescence spectrum,” Caries Res. 49(4), 401–407 (2015). [CrossRef]

27. C. Wang, R. Zhang, Y. Jiang, J. Li, N. Liu, L. Wang, P. Wu, J. He, Q. Yao, and X. Wei, “Fluorescence Spectrometry based Chromaticity Mapping, Characterization, and Quantitative Assessment of Dental Caries,” Photodiagnosis Photodyn. Ther. 37, 102711 (2022). [CrossRef]

28. A. L. Abdel Gawad, Y. El-Sharkawy, H. S. Ayoub, A. F. El-Sherif, and M. F. Hassan, “Classification of dental diseases using hyperspectral imaging and laser induced fluorescence,” Photodiagn. Photodyn. Ther. 25, 128–135 (2019). [CrossRef]

29. D. L. Duong, M. H. Kabir, and R. F. Kuo, “Automated caries detection with smartphone color photography using machine learning,” Health Informatics J. 27(2), 146045822110075 (2021). [CrossRef]

30. R. Vosahlo, J. Golde, J. Walther, E. Koch, C. Hannig, and F. Tetschke, “Differentiation of occlusal discolorations and carious lesions with hyperspectral imaging in vitro,” Appl. Sci. 12(14), 7312 (2022). [CrossRef]

31. C. Wang, H. Qin, G. Lai, G. Zheng, H. Xiang, J. Wang, and D. Zhang, “Automated classification of dual channel dental imaging of auto-fluorescence and white lightby convolutional neural networks,” J. Innov. Opt. Health Sci. 13(04), 2050014 (2020). [CrossRef]

32. P. Francescut, B. Zimmerli, and A. Lussi, “Influence of different storage methods on laser fluorescence values: a two-year study,” Caries Res. 40(3), 181–185 (2006). [CrossRef]

33. L. -J. Zhang, J. Jiang, H. Jiang, J. -J. Zhang, and X. Jin, Improving Training-based Reflectance Reconstruction via White-balance and Link Function, 2018 37th Chinese Control Conference (CCC), 2018, pp. 8616–8621.

34. Berk Kaya, Yigit Baran Can, and Radu. Timofte, Towards spectral estimation from a single RGB image in the wild. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, 2019.

35. Biebele Joslyn Fubara, Mohamed Sedky, and David Dyke, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 480–481.

36. B. Arad, R. Timofte, O. Ben-Shahar, Y.-T. Lin, G. Finlayson, and S. Givati, “NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image,” (2020).

37. X. Gao, T. Wang, J. Yang, J. Tao, Y. Qiu, Y. Meng, B. Mao, P. Zhou, and Y. Li, “Deep-learning-based hyperspectral imaging through a RGB camera,” J. Electron. Imag.30(05), (2021).

38. L. Ju, X. Wang, X. Zhao, P. Bonnington, T. Drummond, and Z Ge, “Leveraging regular fundus images for training UWF fundus diagnosis models via adversarial learning and pseudo-labeling,” IEEE Transactions on Medical Imaging 40(10), 2911–2925 (2021). [CrossRef]

39. J. Li, C. Wu, R. Song, Y. Li, W. Xie, L. He, and X. Gao, “Deep Hybrid 2-D-3-D CNN Based on Dual Second-Order Attention With Camera Spectral Sensitivity Prior for Spectral Super-Resolution,” IEEE Trans. Neural Netw. Learning Syst.1–12 (2021).

40. S. K. Roy, G. Krishna, S. R. Dubey, and B. B. Chaudhuri, “HybridSN: Exploring 3-D–2-D CNN Feature Hierarchy for Hyperspectral Image Classification,” IEEE Geosci. Remote Sensing Lett. 17(2), 277–281 (2020). [CrossRef]

41. Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks, (2020).

42. N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybern. 9(1), 62–66 (1979). [CrossRef]

43. Changhua Lu, Ping Zhu, and Yong Cao, “The segmentation algorithm of improvement a two-dimensional Otsu and application research,” in 2010 2nd International Conference on Software Technology and Engineering (IEEE, 2010), p. 5608908.

44. S. Vinayahalingam, S. Kempers, L. Limon, D. Deibel, T. Maal, M. Hanisch, S. Bergé, and T. Xi, “Classification of caries in third molars on panoramic radiographs using deep learning,” Sci. Rep. 11(1), 12609 (2021). [CrossRef]

45. M. T. G. Thanh, N. Van Toan, V. T. N. Ngoc, N. T. Tra, C. N. Giap, and D. M. Nguyen, “Deep Learning Application in Dental Caries Detection Using Intraoral Photos Taken by Smartphones,” Appl. Sci. 12(11), 5504 (2022). [CrossRef]

46. P. Usenik, M. Bürmen, A. Fidler, F. Pernuš, and B. Likar, “Automated Classification and Visualization of Healthy and Diseased Hard Dental Tissues by Near-Infrared Hyperspectral Imaging,” Appl. Spectrosc. 66(9), 1067–1074 (2012). [CrossRef]

47. A. Procházka, J. Charvát, O. Vyšata, and D. Mandic, “Incremental deep learning for reflectivity data recognition in stomatology,” Neural Comput. Appl. 34(9), 7081–7089 (2022). [CrossRef]

48. L. Yahiaoui, J. Horgan, S. Yogamani, and C. Hughes, “Impact analysis and tuning strategies for camera image signal processing parameters in computer vision,” Irish Machine Vision and Image Processing conference (IMVIP), Vol. 2 (2011).

49. C. P. van den Berg, J. Troscianko, J. A. Endler, N. J. Marshall, and K. L. Cheney, “Quantitative Colour Pattern Analysis (QCPA): A comprehensive framework for the analysis of colour patterns in nature,” Methods Ecol. Evol. 11(2), 316–332 (2020). [CrossRef]

50. P. L. Vora and J. E. Farrell, “Digital color cameras—2—Spectral response,” Tech. Report (1997).

51. M. M. Darrodi, G. Finlayson, T. Goodman, and M. Mackiewicz, “Reference data set for camera spectral sensitivity estimation,” JOSA A , 32(3), 381–391 (2015). [CrossRef]

Dataset	No. of tooth specimens	No. of tooth surfaces measured	No. of extracted pixel blocks (size:5 × 5 × 6)
Training Set	54	84	420 (ICDAS II 0)
			356 (ICDAS II 1,2)
			330 (ICDAS II 3,4)
			280 (ICDAS II 5,6)
			280 (Calculus)
Validation Set	8	12	60 (ICDAS II 0)
			52 (ICDAS II 1,2)
			46 (ICDAS II 3,4)
			40 (ICDAS II 5,6)
			40 (Calculus)
Test Set	16	24	120 (ICDAS II 0)
			102 (ICDAS II 1,2)
			94 (ICDAS II 3,4)
			80 (ICDAS II 5,6)
			80 (Calculus)
Generalization capability validation	5	5	-
Total	83	125	2380

Layer (type)	Output Shape
Input	(5, 5, 6, 1)
conv3d_1 (Conv3D + BN + ReLU)	(5, 5, 6, 32)
conv3d_2 (Conv3D + BN + ReLU)	(5, 5, 6, 64)
conv3d_3 (Conv3D + BN + ReLU)	(5, 5, 6, 96)
reshape_1 (Reshape)	(5, 5, 576)
eca_1 (ECA module)	(5, 5, 576)
conv2d_1 (Conv2D + BN + ReLU)	(5, 5, 128)
conv2d_2 (Conv2D + BN + ReLU)	(3, 3, 256)
flatten_1 (Flatten)	(2304)
fcn_1 (Full connection + Dropout)	(1000)
fcn_2 (Full connection + Dropout)	(500)
fcn_3 (Full connection + Dropout)	(5)

Input	Channels	Method	ACC (%) ↑	Sen (%) ↑	Spec (%) ↑	F1 (%) ↑
RGB	3	2-D-CNN	90.69	90.19	97.71	90.06
RGB	3	2-D-CNN + ECA	92.23	91.68	98.08	91.68

R'G'B'	3	2-D-CNN	85.22	84.73	96.32	84.64
R'G'B'	3	2-D-CNN + ECA	86.63	86.09	96.67	86.18

RGBR'G'B'	6	3-D-CNN	92.58	92.13	98.18	91.99
		2-D-3-D Hybrid CNN	94.33	94.01	98.61	93.87
		2-D-3-D Hybrid CNN + ECA	96.43	96.08	99.11	96.13

HSI	32	3-D-CNN	98.11	98.09	99.52	98.10
		2-D-3-D Hybrid CNN	98.53	98.48	99.62	98.55
		2-D-3-D Hybrid CNN + ECA	99.16	99.19	99.78	99.22

Detection method	Diagnostic Category	ACC (%) ↑
Panoramic radiographs and deep learning [44]	2-classes (caries and non-caries)	87.0
Smartphone images and machine learning [29]	3-classes (no surface change, visually non-cavitated, cavitated)	92.4
Smartphone images and deep learning [45]	2-classes (non-cavity and cavitated lesions)	81.0
Near-infrared (NIR) hyperspectral imaging and machine learning [46]	5-classes (enamel, dentin, calculus, enamel caries, dentin caries)	95.8
Multispectral and machine learning [47]	2-classes (healthy, caries)	98.1
Sub-band fluorescence imaging and deep learning (Ours)	5-classes(ICDAS II 0, ICDAS II 1-2, ICDAS II 3-4, ICDAS II 5-6, and calculus)	96.4

Spectral sensitivity function (SSF)	ACC (%) ↑	Sen (%) ↑	Spec (%) ↑	F1 (%) ↑
Canon 5D MKIII Tamron 150 to 600mm	96.64	96.38	99.17	96.39
Canon 700D 18 to 55mm	95.80	95.43	98.97	95.41
Nikon D7000 CoastalOpt 105mm	97.48	97.29	99.38	97.30
Olympus PEN E-PL5 Olympus 60mm	96.22	95.93	99.10	95.91
Samsung NX1000 Nikkor EL 80mm	98.53	98.50	99.63	98.48
Sony A7 Nikkor EL 80mm	97.06	96.88	99.27	96.88
Redmi 10X	96.85	96.50	99.22	96.53
HUAWEI P30	96.01	95.58	99.02	95.62
Mean	96.82	96.56	99.22	96.57

Deep learning and sub-band fluorescence imaging-based method for caries and calculus diagnosis embeddable on different smartphones

Abstract

1. Introduction

2. Materials and methods

2.1 Tooth specimens

2.2 Six-channel fluorescence image acquisition

2.3 Dataset

2.4 Network architectures for diagnosis

2.5 Metrics

3. Experiments and results

3.1 Implementation details

3.2 Different input experiments and ablation study

3.3 Comparison with other methods

3.4 Visualization results

4. Transfer to different RGB camera devices

4.1 Virtual fluorescence image validation

4.2 Actual fluorescence image validation

4.3 Potential application flow

5. Discussion

5.1 Analysis of the reasons for the validity of six-channels RGBR'G'B’ fluorescence imaging

5.2 Analysis of advantages and disadvantages

6. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (10)

Tables (5)

Equations (8)

Biomedical Optics Express