Tumor tissue classification based on micro-hyperspectral technology and deep learning

Bingliang Hu; Bingliang Hu; Jian Du; Jian Du; Zhoufeng Zhang; Zhoufeng Zhang; Quan Wang; Quan Wang

doi:10.1364/BOE.10.006370

1. Introduction

At present, precision medicine research has become the focus of biomedical field. The new diagnosis and treatment of malignant tumor is one of the important research contents [1]. Classification based on the location and cytological characteristic of primary tumor can no longer meet the practical needs. Because in traditional histopathology [2], pathologists usually need to perform a series of work such as fixation and staining on pathological sections. The whole diagnosis process is cumbersome and time consuming. The workload is large and the diagnosis result is easily affected by human experience. With the rapid development of new medical imaging technology and medical analysis, future histopathology is expected to achieve greater progress and innovation on the basis of computer-aided analysis.

Hyperspectral imaging technology combines two traditional optical diagnostic methods of spectral analysis and optical imaging [3–5]. While medical microspectral imaging is an integrated crossover technology based on many disciplines such as clinical medicine, imaging, and pathological analysis. Spectral analysis can obtain a complete spectrum of a certain point on the biological tissue sample in the wavelength range of interest [6], and thereby analyze the chemical composition and physical characteristic of different pathological tissues; optical imaging technology provides spatial distribution information of each micro structure to achieve a visual representation of different pathological structures. Medical micro-hyperspectra combines two-dimensional image information with one-dimensional spectral signal into a three-dimensional datacube, which not only includes rich spatial information, but also contains spectral information that reflects biological tissue characteristics. In the visible and near infrared spectra, light absorption can reveal the state of vascular proliferation and high metabolism. Both of them are related to hemoglobin concentration and oxygen saturation, which are important markers of cancer [7].

With rapid development of hyperspectral imaging technology and precision medicine theory in recent years, the application of hyperspectral technology to close-range medical diagnosis has become a new research trend. For example, Liu collected micro-hyperspectral data of motor nerve cells and sensory nerve cells, and used improved Spectral Angel Mapping algorithm to classify [8]. However, the algorithm is only sensitive to transmission intensity and cannot identify subtle spectral differences. Ortega used SVM and ANN to distinguish healthy brain tissue from tumorous brain tissue in vitro [9]. The optimal sensitivity and specificity were over 92%. However, due to the lack of spatial information, accurate positioning and related medical interpretation of tumor tissue cannot be performed. Zhu proposed an identification method based on the characteristics of spectral regions and the average accuracy reached 95% [10]. However, the method used artificial extraction features (a total of 7 features), which is only validated on small sample datasets (50 samples per group), and not scalable enough. In general, due to the constraints of imaging performance [9], big data accumulation [10,11], and traditional spectral algorithms [8,12], micro-hyperspectral research is still in the preliminary exploratory stage [13,14]. In view of current research situation, this paper proposes the following three problems and improves them:

1. Individual differences of spectral features in micro tissues: In order to effectively apply spectral information in histopathology, it is necessary to study individual differences and consistency of spectral features in visible and near-infrared bands.
2. Spatial-spectral features and medical contacts in micro tissues: In order to effectively integrate with traditional medical diagnosis, it is necessary to establish a medical contact with the combination of spectral information and image information.
3. Analytical model of micro-hyperspectral dataset: Due to the large amount of spectral data and samples, a more accurate convolutional neural network model needs to be established.

Deep learning is a research hotspot in the field of artificial intelligence. Since Hinton proposed deep learning theory [15], it has been widely used in image classification, target detection, facial recognition and other fields. Involving different levels of data representation, deep learning usually requires a multi-layer neural network. This structure can learn more essential deep features, which is conducive to extract abstract and invariant features. Convolutional Neural Network (CNN) is one of the most representative models [16]. The unique theory of weight sharing and local connection makes CNN full of richer biological significance. For hyperspectral image processing, the effective structure and training method used by CNN to extract complex spectral features still need further study.

In recent years, the continuous improvement of hyperspectral imaging speed and spatial-spectral resolution make it possible to study subtle spectra of biological tissue. Therefore, this paper use the self-developed micro-hyperspectral instrument to image gastric cancer tissues. We take the micro-hyperspectral data of 30 patients with gastric cancer as the research object. The differences of micro-hyperspectral features between gastric cancer tissues and normal tissues is explored. And an efficient joint classification method is established based on CNN model. At the same time, we also study the model visualization and medical interpretation related to the tumor spectral-spatial features.

2. Material and methods

2.1 Experimental framework

The experimental framework in this paper is shown in Fig. 1. First, Raw HS images are acquired according to the standardized micro-hyperspectral data acquisition process which is jointly formulated by the laboratory and the pathology department. Second, we perform data preprocessing that uniformity calibration and Savitzky-Golay smoothing and 1st-derivation are applied to spatial dimension and spectral dimension respectively to remove system noise. Then experimental datasets are established for spectral data and image data respectively.

Fig. 1. Micro-hyperspectral data processing framework of gastric cancer.

Download Full Size | PDF

Focusing on the above three issues for modeling and analysis, we establish a Spectral-CNN classification model (Spec-CNN for short) based on the spectral features of gastric cancer tissue to investigate the effect of different structural parameters on model performance. The dataset is divided into individual samples to study the individual difference and consistency of hyperspectral features.

In traditional diagnosis, pathologists mainly observe the difference between cells which includes morphological characteristics and arrangement of cells. In order to establish the relationship between spatial-spectral features and traditional medical features, two classification models, Spectral-Spatial-CNN-1 and Spectral-Spatial- CNN-2 (SS-CNN-1, SS-CNN-2 for short), are proposed in the next step. The effective combination of spectral fingerprint information and image morphology information can improve the information utilization of spectral data.

In view of the characteristics of high dimensionality and large amount of data in micro-hyperspectral datasets, Spectral-Spatial-CNN-3 model (SS-CNN-3 for short) based on three-dimensional convolution is proposed to learn more comprehensive image features of gastric cancer tissues. It can obtain the optimal classification performance while the model parameters and training time are controllable.

2.2 Microscopic hyperspectral imaging system

The micro-hyperspectral imaging system used in this research is developed by our laboratory. For medical micro-spectra, the system is mainly for microscopic imaging, and the signal intensity is weak. There are high requirements for spectral resolution and number of spectral band. Therefore, the system adopts built-in scanning principle and dispersion principle, which mainly consists of a microscopic imaging system (1), a built-in scanning spectrum imager (2), a data transmission line (3) and a data acquisition system (4). The system schematic and software interface are shown in Fig. 2. The spectral range of the design system covers 350-1000 nm which reaches the requirement of wide spectral range, the number of spectral channel is greater than 200 and the spectral resolution is better than 1 nm; the image size is 716×596 pixels, the magnification of the microscope objective is 20×, the numerical aperture is 0.65, the spatial resolution reaches 0.5µm which meets the requirement of high imaging resolution; the number of data quantization bits is 16, and the image scan time is less than 30s which has the advantages of simple structure and convenient operation. All the above performance indexes meet the research requirements of subtle spectra of this paper.

Fig. 2. Schematic of micro-hyperspectral imaging system. (a) Imaging system. (b) Software surface. (c) Electric control scanning mechanism.

Download Full Size | PDF

A wide-spectral tungsten halogen lamp (400nm-2500nm, 50W) is used as lighting source in the experiment. The beam enters the built-in scanning spectral imager through the measured target, that is through entrance slit, collimator lens, dispersing elements and focusing lens in turn, the one-dimensional spatial information and spectral information of the field are obtained on the imaging plane. The data acquisition software sends control instructions to the controller through the serial port, and the controller outputs control signals to the driver. The driver drives the stepper motor to rotate and then drives the precision displacement table to move, so as to achieve the acquisition of another one-dimensional spatial information. At the same time, image data acquisition and storage are completed by the software.

2.3 Experimental dataset

The experimental sample was histopathological section of gastric cancer. The samples were from the Department of Pathology, the First Affiliated Hospital of Xi'an Jiaotong University. The pathological sections of 30 patients (26 cases of gastric cancer and 4 cases of normal) were included. Each sample contains a detailed pathological diagnosis report, including specific cancer status, cancer staging and so on. According to the degree of tumor malignancy, gastric cancer is classified into I-level high differentiation, II-level middle differentiation, III-level poor differentiation and IV-level undifferentiation. The degree of tumor malignancy gradually deepens as the level increases. The 30 samples used in this experiment were all diagnosed by doctors. Each sample contains at least two pathological sections. In addition, the following characteristics of this experimental sample should be noted:

(1) Doctor diagnosed some samples which found no gastric cancer as normal tissue (P9, P11, P20, P30);
(2) There were cases in which different sections have different diagnosis in some samples. For example, two sections of P14 patient were diagnosed as III-level poorly differentiated and IV-level undifferentiated respectively;
(3) The same section may have the characteristics of two types at the same time. For example, one section of P5 included both I-level high-differentiation and II-level differentiation;
(4) Usually a section included both normal tissue and cancerous tissue.

The experiment strictly followed the standardized data acquisition process, and the micro-hyperspectral system was used for data acquisition. The magnification of the objective lens was set to 20×. Each section can collect 8-10 hyperspectral images depending on the area of the sampled tissue. Pseudo-color synthesis of the acquired hyperspectral image is performed and given to the doctors for diagnostic marking. Doctors first judge whether the image contains gastric cancer tissue, and then circle the gastric cancer tissue range and make a classification mark. At the same time, for some images with obvious characteristics of cancer cells, each cancer cell needs to be circled separately. For example, the area marked by the red line in Fig. 3(a) is the region of gastric cancer tissue, and the area in green circle in (b) is a single cancer cell.

Fig. 3. (a) Gastric cancer tissue markers by red line. (b) Gastric cancer cell markers in green circles.

Download Full Size | PDF

After data marking, we obtained a hyperspectral image of the gastric cancer tissue. The specific sample condition is shown in Table 1. The size of each sample data is 596×716×256 pixels (image width × image height × band number). The ROIs are determined by visual inspection based on the gastric cancer region and the normal tissue region marker and reconfirmed by doctors. The typical spectral curves are then manually extracted and saved, and an average of 2-4 ROIs can be obtained for each hyperspectral image. Finally, 36546 spectral curves of gastric cancer and 28782 spectral curve samples of normal tissue were obtained, which constitute the experimental dataset of this study.

Table 1. Experimental dataset.

View Table | View all tables in this article

2.4 Methods

2.4.1 Hyperspectral data preprocessing

Due to the non-uniformity of the detector array response and the vignetting of the imaging optical system, the output response of the system to the same input brightness is somewhat different, so that some strip noises appear on the image. These strip noises will influence the subsequent analysis and processing. Therefore, we first perform uniformity calibration processing for the raw data collected. Commonly used uniformity calibration algorithms include average statistical calibration, local mean, low-pass filter calibration, and adaptive calibration. In this paper, the average statistical calibration is applied which is more suitable for push-scan image data. The calibration results are shown in Fig. 4. The upper row is the image of different spectral bands before calibration, and the lower row is the image of different spectral bands after calibration. It can be seen that after the calibration, the narrow stripe noises in the original image are well processed, and a wide bright strip noise in the central region is also well suppressed.

Fig. 4. The comparison between the pre-calibration (upper row) and post-calibration (lower row). The spectral image of (a) 80th, (b) 130th, (c) 180th, (d) 230th, (e) pseudo-color image.

Download Full Size | PDF

In addition, in order to eliminate the effect of high frequency noise and baseline offset, we use S-G smoothing and 1st-derivation to preprocess the spectral dimension [17]. The post-processing and the establishment of related models also take the 1st-derivative spectral data as input. Taking into account factors such as spectral resolution, smoothing denoising effect and effective information retention, setting the moving window width to 7(spectral band) can achieve better result. In addition, we consider that the spectral data of 350∼410nm and 910∼1000nm is greatly interfered by noise, and the data of this part is removed. The remaining part of 410∼910nm is reserved as the research object. As shown in Fig. 5.

Fig. 5. 1st-derivative spectral curves of gastric cancer and normal tissue.

Download Full Size | PDF

In order to have a clearer understanding of the spectral differences between gastric tissues, Spectral Angle Mapping (SAM) is introduced here to analyze the spectral similarity. The similarity between two spectra is determined by calculating the angle. The calculation formula of SAM is as follows.

(1)$$\alpha = {\cos ^{ - 1}}\left[ {\frac{{\sum\limits_{i = 1}^{{n_b}} {{t_i}{r_i}} }}{{(\sum\limits_{i = 1}^{{n_b}} {t_i^2{)^{{\textstyle{1 \over 2}}}}(} \sum\limits_{i = 1}^{{n_b}} {r_i^2{)^{{\textstyle{1 \over 2}}}}} }}} \right]$$

where n_b is the number of band, t and r represent the test spectrum and the reference spectrum respectively. If the calculation result of spectral angle is large, it means that the spectral difference between the two is large and vice versa. The average spectral data between gastric cancer tissue and normal tissue are substituted into formula (1). The calculated result of 0.06 indicates that the spectral change of the two is similar. So it is difficult to distinguish the gastric cancer tissue and normal tissue by traditional methods such as a threshold setting of spectral angle.

2.4.2 CNN model

(1) Model concept

The Convolutional Neural Network is one of the most representative models in deep learning theory, which typically includes several convolutional layers, pooled layers and fully connected layers. The concept is derived from the artificial neural network. There are connections between neurons in adjacent layers, but no connection between neurons in the same layer or across layers. This structure simulates the mechanism by which human brain interprets data. Compared with shallow learning such as artificial neural network, deep learning models are good at using big data to learn features and emphasizing the depth of the model and the importance of feature learning. The idea of CNN local connection is derived from the structure of visual system in biology. As shown in Fig. 6, the neurons in upper layer respond only to a few neurons in the lower layer. For spectral data, an input unit connected by an implicit unit requires only a part of spectral signal in the spectral input sequence, which can effectively avoid complex feature extraction and data reconstruction. Also derived from the biological point of view is the theory of weight sharing. The activity of adjacent neurons tends to be similar, so that the same connection parameters can be shared. This means that for the same convolution kernel, the features extracted in a certain sequence region are equally applicable to other regions, which greatly simplifies the network structure. The convolutional layers use convolution kernels to process the input, which can reduce weighted join and introduce sparsity. The convolution formula can be expressed as,

(2)$$h_i^k = \textrm{ReLU}(({W^k}x)_{i} + {b^k})$$

where x is the model input or the feature of the previous layer, h is the output characteristic of the current layer, W is the convolution kernel, b is the offset, ReLU is the activation function, and the kth feature is the output after convolution and nonlinear activation. Taking one-dimensional spectral data as an example, suppose a part of spectral data contains m bands, and the convolutional layer has a total of n convolution kernels. The size of the convolution kernel is a, the step size is 1, and n characteristic sequences of size (m-a + 1) can be obtained through convolutional layer. In general, CNN includes typical operations such as convolution, nonlinearity, pooling, and batch normalization. And the initial shallow features of the data are gradually transformed into deep feature representations, which can then complete complex classification task by operations such as LR. The deep learning models proposed in this research are all implemented by the Keras framework [18]. Keras is now widely used in deep model construction such as CNN and RNN. It is based on the second development of Theano, with a highly modular design and high programming efficiency, enabling more efficient optimization, tuning, and evaluation of deep network.

Fig. 6. Schematic diagram of local connection theory.

Download Full Size | PDF

For medical hyperspectral data, the closer sample points are in spatial position, the greater the likelihood of belonging to a homogeneous organization. The spatial neighborhood centered on the target sample point is used as its characteristic representation, which can eliminate the noise interference that a single sample point is susceptible to, so as to better represent the spectral characteristics of the category to which it belongs.

(2) Model architecture

Four CNN-based analysis models are proposed respectively for the problems summarized above. As shown in Fig. 7. After the raw data is preprocessed, the spectral information of each sample point is extracted based on the doctor's mark. The spectral curve of each point is directly input to model. The input node is 200 (number of bands) and the output is 2 (classification category). Considering the number of input node, output node and samples, we establish Spec-CNN composed of five layers. As shown in Fig. 8, I1 is the input layer, C2 is a convolution layer, P3 is a maximum pooling layer, F4 is a fully connected layer, and O5 is the output layer. After passing through the convolutional layer and pooling layer, the input spectral vector is converted into a feature vector and finally classified by Logistic Regression. Individual differences of spectral characteristics in gastric cancer tissues are studied based on Spec-CNN model.

Fig. 7. Establishment process of four CNN models.

Download Full Size | PDF

Fig. 8. Schematic diagram of Spec-CNN model structure.

Download Full Size | PDF

Spec-CNN simplifies model parameters by simple spectral input, which is an effective application of high-dimensional data processing. In order to introduce image information and establish medical contacts, the data size can be reduced by dimension reduction [19–21]. PCA is one of the commonly used methods, which can reduce multiple spectral dimensions to a linear combination of several spectral principal components. For example, Makantasis used the first three principal components in remote sensing image classification [22]. Liang introduced random PCA (R-PCA) and explored the different effect of different principal components applied to the classification model [23]. In this paper, the modeling is based on the idea of small neighborhood with more spectra and large neighborhood with less spectra respectively. In SS-CNN-1 model, the data dimension was reduced to an acceptable range by PCA processing. The small neighborhood data of central sample point is used as the experimental samples and then split side by side as the model input. At the same time, the effect of actual sample size on classification accuracy is studied. The model structure of SS-CNN-2 is similar to that of SS-CNN-1, but the input data formats are different. In order to establish more medical image contacts, SS-CNN-2 uses a larger spatial neighborhood as input. Mainly studying the ability of model to extract morphological characteristics of micro tissues, it sacrifices part of the spectral information. The model selects large (w×w pixels) neighborhood and first three principal components for the two-dimensional input of sample data. Although the first three principal components cannot cover all the spectral dimension data, they also include most of the spectral information. After several layers of convolution and pooling, the input image data is converted into some feature vector representation [24,25].

SS-CNN-1 and SS-CNN-2 can extract local spatial features but still lose some spectral information. In order to further improve the classification accuracy, SS-CNN-3 will be established based on the full spectral information. This means that preprocessing operations such as PCA may not be required, and the resulting deep classification model is trained directly in an end-to-end manner. Our proposed SS-CNN-3 model treats the input as small window data around the target pixel. While taking into account the cell size, we use small convolution kernel to perform the convolution operation. If a large neighborhood is used as input, the window boundary has a large impact on the classification of central sample point [26,27], which will affect the performance of the model. Compared with the SS-CNN-2 of large neighborhood with less spectra, SS-CNN-3 is a lightweight model that is less affected by local edges, and is more suitable for precise classification of gastric cancer spectra.

3. Results and analysis

3.1 Spectral feature

3.1.1 Experimental results of Spec-CNN

For traditional hyperspectral data processing, the process usually needs two parts of feature extraction and classification. The feature extraction process may require manual calculation. The method based on CNN model can combine these two parts into one and make full use of the effective data features. Therefore, this section firstly establishes the Spec-CNN model to extract spectral features and realize classification. The specific parameters of the model are shown in Table 2.

Table 2. Structure and parameters of Spec-CNN.

View Table | View all tables in this article

where n represents the number of convolution kernel or the number of neuron in each layer, and k represents the size of convolution kernel or pooling. The Spec-CNN consists of 5 layers including a convolutional layer and a pooling layer. The dataset is divided as shown in Table 3.

Table 3. Dataset division.

View Table | View all tables in this article

The experimental data is divided into four categories (I, II, III and IV) according to different types of gastric cancer, which are input into the network for modeling respectively. Taking IV-level undifferentiated as an example, 6522 cases of gastric cancer samples are all used as negative samples. The same numbers of samples in normal set are taken as positive samples, meanwhile the training set and test set are divided by 7:3. The principle of sample division in each iteration training is random division. In this experiment, some model parameters such as the number and size of convolution layers and convolution kernels, the number of neurons in fully-connected layer are optimized. The experimental results are shown in Fig. 9.

Fig. 9. Influences of different parameters on Spec-CNN. (a) Number of convolution layers. (b) Number and size of convolution kernels. (c) Number of neurons in fully connected layer.

Download Full Size | PDF

The model classification accuracy is best when the number of convolution layers is 1 in (a). As the number of layers increases, the classification efficiency decreases, and similar patterns appear on the four types of samples. More layers will increase the risk of overfitting. The x and y coordinates represent the convolution kernel size and the number of convolution kernels respectively in (b). As the size and number increase, the classification accuracy gradually increases and the best is achieved around the value of 20. It is stated that within a certain range, more convolution kernels combined with larger convolution kernel size could bring optimal result probably. The sample I and II were tested with 100 neurons optimally, and the III and IV samples were 120 optimally in (c). Because when the number of neurons in fully-connected layer is too small, the slow convergence of the model leads to a poor accuracy. And considering the training efficiency, it is more appropriate to set the number of neurons to 100. In addition, the setting of mini-batch has little effect on the model. And the number of iterations needs to be set according to the trend of training curve.

We set mini-batch to 32 in Spec-CNN training. The learning rate is set to 0.1 and the momentum factor is set to 0.9. Each feature map of the convolutional layer C2 includes (200-17 + 1) = 184 nodes. There are 22×(17 + 1) = 396 parameters between I1 and C2, and (22×(184/4) + 1)×100 = 101300 parameters between F4 and P3 layer. The sum of all the trainable parameter reaches 100,000.

Table 4 shows the classification results of the Spec-CNN model applied to the four sample sets. The results of II, III and IV are better, and the average accuracy, specificity and sensitivity are above 90%, which is better than sample I. This rule is also in line with the common sense of medical pathology, because the level of I differentiation is relatively high which is very similar to normal tissue. It is indicated that the spectral characteristics and morphological characteristics of gastric cancer have certain similarities in the distribution of characteristics. With the increase of grade, the level of tumor malignancy is deepened, the difference between cancerous tissue and normal tissue is more obvious, and the classification effect of the model is also improved accordingly.

Table 4. Classification results of Spec-CNN in dataset I, II, III and IV.

View Table | View all tables in this article

In addition, the obtained Spec-CNN model is compared with artificial neural network (ANN) and SVM. Taking IV samples as an example, the performance differences of three algorithms for hyperspectral data analysis are studied. Among them, the neural network is provided with three hidden layers, and the model structure is input (200) - fully connection (60) - fully connection (40) - fully connection (20) – output (2). The SVM uses the Radial Basis Function (RBF) as the kernel. For the two selectable parameters c and γ in the model, the 2-D grid search is used for parameter optimization (c = 2-12, 2-11… 212; γ = 2-14, 2-13… 214), and the model is constructed with optimal parameters. This process is based on the SVM function toolkit in MATLAB. The model training of CNN algorithm takes a long time, but the test speed is similar to the others. Table 5 lists the classification results for CNN and the other two models. Obviously, the CNN model proposed in this paper has obtained best classification effect. In addition to the higher accuracy, the sensitivity and specificity of the gastric cancer samples are 94.99% and 93.76%, respectively, which are better than those of ANN and SVM.

Table 5. Classification results of three types of models in dataset IV.

View Table | View all tables in this article

3.1.2 Individual differentiation of gastric cancer tissue spectra

In this section, we focus on the individualized differences in gastric cancer and the possibility of practical application. Two cases are proposed according to the research purpose:

(1) Case study 1 excludes the variability of data among different patients, which focus on the spectral features of each patient's gastric cancer tissue. Unlike the dataset partitioning pattern above, each patient in this case is treated as an independent sample, and each patient's sample data is used independently to train the respective model. The dataset removes some cases which lack of gastric cancer tissue (such as P10, P12, etc.), and the remaining 17 samples are used as research objects. Based on Spec-CNN-1, each patient's data is independently modeled, and then its classification accuracy, sensitivity and specificity are counted. Table 6 shows the detailed classification results. It can be seen that different patient cases have obtained higher classification accuracy, and the average accuracy is 93.66%. At the same time, the model is highly sensitive and specific to gastric cancer tissues and normal tissues, and the sensitivity and specificity of all cases in III and IV are higher than 90%. From the perspective of different types of gastric cancer, from I to IV, the classification accuracy is gradually improved. Two samples in I (P16, P29) are less than 90%, and there are three samples (P24, P27, P28) reached over 97% in IV. The above results indicate that CNN is applied to the excellent performance of individual samples. And for each individual patient, its own characteristic of gastric cancer tissue is similar, and easy to discriminate.
(2) Case study 2 focuses on the actual application scenario, and each patient's data is separately taken out as a test set. The other remaining patients’ data are used as a training set. This process simulates the real pathological diagnosis process. When a new sample is needed for diagnosis, only the model trained by the previous patient's data can be used for classification diagnosis. The idea of this solution is closer to the actual needs. For each specific patient, data was modeled using other patients and also tuned based on the Spec-CNN. Table 7 detailed classification results. Compared with the results of Case study 1, the classification accuracy of Case study 2 is relatively poor, and the sensitivity and specificity are also worse. At the same time, the difference of results between different patients is obvious. For example, the accuracy of P17 is only 69.54%, which indicates that the gastric cancer characteristic of patient 17 may be different from those of other patients. The model fails to learn similar features of gastric cancer from other samples and results in poor classification results. For the samples P6 and P27, the classification results are relatively good. Although the effect is poorer than that of Case study 1, the accuracy of about 90% indicates that the model has learned effective features. The model performance can be further improved by increasing the number of sample data.

Table 6. Experimental results of Case study 1.

View Table | View all tables in this article

Table 7. Experimental results of Case study 2.

View Table | View all tables in this article

Overall, the classification result of Case study 2 is not as good as Case study 1. It indicates that the spectral features of gastric cancer tissue have certain dependence on the individual patient, and the spectral features of different patients are different. In the above study, model training sample is scarce, and only four patients’ data averagely are used for modeling. The model doesn't have enough sample information and is highly dependent on the only training samples. So it is impossible to train a more generalized model. Therefore, in subsequent studies, it is essential to obtain more case samples and expand the dataset. Abnormal samples also need to be collected and analyzed. New samples after preprocessed can be directly added to the established database and the model can be updated.

3.2 Visualization of spatial-spectral features

This section is based on the central sample points to extract neighborhood information as the model input. Table 8 lists the details of the CNN architectures. SS-CNN-1 uses a small neighborhood (7×7 pixels) with more spectra (10) as input, which focuses more on spectral feature extraction; while SS-CNN-2 uses a large neighborhood (27×27 pixels) with less spectra (3), which focuses more on spatial information analysis.

Table 8. Structure and parameters of SS-CNN-1 and SS-CNN-2.

View Table | View all tables in this article

In the SS-CNN-1 model, the data structure of each sample point is 7×7×10, that is, the window neighborhood is 7 pixels. The first 10 principal components are retained, and 7×7×10 data is expanded to 49×10 as the model input. The convolutional layer C2 has 8 convolution kernels and the size of kernel is 3×3. C4 has 16 convolution kernels and the size is 4×4. The size of the pooling kernel is 2×2, and the number of neurons in the fully connected layer is 100. The model is applied to the IV sample set of gastric cancer. After 150 training iterations, the classification accuracy is 96.27%, which is better than that of Spec-CNN.

It is found in the experiment that the neighborhood size of input data and the number of principal component spectra have great influences on the model performance, so we test these two parameters separately. First, the data of different neighborhood sizes are extracted, and other parameters are kept unchanged. The classification results are shown in Table 9. It can be seen that the best results are obtained when the input data is 7×7×10, and the accuracy, specificity and sensitivity indicators are all optimal, and 9×9×10 is suboptimal. If the neighborhood is too small, it is difficult to extract structural information of the cell. After the value exceeds 9, the accuracy decreases slowly as the size increases. The experimental data were subjected to 20× magnification. After sampling calculation, the obtained image data has a size of about 1µm per pixel, and the actual size of gastric cell is about 10 to 20µm. The neighborhood size of 7 to 9 pixels means that the diameter of 7 to 9µm can cover the nucleus and most of the cytoplasmic information. This experimental result is consistent with medical practice. For this 20× magnified gastric cancer tissue, when the neighborhood size is 7 to 9 pixels, it is more suitable for obtaining micro morphological information and structural features and obtaining an optimal model. The classification results for different principal component selections are shown in Fig. 10. When the principal component is selected to be 10, the effect is optimal, and then the accuracy gradually decreased. The first 10 principal components that contain 98.65% spectral information are more suitable for modeling. The other more principal components also contain more redundant data, which is not conducive to improve the model performance.

Fig. 10. Influences of different parameters on SS-CNN-1. (a) Classification results of different neighborhood sizes. (b) Classification results of different principal components.

Download Full Size | PDF

Table 9. Classification results of different neighborhood sizes.

View Table | View all tables in this article

In SS-CNN-2, the first three principal components of the large neighborhood (25×25×3) are used as input images. The convolution layer C2 and C4 have 16 and 32 convolution kernels respectively. After 150 training iterations, the classification accuracy of IV sample set reached 94.81%. Feature analysis is very important for understanding the mechanism of deep learning. The following focuses on the visual analysis of the features extracted by SS-CNN-2, and the characteristics of the model parameters and the microstructure characteristics are studied.

At the beginning of the model training, the weights of each layer are randomly initialized, and then the weights are continuously updated through training and learning. Figure 11 shows the weights of the different convolution kernels in C2, and each small image (4×4) represents a convolution kernel. The C2 layer has 16 convolution kernels, and the intensity of each pixel represents the corresponding weight. It can be seen that most convolution kernels have unique structural features after training and the model has the ability to extract deeper data features.

Fig. 11. Convolution kernel visualization of C2 layer.

Download Full Size | PDF

Figure 12 shows the characteristics of the input image after passing through the layers of the model. The (a) is the original input image, and when passing through the first convolution layer C2, (b) is 16 feature maps converted after being processed by 16 convolution kernels. Then, through the ReLU function, as shown in (c), the ReLU function cuts off the feature smaller than 0, so there are local imageless phenomenon in some of the feature maps. After the maximum pooling operation, the size of the feature map becomes 11×11 in (d). After passing through the second convolutional layer C4, the number of (e) feature maps becomes 32, and the (f) feature map after ReLU is further abstracted. In order to further analyze the characteristics of each layer, the feature maps are presented by pseudo color synthesis. It can be seen that different feature maps are activated by different object types, different network layers encode different feature types, and features learned by higher layers are more abstract and easy to distinguish. Some feature maps show the distribution characteristics of nucleus and stroma, and some show the structural characteristics of cell membrane and fibrous tissue.

Fig. 12. Feature maps of each layer of gastric tissue.

Download Full Size | PDF

3.3 Optimal classification model

In order to train the optimal classification model, SS-CNN-3 takes the neighborhood w×w×b - full spectral information - of the central sample point as the input, where w×w pixels is the neighborhood window and b is the number of spectral bands. Then the deep spatial-spectral feature extraction is performed based on three-dimensional convolution. Assuming that the input size of each layer is s¹×s²×s³, then in the first 3D convolutional layer C2 ($s_2^1 = s_2^2 = w, \, s_2^3 = b$), each convolution kernel $(k_2^1 \times k_2^2 \times k_2^3)$ will generate a feature map with size of $(s_2^1 - k_2^1 + 1) \times (s_2^2 - k_2^2 + 1) \times (s_2^3 - k_2^3 + 1)$. It is used as an input to the P3 layer, and the pooling layer P3 samples the input feature map. C4 contains n₄ convolution kernels with size of $(k_4^1 \times k_4^2 \times k_4^3)$, and a total of n₄ feature map cubes with size of $(s_4^1 - k_4^1 + 1) \times (s_4^2 - k_4^2 + 1) \times (s_4^3 - k_4^3 + 1)$ are obtained. They are expanded into a feature vector as the input to F5 layer.

According to the previous section on the study of spatial features in gastric cancer and the discoveries of Tran [28] and Lin [29] in remote sensing hyperspectral data processing, the small convolution kernel with deeper structure is more conducive to extracting spectral features. Therefore, we fix the spatial dimension of the 3D convolution kernel to 4×4 and 3×3, only slightly changing the spectral depth of convolution kernel. Table 10 shows the specific structural parameters of the SS-CNN-3 model, with a 9×9×200 data cube as input, that is a small neighborhood window set to 9×9 pixels. Performing two convolution operations and one pooling reduces the input to 1×1, so the model contains only two 3D convolution layers C2 and C4. The number of convolution kernels for C2 is 10, and the number of convolution kernels for C4 is twice. This ratio setting draws on the experience of previous classical CNN models [30]. The C2 convolution kernel size is 4×4×5, the C4 convolution kernel size is 3×3×5, and the pooling size is 2×2×2.

Table 10. Model structure and parameters of SS-CNN-3.

View Table | View all tables in this article

At the same time, Dropout and ReLU are applied in training to improve the model performance. Dropout is set to 0.3, and the classification results using ReLU and Sigmoid on datasets III and IV are shown in Fig. 13. It can be seen that the training error curve using Sigmoid fluctuates greatly, and the convergence speed is slower than that of ReLU, especially in dataset IV. At the same time, the model using ReLU can obtain lower training error at the end of training, which shows that the application of ReLU can accelerate convergence and improve the model accuracy.

Fig. 13. Training error curves of ReLU and Sigmoid in (a) dataset III and (b) IV.

Download Full Size | PDF

The model is applied to the IV sample set, and the overall accuracy is 97.57%, which is 1.3% higher than the result of SS-CNN-1. Table 11 summarizes the training parameters of Spec-CNN, SS-CNN-1, SS-CNN-2, and SS-CNN-3.

Table 11. Training results of each model.

View Table | View all tables in this article

In general, the classification accuracy of multidimensional CNNs are higher than those of Spec-CNN, which indicates that the analysis of the spectral features combined with image dimensions is beneficial for improving the model performance. The classification performance of SS-CNN-2 is lower than SS-CNN-1 and SS-CNN-3, which indicates that the format of the input data has a great influence on the performance of the model. Although the input data of the large neighborhood with less spectra contains more spatial dimension information, it also ignores more spectral information, which is the key to improving model performance.

CNN typically takes longer to train the network than other machine learning algorithms. It can also be seen from Table 11 that the training time of SS-CNN-3 is significantly more than the other three categories due to the larger input data (9×9×200) and the relatively more (198031) trainable parameters. The most training parameter is SS-CNN-2 (214009). The reason is that the feature vector of more dimensions before the fully-connection layer (2048) needs to train 204900 parameters after connecting with 100 neurons, so the training time of SS-CNN-2 is also relatively longer. In contrast, SS-CNN-1 with its small input and relatively simple model construction obtained the least training parameters (34345) and the fastest training time (284.11s). Spec-CNN with its single-layer convolution structure achieved the fastest test time (0.61s). Although SS-CNN-3 has a long training time, its more complex network structure extracts deeper spectral features, so the optimal model classification performance (97.57%) is obtained.

4. Conclusion

The main contribution of this study is to explore the spatial-spectral features of tumor tissue by the microscopic hyperspectral technique and deep learning method, and to establish a basis for further research on the medical contacts between spectral features of tumor tissue with diagnostic criteria, mutation, prognosis, and so on. In this work, a hyperspectral database of 28,542 spectral samples (including 30 patients) is established. The convolutional neural network is applied to the identification of gastric cancer pathological tissues and the classification accuracy reaches 97.57%. It is indicated that micro-hyperspectral is a suitable technique for automatically detecting cancerous tissue from pathological sections. In addition, the medical implication of analysis result and the model features are briefly explained, which is crucial for the emerging medical hyperspectral technology. The closer medical contacts and model interpretation are still the next research priorities. The relevant analytical methods proposed in this paper can also be extended to other biomedical objects, such as skin melanoma, microbial pathogens, and human stones. The next step is to establish a complete automated classification diagnosis based on more samples and to extend the study to unstained samples.

Funding

Xi’an Key Laboratory for Biomedical Spectroscopy (201805050ZD1CG34); National Natural Science Foundation of China (61501456).

Disclosures

The authors declare that there are no conflicts of interest related to this article.

References

1. J. Ferlay, I. Soerjomataram, R. Dikshit, S. Eser, C. Mathers, M. Rebelo, D. M. Parkin, D. Forman, and F. Bray, “Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012,” Int. J. Cancer 136(5), E359–E386 (2015). [CrossRef]

2. T. Vodinh, D. L. Stokes, M. B. Wabuyele, M. E. Martin, J. M. Song, R. Jagannathan, E. Michaud, R. J. Lee, and X. Pan, “A hyperspectral imaging system for in vivo optical diagnostics. Hyperspectral imaging basic principles, instrumental systems, and applications of biomedical interest,” IEEE Eng. Med. Biol. Mag. 23(5), 40–49 (2004). [CrossRef]

3. B. Luo and L. Zhang, “Robust Autodual Morphological Profiles for the Classification of High-Resolution Satellite Images,” IEEE Trans. Geosci. Remote Sens. 52(2), 1451–1462 (2014). [CrossRef]

4. Y. Suzuki, H. Okamoto, M. Takahashi, T. Kataoka, and Y. Shibata, “Mapping the spatial distribution of botanical composition and herbage mass in pastures using hyperspectral imaging,” Grassland Science. 58(1), 1–7 (2012). [CrossRef]

5. H. Hou, Z. Fang, Y. Zhang, M. Dong, and Y. Liu, “Simulation and in vivo Experimental Study on Noninvasive Spectral Detection of Skin Cholesterol,” Chin. J. Laser 43(9), 0907001 (2016). [CrossRef]

6. A. Dong, J. Li, B. Zhang, and M. Liang, “Hyperspectral Image Classification Algorithm Based on Spectral Clustering and Sparse Representation,” Acta Optica Sinica. 37(8), 0828005 (2017). [CrossRef]

7. D. C. Kellicut, J. M. Weiswasser, S. Arora, E. Freeman, R. A. Lew, C. Shuman, J. R. Mansfield, and A. N. Sidawy, “Emerging Technology: Hyperspectral Imaging,” Perspect. Vasc. Surg. 16(1), 53–57 (2004). [CrossRef]

8. H. Liu, W. Gu, Q. Li, Y. Wang, Z. Chen, and X. Qin, “Nerve Classification with Hyperspectral Imaging Technology,” Spectrosc. Spect. Anal. 35(1), 38–43 (2015). [CrossRef]

9. S. Ortega, G. M. Callico, M. L. Plaza, R. Camacho, H. Fabelo, and R. Sarmiento, “Hyperspectral database of pathological in-vitro human brain samples to detect carcinogenic tissues,” in IEEE International Symposium on Biomedical Imaging (2016), pp. 369–372.

10. S. Zhu, K. Su, Y. Liu, H. Yin, Z. Li, F. Huang, Z. Chen, W. Chen, G. Zhang, and Y. Chen, “Identification of cancerous gastric cells based on common features extracted from hyperspectral microscopic images,” Biomed. Opt. Express 6(4), 1135–1145 (2015). [CrossRef]

11. H. Akbari, L. V. Halig, H. Zhang, D. Wang, Z. G. Chen, and B. Fei, “Detection of Cancer Metastasis Using a Novel Macroscopic Hyperspectral Method,” Proc. SPIE 8317, 831711 (2012). [CrossRef]

12. Q. Li, C. Li, H. Liu, Z. Mei, Y. Wang, and F. Guo, “Skin cells segmentation algorithm based on spectral angle and distance score,” Opt. Laser Technol. 74, 79–86 (2015). [CrossRef]

13. H. Akbari, L. V. Halig, D. M. Schuster, O. Adeboye, M. Viraj, P. T. Nieh, G. Z. Chen, and F. Baowei, “Hyperspectral imaging and quantitative analysis for prostate cancer detection,” J. Biomed. Opt. 17(7), 0760051 (2012). [CrossRef]

14. A. O. Gerstner, W. Laffers, F. Bootz, D. L. Farkas, R. Martin, J. Bendix, and B. Thies, “Hyperspectral imaging of mucosal surfaces in patients,” J. Biophoton. 5(3), 255–262 (2012). [CrossRef]

15. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science 313(5786), 504–507 (2006). [CrossRef]

16. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE 86(11), 2278–2324 (1998). [CrossRef]

17. M. J. Baker, J. Trevisan, P. Bassan, R. Bhargava, H. J. Butler, K. M. Dorling, P. R. Fielden, S. W. Fogarty, N. J. Fullwood, K. A. Heys, C. Hughes, P. Lasch, P. L. Martin-Hirsch, B. Obinaju, G. D. Sockalingum, J. Sulé-Suso, R. J. Strong, M. J. Walsh, B. R. Wood, P. Gardner, and F. L. Martin, “Using Fourier transform IR spectroscopy to analyze biological materials,” Nat. Protoc. 9(8), 1771–1791 (2014). [CrossRef]

18. Keras Documentation. http://keras.io.

19. M. Kamruzzaman, D. Barbin, G. Elmasry, D. W. Sun, and P. Allen, “Potential of hyperspectral imaging and pattern recognition for categorization and authentication of red meat,” Innovative Food Sci. Emerging Technol. 16(1), 316–325 (2012). [CrossRef]

20. J. A. Sanz, A. M. Fernandes, E. Barrenechea, S. Silva, V. Santos, N. Gonçalves, D. Paternain, A. Jurio, and P. Melo-Pinto, “Lamb muscle discrimination using hyperspectral imaging: Comparison of various machine learning algorithms,” J. Food Eng. 174, 92–100 (2016). [CrossRef]

21. D. Wu and D. Sun, “Color measurements by computer vision for food quality control – A review,” Trends Food Sci. Technol. 29(1), 5–20 (2013). [CrossRef]

22. K. Makantasis, K. Karantzalos, A. Doulamis, and N. Doulamis, “Deep supervised learning for hyperspectral data classification through convolutional neural networks,” in 2015 IEEE International Geoscience and Remote Sensing Symposium (2015), pp. 4959–4962.

23. H. Liang and Q. Li, “Hyperspectral imagery classification using sparse representations of convolutional neural network features,” Remote Sens. 8(2), 99 (2016). [CrossRef]

24. L. Wei, G. Wu, Z. Fan, and D. Qian, “Hyperspectral Image Classification Using Deep Pixel-Pair Features,” IEEE Trans. Geosci. Remote Sens. 55(2), 844–853 (2017). [CrossRef]

25. J. Yue, W. Zhao, S. Mao, and H. Liu, “Spectral–spatial classification of hyperspectral images using deep convolutional neural networks,” Remote Sensing Letters. 6(6), 468–477 (2015). [CrossRef]

26. Y. Chen, H. Jiang, C. Li, X. Jia, and P. Ghamisi, “Deep feature extraction and classification of hyperspectral images based on convolutional neural networks,” IEEE Trans. Geosci. Remote Sens. 54(10), 6232–6251 (2016). [CrossRef]

27. L. Ying, H. Zhang, and S. Qiang, “Spectral-spatial classification of hyperspectral imagery with 3D convolutional neural network,” Remote Sens. 9(1), 67–69 (2017). [CrossRef]

28. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3d convolutional networks,” in Proceedings of the IEEE international conference on computer vision (2015), pp. 4489–4497.

29. L. Lin, H. Dong, and X. Song, “DBN-based Classification of Spatial-spectral Hyperspectral Data,” in Advances in Intelligent Information Hiding and Multimedia Signal Processing (Springer, 2017), pp. 53–60.

30. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems (2012), pp. 1097–1105.

Patient identifier	Differentia-tion	Image sample		Spectral sample
Patient identifier	Differentia-tion	Canceration	Normal	Canceration	Normal
P1	II	24	55	2977	3743
P2	II	23	26	3166	2907
P3	I	3	31	212	1914
P4	III	21	10	2235	1329
P5	II	13	50	1778	2808
P6	II	4	35	766	1853
P7	II	5	21	786	2262
P8	I	6	26	484	3083
P9	-	-	-	-	-
P10	III	17	-	2351	-
P11	-	-	-	-	-
P12	III	20	-	5015	-
P13	III	33	-	4653	-
P14	IV	22	17	1665	1634
P15	I	-	22	-	582
P16	I	3	9	505	531
P17	III	5	4	794	522
P18	IV	3	11	264	406
P19	III	-	5	-	599
P20	-	-	-	-	-
P21	I	8	-	1371	-
P22	I	9	2	2394	-
P23	II	3	6	160	1219
P24	IV	2	9	349	913
P25	IV	-	5	-	270
P26	IV	8	-	2065	-
P27	IV	7	15	1430	1587
P28	IV	4	2	749	380
P29	I	2	1	375	240
P30	-	-	-	-	-

Sample		Training set	Test set	Sum
Gastric cancer	I	3703	1638	5341
	II	6750	2893	9643
	III	10534	4514	15048
	IV	4566	1956	6522
Normal		70%	30%	28782

Sample	Accuracy(%)	Sensitivity(%)	Specificity(%)
I	81.53	80.59	82.48
II	90.24	90.98	89.49
III	90.63	91.14	90.12
IV	94.38	94.99	93.76

Method	Accuracy(%)	Sensitivity(%)	Specificity(%)
ANN	87.47	89.57	85.38
RBF-SVM	89.06	88.24	89.88
Spec-CNN	94.38	94.99	93.76

Sample Type	Patient	Accuracy	Sensitivity	Specificity
I	P3	93.09	92.06	93.21
	P8	92.80	91.72	92.97
	P16	88.71	86.75	90.57
	P29	86.41	85.71	87.50
	Avg.	90.25	89.06	91.06
II	P1	93.10	94.85	91.71
	P2	93.96	94.20	93.69
	P5	92.80	90.99	93.94
	P6	95.66	96.94	95.14
	P7	95.73	95.74	95.72
	P23	86.44	89.58	86.03
	Avg.	92.95	93.72	92.71
III	P4	91.85	91.94	91.71
	P14	95.96	96.19	95.71
	P17	93.15	90.76	96.79
	Avg.	93.65	92.96	94.73
IV	P18	95.00	96.20	94.21
	P24	99.73	100	99.63
	P27	97.90	97.20	98.53
	P28	100	100	100
	Avg.	98.16	98.35	98.10

Tumor tissue classification based on micro-hyperspectral technology and deep learning

Abstract

1. Introduction

2. Material and methods

2.1 Experimental framework

2.2 Microscopic hyperspectral imaging system

2.3 Experimental dataset

2.4 Methods

2.4.1 Hyperspectral data preprocessing

2.4.2 CNN model

3. Results and analysis

3.1 Spectral feature

3.1.1 Experimental results of Spec-CNN

3.1.2 Individual differentiation of gastric cancer tissue spectra

3.2 Visualization of spatial-spectral features

3.3 Optimal classification model

4. Conclusion

Funding

Disclosures

References

Cited By

Figures (13)

Tables (11)

Equations (2)

Biomedical Optics Express

Sample Type	Patient	Accuracy	Sensitivity	Specificity
I	P3	80.38	82.54	80.14
	P8	72.31	84.83	70.35
	P16	79.68	80.79	78.62
	P29	80.43	81.25	79.17
	Avg.	78.20	82.35	77.07
II	P1	86.40	88.69	84.58
	P2	85.01	82.51	87.73
	P5	83.27	83.68	83.02
	P6	90.82	92.14	90.27
	P7	82.15	84.68	81.27
	P23	86.68	89.58	86.30
	Avg.	85.72	86.88	85.53
III	P4	82.30	83.73	79.90
	P14	88.98	87.58	90.41
	P17	69.54	68.49	71.15
	Avg.	80.27	79.93	80.49
IV	P18	81.50	86.08	78.51
	P24	87.27	89.42	86.45
	P27	91.71	95.34	88.45
	P28	81.07	79.91	83.33
	Avg.	85.39	87.69	84.19

Model n×(k1×k2)	I1	C2	P3	C4	F5	O6	Accuracy(%)
SS-CNN-1	(7×7)×10	8×(3×3)	2×2	16×(4×4)	100	2	96.27
SS-CNN-2	25×25	16×(4×4)	2×2	32×(4×4)	100	2	94.81

Data Size	Accuracy(%)	Sensitivity(%)	Specificity(%)
3×3×10	91.95	91.56	92.33
5×5×10	91.23	89.42	93.05
7×7×10	96.27	95.91	96.63
9×9×10	96.17	95.76	96.57
11×11×10	94.48	95.19	93.76

Model	Total parameters	Training time	Test time	Accuracy
Spec-CNN	101797	581.50s	0.61s	94.39%
SS-CNN-1	34345	284.11s	1.04s	96.27%
SS-CNN-2	214009	1325.91s	1.82s	94.81%
SS-CNN-3	198031	9697.44s	15.84s	97.57%