Polarization-based probabilistic discriminative model for quantitative characterization of cancer cells

Jiachen Wan; Jiachen Wan; Yang Dong; Yang Dong; Yang Dong; Jing-Hao Xue; Liyan Lin; Shan Du; Jia Dong; Yue Yao; Yue Yao; Chao Li; Hui Ma; Hui Ma; Hui Ma

doi:10.1364/BOE.456649

1. Introduction

Cancer is one of the leading causes of death in 127 countries [1]. Trend indicates that cancer will soon be the leading cause of premature death given the decreasing trend of premature death due to cardiovascular diseases. Among all types of cancers, female breast cancer is the most diagnosed cancers in 2021, surpassing lung cancer [2]. Liver cancer is the third leading cause of cancer death worldwide. For cancer diagnosis, pathologists examine pathological tissues under high-resolution microscopes to look for textual indication of cancer. While such method will remain to be the gold standard for the next decades, such manual examination is qualitative, subjective, and requires intense labor resources [3]. The clinical demand for a method capable of wide field quantitative evaluation of the pathological slides is high.

In recent years, optical imaging techniques combined with machine learning models are gradually emerging in digital pathology and diagnosis, which enriches the input data forms of the models and expands their ability to acquire microstructural information [4–6]. The Mueller matrix is a comprehensive description of the samples’ polarization properties, containing abundant microstructural information and optical properties [7–9]. To quantitatively decode the Mueller matrix, sets of polarization parameters with physical meanings have been derived by experiments and simulations [10–13] and have demonstrated promising prospects in microstructural characterization of complex biological specimens [14–17]. It has been testified that the contrast mechanisms of 2D images of polarization parameters depend on the samples’ polarization characteristics and less on the imaging resolution [18,19], making them possible to characterize microstructures which may not be available for human vision under a low-resolution system. These polarization parameters can be used alone, or as polarimetry basis parameters (PBPs) to construct polarimetry feature parameters (PFPs), which have explicit connections to the sample’s microstructural characteristics, by employing various machine learning methods. For example, the Mueller matrix polar decomposition parameters $D,\; \; P,$ and $\mathrm{\Delta }$ can be associated with diattenuation, polarizance, and depolarization [13]. Previously, we proposed a linear discriminant analysis (LDA) based approach for deriving PFPs as simplified linear functions of the PBPs, for quantitative characterization of cells and fiber collagen in various breast pathological tissues [5].

Inspired by this, in this paper, we proposed a novel framework for deriving complex sigmoid-transformed PFPs for quantitative characterization of a more specific and finer microstructure than cells – cancer cells. The polarization-based probabilistic discriminative model (P-PDM), designed based on the degrees of complexity of the pathological features and their corresponding polarization characteristics, aims at objective cancer cell identification. Identification of cancerous cell is achieved by first identifying all the cells, and then separating the normal cells from the cancerous cells. The nodes in P-PDM are built using L0 regularized linear logistic regression (LR) classifiers [20] and connected by using conditional probability and Bayes’ theorem. Specifically, we measured pathological tissue samples’ PBPs by using the Mueller matrix microscope [21,22] as input data of the P-PDM, and the model outputs sigmoid-transformed PFPs whose forms depend on the number of used nodes, the edges connecting the nodes, and the probability formulas describing the nodes which can be dissected to extract the physical meanings. A comparison is made between the L0 [20] and L1 [23] regularization method, and L0 regularization is selected for model stability. Then, we demonstrated the viability of our proposed P-PDM in hematoxylin and eosin (H&E) sections of breast cancer tissues and liver cancer tissues, in each of which there are two derived final PFPs that can respectively characterize the cells and cancer cells, with satisfactory accuracy and sensitivity of 85% and above. The pixel-based model learns single pixels’ polarization features, which may also be measured when the corresponding microstructure is not visible to human vision, and then makes the decision on the belonging structure type of the pixels in polarization images. Therefore, the derived final PFPs may accomplish tasks in a low-resolution and wide-field system, which paves the way for quantitative and rapid screening of cancer cells in pathological sections.

2. Related work

Polarization imaging-based machine learning methods for cancer aided diagnosis in pathology are gradually emerging recently. Dremin et al. introduced a high-resolution polarization imaging technique combined with k-means method for automatically clustering the pixels of breast cancer tissues into three types, that is, fat tissue, benign fibrosis, and epithelial carcinoma located inside the milk duct [24]. This work shows the possibility of classification, while the performance of the proposed model is not given when dealing with a large amount of samples; Sindhoora et al. calculated 19 gray level co-occurrence matrix features of polarization parameters images measured under a 40× objective lens and input them into the support vector machine classifier to detect the tumor regions of ductal cancer tissues [25]; The high-resolution polarization hyperspectral images were processed using multiple machine learning classifiers by Zhou et al., such as random forest, support vector machine, gaussian naive bayes and logistic regression [26]. They explore the possibility of classifiers based on polarization hyperspectral features for automatic detection of head and neck squamous cell carcinoma on H&E-stained tissue sections. Christian et al. constructed a decision-theoretic framework for biological tissues classification using Mueller polarimetry, introduced the preprocessing method involving superpixels, and then tested different machine learning models using ex vivo specimens of uterine cervix [27]. Ivanov et al. measured Mueller matrix images of ex vivo human colon samples, decomposed them using symmetric decomposition and depolarization metric calculus. With the help of supervised and unsupervised machine learning methods including logistic regression, random forest, support vector machine, and principal component analysis, they showed promising result distinguishing healthy and tumor human colon samples [28]. Wang et al. combined principal components scores with polarimetric parameters for structure classification, proposing the term digital staining of the H&E slides. The method is validated on intestinal metaplasia and normal glands [29].

In addition to the applications of machine learning model based on handcrafted features, some researchers are also investigating the use of deep convolution neural network (CNN) to extract the features of polarization images of cancer tissues for automatic aided diagnosis of cancer cells. Zhao et al. established the Mueller matrix image data set of pathological tissues of giant cell tumor of bone at 40× magnification [30]. The data set entered into the CNN to extract microstructural information as depth features and were calculate the Mueller matrix derived parameters as manual features. Then, a multi-parameters fusion network was proposed to fuse the two features for automatic detection of the pathological samples of giant cell tumor of bone; Xia et al. and Ma et al. proposed a ReSE Net [31] and a MuellerNet [32] for the classification of different breast cancer cells respectively. The high-resolution Mueller matrix image of single cancer cell was measured and as input data of the networks. ReSE Net adds a SENet on the basis of ResNet, obtaining the weight of each polarization feature channel according to its importance for classification. MuellerNet includes a normal stream composed of a ResNet for processing light intensity images and a polarization stream composed of a CNN with attention mechanism for processing polarization images. The accuracy of the two is about 88% and 86% respectively. Roa et al. utilized a deep learning model utilizing Res-Net to segment cervical collagen and elastin, where the model performance is compared with second harmonic generation and two-photon excitation florescence. The authors also proposed a CNN-K-NN hybrid model, or the use of U-net if data is abundant [33].

Based on above, to our knowledge, it is the first time to derive sigmoid-transformed PFPs with physical meanings and feature specificity for quantitative characterization of cancer cells at pixel level in breast and liver pathological tissues in a low-resolution and wide-field system.

3. Methods

3.1 Pathological samples

To test our method, we used two kinds of H&E-stained pathological tissue slides (4-μm-thick) – breast cancer and liver cancer. Pathological breast cancer tissue slices from 9 patients were provided by University of Chinese Academy of Sciences Shenzhen Hospital. In addition, liver cancer pathological tissue samples used in this study were acquired from 7 clinical patients of Fujian Medical University Cancer Hospital. The size of the regions-of-interest (ROIs) in this study is 1000×1000 in pixels. In each case of pathological tissue sample, pathologists selected 6 ROIs, every two of which are a group. The three groups respectively contain a large number of non-cell tissues (refer to fiber structures, background, or other tissues, denoted as N tissues), non-cancer cells (N cells), and cancer cells (C cells). Meanwhile, the target microstructure in each ROI was labelled manually by pathologists using MATLAB Graphical User Interface to produce the mask as ground truth for cross validation of the models. More details about pathological features and how to label target microstructures in H&E images by pathologists were provided in Section 1 of the Supplement 1. Thus, a total of 54 ROIs of breast cancer and 42 ROIs of liver cancer tissues were analyzed respectively. This work was approved by the Ethics Committees of University of Chinese Academy of Sciences Shenzhen Hospital and Fujian Medical University Cancer Hospital.

3.2 Data acquisition

3.2.1 Experimental setup

The H&E-stained sections from biopsy enter into the Mueller matrix microscope for respectively obtaining the sample’s Mueller matrix image under a 4× objective lens and H&E pathological image under a 20× objective lens. Figure 1(a) illustrates the photograph and schematic of the Mueller matrix microscope. The Mueller matrix microscope was constructed by adding polarization state generator and analyzer to the commercial transmission-light microscope [21], and works by adopting the typical dual rotating retarder method [22]. This instrument has already been described in the previous publications [5,6,15,19]. The details on the construction and optimization of the system can be found in [34–36]. For the sake of completeness, more information about the instrument was provided in Section 2 of the Supplement 1. Figure 1(b) is an example of measuring the Mueller matrix of the breast cancer pathological section under a 4× objective lens. The element m11 is the sample’s intensity image, and the other 15 elements represent the sample’s complete polarization features and are all normalized by m11. Figure 1(c) is the corresponding H&E image obtained under a 20× objective lens, and several target microstructures are labelled with solid lines in different colors by pathologists.

Fig. 1. Data acquisition method. (a) Photograph and schematic of the Mueller matrix microscope. P: polarizer. R: quarter-wave plate; (b) An example of measuring Mueller matrix under a 4× objective lens; (c) The corresponding H&E image obtained under a 20× objective lens.

Download Full Size | PDF

3.2.2 Polarimetry basis parameters

Although the Mueller matrix contains samples’ complete polarization properties and abundant microstructure information, it often inconvenient to use the matrix elements directly since they are lack of explicit connections to the microstructures and sensitive to sample orientation. Recently, multiple techniques based on physical theory were adopted to derive several sets of PBPs, which have clear physical meanings, and are either insensitive or related explicitly to the orientation angle of the sample. The input data of the designed machine learning model is single pixels with series of polarization features. If the polarization features are affected by the sample orientation, the convergence and robustness of the trained model may be impaired, making the extracted PFPs unstable. Therefore, we are using the azimuthal-invariant PBPs rather than Mueller matrix elements or polarization intensity images (which are affected by the light source intensity) as input features for deriving new feature specific polarization parameters PFPs. Table S1 (Section 3 in Supplement 1) summarizes the computing formulas and physical meanings of PBPs commonly used in polarimetry. These PBPs are all used as the input polarization features of samples to derive PFPs for the target microstructure characterization in this study. Lu and Chipman proposed the Mueller matrix polar decomposition (MMPD) method and derived linear retardation δ, diattenuation D, depolarization Δ, and optical rotation ψ [10]. The Mueller matrix transformation (MMT) method proposed in our previous study extracted anisotropy degree t ₁, polarizance b, and circular birefringence β [11]. The Mueller matrix rotation invariant (MMRI) parameters also decode effective information from the Mueller matrix, including linear polarizance P_L, linear diattenuation D_L, linear birefringence related r_L and q_L [12].

In addition to the above parameters, a Mueller matrix asymmetry parameter (MMAP) is proposed, namely the P _TMS. For a transversal mirror symmetric (TMS) sample, the Mueller matrix elements should be symmetric, specifically $\textrm{m24} ={-} \textrm{m42}$ and $\textrm{m34} ={-} \textrm{m43}$ [13]. The parameter P _TMS measures the breaking of such symmetry. The value of P _TMS of the TMS sample is 0. Previous studies of various pathological tissues have shown that parameters of MMPD, MMT, MMRI, and MMAP in transmission polarimetry have a good potential for probing microstructures and facilitating medical diagnosis [5–7,15,21,31].

3.3 Overview

In this study, we proposed a P-PDM for quantitative characterization of cancer cells under a low-resolution and wide-field system. Firstly, we took pathological tissues’ Mueller matrix images under a 4× objective and H&E images under a 20× objective. The element m11 in Mueller matrix represents the intensity image of sample, which can be used as the fixed image for pixel level registration with the H&E image of the sample (Fig. 2(a)). The pixel level registration adopts the affine transformation method [37] and generates the transformation matrix T. In the H&E image, the target microstructures were manually labelled by experienced pathologists. The ROIs were selected in the overlapping area of the two images (Fig. 2(b)). Then sets of polarization parameters, derived from earlier studies, of the ROI were calculated from the Mueller matrix as the PBPs (Fig. 2(c)). The labels were transformed by the matrix T to produce label masks used for mapping on the PBPs images to select target pixels of the three microstructures (Fig. 2(d)). Then, we selected the best PBPs groups from MMPD, MMT, MMRI, and MMAP as the input features of the selected classifiers (Fig. 2(e)). The classifiers used in this study includes P-PDM based on L0 regularization, P-PDM based on L1 regularization, artificial neural network (ANN), and LDA. The P-PDM output two sigmoid-transformed PFPs, which can be used for quantitative characterization of the cells and C cells in the ROI (Fig. 2(f)).

Fig. 2. The framework for quantitative characterization of cancer cells. (a)-(f) outlines the steps from the input of Mueller matrix image and H&E image of the pathological tissue sample to the output of sigmoid-transformed PFPs for obtaining diagnostic indicators of cancer cells.

Download Full Size | PDF

3.4 Algorithm architecture

3.4.1 Image registration

In this study, the input data of the P-PDM were the target pixel values in PBPs. These pixels were selected by the label maps in the corresponding H&E images. The H&E images were obtained under a 20× objective, which was consistent with the objective lens used in clinical observation of cell morphology for evaluation and diagnosis. Therefore, the resolution of H&E images is enough to label the target microstructures for pathologists, which is rarely affected by error sources such as saturated pixels and mixing typology information. To produce the mask for directly mapping on the samples’ PBPs images to select pixels, we need to achieve the pixel-by-pixel registration between the sample’s Mueller matrix image and H&E image and obtain transformation matrix T. As shown in Fig. 2(a), we adopted the affine transformation method to transform the H&E image (moving image) to match the m11 image (fixed image). Specifically, we called the cpselect function to start the Control Point Selection Tool in MATLAB. After selecting several feature points in m11 image and its corresponding H&E image, the fitgeotrans function was conducted to produce the transformation matrix T, in which the transformation type was set as “affine”. Pathologists’ markings of the target microstructure on the H&E image and matrix T were substituted in the imwarp function, and then produced the mask which was used to select target pixels in PBPs as input of the classifiers.

3.4.2 Polarization-based probabilistic discriminative model

Here, the designed P-PDM consists of two L0 regularized linear LR classifiers connected by prior knowledge. The model is designed to first extract all the cell-pixels, and then isolates the cancerous-cell-pixels from the normal-cell-pixels. The LR classifiers were adopted as nodes in P-PDM, since (i) LR classifier is a discriminative model that does not assume any prior-distribution regarding the input data [38]. We employed LR classifiers, considering the probability distribution of input features, i.e., PBPs, in certain classes may not be Gaussian distributions; and (ii) some studies preliminarily demonstrated the ability of LR classifier for the characterization of different biological tissue samples based on polarization data [39].

The L0 regularization is imposed using orthogonal matching pursuit (OMP) [40]. Given the number of allowed parameters to be used in the linear model, OMP logistic regression finds the sparse solution to the classification problem using a greedy approach [40]. The hyperparameter for this model is the number of parameters allowed.

Note that the output of each node resembles a probability or posterior probability to enable the classification post-processing. The output of standard linear regression is simply a combination of input features rather than a calibrated probability. Therefore, we introduced the sigmoid function, which is a bounded differentiable real function ranging from 0 to 1, to map the outputs of the nodes into probabilities [41]. After that, Bayes’ theorem would be implemented to derive sigmoid-transformed PFPs by multiplying these probabilities. Such PFP value of a pixel is the independent probability that the pixel belongs to the interested class. Specifically:

Given a pixel from ROI image, the aim of the P-PDM is to predict the class that this pixel belongs to according to its series of PBPs. The P-PDM intends to find out the probability of whether a certain pixel belongs to the cells class or the C cells class, and outputs two sigmoid-transformed PFPs for the characterization of the target microstructures respectively. The prior knowledge that cancer cell is a subclass of cell is fully utilized when designing the graphical model. Specifically, given a randomly sampled pixel ${\textrm{x}_{\textrm{i,j}}}$ from the ROI image, $\textrm{P(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}{\; )}$ is the probability that ${\textrm{x}_{\textrm{i,j}}}$ belongs to the cells class (denoted by ${\textrm{S}_{\textrm{Cells}}}$). Given a pixel ${\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}$, the conditional probability that the pixel belongs to the C cells class (denoted by ${\textrm{S}_{\textrm{C cells}}}$) is $\textrm{P(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$. The probability formulas of interest are sigmoid functions of PBPs combinations, shown as

(1)$$\begin{array}{l} P({{x_{i,j}} \in {S_{\textrm{Cells}}}} )= \frac{1}{{1 + exp({ - PFP({{x_{i,j}} \in {S_{\textrm{Cells}}}} )} )}}\\ P({{x_{i,j}} \in {S_{\textrm{C cells}}}|{{x_{i,j}} \in {S_{\textrm{Cells}}}} } )= \frac{1}{{1 + exp({ - PFP({{x_{i,j}} \in {S_{\textrm{C cells}}}|{{x_{i,j}} \in {S_{\textrm{Cells}}}} } )} )}} \end{array}$$

where $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ is a linear combination of PBPs, obtained by training a L0 regularized LR classifier (named as ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$ in which C is the hyperparameter in the classifier) for distinguishing cells from other microstructures in pathological tissues. $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ in (1) is the linear combinations of PBPs learnt by using a L0 regularized LR classifiers (named as ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$) for the recognition of cancer cells given that the pixel belongs to the cell class.

As mentioned in Section 3.1, pathologists labelled three target microstructures in ROIs – N tissues, N cells, and C cells – in breast cancer and liver cancer pathological tissues. During training process, at least 10000 pixels were randomly sampled from each target microstructure, containing the feature information in the labelled area. Even if there were some artifacts in a small quantity of pixels or some noise in the target region, it was taken into account by the training process to improve the robustness of the model. Here, grid search based on cross validation was used to determine the hyperparameter C in LR classifiers, which is the number of parameters used in the OMP model.

In each fold of the cross-validation, 1 ROI will be selected as testing set, and the rest are the training set. For the ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$ node, which differentiate cells from other structures, the L0 regularized LR classifier is trained by using the selected PBP of different structures, including noncell (denoted as N tissues) labelled as negative class, as well as noncancer cells and cancer cells (denoted as N cells and C cells) labelled as positive class. As for ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$, which differentiate C cells form N cells, the classifier is trained by using the N cells and C cells data set, labeled as normal and cancerous respectively.

For ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$ training, we randomly sampled 10000 pixels from the N cells class and 10000 pixels from the C cells class, labelled them as the positive class. Meanwhile, 20000 pixels were sampled from N tissues class, and labelled them as negative class. ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$’s input data are the PBPs of these pixels, and the output is $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ which is a polarization parameter for quantitative characterization of cells (vs non-cell tissues) in pathological tissues. For the identification of cells in breast cancer and liver cancer, the number of parameters used was chosen as 2 after using the grid search based on the OMP model, cross validation and prior researches [5].

For ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$ training, we labelled 10000 pixels sampled from the C cells class as the positive class, and labelled 10000 pixels sampled from the N cells class as the negative class. After training, ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$ produces $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$, which can be used to predict if a pixel belongs to the C cells class or not, given that the pixel belongs to the cell class. After grid search based on the cross validation, the number of parameters used was determined as 2 in breast cancer and liver cancer tissues.

By implementing the linear LR classifiers, the PBPs combinations $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ and $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ can be obtained, which are used to calculate the two probability formulas in (1). The output of the P-PDM is the final sigmoid-transformed PFPs, which are defined as the products of probability formulas, calculated as

(2)$$\begin{array}{l} PF{P_{\textrm{Cells}}} = P({{x_{i,j}} \in {S_{\textrm{Cells}}}} )\\ PF{P_{\textrm{C cells}}} = P({{x_{i,j}} \in {S_{\textrm{C cells}}}} )\textrm{ = }P({{x_{i,j}} \in {S_{\textrm{Cells}}}} )\times P({{x_{i,j}} \in {S_{\textrm{C cells}}}|{{x_{i,j}} \in {S_{\textrm{Cells}}}} } )\end{array}$$

where$\; \textrm{PF}\textrm{P}_\textrm{Cells}$ and $\textrm{PF}\textrm{P}_\textrm{C cells}$ are the polarization feature parameters that specifically and quantitatively identify cells and cancer cells in pathological tissues, respectively. We can observe from (1) and (2) that in order to identify more detailed and specific microstructures, more nodes will be needed, resulting in a more complex PFP. Each node is represented by PBPs with clear physical meanings selected by an OMP-based LR classifier, thus the final PFPs consisting of some number of interconnected nodes allow deep interpretation of physical meanings. The P-PDM was implemented through the open-source library Scikit-learn in Python version 3.7.6 with Intel Core i7-9700 CPU @3.00GHz.

3.4.3 Artificial neural network and linear discriminant analysis

Linear discriminant analysis (LDA) is the most basic machine learning model for classification tasks. Given two groups of data to be separated and classified, LDA algorithm is designed to find a linear hyperplane in the feature space that can separate the groups by assuming a gaussian distribution. [42] The viability of LDA for PFP extraction was presented and discussed in our previous work, providing the benefit of clear interpretability and satisfactory performance for simple structure identification tasks. [5] However, as we move on to identifying more complex pathological structures such as cancerous cells, the LDA algorithm struggles in performance due to its linearity. For implementation of LDA in our work, we used the discriminant analysis package in Scikit-learn library.

Artificial neural network is on the other end of the spectrum for model-complexity. Given an input data point, ANN will process the input data with a layered directed graph of hidden units, where each hidden unit is essentially a tunable linear model followed by a non-linear activation function, designed to mimic biological neural activities. [43] The coefficients in the hidden units are fine-tuned using gradient descent during training. It is often the baseline model for supervised learning tasks. ANN benefits from its nonlinear decision boundary, capable of performing complex classification tasks, but it is analogous to a black box, lacking model interpretability. For implementation, we first normalized the input features, and then the model is trained using MLPClassifier function in scikit-learn.

Our proposed P-PDM sits in between the spectrum; it has great interpretability due to the prior-knowledge inspired model structure, while sustaining decent performance comparable to that of ANN. For implementation details of LDA and ANN see [5].

For comparison with P-PDM, we adopted a three-class ANN classifier, which can be used to identify cells, C cells, and N tissues in pathological tissues. Cross validation was used to determine the hyperparameters in the three-class ANN classifier. After grid search, we could obtain the parameter settings of the two optimized ANN classifiers for the cancer cell recognition tasks in breast and liver cancer tissues respectively. For breast cancer, the Hidden_layer_size was searched as (50) and Learning_rate_init is 0.00001. For liver cancer, the Hidden_layer_size was chosen as (75,75) and Learning_rate_init is 0.01. In addition, to compare with our previous study, we also employed four two-class LDA classifiers for the quantitative characterization of the cells and C cells in breast and liver cancer tissues respectively. The data set and labels of ANN and LDA classifiers are the same as those of P-PDM. ANN and LDA were implemented through the open-source library Scikit-learn in Python version 3.7.6 with Intel Core i7-9700 CPU @3.00GHz.

4. Results

Before analyzing the experimental results, recall the three types involved: non-cell tissues (N tissues), non-cancer cells (N cells), and cancer cells (C cells), with N cells and C cells belonging to cells. The output of P-PDM is two sigmoid-transformed PFPs – $\textrm{PF}\textrm{P}_\textrm{Cells}$ and $\textrm{PF}\textrm{P}_\textrm{C cells}$, which have great potential for specific and quantitative characterization of cells and C cells in pathological tissues, respectively.

4.1 Selection of polarimetry basis parameters

Here, we introduced the four PBPs groups popular adopted in polarimetry (Section 3 in Supplement 1). However, there are important correlations between such observables. To ensure the robustness and convergence of the model, we employed the study on the selection of input PBPs for improving linearly independence of input variables. Specifically, it is the selection of PBPs groups before as input of the machine learning classifiers.

During cross validation, the four groups of PBPs, i.e., MMPD, MMT, MMRI, and MMAP, enter into the proposed P-PDM respectively. We calculated and analyzed the average accuracy on the classification of different cells when different PBPs groups were treated as input data. Based on the classification performance, we determined which PBPs groups should be selected, and demonstrated their necessity and advantages of incorporating together. As shown in Table 1, we can observe that (i) in breast tissues, the combination of MMT and MMRI yields the best overall classification accuracy for cell and cancer cell detection; and (ii) in liver tissues, the classifier employing MMPD, MMT, and MMAP groups as input data has most balanced performance for the identification of cells and cancer cells. Therefore, MMT and MMRI were selected as input parameters groups in breast tissues, while MMPD, MMT, and MMAP parameter groups were selected in liver tissues.

Table 1. Performance of the P-PDM with different PBPs groups as input data^a

View Table | View all tables in this article

4.2 Physical interpretability of nodes

Figure 3 and Figure 4 respectively summarize the output results of ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$ and ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$ in breast cancer and liver cancer pathological tissues, which are the simplified linear functions of the PBPs and can be used to calculate the probability formulas by (1). After obtaining these probability formulas describing nodes in P-PDM, the final sigmoid-transformed PFPs can be derived by (2). For L0 regularization the hyperparameter is the number of nonzero parameters, and the corresponding results are shown in Fig. 3(a) and (c). For L1 regularization the hyperparameter C is the inverse of regularization strength, and the results of grid search is shown in Fig. 3(b) and (d). In Fig. 3(e)–(h), the x-axis is the input PBPs and y-axis is the set of linear combination of PBPs with optimized coefficients for each round of cross validation. The color bar represents the coefficient value of PBPs. For each round of cross validation, the training set and test set varies, therefore the resultant PFPs varies as well. For L0 regularization using OMP method, the selected PBPs appears stable, meaning that the same PBPs are selected with similar coefficients. In comparison, the L1 regularization method produces unstable results – for each round of cross validation, different PBPs are selected with varying coefficient. The experimental results indicate that the P-PDM based on L0 regularization can derive the simplest form of $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ as the product of sigmoid functions of simplified and stable PBPs combinations. This simplified and stable parameters not only quantitatively characterize cancer cells in complex pathological tissues, but also pave the way for physical interpretation.

Fig. 3. Linear combination of PBPs used for calculating probability formulas which identify the node $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ of P-PDM in breast (a), (b), (e), and (f) and liver cancer pathological tissues (c), (d), (g), and (h). Displayed in (a) – (d) are the results of selecting hyperparameters and in (e) – (h) are the linear coefficients obtained in each round of cross validation, but with different pathological samples (liver or breast cancer) and regularization methods (L0 or L1).

Download Full Size | PDF

Fig. 4. Linear combination of PBPs used for calculating probability formulas which identify the node $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ of P-PDM in (a) breast cancer and (b) liver cancer pathological tissues. Each row is the resulting coefficient for each cross validation round.

Download Full Size | PDF

The physical interpretability of nodes in P-PDM based on L0 regularization comes from the PBPs whose physical meanings are clear. The simplified linear functions of the PBPs can describe the correlation between polarization characteristics and the pathological features of interest, and can explain these nodes of model to an extent. Fig. 4 presents the first node of P-PDM that is $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ whose sigmoid function is $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ used to identify cells from tissues, we can conclude that (i) in breast cancer tissues, the pixels with high anisotropy (t ₁) and low linear birefringence (r_L) are more likely to be cells. Of note, both r_L and q_L are linear birefringence related parameters, and their values are equal in the case of transverse mirror symmetry. It can be observed from Table 1 that the accuracy of using the transverse mirror symmetry related parameter as input is about 40%, indicating that the mirror symmetry may not be broken in some breast cells. Therefore, r_L and q_L are used interchangeably in a few ROIs in Fig. 4(a); and (ii) in liver cancer tissues, the coefficient of t ₁ is high and that of δ is low, meaning that cells have strong anisotropy and low linear birefringence property.

In addition, $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ can be obtained by multiplying the sigmoid function of the first node $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ and the second node $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$, which may explain the polarization feature variation from normal cells to cancer cells: (i) in breast cancer tissues, the decrease in linear (q_L) and circular (β) birefringence signals cancer progression; (ii) in liver tissues, cancer cells have strong transverse mirror asymmetry (P _TMS) and polarizance (b).

4.3 Quantitative characterization results

The two final PFPs’ performance can be validated on the ROIs from the test samples. PFPs can be calculated from the PBPs of each ROI. By using PFPs, the cells in ROI’s can be identified and their types can be predicted quantitatively. In Fig. 5, we summarize the quantitative characterization results of different types of cells obtained from the final PFPs in breast cancer and liver cancer pathological tissues respectively. The ROIs presented in Fig. 5 were selected randomly from the test set. They do not represent the performance of all the cases, and are shown here only to illustrate the parts of characterization results of PFPs. In Fig. 5, the H&E images are the corresponding ground truth of the PFPs’ characterization results, in which the corresponding cells area are inside the black solid line and outside the blue solid line labelled by pathologists. From Fig. 5, we can observe the following patterns: first, in Fig. 5(a) and (c), the two ROIs are composed by N cells and N tissues. In these ROIs, the high values of $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ can indicate the positions of cells. Meanwhile, $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ which are sensitive to cancer cells, almost have no high values in these regions of healthy breast and liver tissues. Secondly, most cells in Fig. 5(b) and (d) are C cells. Therefore, in the ROIs composed by C cells and N tissues, the 2D images of $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ and $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ have obvious contrast at cells positions. Third, there are also some misclassified pixels. For example, in Fig. 5(c), healthy liver tissue, there are a few highlight pixels in 2D images of $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$, which means the PFPs predict the pixels belonging to non-cancer cells as cancer cells.

Fig. 5. Quantitative characterization results of different types of cells using the two final PFPs – PFP _Cells and PFP _{C cells}: (a) and (c) are ROIs composed of N cells and N tissues in breast and liver cancer tissues respectively; (b) and (d) are ROIs composed of C cells and N tissues in breast and liver cancer tissues respectively. In each of (a-d): the first column presents the H&E images where the corresponding cells area are labelled inside the black solid line and outside the blue solid line; and the second and the third column shows the two images of the two final PFPs.

Download Full Size | PDF

Based on above analysis, we can conclude that: (i) $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ has great potential for identifying of all kinds of cells and $\textrm{PF}\textrm{P}_\textrm{C}\; \textrm{cells}$ with stronger specificity may be considered as a powerful tool for quantitative recognition of cancer cells in the two pathological tissues; and (ii) taking full advantage of polarization imaging, the PFPs’ characterization ability depends less on image resolution. It makes cancer cells screening possible under a 4× objective lens.

4.4 Performance of classifiers

The performance of the two PFPs derived by P-PDM based on L0 regularization, the two LDA classifiers, and the three-class ANN classifier were evaluated at pixel level by feeding test set. The training and test process can be found in 3.4. The average values of accuracy, precision, and recall were calculated after conducting on test data which are not overlap with the training set. Table 2 summarizes the performance of cells classification in pathological tissues of breast cancer and liver cancer using P-PDM, LDA, and ANN, from which we can observe that: (i) in both pathological tissues, the performance differences between the P-PDM and ANN are moderate, indicating that the model with a few nodes can have comparable performance with complex ANN; (ii) for the identification of cancer cells, LDA has limited ability for the recognition of target microstructures. For example, for the quantitative characterization of C cells in liver cancer, the accuracy of P-PDM can achieve 0.854 while that of LDA is 0.738; (iii) P-PDM achieves high performance with the simplest model possible, using no more than four polarization basis parameters for the classification of cancer cells, much simpler than other machine learning methods, providing unprecedented possibility for physical interpretation of the result. Of note, although the preliminary experiment results point in a positive direction, a larger database is still needed to validate them.

Table 2. Performance of P-PDM, LDA, and ANN on the microstructural feature recognition

View Table | View all tables in this article

5. Discussion and concluding remarks

The proposed P-PDM shows potential to act as indicators for quantitative characterization of cancer cells in pathological tissues with maximal physical interpretability. We designed the P-PDM by integrating prior knowledge, making the model simple with only a few nodes. The nodes can be built using L0 regularized LR classifiers based on orthogonal matching pursuit and defined by the linear combination of PBPs with physical meanings. We connected the nodes by employing conditional probability and Bayes’ theorem. Therefore, each node in this model can be dissected to extract the physical meanings and their correspondence to microstructural features. Such a P-PDM allows us to analyze the polarization features variations between healthy and cancerous cells. We demonstrated the viability of P-PDM by using the pathological tissues of breast cancer and liver cancer, in each of which there are two derived PFPs that can respectively characterize the cells and cancer cells with satisfactory performance scores, with the simplest form of PFP possible using no more than 4 PBPs. We also compared the proposed model with a three-class ANN; while the P-PDM’s recall is worse than the ANN model, P-PDM’s precision is higher, and thus its overall accuracy is on par with that of the ANN model. Therefore, the proposed model has comparable performance with ANN, but P-PDM is computationally much simpler, considering its number of parameters is orders of magnitude smaller than that of ANN. Notably, the PFPs could work under a 4× objectives and separate the types of cells into more specific categories which are only distinctive under high magnification, since the contrast of polarization imaging depends less on imaging resolution. It may pave the way for rapidly scanning and quantitatively analysis of the whole pathological section in a low-resolution and wide-field system, building cornerstone for primary screening of cancer cells in clinical practice.

The limitation with this model is two-fold. Firstly, the manual labels provided by pathologists are expensive and labor intensive. Secondly, the performance of P-PDM model decreases as the model becomes deeper, because the error rate in each node propagates throughout the entire model. To potentially address the first limitation and improve the P-PDM model, one should consider taking advantage of semi-supervised learning methods, which utilizes the unlabeled data to improve model performance.

In summary, the proposed P-PDM leverages the strength of polarization imaging to classify cancer cells at pixel level, making the identification quantitative, objective, interpretable, and less dependent on imaging resolution.

Funding

National Natural Science Foundation of China (11974206, 61527826); Shenzhen Bureau of Science and Innovation (JCYJ20170412170814624); Beijing Municipal Administration of Hospitals’ Youth Programme (QML20191206).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. F. Bray, M. Laversanne, E. Weiderpass, and I. Soerjomataram, “The ever-increasing importance of cancer as a leading cause of premature death worldwide,” Cancer 127(16), 3029–3030 (2021). [CrossRef]

2. H. Sung, J. Ferlay, R. L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal, and F. Bray, “Global cancer statistics 2021: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA Cancer J. Clin. 71(3), 209–249 (2021). [CrossRef]

3. B. Saikia, K. Gupta, and U. N. Saikia, “The modern histopathologist: in the changing face of time,” Diagn. Pathol. 3(1), 25–29 (2008). [CrossRef]

4. Y. Rivenson, H. Wang, Z. Wei, K. D. Haan, Y. Zhang, Y. Wu, H. Günaydın, J. E. Zuckerman, T. Chong, A. E. Sisk, L. M. Westbrook, W. D. Wallace, and A. Ozcan, “Virtual histological staining of unlabelled tissue-autofluorescence images via deep learning,” Nat. Biomed. Eng. 3(6), 466–477 (2019). [CrossRef]

5. Y. Dong, J. Wan, L. Si, Y. Meng, Y. Dong, S. Liu, H. He, and H. Ma, “Deriving polarimetry feature parameters to characterize microstructural features in histological sections of breast tissues,” IEEE Trans. Med. Imaging 68(3), 881–892 (2021). [CrossRef]

6. Y. Dong, J. Wan, X. Wang, J. H. Xue, J. Zou, H. He, P. Li, A. Hou, and H. Ma, “A Polarization-imaging-based machine learning framework for quantitative pathological diagnosis of cervical precancerous lesions,” IEEE Trans. Med. Imaging 40(12), 3728–3738 (2021). [CrossRef]

7. C. He, H. He, J. Chang, B. Chen, H. Ma, and M. J. Booth, “Polarisation optics for biomedical and clinical applications: a review,” Light: Sci. Appl. 10(1), 194 (2021). [CrossRef]

8. N. Ghosh and I. A. Vitkin, “Tissue polarimetry: concepts, challenges, applications, and outlook,” J. Biomed. Opt. 16(11), 110801 (2011). [CrossRef]

9. V. V. Tuchin, “Polarized light interaction with tissues,” J. Biomed. Opt. 21(7), 071114 (2016). [CrossRef]

10. S. Y. Lu and R. A. Chipman, “Interpretation of Mueller matrices based on polar decomposition,” J. Opt. Soc. Am. 13(5), 1106–1113 (1996). [CrossRef]

11. H. He, R. Liao, N. Zeng, P. Li, Z. Chen, X. Liu, and H. Ma, “Emerging new tool for characterizing the microstructural feature of complex biological specimen,” J. Lightwave Technol. 37(11), 2534–2548 (2019). [CrossRef]

12. P. Li, D. Lv, H. He, and H. Ma, “Separating azimuthal orientation dependence in polarization measurements of anisotropic media,” Opt. Express 26(4), 3791–3800 (2018). [CrossRef]

13. P. Li, Y. Dong, J. Wan, H. He, T. Aziz, and H. Ma, “Polaromics: deriving polarization parameters from a Mueller matrix for quantitative characterization of biomedical specimen,” J. Phys. D: Appl. Phys. 55(3), 034002 (2022). [CrossRef]

14. P. Schucht, H. R. Lee, M. H. Mezouar, E. Hewer, A. Raabe, M. Murek, I. Zubak, J. Goldberg, E. Kövari, A. Pierangelo, and T. Novikova, “Visualization of white matter fiber tracts of brain tissue sections with wide-field imaging Mueller polarimetry,” IEEE Trans. Med. Imaging 39(12), 4376–4382 (2020). [CrossRef]

15. Y. Dong, J. Qi, H. He, C. He, S. Liu, J. Wu, D. S. Elson, and Hui Ma, “Quantitatively characterizing the microstructural features of breast ductal carcinoma tissues in different progression stages by Mueller matrix microscope,” Biomed. Opt. Express 8(8), 3643–3655 (2017). [CrossRef]

16. Y. Dong, H. He, W. Sheng, J. Wu, and H. Ma, “A quantitative and non-contact technique to characterize microstructural variations of skin tissues during photo-damaging process based on Mueller matrix polarimetry,” Sci. Rep. 7(1), 14702 (2017). [CrossRef]

17. N. Ghosh, M. Wood, and I. A. Vitkin, “Mueller matrix decomposition for extraction of individual polarization parameters from complex turbid media exhibiting multiple scattering, optical activity, and linear birefringence,” J. Biomed. Opt. 13(4), 044036 (2008). [CrossRef]

18. Y. Liu, Y. Dong, S. Lu, R. Meng, and H. Ma, “Comparison between image texture and polarization features in histopathology,” Biomed. Opt. Express 12(3), 1593–1608 (2021). [CrossRef]

19. Y. Shen, R. Huang, H. He, S. Liu, Y. Dong, J. Wu, and H. Ma, “Comparative study of the influence of imaging resolution on linear retardance parameters derived from the Mueller matrix,” Biomed. Opt. Express 12(1), 211–225 (2021). [CrossRef]

20. R. Rubinstein, M. Zibulevsky, and M. Elad, “Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit,” Tech. Rep. 40 (Computer Science Department, Israel Institute of Technology, 2008).

21. Y. Wang, H. He, J. Chang, N. Zeng, S. Liu, M. Li, and H. Ma, “Differentiating characteristic microstructural features of cancerous tissues using Mueller matrix microscope,” Micron 79, 8–15 (2015). [CrossRef]

22. D. H. Goldstein, “Mueller matrix dual-rotating retarder polarimeter,” Appl. Opt. 31(31), 6676–6683 (1992). [CrossRef]

23. S. I. Lee, H. Lee, P. Abbeel, and A. Y. Ng, “Efficient L1 regularized logistic regression,” in 21th National Conference on Artificial Intelligence Conference (Association for the Advance of Artificial Intelligence) (2014), paper 06.

24. V. Dremin, O. Sieryi, M. Borovkova, J. Näpänkangas, I. Meglinski, and A. Bykov, “Histological imaging of unstained cancer tissue samples by circularly polarized light,” in European Conferences on Biomedical Optics 2021 (ECBO) (2021), paper EM3A.3.

25. K. M. Sindhoora, K. U. Spandana, D. Ivanov, E. Borisova, U. Raghavendra, S. Rai, S. P. Kabekkodu, K. K. Mahato, and N. Mazumder, “Machine-learning-based classification of Stokes-Mueller polarization images for tissue characterization,” in Journal of Physics: Conference Series (2021), paper 012045.

26. X. Zhou, L. Ma, W. Brown, J. V. Little, A. Y. Chen, L. L. Myers, B. D. Sumer, and B. Fei, “Automatic detection of head and neck squamous cell carcinoma on pathologic slides using polarized hyperspectral imaging and machine learning,” Proc. SPIE 11603, 116030Q (2021). [CrossRef]

27. C. Heinrich, J. Rehbinder, A. Nazac, B. Teig, A. Pierangelo, and J. Zallat, “Mueller polarimetric imaging of biological tissues: classification in a decision-theoretic framework,” J. Opt. Soc. Am. A 35(12), 2046–2057 (2018). [CrossRef]

28. I. Deyan, D. Viktor, G. Tsanislava, B. Alexander, N. Tatiana, O. Razvigor, and M. Igor, “Polarization-Based Histopathology Classification of Ex Vivo Colon Samples Supported by Machine Learning,” Front. Phys. 9, 814787 (2022). [CrossRef]

29. W. Wang, L.G. Lim, S. Srivastava, J. Bok-Yan So, A. Shabbir, and Q. Liu, “Investigation on the potential of Mueller matrix imaging for digital staining,” J. Biophotonics 9(4), 364–375 (2016). [CrossRef]

30. Y. Zhao, J. Zang, M. Reda, K. Feng, G. Cheng, Z. Ren, S. G. Kong, S. Su, H. X. Huang, and H. Huang, “Detecting giant cell tumor of bone lesions using Mueller matrix polarization microscopic imaging and multi-parameters fusion network,” IEEE Sens. J. 20(13), 7208–7215 (2020). [CrossRef]

31. L. Xia, Y. Yao, Y. Dong, M. Wang, H. Ma, and L. Ma, “Mueller polarimetric microscopic images analysis based classification of breast cancer cells,” Opt. Commun. 475, 126194 (2020). [CrossRef]

32. D. Ma, Z. Lu, L. Xia, Q. Liao, W. Yang, H. Ma, R. Liao, L. Ma, and Z. Liu, “MuellerNet: a hybrid 3D–2D CNN for cell classification with Mueller matrix images,” Appl. Opt. 60(22), 6682–6694 (2021). [CrossRef]

33. C. Roa, V. N.. Du Le, M. Mahendroo, I. Saytashev, and J. Ramella-Roman, “Auto-detection of cervical collagen and elastin in Mueller matrix polarimetry microscopic images using K-NN and semantic segmentation classification,,” Biomed Opt. Express 12(24), 2236–2249 (2021). [CrossRef]

34. R. M. A. Azzam, “Photopolarimetric measurement of the Mueller matrix by Fourier analysis of a single detected signal,” Opt. Lett. 2(6), 148–150 (1978). [CrossRef]

35. D. H. Goldstein and R. A. Chipman, “Error analysis of a Mueller matrix polarimeter,” J. Opt. Soc. Am. A 7(4), 693–700 (1990). [CrossRef]

36. K. M. Twietmeyer, R. A. Chipman, A. E. Elsner, Y. Zhao, and D. Vannasdale, “Mueller matrix retinal imager with optimized polarization conditions,” Opt. Express 16(26), 21339–21354 (2008). [CrossRef]

37. B. Zitová and J. Flusser, “Image registration methods: a survey,” Image Vision Comput. 21(11), 977–1000 (2003). [CrossRef]

38. J. H. Xue and D. M. Titterington, “Comment on “On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes”,” Neural Process Lett. 28(3), 169–187 (2008). [CrossRef]

39. C. Rodríguez, A. V. Eeckhout, L. Ferrer, E. G. Caurel, E. G. Arnay, J. Campos, and A. Lizana, “Polarimetric data-based model for tissue recognition,” Biomed. Opt. Express 12(8), 4852–4872 (2021). [CrossRef]

40. Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition,” in Proceedings of 27th Asilomar Conference on Signals, Systems and Computers (1993), pp. 40–44 vol. 1.

41. J. Platt, Advances in Large Margin Classifiers: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, (Massachusetts Institute of Technology, 2000), pp. 61–75.

42. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, (Springer, 2009), vol. 2, pp. 106–119.

43. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, (Springer, 2009), vol. 2, pp. 389–395.

Breast cancer tissues
A	D	T	R	P	D+T	D+R	D+P	T+R
Cells	0.941	0.881	0.940	0.460	0.936	0.940	0.883	0.936
C cells	0.869	0.768	0.834	0.465	0.859	0.872	0.829	0.873
A	T+P	R+P	T+R+P	D+R+P	D+T+P	D+T+R	ALL
Cells	0.870	0.926	0.936	0.886	0.937	0.936	0.935
C cells	0.765	0.811	0.863	0.836	0.858	0.869	0.859
Liver cancer tissues
A	D	T	R	P	D+T	D+R	D+P	T+R
Cells	0.905	0.818	0.917	0.518	0.926	0.902	0.622	0.921
C cells	0.717	0.762	0.783	0.713	0.754	0.754	0.819	0.778
A	T+P	R+P	T+R+P	D+R+P	D+T+P	D+T+R	ALL
Cells	0.749	0.625	0.922	0.610	0.927	0.927	0.927
C cells	0.853	0.737	0.852	0.822	0.856	0.790	0.854

	Accuracy			Recall			Precision
	P-PDM	LDA	ANN	P-PDM	LDA	ANN	P-PDM	LDA	ANN
	Breast cancer tissues
C Cells	0.871	0.824	0.846	0.871	0.863	0.850	0.883	0.816	0.852
	Liver cancer tissues
C Cells	0.854	0.738	0.857	0.875	0.720	0.959	0.881	0.706	0.798

Breast cancer tissues
A	D	T	R	P	D+T	D+R	D+P	T+R
Cells	0.941	0.881	0.940	0.460	0.936	0.940	0.883	0.936
C cells	0.869	0.768	0.834	0.465	0.859	0.872	0.829	0.873
A	T+P	R+P	T+R+P	D+R+P	D+T+P	D+T+R	ALL
Cells	0.870	0.926	0.936	0.886	0.937	0.936	0.935
C cells	0.765	0.811	0.863	0.836	0.858	0.869	0.859
Liver cancer tissues
A	D	T	R	P	D+T	D+R	D+P	T+R
Cells	0.905	0.818	0.917	0.518	0.926	0.902	0.622	0.921
C cells	0.717	0.762	0.783	0.713	0.754	0.754	0.819	0.778
A	T+P	R+P	T+R+P	D+R+P	D+T+P	D+T+R	ALL
Cells	0.749	0.625	0.922	0.610	0.927	0.927	0.927
C cells	0.853	0.737	0.852	0.822	0.856	0.790	0.854

	Accuracy			Recall			Precision
	P-PDM	LDA	ANN	P-PDM	LDA	ANN	P-PDM	LDA	ANN
	Breast cancer tissues
C Cells	0.871	0.824	0.846	0.871	0.863	0.850	0.883	0.816	0.852
	Liver cancer tissues
C Cells	0.854	0.738	0.857	0.875	0.720	0.959	0.881	0.706	0.798

Polarization-based probabilistic discriminative model for quantitative characterization of cancer cells

Abstract

1. Introduction

2. Related work

3. Methods

3.1 Pathological samples

3.2 Data acquisition

3.2.1 Experimental setup

3.2.2 Polarimetry basis parameters

3.3 Overview

3.4 Algorithm architecture

3.4.1 Image registration

3.4.2 Polarization-based probabilistic discriminative model

3.4.3 Artificial neural network and linear discriminant analysis

4. Results

4.1 Selection of polarimetry basis parameters

4.2 Physical interpretability of nodes

4.3 Quantitative characterization results

4.4 Performance of classifiers

5. Discussion and concluding remarks

Funding

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (5)

Tables (2)

Equations (2)

Biomedical Optics Express