Increasing signal-to-noise ratio in over-determined Mueller matrices

H. Philpott; E. Garcia-Caurel; O. Guaitella; A. Sobota

doi:10.1364/OE.493464

1. Introduction

Polarimetry is a useful technique in the toolbox of remote sensing diagnostics. Broadly speaking, it can be divided into two sub-disciplines: Stokes polarimetry [1–7] and Mueller polarimetry [8–14]. The former is concerned with determining the polarisation state of light, and has been employed in many fields ranging from astronomy to the study of beetles [15]. Mueller polarimetry on the other hand is concerned with calculating how a particular medium affects the polarisation state of light, the description of this interaction is mathematically represented as a 4x4 matrix known as the Mueller matrix. This focus on materials has proven invaluable in fields such as ellipsometry, crystallography and oncology [16–21], where the Mueller matrix of biological tissue or thin films provides useful insights into their properties [22]. Alongside these static measurements, Mueller polarimetry can also be used to make time resolved measurements of external variables such as temperature and electric field. This functionality is added by introducing a set of crystals whose interaction with polarised light is predictably and reversibly dependent on these external factors [23,24]. In this paper we used a type of crystal known as an electro-optic crystal, where the Mueller matrix of which is dependent on the electric field through the crystal. In particular we used BSO [25] as we have previously shown that it can be used to measure the electric field of charge deposited by a plasma ionisation wave, which is an area of interest in the field of plasma catalysis. A full description of plasma catalysis is beyond the scope of this paper and not relevant to the specific insights outlined, nevertheless for the scope of this paper it can be said that the electric field from the plasma creates a time-varying change in the Mueller matrix of the BSO, which can be measured in a polarimeter. This variation in time of the BSO’s Mueller matrix can be thought of as a signal, and therefore whenever SNR is mentioned throughout the text, it is this signal that we’re referring to. We have previously written about using over-determination to overcome statistical noise [26], and in this paper we build on that concept, except now with the aim of maximising the SNR of a time varying signal. Over-determining the Mueller matrix involves calculating the matrix using more measurements than is strictly necessary. For example, 16 linearly independent measurements is the minimum number required to fully determine the Mueller matrix, so therefore any Mueller matrix that is calculated using more than 16 linearly independent measurements is considered over-determined. In this paper we have a collection of 36 linearly independent measurements that can be used to calculate the Mueller matrix, and of course, we do not have to use all 36 measurements at once. We can use any combination of measurements with a total number between 16 and 36. This results in a very large number of ways to combine these measurements, some of which will result in a Mueller matrix with higher SNR than others. Therefore to produce a Mueller matrix with the highest SNR as possible, we need a way of calculating the matrix using a subset of measurements with the highest SNR. When we began this process of finding the subset of measurements with the highest SNR, we were collecting these measurements in a matrix, and then solving for $\mathbf {M}$, the Mueller matrix, using the standard calculation shown in Eq. (1).

(1)$$\mathbf{I}=\mathbf{AMW} \rightarrow \mathbf{M} = \mathbf{A^{{-}1}IW^{{-}1}}$$

where $\mathbf {I}$ is the matrix containing the measurements, $\mathbf {W}$ and $\mathbf {A}$ are matrices containing the Stokes vectors from the polarisation state generator (PSG) and polarisation state analyser (PSA) respectively. Unfortunately we quickly ran into a problem, in that some of the possible subsets of measurements cannot be represented using Eq. (1). For example, lets say we were calculating the Mueller matrix using 16 out of our 36 total measurements, as is shown in Eq. (2), and we now want to calculate the Mueller matrix using 17 measurements, as is shown in Eq. (3). You can see immediately in Eq. (3) that a new row has to be added to the $\mathbf {I}$ matrix to accommodate this new measurement, as well as a new Stokes vector added to the $\mathbf {W}$ matrix, both of which are highlighted in bold. However the remaining 3 columns of this row have to be populated with values. Therefore the measurements $I_{17}$, $I_{18}$ and $I_{19}$ have to be added so that the row is not empty, which means that you’re now no longer calculating the Mueller matrix with 17 measurements, which is what we initially set out to do. So we have to wonder whether it is actually feasible to calculate the Mueller matrix using 17 measurements when they’re collected in matrix form? We conclude then that Eq. (3) shows that the $\mathbf {I}$ matrix must be completely filled with measurements, and therefore whole rows or columns have to be added at a time, rather than individual measurements.

(2)$$\begin{bmatrix} I_0 & I_1 & I_2 & I_3 \\ I_4 & I_5 & I_6 & I_7 \\ I_8 & I_9 & I_{10} & I_{11} \\ I_{12} & I_{13} & I_{14} & I_{15} \end{bmatrix}= \begin{bmatrix} a_{0}^{0} & a_{1}^{0} & a_{2}^{0} & a_{3}^{0} \\ \vdots & \vdots & \vdots & \vdots \\ a_{0}^{3} & a_{1}^{3} & a_{2}^{3} & a_{3}^{3} \end{bmatrix} \begin{bmatrix} m_{0,0} & \ldots & m_{0,3} \\ \vdots & \ddots & \vdots \\ m_{3,0} & \ldots & m_{3,3} \end{bmatrix} \begin{bmatrix} w_{0}^{0} & \ldots & w_{0}^{3} \\ w_{1}^{0} & \ldots & w_{1}^{3} \\ w_{2}^{0} & \ldots & w_{2}^{3} \\ w_{3}^{0} & \ldots & w_{3}^{3} \end{bmatrix}$$

(3)$$\begin{bmatrix} I_0 & I_1 & I_2 & I_3 \\ I_4 & I_5 & I_6 & I_7 \\ I_8 & I_9 & I_{10} & I_{11} \\ I_{12} & I_{13} & I_{14} & I_{15} \\ \mathbf{I_{16}} & \mathbf{I_{17}} & \mathbf{I_{18}} & \mathbf{I_{19}} \end{bmatrix}= \begin{bmatrix} a_{0}^{0} & a_{1}^{0} & a_{2}^{0} & a_{3}^{0} \\ \vdots & \vdots & \vdots & \vdots \\ a_{0}^{3} & a_{1}^{3} & a_{2}^{3} & a_{3}^{3} \end{bmatrix} \begin{bmatrix} m_{0,0} & \ldots & m_{0,3} \\ \vdots & \ddots & \vdots \\ m_{3,0} & \ldots & m_{3,3} \end{bmatrix} \begin{bmatrix} w_{0}^{0} & \ldots & w_{0}^{3} & \mathbf{w_{0}^{4}} \\ w_{1}^{0} & \ldots & w_{1}^{3} & \mathbf{w_{1}^{4}} \\ w_{2}^{0} & \ldots & w_{2}^{3} & \mathbf{w_{2}^{4}} \\ w_{3}^{0} & \ldots & w_{3}^{3} & \mathbf{w_{3}^{4}} \end{bmatrix}$$

So in our example where we have a total of 36 measurements taken using 6 distinct Stokes vectors in the PSA and PSG. We can only perform calculations with $\mathbf {I}$ matrices of the following sizes (4x4), (4x5), (5x4), (5x5), (5x6), (6x5) and (6x6), which corresponds to measurement subsets of sizes 16, 20, 24, 25, 30 and 36, which is only a fraction of the possible subsets ranging from sizes 16 to 36. Therefore we need a way of calculating the Mueller matrix where measurements can be added or removed individually rather than as whole rows or columns. Thankfully there is an already existing framework used within the field of partial polarimetry [27–34]. As the name suggests, its purpose is to only measure the Mueller matrix partially rather than the whole 4x4 matrix. In many applications only one or a few elements of the Mueller matrix are of interest, so measuring the entire matrix so that only one element can be extracted is inefficient and can potentially introduce errors. In order to calculate certain elements of the Mueller matrix it must be transformed into a vector by flattening it row-wise, the result of this operation we denote $\mathbf {M}^{\prime }$. Now that the Mueller matrix has been transformed into a vector we must also transform $\mathbf {I}$, $\mathbf {A}$ and $\mathbf {W}$. First, we also transform $\mathbf {I}$ into a vector by flattening it row-wise, which we denote $\mathbf {I}^{\prime }$. The Transformation of $\mathbf {A}$ and $\mathbf {W}$ is more complex than $\mathbf {I}$ and $\mathbf {M}$, as it involves taking the kronecker product of the two to yield a matrix we denote $\mathbf {P}$. Now that all the necessary matrices have been transformed we can relate them using Eq. (4), where $\mathbf {P}$, $\mathbf {I}^{\prime }$ and $\mathbf {M}^{\prime }$ are shown in Eq. (5)

(4)$$\mathbf{I}^{\prime}=\mathbf{PM}^{\prime}$$

(5)$$\mathbf{P}=\mathbf{A}{\otimes}\mathbf{W}^{T},\;\mathbf{I}^{\prime}=Vec(\mathbf{I}),\mathbf{M}^{\prime}=Vec(\mathbf{M})$$

where $\otimes$ is the Kronecker product, $Vec()$ transforms matrices into a column vector and $T$ denotes the transpose. This representation is not intuitive at first glance, so we have provided an example of how Eq. (2) can be represented using the transformations outlined in Eqs. (4) and (5)

(6)$$\begin{bmatrix} a_{0}^{0}w_{0}^{0} & a_{0}^{0}w_{1}^{0} & a_{0}^{0}w_{2}^{0} & a_{0}^{0}w_{3}^{0} & \dots & a_{3}^{0}w_{0}^{0} & a_{3}^{0}w_{1}^{0} & a_{3}^{0}w_{2}^{0} & a_{3}^{0}w_{3}^{0} \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ a_{0}^{3}w_{0}^{3} & a_{0}^{3}w_{1}^{3} & a_{0}^{3}w_{2}^{3} & A_{0}^{3}W_{3}^{3} & \dots & a_{3}^{3}w_{0}^{3} & a_{3}^{3}w_{1}^{3} & a_{3}^{3}w_{2}^{3} & a_{3}^{3}w_{3}^{3} \end{bmatrix} \begin{bmatrix} m_{0,0} \\ m_{0,1} \\ m_{0,2} \\ m_{0,3} \\ \vdots \\ m_{3,0} \\ m_{3,1} \\ m_{3,2} \\ m_{3,3} \end{bmatrix}= \begin{bmatrix} I_0 \\ I_1 \\ I_2 \\ \vdots \\ I_{15} \end{bmatrix}$$

Throughout the following text we will refer to the vector methodology, it is the preceding methodology shown in Eq. (4) to Eq. (6) that we’re referring to. And when we mention the matrix methodology we are speaking of the more usual $\mathbf {I}=\mathbf {AMW}$ method where $\mathbf {I}$ and $\mathbf {M}$ are matrices.

2. Experimental details

The measurements were made using a similar setup used in our previous work on over-determination [26], a schematic of this is shown in fig. (1). The optical components begin with an LED generating incoherent collimated monochromatic red light (625nm), this light travels through the PSG, which is comprised of a linear polariser (LP) mounted at $180^{\circ }$, followed by a pair of two $Meadowlark^{TM}$ liquid crystal variable retarders (LCVR) [35] that were mounted at $45^{\circ }$ and $0^{\circ }$ degrees respectively. Each pair of LCVR’s was managed by a Meadowlark control unit that applies a voltage between 0V and 10V to the liquid crystal, and thus alters the retardance of the liquid crystal over a range of $15^{\circ }$ to $350^{\circ }$. Once the light passes through the PSG, it travels through the BSO, which is undergoing exposure to ionisation waves, or so called, plasma bullets [36,37]. The light then moves through a collection of two lenses before it passes through the PSA. In an ideal system these two lenses would not be necessary, as the light would be perfectly collimated throughout the whole setup. However the light source used was only capable of producing collimated light over distances shorter than this setup, and therefore this sequence of lenses was added to ensure that the light entering into the PSA was collimated and that the sample was in focus at the camera. The PSA is made up of the same components as the PSG, except now the order of the components (from left to right) is as follows: two LCVR’s mounted at $0^{\circ }$ and $45^{\circ }$, then a linear polariser mounted at $180^{\circ }$. Immediately in front of the $Andor^{TM}$ Istar ICCD camera there is a lens that magnifies the image before entering the cameras array of detectors which measure the intensity of the light, producing an 1024x1024 pixel image. The camera used for these measurements had possible gate widths ranging from the seconds to the nanoseconds, it also had an adjustable gain ranging from 1 to 4095 that when applied, increased the intensity of images taken. As the timescales involved in plasma phenomena are quite short, to capture the effects accurately, a gate width of 1$\mu$s and a gain of 2000 was used. Because the plasma generation is cyclical and uniform in time, with levels of jitter between each image capture much lower than the 1$\mu$s exposure used, we are confident that the measurements presented here are consistent. In total, images were taken over delays ranging from 0 to 30$\mu$s after the plasma reaches the outer electrode, with each image being the accumulation of 5 separate images. The Stokes vectors were held constant over the whole range of intervals, and were only changed once the complete set was recorded. The BSO itself was a 30x30x0.5mm crystal placed perpendicular to the light beam so that only $E_z$ was measured, and the jet capillary was placed 4mm away from the BSO and at an angle of $45^{\circ }$. For the image capture process, 6 distinct, linearly independent Stokes vectors were used in both the PSG and PSA, resulting in an over-determined system of 36 measurements. The Stokes vectors used for both the PSG and PSA are taken from the diamond set of vectors, so called because they sit on the vertices of a diamond shape in the Poincaré sphere [1,2,38,39]. An explicit description is shown in Eq. (7). This particular set of Stokes vectors was chosen firstly because the 36 measurements being produced are large enough to highlight the difference between the vector and matrix methods whilst not being too cumbersome to show in figures. Secondly, these Stokes vectors are linearly independent, and as we stated above, over-determination requires over 16 linearly independent measurements, so this set of Stokes vectors will satisfy that requirement.

(7)$$\mathbf{A} = \frac{1}{2} \begin{bmatrix} 1 & 1 & 0 & 0 \\ 1 & -1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 1 & 0 & -1 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 0 & 0 & -1 \end{bmatrix}, \mathbf{W} = \mathbf{A}^T$$

Fig. 1. A schematic of the equipment used to make measurements.

Download Full Size | PDF

3. Figures of interest

In this section we show how we calculate our comparison statistics input SNR, output SNR and condition number. The subscripts $n$ and $m$ correspond to a matrix element, and the superscript $t$ denotes the time, e.g. $\mathbf {I}_{4,3}^{12}$ represents the element of the $\mathbf {I}$ matrix in the 4th row and 3rd column, at $t=12\mu$s. We start with our definitions of input noise and output noise, denoted as $\rho _{I}^{t}$ and $\rho _{M}^{t}$ respectively, where there is single value per moment in time. These two values are derived from $\mathbf {R}$ and $\mathbf {G}$, matrices of normalised values from $\mathbf {I}$ and $\mathbf {M}$ respectively, where each element is normalised between 0 and 1 as follows.

(8)$$\mathbf{T}_{n,m}^{t} = |\mathbf{I}_{n,m}^{t} - \mathbf{I}_{n,m}^{0}|$$

(9)$$\mathbf{Q}_{n,m}^{t} = |\mathbf{M}_{n,m}^{t} - \mathbf{M}_{n,m}^{0}|$$

(10)$$\mathbf{R}_{n,m}^{t} = \frac{\mathbf{T}_{n,m}^{t} - Min^{t}\,(\mathbf{T}_{n,m}^{t\geq0})}{Max^{t}\,(\mathbf{T}_{n,m}^{t\geq0}) - Min^{t}\,(\mathbf{T}_{n,m}^{t\geq0})}$$

(11)$$\mathbf{G}_{n,m}^{t} = \frac{\mathbf{Q}_{n,m}^{t} - Min^{t}\,(\mathbf{Q}_{n,m}^{t\geq0})}{Max^{t}\,(\mathbf{Q}_{n,m}^{t\geq0}) - Min^{t}\,(\mathbf{Q}_{n,m}^{t\geq0})}$$

where $Min^{t}\,(\mathbf {T}_{n,m}^{t\geq 0})$ and $Max^{t}\,(\mathbf {T}_{n,m}^{t\geq 0})$ denote the minimum and maximum values across time for a given value of $n$ and $m$. The calculation of $\rho _{I}^{t}$ and $\rho _{M}^{t}$ uses the weighted standard deviation formula, where the weights used are outlined in Eq. (12) and Eq. (13). As you can see the weights correspond to the absolute change in values over time for each $n, m$ for the Mueller matrix $\mathbf {M}$ and the experimentally determined intensity matrix $\mathbf {I}$. The absolute change was chosen so that the output would be consistent between increasing or decreasing values in the $\mathbf {M}$ and $\mathbf {I}$ matrices.

(12)$$\mathbf{H}_{n,m}^I = Max^{t}\,(\mathbf{T}_{n,m}^{t\geq0})$$

(13)$$\mathbf{H}_{n,m}^M = Max^{t}\,(\mathbf{Q}_{n,m}^{t\geq0})$$

(14)$$\rho_{I}^{t} = \sqrt{ \frac{\sum_{n,m = 1}^{6}\mathbf{H}_{n,m}^{I}\left(\mathbf{R}_{n,m}^t - \overline{\mathbf{R}^t}\right)^2}{\frac{35}{36}\sum_{n,m = 1}^{6}\mathbf{H}_{n,m}^{I}}}$$

(15)$$\rho_{M}^{t} = \sqrt{ \frac{\sum_{n,m = 1}^{4}\mathbf{H}_{n,m}^{M}\left(\mathbf{G}_{n,m}^t - \overline{\mathbf{G}^t}\right)^2}{\frac{15}{16}\sum_{n,m = 1}^{4}\mathbf{H}_{n,m}^{M}}}$$

where $\sum _{n,m=1}^{4}$ is an abbreviation of $\sum _{n=1}^{4}\sum _{m=1}^{4}$. Now that we have the equations for input noise and output noise we can calculate the input SNR and output SNR as follows.

(16)$$Input\;SNR = \frac{1}{30}\sum_{t=1}^{30}\left(\frac{\sum_{n,m=1}^{6}\left(\frac{\mathbf{H}_{n,m}^I\mathbf{R}_{n,m}^t}{\mathbf{\rho}_{I}^{t}}\right)}{\sum_{n,m=1}^{6}\mathbf{H}_{n,m}^{I}}\right)$$

(17)$$Output\;SNR = \frac{1}{30}\sum_{t=1}^{30}\left(\frac{\sum_{n,m=1}^{4}\left(\frac{\mathbf{H}_{n,m}^{M}\mathbf{G}_{n,m}^{t}}{\mathbf{\rho}_{M}^{t}}\right)}{\sum_{n,m=1}^{4}\mathbf{H}_{n,m}^M}\right)$$

The equation for condition number is shown below

(18)$$Condition\,number = \|\mathbf{P}\|\|\mathbf{P}^{{-}1}\|$$

where $\|..\|$ denotes the Frobenius norm and $\mathbf {P}$ is the kronecker product of $\mathbf {A}$ and $\mathbf {W}$ shown in Eq. (5).

4. Results

In this section we will directly compare the vector method of calculation with the matrix method. Our specific statistics used for comparison are as follows: The output SNR, which is the SNR in the calculated Mueller matrix defined in Eq. (17). The input SNR, which is the SNR of the experimentally derived intensity measurements used to calculate the Mueller matrix, and is defined in Eq. (16). And finally, the condition number of the $\mathbf {P}$ matrices used in calculating the Mueller matrix, which we define in Eq. (18). As well as comparing these three statistics, we also show our experimentally derived data (see section Experimental Details for more information). Specifically, some of this data can be seen in Fig. 2(a), where we see the changes in each element of the BSO’s Mueller matrix as a function of time. Where the exact numbers are referring to the average value taken across a 1x1mm area centred upon the plasma impact site (shown as the red square in Fig. 2(b)). To show that the 1x1mm area that the average is taken across is spatially homogeneous, we have included Fig. 2(b) which shows the spatial distribution of values taken from the difference between $\mathbf {M}_{n,m}^{15} - \mathbf {M}_{n,m}^{0}$. For this particular figure, the reason for displaying a change in Mueller matrix rather than a Mueller matrix itself was because the change in Mueller matrix has a smaller range of values, and is therefore easier to compare. However, we reiterate that the principal purpose of Fig. 2(b) is to show that we are taking the spatial average across a homogeneous region. In addition to verifying the spatial homogeneity, we have also included the standard deviation in each element of the Mueller matrix across this area, where these values can be seen as the grey dashed lines in Fig. 2(a). As well as showing an example of a Mueller matrix calculated using our experimentally derived data, we also show this data as well, which can be seen in fig. (3). This figure is a visual representation of the $\mathbf {T}$ matrix which is used in calculating the input SNR. The purpose of showing this figure is to highlight the varying levels of SNR between each of the experimentally derived measurements, and is subsequently used to highlight the difficulty in maximising the input SNR when these values are collected in a matrix. We should also highlight that these results were averaged across the same 1x1mm area shown in Fig. 2(b). In addition to this point, we also feel the need to stress that the Mueller matrix shown in Fig. 2(a) is not the definitive Mueller matrix, as Table 1 shows, there are billions of ways to calculate the Mueller matrix with our collection of data, therefore this is just one of those multitudes and subsequently is primarily intended as an example to aid in our explanations. If we are to maximise the input SNR we must look at Eq. (16) to see that to maximise this equation a subset of values from $\mathbf {I}$ must be drawn that have as minimal variance as possible (i.e. minimise $\rho _{I}^{t}$), whilst also having as large a signal as possible (i.e. maximise $\mathbf {H}_{n,m}^I$). This is where the drawbacks of the matrix method become clear. To create a subset of measurements in the form of a matrix we will have to add whole rows or columns of measurements at a time, rather than individual measurements. This requirement means that there are subsets of measurements that cannot be used when represented in the form $\mathbf {I}=\mathbf {AMW}$. Which means that in reality the subset of measurements with globally maximum input SNR cannot be found and used to calculate the Mueller matrix using the matrix methodology. If we contrast this with the vector methodology, where individual measurements can be added. This means that we are not limited at all with the number of subsets of measurements that can be used for calculating the Mueller matrix and thus the subset with the globally maximum input SNR can be found and used to calculate the Mueller matrix. This point is perhaps best illustrated in fig. (3). This figure displays a similar analysis as Fig. 2(a), except that the subplots in this image are related to elements in the $\mathbf {T}$ matrix, which is the absolute change in value of the intensity matrix $\mathbf {I}$ over time (see Eq. (8) for more details). Each row and column is related to a particular Stokes vector in the $\mathbf {A}$ and $\mathbf {W}$ matrix respectively. As can be clearly seen, not all the measurements have the same level of signal, for example if we look at the [1,1,0,0]-[1,0,0,-1] (PSA vector-PSG vector) measurement we will see a very clear signal with values exceeding 100 counts, whereas the [1,1,0,0]-[1,0,1,0] measurement shows a similar form of signal, except with values much lower than 100. Alongside differing magnitudes of signal, we can also see varying levels of noise. In particular, if we look at [1,0,0,1]-[1,0,1,0] we will see a signal with counts around 25, but with a very high level of noise, especially when compared to [1,0,-1,0]-[1,0,1,0] which has the same level of signal but much lower noise. So in the case of maximising input SNR, the question then is, how do we include as many high signal measurements and exclude as many low signal ones as possible? For instance, let’s say we want to include all measurements but the [1,0,1,0]-[1,0,1,0] measurement from our calculation. If we were collecting the measurements in a matrix then we would have to remove the entire [1,0,1,0] row or column and therefore remove 5 other measurements alongside the single element we actually want to remove. If the measurements are collected in a vector however, we simply remove the measurement from $\mathbf {I}^{\prime }$ and then delete the associated row within the $\mathbf {P}$ matrix, a process that leaves all the other measurements unaffected. So clearly, if we want to work with as much control as possible, we must use the vector framework. If we further explore the flexibility afforded to us by the vector methodology, we can begin to see that it is now possible to move entirely away from the set based method commonly used in Mueller polarimetry, i.e. using Stokes vectors from the cubic, diamond set etc., in the PSG and PSA. Using the vector method it is entirely possible to use a completely new pair of Stokes vectors in the PSG and PSA for each new measurement. The ramifications of this possibility are somewhat large, as to date most investigations that compare different measurement schemes have only compared pure sets against each other, e.g. pure cubic vs pure diamond. But now we can combine pairs and sets of pairs of Stokes vectors in such a way that any non-singular $\mathbf {P}$ matrix can be created. This dramatically increases the number of possible measurement schemes that can be compared together. To further illustrate the advantages of the vector method over the matrix method, we analysed the input SNR, output SNR and condition number produced by both the vector and matrix calculation methods on our own data where the outcome of this can be seen in Table 1 and Fig. 4. We hasten to add that both the matrix and vector methods use data drawn from the same complete dataset, the variation between them lies in different subsets of the complete dataset being used. It should be highlighted that due to the extremely large number of possible unique subsets that can be used in the vector methodology, we have limited the investigation of these to a randomly chosen selection of 10,000 per value of $N$, which denotes the size of the subsets to be drawn from the complete dataset of 36 measurements. Therefore since it cannot be feasibly proven that the minimum values shown here actually correspond to the global minima, they should be interpreted as the eventual local minima. If we look at Table 1 starting on the left we have the $N$ column. These values of $N$ were chosen because these are all the values of $N$ that are compatible with the matrix method, for instance, it is impossible to have $N=23$ using the matrix methodology as 23 is a prime number and within this methodology the possible $N$ has the equation $N = N_{A}*N_{W}$, where $N_{A}$ and $N_{W}$ are the number of Stokes vectors in the PSA and PSG respectively. As we stated previously, in the vector framework, measurements can be added or removed at will, so this means that any value of $N$ between 16 and 36 can be analysed, however we have omitted the rows where it is impossible to use the matrix method, and instead provided a full analysis in Fig. 4. The most striking difference between the two methods can be seen in the Number of subsets columns. We immediately see that the magnitude of possible unique subsets is $~10^{5}-10^{7}$ times greater for the vector than for the matrix method, which is a colossal difference in magnitude. The reason for this marked difference is explained in Eq. (19), where we see the equations which provide the number of unique subsets that can be drawn using both methods. Where $N_{C}^{mat}$ and $N_{C}^{vec}$ are the number of unique subsets for the matrix and vector methods respectively. Looking at the 36! ($36!=3.7199 \times 10^{41}$) within the equation for $N_{C}^{vec}$ we can immediately see why the number is so high for the vector method. If we contrast that with the equation for the matrix method we see that we have $(6!)^2=518400$ as an indication of the scale.

(19)$$N_{C}^{mat} = \left(\frac{6!}{N_{A}!(6-N_{A})!}\right) \left(\frac{6!}{N_{W}!(6-N_{W})!}\right),\; N_{C}^{vec} = \frac{36!}{N!(36-N)!}$$

Next, if we compare the input SNR and output SNR of both methods we will immediately see that the vector method either provides the same value or better. The reason for this is that the Mueller matrices that can be calculated using the matrix method can all be calculated using the vector method. What this means is that the set of Mueller matrices determined using the matrix method are in fact a small subset of matrices that can be calculated using the vector method. So if the Mueller matrices calculated using the matrix method are a subset of those that can be calculated by the vector method, therefore logically, the maximum input SNR and maximum output SNR in the vector framework has to be greater than or equal to the maximum input SNR and maximum output SNR in the matrix framework. This logical restriction also of course holds true for the minimum condition number or for any measure for that matter. This is where the true power of the vector method lies, in that the results from the matrix method are a subset of results from the vector method, therefore the vector method has to be better or as good as the matrix method by simple logic of construction. Fig. 4 shows a comparison between the two calculation methods, where maximum, minimum and mean values of condition number, input SNR and output SNR are shown for each value of $N$, the number of measurements drawn from the total set of 36 measurements. Please note that the 99 percentile is displayed for Fig. 4(a) due to the extremely large values produced in the calculation of the condition number. As stated previously, the matrix method is only capable of calculating the Mueller matrix using $N$ = 16, 20, 24, 25, 30 and 36. Consequently, the results shown in Fig. 4 only have matrix method values for these values of $N$. Indeed, one of the main purposes of Fig. 4 is to clearly highlight how limiting the matrix method is compared to the vector method. If we take a specific example of this by looking at Fig. 4(b), where we see that the maximum value of input SNR for $N$ = 17 is higher than that of $N$ = 16, but as we have stated many times before, the matrix method is completely unable to perform calculations using this value of $N$ and therefore the only way to calculate the Mueller matrix using measurements with this particular high level of input SNR is to use the vector method. This fact is shown in all the sub-figures of Fig. 4, however each sub-figure has unique characteristics worthy of investigation. Beginning with Fig. 4(a) we see that, as expected, increasing the number of measurements results in a lower condition number, and that the relationship between the two is non-linear. We can also clearly see that the vector method produces a larger range of condition numbers than the matrix method, indeed the matrix method has no range in values for each specific value of $N$. On average, the condition numbers produced by the vector method tend to be higher than those generated by the matrix method, however this does not mean the vector method is inferior in this regard. Specifically, the average may be higher, but the minimum values produced are much lower than those from the matrix method, for example, for $N$ = 16 we have minimum values of 32 and 24.67 for the matrix and vector methods respectively. Now moving our attention to Fig. 4(c) we can see that increasing the number of measurements has a positive linear influence on the output SNR, and this relationship is shared between both calculation methods. Just like Fig. 4(b) and Fig. 4(a), Fig. 4(c) shows the vector method has a wider range of values than the matrix method, resulting in both lower and higher values of output SNR. In terms of the maximum output SNR, the vector method is substantially superior with maximum values of 7.54 versus 4.97.

Fig. 2. Spatial and temporal changes in the Mueller matrix of the BSO. The impact, stasis and subsequent discharging are clearly represented by the sharp rise, plateau and decay shown in Fig. 2(a)

Download Full Size | PDF

Fig. 3. Visual representation of $\mathbf {T}_{n,m}^{t}$ values over time defined in Eq. (8). Each row and column corresponds to a different PSG and PSA vector respectively. The $x$ and $y$ axes correspond to time in microseconds and absolute change in signal, which is measured in counts.

Download Full Size | PDF

Fig. 4. A comparison between the vector and matrix calculation methods showing the condition number, input SNR and output SNR achieved with each calculation method and for each number of measurements used in the calculation. The markers denote the average, and the top and bottom of each error bar corresponds to the maximum and minimum value. Please note that the 99th percentile is displayed for Fig. 4(a) due to the extremely large values produced in the calculation of the condition number.

Download Full Size | PDF

Table 1. Comparison between the matrix and vector methods.

View Table

5. Conclusion

This paper compares the efficacy of increasing the SNR of a time-varying, over-determined Mueller matrix by using the vector method of calculation versus the matrix method. The data used was experimentally derived from an investigation into the time-varying signal of an electro-optic BSO crystal undergoing cyclical impact from a helium plasma ionisation wave. To compare the two methods we calculated the SNR of the intensity measurements used for calculation and the SNR of the resulting Mueller matrices, which we denoted input SNR and output SNR respectively. Alongside this, we calculated the condition number of the matrices used to determine the Mueller matrices. The results of these three statistics clearly show that the vector method is consistently better than the matrix method. Specifically, the vector method has a maximum output SNR of 7.54 versus 4.97, maximum input SNR of 8.43 versus 7.93 and consistently lower condition numbers. We posit that the reason for this superiority is due to the much greater flexibility afforded by the vector method, which allows for individual intensity measurements to be added or removed from the calculation. In comparison to the matrix method which has to add or remove whole rows or columns at a time. We then go on to highlight that the Mueller matrices calculated by the matrix method are in fact a very small subset of those calculable by the vector method. Consequently, the vector methodology cannot be worse than the matrix methodology.

Funding

HORIZON EUROPE Marie Sklodowska-Curie Actions (813393).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available in Dataset 1 [40].

References

1. M. R. Foreman and F. Goudail, “On the equivalence of optimization metrics in Stokes polarimetry,” Opt. Eng. 58(08), 1 (2019). [CrossRef]

2. A. Peinado, A. Lizana, J. Vidal, C. Iemmi, and J. Campos, “Optimization and performance criteria of a Stokes polarimeter based on two variable retarders,” Opt. Express 18(10), 9815–9830 (2010). [CrossRef]

3. X. Tu, S. McEldowney, C. Guido, Y. Zou, M. Smith, N. Brock, S. Miller, L. Jiang, and S. Pau, “Division of focal plane red – green – blue full-Stokes imaging polarimeter,” Appl. Opt. 59(22), G33 (2020). [CrossRef]

4. X. Li, F. Goudail, P. Qi, T. Liu, and H. Hu, “Integration time optimization and starting angle autocalibration of full Stokes imagers based on a rotating retarder,” Opt. Express 29(6), 9494 (2021). [CrossRef]

5. H. Dong, M. Tang, and Y. Gong, “Noise properties of uniformly-rotating RRFP Stokes polarimeters,” Opt. Express 21(8), 9674 (2013). [CrossRef]

6. N. Hagen and Y. Otani, “Stokes polarimeter performance: general noise model and analysis,” Appl. Opt. 57(15), 4283 (2018). [CrossRef]

7. T. Mu, C. Zhang, Q. Li, and R. Liang, “Error analysis of single-snapshot full-Stokes division-of-aperture imaging polarimeters,” Opt. Express 23(8), 10822 (2015). [CrossRef]

8. F. Goudail, “Optimal Mueller matrix estimation in the presence of additive and Poisson noise for any number of illumination and analysis states,” Opt. Lett. 42(11), 2153 (2017). [CrossRef]

9. R. A. Chipman, E. A. Sornsin, and J. L. Pezzaniti, “Mueller matrix imaging polarimetry: an overview,” Int. Symp. on Polariz. Anal. Appl. to Device Technol. 2873, 5–12 (1996). [CrossRef]

10. V. Devlaminck, P. Terrier, and J.-M. Charbois, “Physically admissible parameterization for differential Mueller matrix of uniform media,” Opt. Lett. 38(9), 1410 (2013). [CrossRef]

11. P. R. T. Munro and P. Török, “Properties of high-numerical-aperture,” Opt. Lett. 33(21), 2428–2430 (2008). [CrossRef]

12. I. J. Vaughn and B. G. Hoover, “Noise reduction in a laser polarimeter based on discrete waveplate rotations,” Opt. Express 16(3), 2091 (2008). [CrossRef]

13. R. Ossikovski, M. Anastasiadou, S. B. Hatit, E. Garcia-Caurel, and A. D. Martino, “Depolarizing mueller matrices: How to decompose them?” Phys. Status Solidi A 205(4), 720–727 (2008). [CrossRef]

14. X. Li, F. Goudail, and S.-C. Chen, “Self-calibration for Mueller polarimeters based on DoFP polarization imagers,” Opt. Lett. 47(6), 1415 (2022). [CrossRef]

15. K. Serkowski, “Optical Polarimeters in Astronomy,” in Proc. SPIE, vol. 112 (1977), pp. 12–13.

16. O. Arteaga and B. Kahr, “Mueller matrix polarimetry of bianisotropic materials [Invited],” J. Opt. Soc. Am. B 36(8), F72 (2019). [CrossRef]

17. M. Shopa, Y. Shopa, M. Shribak, E. Kostenyukova, I. Pritula, and O. Bezkrovnaya, “Polarimetric studies of l-arginine-doped potassium dihydrogen phosphate single crystals,” J. Appl. Crystallogr. 53(5), 1257–1265 (2020). [CrossRef]

18. E. Du, H. He, N. Zeng, M. Sun, Y. Guo, J. Wu, S. Liu, and H. Ma, “Mueller matrix polarimetry for differentiating characteristic features of cancerous tissues,” J. Biomed. Opt. 19(7), 076013 (2014). [CrossRef]

19. R. M. Azzam, “Use of a light beam to probe the cell surface in vitro,” Surf. Sci. 56, 126–133 (1976). [CrossRef]

20. J. Qi and D. S. Elson, “Mueller polarimetric imaging for surgical and diagnostic applications: a review,” J. Biophotonics 10(8), 950–982 (2017). [CrossRef]

21. A. Laskarakis, S. Logothetidis, E. Pavlopoulou, and M. Gioti, “Mueller matrix spectroscopic ellipsometry formulation and application,” Thin Solid Films 455-456, 43–49 (2004). [CrossRef]

22. H. He, R. Liao, N. Zeng, P. Li, Z. Chen, X. Liu, and H. Ma, “Mueller matrix polarimetry-An emerging new tool for characterizing the microstructural feature of complex biological specimen,” J. Lightwave Technol. 37(11), 2534–2548 (2019). [CrossRef]

23. H. B. Maris, “Photo-elastic Properties of Transparent Cubic Crystals*,” J. Opt. Soc. Am. 15(4), 194 (1927). [CrossRef]

24. F. Vachss and L. Hesselink, “Measurement of the electrogyratory and electro-optic effects in BSO and BGO,” Opt. Commun. 62(3), 159–165 (1987). [CrossRef]

25. I. F. Vasconcelos, R. S. De Figueiredo, S. J. De Guedes Lima, and A. S. Sombra, “Bismuth silicon oxide (Bi12SiO20-BSO) and bismuth titanium oxide (Bi₁₂TiO₂₀-BTO) obtained by mechanical alloying,” J. Mater. Sci. Lett. 18(22), 1871–1874 (1999). [CrossRef]

26. H. Philpott, “Optimizing Mueller polarimetry in noisy systems through over-determination,” Appl. Opt. 60(31), 9594–9606 (2021). [CrossRef]

27. A. S. Alenin and J. Scott Tyo, “Structured decomposition design of partial Mueller matrix polarimeters,” J. Opt. Soc. Am. A 32(7), 1302 (2015). [CrossRef]

28. G. Anna, H. Sauer, F. Goudail, and D. Dolfi, “Fully tunable active polarization imager for contrast enhancement and partial polarimetry,” Appl. Opt. 51(21), 5302–5309 (2012). [CrossRef]

29. G. Anna, F. Goudail, and D. Dolfi, “Optimal discrimination of multiple regions with an active polarimetric imager,” Opt. Express 19(25), 25367 (2011). [CrossRef]

30. O. Arteaga and R. Ossikovski, “Complete Mueller matrix from a partial polarimetry experiment: the 9-element case,” J. Opt. Soc. Am. A 36(3), 416 (2019). [CrossRef]

31. O. Arteaga and R. Ossikovski, “Complete Mueller matrix from a partial polarimetry experiment: the 12-element case,” J. Opt. Soc. Am. A 36(3), 416 (2019). [CrossRef]

32. S. Savenkov, R. Muttiah, E. Oberemok, and A. Klimov, “Incomplete active polarimetry: Measurement of the block-diagonal scattering matrix,” J. Quant. Spectrosc. Radiat. Transfer 112(11), 1796–1802 (2011). [CrossRef]

33. N. Quan, C. Zhang, and T. Mu, “Optimal configuration of partial Mueller matrix polarimeter for measuring the ellipsometric parameters in the presence of Poisson shot noise and Gaussian noise,” Photonics Nanostructures - Fundam. Appl. 29, 30–35 (2018). [CrossRef]

34. X. Li, H. Hu, L. Wu, and T. Liu, “Optimization of instrument matrix for Mueller matrix ellipsometry based on partial elements analysis of the Mueller matrix,” Opt. Express 25(16), 18872 (2017). [CrossRef]

35. A. De Martino, Y.-K. Kim, E. Garcia-Caurel, B. Laude, and B. Drévillon, “Optimized Mueller polarimeter with liquid crystals,” Opt. Lett. 28(8), 616 (2003). [CrossRef]

36. X. Lu, G. V. Naidis, M. Laroussi, and K. Ostrikov, “Guided ionization waves: Theory and experiments,” Phys. Rep. 540(3), 123–166 (2014). [CrossRef]

37. E. Slikboer, “Electric Field and Charge Measurements in Plasma Bullets using the Pockels Effect,” Ph.D. thesis, Eindhoven University of Technology (2015).

38. J. Dai, F. Goudail, M. Boffety, and J. Gao, “Estimation precision of full polarimetric parameters in the presence of additive and Poisson noise,” Opt. Express 26(26), 34081 (2018). [CrossRef]

39. A. Peinado, A. Lizana, J. Vidal, C. Iemmi, I. Moreno, J. Campos, and B. Aires, “Analysis, optimization and implementation of a variable retardance based polarimeter,” EPJ Web of Conferences pp. 1–7 (2010).

40. H. Philpott, E. Garcia-Caurel, O. Guaitella, and A. Sobota, “Mueller polarimetry measurements,” figshare, 2023https://doi.org/10.6084/m9.figshare.23634144.

	Max. of output SNR		Max. of input SNR		Min. of Condition No.		Number of subsets
$N$	Matrix	Vector	Matrix	Vector	Matrix	Vector	Matrix	Vector
16	4.09	5.63	7.62	7.62	32	24.67	225	$7.3079 \times 10^{9}$
20	4.70	7.54	7.93	7.93	28.28	22.18	180	$7.3079 \times 10^{9}$
24	4.82	5.17	5.91	6.25	25.30	23.58	30	$1.2517 \times 10^{9}$
25	4.97	5.53	6.31	6.32	25	23.10	36	$6.0081 \times 10^{8}$
30	4.75	5.25	5.73	6.34	22.36	21.02	12	$1.9478 \times 10^{6}$
36	4.40	4.40	4.52	4.52	20	20	1	1

Increasing signal-to-noise ratio in over-determined Mueller matrices

Abstract

1. Introduction

2. Experimental details

3. Figures of interest

4. Results

5. Conclusion

Funding

Disclosures

Data availability

References

Supplementary Material (1)

Data availability

Cited By

Figures (4)

Tables (1)

Equations (19)

Optics Express