Unscented Kalman filters for polarization state tracking and phase noise mitigation

Jokhakar Jignesh; Bill Corcoran; Chen Zhu; Arthur Lowery

doi:10.1364/OE.24.022282

1. Introduction

Coherent communication systems support spectrally efficient modulation schemes where data is encoded on the phase, amplitude and polarization states of the optical carrier. However, the finite linewidth of the laser, used as a carrier, results in an additive random phase perturbation. Additionally, small geometric variations in optical fiber randomly rotate the state of polarization at receiver end, changing even with slight vibrations. The digital coherent receiver must dynamically track the phase and polarization states of the system, in order to recover the transmitted signals fully.

Commonly, with quadrature phase shift keying (QPSK) format, adaptive filters that are based on the constant modulus algorithm (CMA) [1] is used in conjunction with the Viterbi-Viterbi phase estimation (VVPE) algorithm, where the CMA takes care of the effects of polarization rotation and VVPE mitigates the effects of the phase noise. In cases of higher quadrature-amplitude modulation (QAM) formats, multi-modulus algorithm (MMA) or Weiner filter based estimation with pilot-symbol-aided maximum likelihood phase estimation is employed [2]. Kalman filters provide an alternative method, and have been proven to give optimum estimates compared with all other estimators assuming that the noise sources associated with the system are Gaussian [3]. Recently, Marshall et al. proposed to use a Kalman filter; extended Kalman filter (EKF) to be more precise, and showed faster convergence than conventional CMA + VVPE approach [4]. Also, Linear Kalman filters (LKFs) have been used for polarization state tracking based on a radius-directed method [5]. However, this LKF did not provide joint tracking of phase and polarization, and the proposed system requires significant modifications when shifting to higher order QAM.

In this paper we propose to use, in place of an EKF, an ‘unscented’ Kalman filter (UKF) that has been shown to accurately capture all the moments of the parameter to be estimated in contrast to only the 1st and 2nd-order moments provided by the EKF [3]; thus, it gives more-accurate estimates. However, this improvement in performance from the UKF comes at the cost of increased complexity, leading us to discuss modified versions of the UKF and EKF (R-UKF and R-EKF) with reduced complexities [6]. This paper is an extension to our previous work in [6], operating at higher baud rates and investigating operation with both QPSK and 16-QAM modulation formats. The UKF was found to provide performance enhancement in comparison with the previously proposed EKF or CMA/MMA algorithms but at the cost of increased complexity, whereas the R-UKF outperformed the previous algorithms with reduced complexity but at moderate or higher optical signal-to-noise ratios (OSNR).

Section 2 discusses the phase and polarization observation model and unscented Kalman filters in detail. Section 3 proposes modifications to reduce the complexities of Kalman filter algorithms. Section 4 describes the experimental setup and shows the performance of each algorithm for back-to-back noise loading as well as 800-km fiber optical link configuration for 20G-baud QPSK and 16-QAM modulated signals. We present our conclusions in Section 5. The appendices provide the unscented Kalman filter algorithm proposed in this paper and an intuitive and detailed discussion of the mathematical model.

2. Kalman filters

Unlike the CMA or other adaptive filter algorithms such as least mean square (LMS), recursive least square (RLS) etc. that calculate the tap coefficients of a digital finite impulse response filter towards reducing a cost function, Kalman filters use a mathematical model of transmission impairments to mitigate their effects. In other words, instead of estimating the correct coefficients of the taps in filter, Kalman filters attempt to estimate the accurate values of the parameters in the mathematical model of distortions. Since these parameters are random, the MMSE criterion is considered instead of the maximum likelihood (ML) [7]. When the noises associated with the mathematical model have Gaussian distributions, Kalman filters can provide optimal estimates in terms of minimum variance [3]. The mathematical model (‘observation’ model) for the impairments of phase noise and polarization rotation, is given as

Z = e^{j θ} (\begin{matrix} a + j b & c + j d \\ - c + j d & a - j b \end{matrix}) (\begin{matrix} T_{x} \\ T_{y} \end{matrix}) + N

where, Z is the received polarization multiplexed signal, T_x and T_y are the transmitted symbols on X and Y polarizations respectively, [a, b, c, d] are the polarization state parameters determining the polarization rotation,

θ

is the phase noise parameter and N is the additive Gaussian noise term mostly due to amplified spontaneous emissions from EDFAs in the system. A more detailed discussion about the observation model is given in Appendix B. The parameters [a, b, c, d] follow the Weiner process i.e.

a_{n}

=

a_{n - 1}

+

Δ a

where n determines the time instance and

Δ a

is Gaussian distributed [4]. The parameters b, c and d follow similar distribution. Since the parameters follow Weiner process, Weiner filters may seem to be appropriate for their estimation [2]. However, as the parameters [a, b, c, d, θ] are non-stationary, Kalman filters should give more-accurate estimates and thus prove to be optimum estimators for these parameters [7].

The Kalman filter used here estimates the parameters [a, b, c, d, θ] and performs the inverse of the mathematical model on the received signal. The estimates of [T_x T_y] can now be calculated using maximum likelihood, T_x and T_y being deterministic points of constellations.

In spite of the benefits offered by Kalman filters, linear Kalman filters cannot be used for our application because the observation model in Eq. (1) is non-linear, as the phase parameter (θ) is an exponent. As a solution to this, Marshall et al. proposed using an EKF that performs the linearization of the non-linear model by taking its partial derivatives with respect to each parameter, then using this linearized model with regular LKF [4], as shown in Fig. 1(a). However, this linearization causes inaccuracies in the parameter estimation and thus only leads to sub-optimal estimates [3]. Hence, we propose to use an ‘unscented’ Kalman filter (UKF), which is able to cope with nonlinear models.

Fig. 1 Block representation of extended Kalman filter (EKF) and unscented Kalman filter (UKF).

Download Full Size | PDF

The UKF calculates (2L + 1) sigma points, where L is the number of parameters, to accurately capture all the moments of the probability density function (PDF) of the parameters, then passes them through the actual model without linearization [3]. The sigma points, propagated through the actual model, capture the PDF of the signal at the output of the model more accurately than the EKF that just performs the scaling of the mean and variance of the PDF. As a result, UKF should provide better performance than the EKF [3].

Figure 1(b) is the block diagram of the UKF. The unknown state parameter vector to be estimated is S = [a, b, c, d, θ], $\bar{t}$ = [T_x T_y], $P_{i}$ is the a posteriori estimate covariance, ${(S_{i})}_{k}^{-}$ is the k^th sigma point calculated, k = 1 to 2L + 1, and ${\hat{t}}_{l}$ is the estimated symbol vector. The UKF algorithm for our application is given in Appendix A.

3. Modified Kalman filters

As UKFs can provide more accurate estimates compared with EKFs, using a UKF should provide a system performance improvement. However, this performance improvement comes at the cost of increase complexity of the algorithm that arises from the additional computations required for calculating the sigma points. This has motivated us to attempt to reduce the complexity of the Kalman filter algorithms we use.

To avoid singularity issues in the system, the parameters [a, b, c, d, θ] are restricted to being real valued [4]. However, due to the phase noise term (e^jθ) in the observation model, the Kalman filter gives complex values for these parameters. Marshall et al. proposed splitting each complex row in the algorithm matrices into two consecutive rows; the first row being the real part and second row being the imaginary part [4]. Although this solves the singularity problem, the overall complexity of the algorithm remains the same. As a solution to this, we propose to combine two real-valued parameters into a complex-valued parameter i.e. take a + jb = $\tilde{a}$ and c + jd = $\tilde{c}$ . As a result, the parameters to be estimated by the Kalman filter now become S = [ $\tilde{a}$ , $\tilde{c}$ , θ]. We name the EKF and UKF, with this new reduced state vector, as R-EKF and R-UKF respectively. Since the Kalman filter passes a complex value to each of their parameters, no splitting is required. Moreover, as the number of unknown parameters reduces from 5 in previous versions to 3 in the modified versions, the orders of the algorithm’s complex matrices reduce from 5 × 5, 5 × m or m × 5 to 3 × 3, 3 × m or m × 3 where m can take values within {1,3}. This reduces the number of complex multiplications in the algorithm, thus reducing the computational complexity. Additionally, since the parameters a, b, c, d always appear in pairs like a + jb and c + jd, and never individually in the algorithm, the combining of these parameters is easily facilitated. The combined parameters, $\tilde{a}$ and $\tilde{c}$ still follow the Weiner process i.e. $\tilde{a_{n}}$ = $\tilde{a_{n - 1}}$ + $Δ \tilde{a}$ and $\tilde{c_{n}}$ = $\tilde{c_{n - 1}}$ + $Δ \tilde{c}$ where $Δ \tilde{a}$ = $Δ a$ + j $Δ b$ and $Δ \tilde{c}$ = $Δ c$ + j $Δ d$ and thus have a circular Gaussian distribution in the complex plane with zero means and finite variances. Table 1 shows the number of required complex multiplications per symbol for each algorithm and thus gives an idea of the complexity reduction with the proposed techniques.

Table 1. Number of complex multiplications per symbol detection required by the algorithms.

View Table

The UKF is more computationally complex than an EKF. However, with the proposed technique to reduce complexity giving R-UKF and R-EKF, an R-UKF is less computationally complex than an EKF and the R-EKF shows has the least complexity of all algorithms considered in this study.

4. Experimental setup

20-Gbaud digital signals were generated using an arbitrary waveform generator (AWG) to drive an optical IQ modulator. Polarization multiplexing was emulated using an optical delay line, polarization beam splitter (PBS) and combiner (PBC) as shown in Fig. 2. In the 'back-to-back' configuration, optical noise covering a 200-GHz bandwidth was added to vary the received OSNR. At the receiver end, the signal was amplified using an EDFA and filtered using a BPF with a 200-GHz bandwidth centered at the set transmission wavelength, then fed into a 25-GHz electrical bandwidth integrated coherent receiver. The outputs of the coherent receiver were connected to a 40-GSa/s 28-GHz bandwidth digital signal oscilloscope (DSO). The test algorithms were run as offline DSP.

Fig. 2 Experimental setup for a) back-to-back configuration and b) 800-km transmission link configuration; CMZM: complex Mach-Zehnder modulator, BPF: band pass filter, VOA: variable optical attenuator, ECL: external cavity laser, EDFA: erbium doped fiber amplifier.

Download Full Size | PDF

Figure 3 gives the DSP flow chain performed offline. After being sampled at 40 GSa/s i.e. at 2 samples per symbol, the complex signals in both polarizations were passed through a static frequency domain chromatic dispersion (CD) compensation using overlap-add method [8] for optical fiber link configuration. The frequency offsets in each polarization were then separately estimated using spectral search method [9] in all cases and compensated for. The signals in each polarization are then resampled to one sample per symbol before passing on to the algorithms under test. The internal clock of the DSO can be assumed to be stable, but still there are few phase distortions added in the sampled signal. However, these distortions can be taken care of by longer length taps in CMA/MMA equalizers or by optimizing the initial variances in the Kalman filters. The optimum tap length in our system was found to be 41 taps for both CMA and MMA filters. Similarly, the Kalman filters aided by their decision directed nature are able to equalize these effects by intelligently updating the parameters in the mathematical distortion model. This may even cause the Kalman filters to give complex values for real parameters to be estimated which again supports the concept of R-UKF and R-EKF.As shown in Fig. 3, the Kalman filters replace CMA + VVPE/ MMA + ML algorithms.

Fig. 3 Digital signal processing flow for a) CMA + VVPE/MMA + ML b) Kalman filters under test.

Download Full Size | PDF

While QPSK is used for 100Gb/s channels for optical transport, future 400Gb/s channels are likely to use 16QAM as a modulation scheme [10]. Thus, the performance of our proposed systems have been measured for 20 Gbaud 16-QAM and QPSK signals. The signal quality factor (Q) after recovered by the test algorithms was measured. At lower OSNRs where sufficient errors were measured (above the FEC limit), the Q-value calculated from the constellation variance (Q_SNR) is equal to the Q-value calculated from the finite BER (Q_BER) where Q_BER for any M-QAM modulation format is [11]

Q_{B E R} = 20 \log_{10} (\sqrt{\frac{2 (M - 1)}{3}} \times e r f c^{- 1} (\frac{B E R \times \log_{2} \sqrt{M}}{(1 - \frac{1}{\sqrt{M}})}))

Although the R-UKF and R-EKF reduce the complexity, some changes need to be made to avoid the singularity in the system. The Jacobian matrix is

J = e^{j θ} [\begin{matrix} Z_{x} & \begin{matrix} j Z_{x} & \begin{matrix} Z_{y} & \begin{matrix} j Z_{y} & j (a + j b) Z_{x} + j (c + j d) Z_{y}) \end{matrix} \end{matrix} \end{matrix} \\ Z_{y} & \begin{matrix} - j Z_{y} & \begin{matrix} - Z_{x} & \begin{matrix} j Z_{x} & j (- c + j d) Z_{x} + j (a - j b) Z_{y} \end{matrix} \end{matrix} \end{matrix} \end{matrix}]

For R-EKF it reduces to

J = [\begin{matrix} \begin{matrix} e^{j θ} Z_{x} & e^{j θ} Z_{y} & j (e^{j θ} \tilde{a} Z_{x} + e^{j θ} \tilde{c} Z_{y}) \end{matrix} \\ \begin{matrix} e^{- j θ} Z_{y}^{*} & - e^{- j θ} Z_{x}^{*} & j (e^{- j θ} \tilde{c} Z_{x}^{*} - e^{- j θ} \tilde{a} Z_{y}^{*}) \end{matrix} \end{matrix}]

For the R-EKF, the data symbol estimated in Y polarization is the conjugate of the actual transmitted symbol. For both R-UKF and R-EKF, the estimates of the parameter

θ

will be complex. We noted that the system avoids singularity only when that complex value is considered. Taking only the real or imaginary part, or the absolute value of the complex estimate, either leads to singularity or divergence of the filter. The explanation for this is that, since we have combined two parameters into one, the three parameters work together to enforce the received signal transformations onto desired constellation points. This causes the Kalman filters to make the

θ

parameter complex valued. In other words, modified Kalman filters have fewer degrees of freedom to achieve the desired results and thus need all the parameters to be complex. Additionally, in our system, the values for observation noise variance

σ_{z}^{2}

and process noise variance

σ_{p}^{2}

giving optimum performance were found to be 10⁻¹ and 10⁻⁸ for EKF and UKF, and 5

\times

10⁻¹ and 10⁻⁴ for R-EKF and R-UKF respectively. This increased variance leads to a lower performance at poor OSNRs.

4.1 Experimental results

Figure 4 shows the performance of the proposed algorithms in terms of signal quality factor (Q) against received OSNR. At high OSNR for the QPSK case, the UKF and R-UKF both outperform the EKF, showing an increase in Q by 1.7-dB and 1.4-dB respectively at 20-dB OSNR. Similarly, for the 16-QAM case in Fig. 4, the UKF and R-UKF give 1.4-dB and 1-dB improvement over EKF at 20-dB OSNR. The unscented transformation used in UKF and R-UKF leads to more-accurate capture of the a priori and a posteriori estimate error co-variances [12], and hence we get these benefits of UKF and R-UKF over EKF.

Fig. 4 Q-value vs. OSNR in back-to-back configuration for 20 Gbaud QPSK and 20 Gbaud 16-QAM respectively.

Download Full Size | PDF

Figure 4 also shows the UKF and R-UKF performance compared with the conventional constant-modulus algorithm (CMA) (41 taps) and Viterbi and Viterbi phase estimation (VVPE) algorithm [1]. The number of taps were chosen to be in accordance to optimum Q performance observed as discussed before. In this case, the UKF and R-UKF give up to 2-dB and up to 1.5-dB improvements in received signal Q at 20-dB OSNR, respectively. In the 16-QAM case, the VVPE algorithm is not a strong phase estimation tool due to higher noise susceptibility of the 16-QAM signal. Thus, a multi-modulus algorithm (MMA) [13] with pilot-based maximum likelihood (ML) technique aided technique [14] is an alternate method to estimate the constellation symbols [2]. Figure 4 shows that for 16-QAM case, the EKF and R-EKF now give considerable 0.7-dB and 0.6-dB performance gains over MMA + ML techniques, respectively, at 20-dB OSNR, which was not observed in QPSK case with CMA + VVPE algorithm. Additionally, the UKF and R-UKF continue their trends and give 2.2-dB and 1.7-dB improvements over MMA + ML scheme. These improvements are because of a more intelligent update of the Kalman gain using the a priori and a posteriori estimate error covariance against the constant step size of CMA/MMA. The improvement over CMA/MMA is not as much in the EKF since the linearization process disturbs the accuracy of the estimate co-variances.

While these simulations of the system performance at high OSNR confirm expected improvements, fiber communication systems are often run near their 'error-free' thresholds, commonly taken as the operating threshold for Reed-Solomon (255, 239) hard-decision forward-error correction (FEC). For the back-to-back QPSK results shown in Fig. 4, at low OSNR, due to higher co-variances needed by the R-UKF, this algorithm shows a 0.6-dB OSNR penalty at the 7% hard FEC threshold for QPSK case compared to the EKF. The implementation penalty at the hard FEC limit is 1.3-dB for the UKF, CMA + VVPE and EKF, 1.9-dB for the R-UKF and is maximum of 2.4-dB for R-EKF.

Although the R-UKF shows reduced performance at lower OSNRs compared with the UKF and EKF for the reasons mentioned in QPSK case, it still shows negligible OSNR penalty at the 7% hard-FEC limit compared with the EKF for the 16-QAM case. Additionally, the UKF shows an improvement of 1-dB in required OSNR at FEC limit. The implementation OSNR penalty suffered at the 7% hard-FEC limit for 16-QAM modulated signal is 0.9-dB for the UKF, 1.9-dB for the R-UKF, 2.1-dB for EKF and 2.7-dB for MMA + ML and R-EKF. Comparing with the penalty values for QPSK, it can be inferred that the EKF, R-EKF and MMA + ML algorithms are not good alternatives for systems supporting both QPSK to 16-QAM. On the other hand, the UKF, with increased complexity, has a reduced implementation penalty for 16-QAM and R-UKF, with its complexity less than EKF, has a consistent implementation penalty independent of whether QPSK or 16-QAM is being received.

Recent research in optical communication focuses on 400G systems that require higher-order QAM modulation formats and powerful error correcting techniques for making long distance transmission possible at these high data rates. The soft-decision forward error correction (SD-FEC) techniques such as low-density parity check (LDPC) codes and Turbo codes are being widely explored with an intention of achieving the post-FEC bit error rate (BER) of 10⁻¹⁵ [15]. For this, the hard FEC techniques, RS(255,239), i.e. with 7% overhead, require a pre-FEC BER better than 3.8 $\times$ 10⁻³. On the other hand, the SD-FEC techniques can achieve the desired post-FEC BER of 10⁻¹⁵ even with worse pre-FEC BERs of 2.7 $\times$ 10⁻², but require a 20% overhead [16]. At this 20% SD-FEC limit for 16-QAM modulated signal, the UKF, EKF and the MMA + ML algorithms give similar performances, whereas, the R-UKF and R-EKF give OSNR penalties of 0.8-dB and 1.3-dB respectively. Hence, for higher pre-FEC BERs, the reduction in complexity in R-UKF and R-EKF comes at the cost of increase in OSNR penalty compared with other algorithms. So for 16-QAM with a 20% FEC overhead, the comparative performance of the algorithms used here is similar to the case of a 7% FEC overhead for QPSK. Nonetheless, for systems that are designed for better pre-FEC BER, the R-UKF can prove to be a good alternative to the EKF. We note that a given pre-FEC BER as a SD-FEC threshold may not be the most accurate metric [17], but this does provide us with an approximate bound for comparison.

To understand the performance of the test algorithms in a practical transmission scenario, the signal was transmitted over 800 km. The optical signal was passed through 10 spools of optical fiber, each of length 80 km. The span’s launch powers were controlled by EDFAs placed before each spool, with a final EDFA placed as a pre-amplifier before the receiver.

Figure 5 shows the Q performance of each algorithms in test with varying launch power for both QPSK and 16-QAM modulated signals. It shows that each algorithm has the same optimal launch power. At higher powers, the highly nonlinear transmission regime, they give similar performance since the phase tracking is lost as expected, as none of the algorithms are capable of compensating for fast state changes caused by non-linear distortions in the fiber. In the QPSK case, the EKF gave no apparent improvement over the CMA + VVPE, whereas in case of 16-QAM, as observed at higher OSNRs, the EKF and R-EKF give marginal peak improvement over the MMA + ML. At the peak Q, we expect the signal to have a high OSNR, leading to the increased performance of the UKF (2.1-dB for QPSK and 2.3-dB for 16-QAM) and R-UKF (1.8-dB for QPSK and 2.1-dB for 16-QAM) implementations CMA + VVPE/MMA + ML at optimal launch power. For the 16-QAM case, at the 7% hard FEC limit, the UKF showed 2-dB, R-UKF showed 1-dB, EKF showed 0.7-dB and R-EKF showed negligible improvement in the required launch power compared with the MMA + ML algorithm. Thus, the UKF requires the lowest launch power to achieve the required pre-FEC BER of 3.8 × 10⁻³ but comes with higher computational complexity. The R-UKF may prove to be a pragmatic option since it is less complex than the EKF and requires lesser launch power to achieve the performance above FEC. The R-EKF require the highest power and thus could be a feasible option only when low complexity is of utmost importance.

Fig. 5 Q-value vs. launch power (dBm) in 800 km link configuration for 20 Gbaud QPSK and 16-QAM respectively.

Download Full Size | PDF

4.2 Simulation results

In addition to transmission impairments and optical noise, the finite linewidth of the lasers used in the systems can vary from device to device. Although tuneable external cavity lasers (as used in our experiments) generally confine their linewidth within 100 kHz, for DWDM applications laser linewidths may exceed this value. Hence, we simulated the back-to-back setup in VPItransmissionMaker v. 9.3 software with the linewidth of both the transmitter’s continuous wave (CW) laser and the receiver’s local oscillator CW laser varied from 100 kHz to 1 MHz as shown in Fig. 6.

Fig. 6 Q (dB) vs linewidth (kHz) for 8 dB OSNR, QPSK; 20 dB OSNR, QPSK; 15 dB OSNR, 16 QAM and 20 dB OSNR, 16 QAM respectively.

Download Full Size | PDF

Simulations were performed with a 20-Gbaud polarization multiplexed QPSK and 16-QAM modulated signals at 20-dB OSNR and also at OSNRs close to EKFs respective 7% hard FEC limit (8-dB for QPSK and 15-dB for 16-QAM). The angular rotation rate of polarization evolution taken for simulations is 6.8 Mrad/s. This high polarization rotation velocity was taken to make a fair comparison with the simulations shown by Marshall et al. in [4]. Additionally, fixing the polarization rotation velocity in simulations where the system is highly stressed beyond practical limits ensures the robustness of the system.

The degradation in signal Q is between 0.1-dB and 0.3-dB for all cases explored at laser linewidth values of 600 MHz, between 0.5-dB to 0.7-dB for linewidth value of 800 MHz and a 0.9-dB to 1.1-dB degradation occurring with laser linewidth values of 1 MHz compared to the performance with 100-kHz linewidths. As observed from the back-to-back experimental results, the performances show 1-dB variation with a 1-dB change in OSNR around the hard FEC limit. Considering this and the simulation results in Fig. 6, it can be concluded that lasers with 1 MHz linewidths require only a 1-dB increase in the OSNR as compared to that required for lasers with 100 kHz linewidths. Thus, the algorithms to be sufficiently robust for stressed systems with linewidths up to 1 MHz and polarization rotations up to 6.8 Mrads/s.

The observation model in Eq. (1) does not consider the differential group delay (DGD), i.e. the effects of polarization mode dispersion (PMD). To understand the robustness of the system to PMD, simulations were performed with 20-Gbaud modulated signals over 800 km, with the PMD parameter of the fiber varied from 0.028 to 0.084. This is equivalent to sweeping the PMD from 25 ps (less than one symbol duration) to 75 ps (more than 1 symbol) for a 20-Gbaud signal. The results are shown in Fig. 7. A standard SMF-28 fiber tends to give maximum of 28-ps DGD over 800 km distance that relates to negligible performance penalty for both the QPSK and 16-QAM signals. A penalty of 1-dB occurred for a DGD of 65 ps for all cases corresponding to links of 2000 km of standard SMF-28 fiber. Hence, the algorithms could be implemented in practice. Additionally, the proposed systems were tested with artificially added absolute frequency offsets up to 1 GHz showed variations in the performances of the test algorithms within 0.4-dB, verifying the robustness to frequency offsets.

Fig. 7 Q (dB) vs. PMD (ps) for 8 dB OSNR, QPSK; 20 dB OSNR, QPSK; 15 dB OSNR, 16 QAM and 20 dB OSNR, 16 QAM respectively.

Download Full Size | PDF

The intelligent update of the Kalman gain based on the statistical information aided with the decision-directed nature of the Kalman filters makes the Kalman filters to perform in spite of DGD being slightly higher than 1 symbol. However, we do observe performance degradation as shown in Fig. 7. If the DGD is increased further, we can expect that the system will at some point completely lose track, resulting in a sharp drop in performance. Also, the FIR filters used for CMA/MMA in this work were symbol-period spaced. A performance comparison of the Kalman filters under test with that of fractionally-space time domain equalizers (FS-TDE) remains to be explored. Additionally, the maximum tolerable residual CD after overlap-add static CD compensation also remains to be investigated which can be taken up as future work.

Overall, these results indicate that, because it only provides a marginal performance improvement for higher modulation formats, EKF is not a good alternative to conventional CMA + VVPE/MMA + ML algorithms, whereas, the better performance improvement from using the UKF may come at the cost of increased complexity. For particular cases where moderate or high OSNRs in the system are available, the R-UKF seems to provide a good alternative to CMA/MMA + VVPE in terms of performance and complexity.

5. Conclusions

We have investigated and compared two types of Kalman filters for joint polarization state tracking and phase noise mitigation: unscented Kalman filters and previously proposed extended Kalman filters. A comparison was also made with conventional blind CMA algorithms for QPSK and MMA for 16-QAM modulation formats. We have shown through experiments that our proposed UKF for joint polarization and phase tracking outperforms the previously proposed EKF algorithm and the conventional CMA + VVPE/MMA + ML algorithms at the cost of increased complexity. The EKF used shows no improvement over CMA + VVPE algorithm when tracking polarization and phase on a QPSK modulated signal throughout the OSNR range investigated and very marginal improvement over MMA + ML algorithm for higher QAM modulation formats at high OSNRs.

We have proposed reduced-complexity versions of the UKF and EKF, R-UKF and R-EKF, for joint polarization and phase noise tracking. The R-UKF outperforms both EKF and CMA + VVPE/MMA + ML for moderate and higher OSNRs (i.e. >10-dB for QPSK and >13-dB for 16-QAM), while requiring fewer computations than the EKF. Although the R-EKF is less complex than the EKF, its performance is compromised for target OSNRs (i.e. <18 dB for QPSK and <16 dB for 16-QAM). After transmission over an 800-km optical fiber link, all of the algorithms attain their peak performance at same launch powers; the UKF and R-UKF outperform the other algorithms, whose performances are similar except for low launch power region where the EKF shows marginal improvement over CMA + VVPE and R-EKF. In the case of 16-QAM signals, the UKF requires the lowest launch power at the 7% hard FEC limit followed by R-UKF, EKF, R-EKF and MMA + ML.

Overall, the Kalman filters give flexibility against changes in modulation formats as compared with the conventional systems. An UKF gives optimum performance over all other algorithms at the cost of increased complexity. In case of systems that require for lower complexity, an R-UKF proves to be an appropriate choice if its OSNR requirements are met.

Appendix A Unscented Kalman filter algorithm

S_{0} = {[0, 0, ...0]}_{1 \times L}^{T}

P_{0} = {[0]}_{L \times L}

L = 5 for UKF and EKF; L = 3 for R-UKF

A = I_{L \times L}

S_{i}^{-} = A S_{i - 1}^{-}

P_{i}^{-} = A P_{i - 1}^{-} A^{T} + Q

Calculate Sigma points:

^{${(S_{i})}_{k}^{-} = S_{i}^{-} + \sqrt{(L + λ) {(P_{i}^{-})}_{k}}$} where, $λ$ =L(10⁻³+1) and ${(P_{i}^{-})}_{k}$ is the k^th column of $P_{i}^{-}$ .^{${(S_{i})}_{f} = \sum_{k = 0}^{2 L} ω_{k}^{(m)} {(S_{i})}_{k}^{-}$}

t_{i k} = e^{j {- {(S_{i})}_{k}^{-} (5)}} [\begin{matrix} {{(S_{i})}_{k}^{-} (1)} + j {{(S_{i})}_{k}^{-} (2)} & {{(S_{i})}_{k}^{-} (3)} + j {{(S_{i})}_{k}^{-} (4)} \\ {- {(S_{i})}_{k}^{-} (3)} + j {{(S_{i})}_{k}^{-} (4)} & {{(S_{i})}_{k}^{-} (1)} - j {{(S_{i})}_{k}^{-} (2)} \end{matrix}] [\begin{matrix} Z_{x} \\ Z_{y} \end{matrix}]

{\hat{t}}_{í} = \sum_{k = 0}^{2 L} ω_{k}^{(m)} t_{i k}

P_{t t} = \sum_{k = 0}^{2 L} ω_{k}^{(c)} (t_{i k} - {\hat{t}}_{i}) {(t_{i k} - {\hat{t}}_{i})}^{*} + R P_{t t} = \sum_{k = 0}^{2 L} ω_{k}^{(c)} (t_{i k} - {\hat{t}}_{i}) {(t_{i k} - {\hat{t}}_{i})}^{*} + R

P_{t s} = \sum_{k = 0}^{2 L} ω_{k}^{(c)} ({(S_{i})}_{k}^{-} - {(S_{i})}_{f}) {(t_{i k} - {\hat{t}}_{i})}^{*}

ω_{0}^{(m)} = λ / (L + λ)

ω_{0}^{(c)} = {λ / (L + λ)} + (1 - α^{2} + β)

For Gaussian distribution of parameters,

β = 2

Kalman gain: (G_i) = $P_{t s} P_{t t}^{- 1}$

S_{i} = {(S_{i})}_{f} + G_{i} [{\hat{t}}_{i} - d e c i s i o n ({\hat{t}}_{i})]

P_{i} = P_{i}^{-} - G_{i} P_{t t} G_{i}^{*}

Appendix B Description of the mathematical observation model

The conventional mathematical model or Jones matrix for any birefringent material is given as [18]

M = [\begin{matrix} e^{j η / 2} \cos^{2} ϑ + e^{- j η / 2} \sin^{2} ϑ & (e^{j η / 2} - e^{- j η / 2}) e^{- j ϕ} \cos ϑ \sin ϑ \\ (e^{j η / 2} - e^{- j η / 2}) e^{j ϕ} \cos ϑ \sin ϑ & e^{- j η / 2} \cos^{2} ϑ + e^{j η / 2} \sin^{2} ϑ \end{matrix}]

where,

η

is the relative phase retardation induced between the fast axis and the slow axis,

ϑ

is the orientation of the fast axis with respect to the horizontal axis and

Ø

is the circularity i.e.

Ø

= 0 for linear retarders and

Ø

= ±

π

/2 for circular retarders. It can be observed that M forms a scaled unitary matrix that agrees with the properties of a Jones matrix.

The elements of Jones matrix, M are complex and can be written in phasor form as Xe^jY or in Cartesian form as a+jb. In our work, we chose the Jones matrix to be in Cartesian form as

M = [\begin{matrix} a + j b & c + j d \\ - c + j d & a - j b \end{matrix}]

The reason for choosing the Cartesian form lies in context to using the Kalman filters that require the parameters (to be estimated) to follow the Weiner process, i.e. an unknown parameter z should follow z_{n + 1} = z_n +

Δ

z where the suffix denotes the time instance and

Δ

z is normal distributed such that

Δ

z ~Ɲ(0,

σ_{z}^{2}

). Since the real and imaginary parts of the elements of M follow Weiner procedure [4], Cartesian form is considered.

It may be possible that some researchers may favour to use the phasor form instead of Cartesian for reasons of their own. However, we prove here that using the phasor form does not work in Kalman filters. Consider the matrix element a+jb with equivalent phasor form Xe^jY where a and b follow the Weiner process i.e a_n+1 = a_n +∆a and b_n+1 = b_n +∆b. Thus, in order to use Xe^jY in Kalman filters, X and Y should also follow Weiner process.

\begin{array}{l} Now, X_{n} = ‖ a_{n} + j b_{n} ‖ \\ = \sqrt{a_{n - 1}^{2} + b_{n - 1}^{2} + {(Δ a)}^{2} + {(Δ b)}^{2} + j (2 a Δ a + 2 b Δ b)} \end{array}

X_{n} = X_{n - 1} (\sqrt{1 + \frac{{(Δ a)}^{2} + {(Δ b)}^{2} + j (2 a Δ a + 2 b Δ b)}{X_{n - 1}^{2}}}) (from (7))

Now consider ∆X = X_n - X_n-1

Δ X = X_{n - 1} (\sqrt{1 + \frac{{(Δ a)}^{2} + {(Δ b)}^{2} + j (2 a Δ a + 2 b Δ b)}{X_{n - 1}^{2}}} - 1)

The term {(∆a)² + (∆b)²} is exponential distributed and (2a∆a + 2b∆b) is Gaussian distributed, ∆a and ∆b being Gaussian distributed. Thus, the addition of an exponential and Gaussian distribution is definitely not Gaussian. As a result, ∆X is not Gaussian distributed and X does not follow Weiner process, making phasor form Xe^jY unsuitable for Kalman filters.

Intuitively, since X is a non-zero magnitude, X_n - X_n-1 = (X_n – r) for X_n-1 = any positive number ‘r’. Thus, the range within which ∆X = X_n - X_n-1 can take values is [-r,∞]. In order to make the distribution of X_n to take values within range [-∞,∞], X_n-1 has to take value ∞ that is not possible in practical systems. This contradicts the properties of Gaussian distribution that has the limits [-∞,∞]. As a result, again, ∆X is not Gaussian distributed and X does not follow Weiner process, making phasor form Xe^jY unsuitable for Kalman filters.

Relevance to Poincare sphere

The a,b,c,d parameters being the real and imaginary parts of M, they can be written as $a = \cos (η / 2); b = \sin (η / 2) \cos ϑ; c = 2 \sin (η / 2) \sin Ø \sin 2 ϑ; d = 2 \sin (η / 2) \cos Ø \sin 2 ϑ$ Calculating the values of these parameters for different polarizations from the corresponding values of Ƞ, $ϑ$ and $Ø$ , the relevance of a, b, c, d parameters to Poincare sphere can be deduced as shown in Fig. 8.

Fig. 8 Poincare sphere showing different polarizations with Stokes parameter and equivalent [a, b, c, d] parameters in the form {(S1, S2, S3);(a, b, c, d)}.

Download Full Size | PDF

Thus, it can be concluded that c and d denote the X and Y co-ordinates of the Poincare sphere, i.e. the normalized S1 and S2 Stokes parameters in Fig. 8 respectively, whereas the Z co-ordinate of the sphere (normalized S3) is denoted by (2a+b) [19]. Hence, any polarization state on or within the Poincare sphere can be denoted in the form of a, b, c, d parameters and equivalent Jones matrix M can be derived from Eq. (A2).

Funding

Australian Research Council’s (ARC) Centre of Excellence and Laureate Fellowship schemes (CE110001018, FL130100041).

Acknowledgment

We thank VPIphotonics (www.vpiphotonics.com) for their support under the university program.

References and links

1. D. S. Millar and S. J. Savory, “Blind adaptive equalization of polarization-switched QPSK modulation,” Opt. Express 19(9), 8533–8538 (2011). [CrossRef] [PubMed]

2. M. G. Taylor, “Phase estimation methods for optical coherent detection using digital signal processing,” J. Lightwave Technol. 27(7), 901–914 (2009). [CrossRef]

3. E. A. Wan and R. van der Merwe, “The unscented Kalman filter for nonlinear estimation”, in Proc. IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium Conference (2000), (2000), pp. 153–158. [CrossRef]

4. T. Marshall, B. Szafraniec, and B. Nebendahl, “Kalman filter carrier and polarization-state tracking,” Opt. Lett. 35(13), 2203–2205 (2010). [CrossRef] [PubMed]

5. Y. Yang, G. Cao, K. Zhong, X. Zhou, Y. Yao, A. P. T. Lau, and C. Lu, “Fast polarization-state tracking scheme based on radius-directed linear Kalman filter,” Opt. Express 23(15), 19673–19680 (2015). [CrossRef] [PubMed]

6. J. Jokhakar, B. Corcoran, C. Zhu, and A. J. Lowery, “Unscented Kalman filters for polarization state tracking and phase noise mitigation,” in Proc. of Optical Fiber Communication Conference, OSA Technical Digest (online) (Optical Society of America, 2016), paper Tu2A.4. [CrossRef]

7. S. Kay, Fundamentals of Statistical Signal Processing, Volume 1: Estimation Theory (Wiley, 1993).

8. T. Xu, G. Jacobsen, S. Popov, M. Forzati, J. Mårtensson, M. Mussolin, J. Li, K. Wang, Y. Zhang, and A. T. Friberg, “Frequency-domain chromatic dispersion equalization using overlap-add methods in coherent optical system,” J. Opt. Commun. 32(2), 131–135 (2011). [CrossRef]

9. M. Selmi, Y. Jaouen, and P. Ciblat, “Accurate digital frequency offset estimator for coherent PolMux QAM transmission systems,” in Proc. of European Conference and Exhibition on Optical Communication (2009), pp. 1–2.

10. OIF-Tech-Options-400G–01.0 – Technology Options for 400G Implementation (July 2015) (White paper). Available: http://www.oiforum.com/wp-content/uploads/OIF-Tech-Options-400G-01.0.pdf.

11. R. A. Shafik, M. S. Rahman, and A. H. M. R. Islam, “On the extended relationships among EVM, BER and SNR as performance metrics,” in Proc. of International Conference on Electrical and Computer Engineering (2006), pp. 408–411. [CrossRef]

12. H. Louchet, K. Kuzmin, and A. Richter, “Improved DSP algorithms for coherent 16-QAM transmission,” in Proceedings of European Conference and Exhibition on Optical Communication (2008), pp. 57–58. [CrossRef]

13. C. Zhu, A. V. Tran, S. Chen, L. B. Du, T. Anderson, A. J. Lowery, and E. Skafidas, “Frequency-domain blind equalization for long-haul coherent pol-mux 16-QAM system with CD prediction and dual-mode adaptive algorithm,” IEEE Photonics J. 4(5), 1653–1661 (2012). [CrossRef]

14. S. Zhang, P. Y. Kam, C. Yu, and J. Chen, “Decision-aided carrier phase estimation for coherent optical communications,” J. Lightwave Technol. 28(11), 1597–1607 (2010). [CrossRef]

15. D. Chang, F. Yu, Z. Xiao, N. Stojanovic, F. N. Hauske, Y. Cai, C. Xie, L. Li, X. Xu, and Q. Xiong, “LDPC convolutional codes using layered decoding algorithm for high speed coherent optical transmission,” in Proc. of Optical Fiber Communication Conference, OSA Technical Digest (Optical Society of America, 2012), paper OW1H.4. [CrossRef]

16. X. Zhou, L. E. Nelson, P. Magill, R. Isaac, B. Zhu, D. W. Peckham, P. I. Borel, and K. Carlson, “High spectral efficiency 400 Gb/s transmission using PDM time-domain hybrid 32–64 QAM and training-assisted carrier recovery,” J. Lightwave Technol. 31(7), 999–1005 (2013). [CrossRef]

17. A. Alvarado, E. Agrell, D. Lavery, R. Maher, and P. Bayvel, “Replacing the soft-decision FEC limit paradigm in the design of optical communication systems,” J. Lightwave Technol. 33(20), 4338–4352 (2015). [CrossRef]

18. R. Driggers, Encyclopedia of Optical Engineering, Volume 2, (Marcel Dekker Inc. 2003).

19. H. G. Berry, G. Gabrielse, and A. E. Livingston, “Measurement of the Stokes parameters of light,” Appl. Opt. 16(12), 3200–3205 (1977). [CrossRef] [PubMed]

Algorithm	Name	Number of complex multiplications per symbol
UKF	Unscented Kalman filter	140
R-UKF	Reduced - Unscented KF	90
EKF	Extended Kalman filter	95
R-EKF	Reduced - Extended KF	72

Unscented Kalman filters for polarization state tracking and phase noise mitigation

Abstract

1. Introduction

2. Kalman filters

3. Modified Kalman filters

4. Experimental setup

4.1 Experimental results

4.2 Simulation results

5. Conclusions

Appendix A Unscented Kalman filter algorithm

Appendix B Description of the mathematical observation model

Relevance to Poincare sphere

Funding

Acknowledgment

References and links

Cited By

Figures (8)

Tables (1)

Equations (22)

Optics Express