Blind, fast and SOP independent polarization recovery for square dual polarization–MQAM formats and optical coherent receivers

Mathieu Chagnon; Mohamed Osman; Xian Xu; Qunbi Zhuge; David V. Plant

doi:10.1364/OE.20.027847

1. Introduction

Optical coherent receivers are key to higher transmission rates, where information is imprinted in multiple dimensions of the optical waveform. Polarization is one of those dimensions, where different complex waveforms can be independently imprinted on two orthogonal SOPs of the signal at the transmitter. However, at the receiver side, the received SOP is completely unknown. It is mapped onto a different orthogonal basis than the one used at the transmitter and is varying with time and distance.

To blindly untangle the polarization of the incoming waveform in an optical coherent receiver, the CMA algorithm is often used [1]. However, CMA has several disadvantages. First, it suffers from the singularity problem where the algorithm converges to a tap-weight setup that produces the same transmitted signal at both equalizer outputs [2,3]. Moreover, the convergence speed of the CMA is strongly dependent on the SOP of the received waveform ,and consequently varies a lot with the choice of the initial FIR tap values [2,4,5,6]. This has become a major problem and is recently being addressed to comply with the fast transition requirements of emerging data rate adaptive optical packet networks [5,6], burst mode coherent receivers and agile optical network architectures where the receiver can be dynamically reconfigured to quickly drop a wavelength and switch to another channel [4].

Recently, new blind PR techniques have been proposed to speed up the convergence time required for blind polarization rotation estimation. A method based on CMA using 25 different initial CMA tap values, all processing in parallel, is presented in [4]. It chooses the test case that provides the lowest CMA error to initialize the central tap of the equalizer. Using such scheme on DP-4QAM format provides a mean convergence time of around 35 ns, with a worst case convergence time of 280 ns [4]. The computational requirement of processing independently and in parallel 25 self-adaptating equalizers is very substantial. In [6], a three-stage CMA enables blind recovery in 200 ns, or 11200 symbols at 56 Gbaud, which is hardly compatible with burst-mode operation [5]. Some proposed techniques to reduce the convergence time of CMA use training symbols or preample headers. As an example, in [5], a preample of 30 header symbols is used to estimate to polarization rotation matrix and in [7], a training symbol based algorithm using 20 or 40 symbols is proposed to estimate the Jones channel matrix. Such techniques, although providing quick rotation estimates, not only add overhead to the system but also need synchronization with the preample header for polarization matrix estimation, and hence are not true blind approaches.

In this paper, we present a new method for blindly estimating the SOP of the optical waveform for coherent receivers applicable to any polarization multiplexed square-MQAM format. The process consists of converting the received waveform into the three dimensional Stokes space and looping over an algorithm looking for desired Stokes states on the Poincare sphere until a unitary polarization derotation matrix can be estimated. This matrix is then used to initialize the center taps of the subsequent decision directed-LMS polarization equalizer. The method realizes much shorter convergence time then recent methods using modified version of CMA to improve convergence speed. This paper consists of three parts. In Section 2, we present the mapping of Jones constellations to Stokes constellations for different DP–MQAM formats. We also summarize a well known model for waveform polarization rotation and polarization mode dispersion (PMD) in lightwave systems [8]. We will use this model to motivate our approach. In Section 3 we introduce the new blind PR method and present in detail its operating algorithm. Section 4 explains the metric of comparison used to confront the convergence time and initial BER of our Stokes space blind method against that of CMA for 3 different modulation format: DP–4QAM, DP–16QAM and DP–64QAM. The experimental test bed is also presented in this section. Finally, Section 5 presents experimental performance comparison for CMA and the proposed method as well as numerical simulation results assessing the new method over varying PMD. Conclusion in found in Section 6.

2. DP-MQAM formats in Stokes space and modeling of SOP in lightwave systems

An optical coherent receiver collects simultaneously 4 electrical waveforms that represent the real and imaginary part of an optical waveform mapped into an orthogonal basis formed by the optical coherent front end. Those 4 waveforms of dimension 1 × N can be converted into a single waveform of dimension 2 × N which consists of a concatenation of N Jones vectors as in Eq. (1), where A_x/y_-Re/Im are the 4 sampled signals and N is the number of samples. The right hand side of Eq. (1) is a different representation of the combination of the 4 waveforms, where the magnitude, the absolute phase, the relative amplitude and sign on $\hat{x}$ and $\hat{y}$ and the relative phase of $\hat{x}$ with respect to $\hat{y}$ are all embedded in E_n, δ_n, θ_n and ϕ_n, respectively.

| A (n T) 〉 = [\begin{matrix} A_{x} \\ A_{y} \end{matrix}] = [\begin{matrix} A_{x - Re} (n T) + i A_{x - Im} (n T) \\ A_{y - Re} (n T) + i A_{y - Im} (n T) \end{matrix}] = E_{n} e^{δ_{n}} [\begin{matrix} \cos (\frac{θ_{n}}{2}) \\ \sin (\frac{θ_{n}}{2}) e^{i ϕ_{n}} \end{matrix}]

The 3-by-1 Stokes representation of Eq. (1) is written as

\vec{A} = [\begin{matrix} A_{1} \\ A_{2} \\ A_{3} \end{matrix}] = E_{n}^{2} [\begin{matrix} \cos (θ_{n}) \\ \sin (θ_{n}) \cos (ϕ_{n}) \\ \sin (θ_{n}) \sin (ϕ_{n}) \end{matrix}]

As can be seen in Eq. (2), the absolute phase information e^iδ is lost in the conversion from Jones to Stokes Space. The new blind Stokes space polarization estimation method we present in this paper will benefit from this absolute phase information loss. The reader can refer to [9] for details concerning conversions from Jones to Stokes space.

Different polarization multiplexed modulation formats can be represented in either the Jones or the Stokes space. Because it exhibits the extra phase information $e^{i δ}$ , Jones space is commonly use to represent constellations. Constellations in Stokes space are rarely shown and we depict here in Fig. 1 such constellations for a) DP–4QAM, b) DP–16QAM and c) DP–64 QAM, all of unitary mean signal power. As can be observed, much fewer states exists in this space; for instance, the 4² = 16 states of DP–4QAM map to only 4 states in Stokes space.

Fig. 1 Mapping of all possible Jones states to Stokes space for theoretical (a) DP-4QAM, (b) DP-16QAM and (c) DP-64QAM. For (d), theoretical DP-64QAM is rotated by a random unitary matrix. The total mean power is always unitary. The gray sphere has a radius of 1.

Download Full Size | PDF

The red rings in Fig. 1 depict the equivalent modulus used in the cost function of the blind CMA polarization derotation algorithm. For every square QAM formats, the modulus has to increase with QAM order [10]. For instance, DP–4QAM, –16QAM and –64QAM formats have moduli of 0.5, 0.66 and ~0.69, respectively. As those moduli apply to both polarizations independently, they are depicted as a single ring of radius double those values.

The Stokes space representations of any square DP–MQAM format share a common feature: the most outer Stokes constellation points are all exactly aligned with the Stokes vectors $� {\vec{A}}_{2}$ , ${\vec{A}}_{2}$ , $� {\vec{A}}_{3}$ , and ${\vec{A}}_{3}$ at increasingly higher powers of 3(√MQAM–1)²/(MQAM–1). Our novel blind Stokes space polarization recovery method uses this theoretical alignment and power increase feature of the most outer Stokes constellation points for estimating the received SOP. As will be detailed in the next section, locating the position of those 4 regions provides an estimate of the unknown unitary rotation imprinted in the received signal.

It is out of the scope of this paper to explain the generation of specific polarization rotation matrices, or “unitary” matrices. For detailed explanations, we refer the reader to the paper from Gordon and Kogelnik [9]. However, as our blind polarization recovery technique is strongly based on those rotations, the following provides a brief explanation on one of the many possible ways to rotate a single Stokes vector from one position to another on the Poincare sphere. The simplest way to move $\hat{A}$ to $\hat{B}$ is by applying on $\hat{A}$ a rotation matrix that spins the Poincare sphere on an axis given by the cross product of $\hat{A}$ and $\hat{B}$ , $\hat{A} \times \hat{B} = \hat{p}$ , and by an angle given by the inverse cosine of the dot product of $\hat{A}$ and $\hat{B}$ , arcos $(\hat{A} \cdot \hat{B}) = ϕ$ . The notation ‘ $\hat{A}$ ’ means the unitary normalized Stokes vector of ‘ $\vec{A}$ ’. There are actually an infinite number of possible unitary rotation matrices that rotate $\hat{A}$ to $\hat{B}$ .

2.1 Modeling of SOP in lightwave systems

As we know, the birefringence in an optical fiber breaks the degeneracy of the 2 eigenmodes that can propagate in a fiber at the same propagation constant. Birefringence gives 2 eigen modes each having its respective propagation constants β_j(ω). Up to the first order, one can describe those 2 modes in the Jones-matrix and Stokes-matrix forms [8,11,12]

\frac{\partial | \tilde{A} (z, ω) 〉}{\partial z} = - \frac{i}{2} (Δ β_{0} + ω Δ β_{1}) U^{�} σ_{1} U | \tilde{A} (z, ω) 〉 = - \frac{i}{2} (Δ β_{0} + ω Δ β_{1}) (\hat{b} \cdot \vec{σ}) | \tilde{A} (z, ω) 〉

\frac{\partial \tilde{\vec{A}} (z, ω)}{\partial z} = - (Δ β_{0} + ω Δ β_{1}) \hat{b} \times \tilde{\vec{A}} (z, ω)

where

| \tilde{A} (z, ω) 〉

is the Fourier transform of | A(z,t)⟩, U^†σ₁U≡

\hat{b} \cdot \vec{σ}

and

\vec{b}

is the 3-by-1 birefringence vector. The term

\vec{σ}

= [σ₁, σ₂, σ₃]^T are the Pauli spin matrices σ_k and are defined in [8,9]. The term Δβ₀ produces a differential phase shift, while the term ωΔβ₁ leads to a temporal delay between the two orthogonal eigenvectors of U, which directly leads to PMD. Said differently, Δβ₀ engenders simple rotations, or “spinning” of all the states, independent of frequency, whereas ωΔβ₁ leads to spinning of the components of the waveform as a function of their frequency, inherently resulting in differential delays. In the frequency domain, polarization dispersion manifests as a frequency dependent state of polarization at the output of the fiber [13]. One can note that the eigenvectors of U^†σ₁U are, in Stokes space, ±

\hat{b}

, such that for constant birefringence, a SOP aligned on ±

\hat{b}

wouldn’t change. However, the birefringence vector

\vec{b}

has varying orientations and amplitudes along the fiber length and has to be a function of z. For the zero^th order, the magnitude of

\vec{b}

is Δβ₀ and for the first order, its magnitude is Δβ₁. The solution of Eq. (3) can be written in the form

| \tilde{A} (z, ω) 〉 = W (z) | \tilde{S} (z, ω) 〉

where W(z) is a function of solely z. If we apply Eq. (5) to Eq. (3), we get the 2 equations

\frac{\partial W (z)}{\partial z} = - \frac{i}{2} Δ β_{0} (\hat{b} (z) \cdot \vec{σ}) W (z)

\frac{\partial | \tilde{S} (z, ω) 〉}{\partial z} = - \frac{i}{2} ω Δ β_{1} W^{- 1} (z) (\hat{b} (z) \cdot \vec{σ}) W (z) | \tilde{S} (z, ω) 〉

If

\vec{b}

was z-independent, the solution to W(z) would simply be

W (z) = \exp (- \frac{i}{2} Δ β_{0} z (\hat{b} \cdot \vec{σ}))

where “exp” is the exponential of the 2-by-2 matrix –i½Δβ₀z $(\hat{b} \cdot \vec{σ})$ . To solve for

| \tilde{S} (z, ω) 〉

, one can show that W(z) and

(\hat{b} \cdot \vec{σ})

commute, i.e., [W(z),

(\hat{b} \cdot \vec{σ})

] = 0. Therefore, the differential equation governing

| \tilde{S} (z, ω) 〉

becomes ∂/∂z $| \tilde{S} (z, ω) 〉$ = –i½ωΔβ_{1 $(\hat{b} \cdot \vec{σ})$ $| \tilde{S} (z, ω) 〉$}, which has the same form as Eq. (6) and consequently the same solution form as Eq. (8), with Δβ₀ changed to ωΔβ₁, adding frequency dependence.

| \tilde{S} (z, ω) 〉

solves as

| \tilde{S} (z, ω) 〉

= T(z,ω)

| \tilde{S} (0, ω) 〉

, with initial waveform

| \tilde{S} (0, ω) 〉

at z = 0. This easily shows mathematically that PMD is simply a polarization rotation that is frequency dependent [13]. If PMD was neglected (Δβ₁ = 0) and constant birefringence was assumed,

\vec{A} (z, t)

would simply precess around

\hat{b}

·as it propagates, at an angular speed proportional to Δβ₀. However, as

\vec{b}

is z-dependent, we can assume without any loss of generality that birefringence is piece-wise constant along the fiber, that is,

\hat{b}

(z_j ≤ z ≤ z_j₊₁) =

\hat{b}

_j and that for every z_j ≤ z ≤□z_j₊₁, both Δβ₀_j and Δβ₁_j vary. The solution of

| \tilde{A} (z, ω) 〉

, in the form of Eq. (5), is then [8]

| \tilde{A} (z, ω) 〉 = W (z_{N - 1}) T (z_{N - 1}, ω) \dots W (z_{1}) T (z_{1}, ω) W (z_{0}) T (z_{0}, ω) | \tilde{S} (0, ω) 〉 = U_{c} (z, ω) | {\tilde{S}}_{o} 〉

where the total fiber length is divided into N shorter pieces. None of the matrices in Eq. (9) commute, so the order of application is paramount. U_c(z,ω) is the composite Jones matrix of the whole fiber link [11] and is also a unitary matrix at each frequency. In SMF fibers, Δβ₁ is normally small . If we neglect PMD, all the T(z_j,ω) matrices become the identity I,

| \tilde{A} (z, ω) 〉

becomes ω-independent and the result of the propagation of | A(z,t)⟩ is simply that its SOP wanders around on the Poincare sphere as it propagates, adding no impairment on the initial waveform

| {\tilde{S}}_{o} 〉

. One can observe that for a static link without PMD, the SOP is constant at a fix distance and only very slowly varying for small first order birefringence.

The polarization recovery technique that we propose in this paper allows us to estimate blindly the cumulated polarization rotation U_c(z,ω) with the hypothesis that the cumulative PMD after distance z is small. Said differently, it estimates the resulting matrix of Eq. (9) with the T’s assumed to be almost the identity I.

3. New blind polarization rotation estimation

The technique we present in this paper allows estimating blindly the polarization rotation of the received signal after the optical front end of an optical coherent receiver. This technique can be used for any square DP–MQAM modulation format. In this section, we will explain this new method of blind polarization rotation recovery.

The very first step is to convert the 4 waveforms sampled by the real-time oscilloscope into a concatenation of Jones vectors as explained in Eq. (1). Then, the data is resampled at T/2, where T is the symbol duration. The third step is to apply the inverse chromatic dispersion (CD) [14]. At this point we make the assumption that we have a 2-by-time matrix of Jones vectors that are simply rotated and noisy versions of the Jones vectors that were transmitted, that is, rotated and noisy versions of Figs. 1(a), 1(b) or 1(c) for DP–4QAM, –16QAM or –64QAM modulation formats, respectively. Mathematically, we assume that we have a series of | A_Rx⟩ = U_ROT| A_Tx⟩ + | n⟩, or ${\vec{A}}_{R x}$ = R_{ROT ${\vec{A}}_{T x}$} + $\vec{n}$ in Stokes space, where $\vec{n}$ (| n⟩) is a (complex) noise source and U_ROT (R_ROT) is an unknown unitary rotation matrix in Jones (Stokes) space. Here, by | A_Rx⟩ we mean the waveform after the three first steps. Figure 1(d) shows a randomly rotated constellation of a DP–64QAM signal. As mentioned in the previous section, U_ROT is almost constant at a fix distance for fiber with small PMD coefficient. Our goal is to estimate U_ROT with the sole knowledge of the time series of | A_Rx⟩. Our technique resides in estimating the location of 3 out of the 4 outer-most states in the Stokes constellation, as depicted in Fig. 1, and to realign those estimated corners to their desired location, i.e. along the ± ${\vec{A}}_{2}$ and ± ${\vec{A}}_{3}$ axes.

This blind polarization rotation estimation operates differently than other blind PR techniques like CMA. The mode of operation of our technique is different and independent of the steady-state mode of operation for polarization tracking. On the contrary, a technique like CMA can be used from system startup and be kept for steady state operation for some modulation formats like DP–4QAM [1]. The CMA algorithm iteratively adapts its coefficients blindly such that they minimize a phase independent cost function [10]. The CMA self-recovering equalizer is more often used as pre-convergence to reduce the effects of channel distortions in polarization before switching to the decision-directed Least-Mean Square (DD-LMS) equalizer which requires an acceptable average of good decisions by the slicer [10,15].

In the technique presented in this paper we also share the same end goal of steady state operation in DD-LMS, but the blind technique implemented to get there is different. The proposed SOP estimation process in Stokes space is not an iteratively adapting process like CMA. The process waits until it obtains enough satisfying SOP states, at which point the rotation matrix U_ROT is estimated. Only then adaptive processing of new incoming data starts, adapting in decision directed mode.

Detailed steps for obtaining U_ROT summarize as follows:

1. Generate Jones vectors as in Eq. (1), resample at T/2, apply matched filter/inverse CD
2. Keep only vectors that have power greater than a power threshold P_th, convert those to Stokes vectors, save their power, normalize them to unity and store them in a buffer, forming a 3-by-N matrix that we call S_3,_N, where N grows by 1 for each new valid entry.
3. For each new entry to the buffer matrix S_3,_N, compute a new projection matrix U_proj by computing U_proj = S_3,_N₊₁^TS_3,_N₊₁, of size N + 1-by-N + 1.
4. For each new U_proj, verify if there is a row that satisfies the three following criteria:
- 4.1. Contains at least C elements greater than cos(γ₁)
- 4.2. Contains at least C elements smaller than cos(π + γ₁)
- 4.3. Contains at least C elements smaller than |cos(π/2 + γ₂/2)|
  The parameters C, γ₁ and γ₂ are to be chosen such that C≥1, 2γ₁ + γ₂≤ π and γ₂≥2γ₁. Operations 2. 3. and 4. are repeated until all criteria in 4. are satisfied. 4.1) tries to find C elements that are all close enough within a maximum angular distance of 2γ₁. 4.2) looks for C elements that are also regrouped angularly within of 2γ₁ but opposite to those in 4.1). Finally, 4.3) is looking for C vectors that are perpendicular within π/2 ± γ₂/2 to those in 4.1) and 4.2) and are likewise gathered within 2γ₁. These operations in 4. are visually presented in Fig. 2 . For our results, we chose C = 4, γ₁ = 20° and γ₂ = 60°.
  
  Fig. 2 Explanation of algorithm in operation 4.: The n^th column of (S)_3-_N is the green dot. By projecting vectors onto each other, we need at least C-1 other vectors close by, C vectors perpendicular and C opposite to estimate U_ROT.
  Download Full Size | PDF
5. When all the conditions in 4. are satisfied, the Stokes vectors saved in S_3,_N are used to estimate U_ROT. The subset of vectors in S_3,_N that made stop the looping algorithm and satisfied 4.1 to 4.3 represent 3 out of the 4 outer-most Stokes constellations corners, introduced in Fig. 1. U_ROT is then estimated in the following way:
- 5.1. Because all vectors stored in S_3,_N have high power, a plane can be fit in a least mean square sense to those three dimensional data points. Figure 3 shows how a plane can be fit to a power discretized noisy stokes constellation. The norm ${\hat{p}}_{h o l e}$ of this plane is estimated as being the eigenvector of the smallest eigenvalue of the covariance matrix C_COV of S_3,_N, where each vector in S_3,_N has to be previously rescaled to its respective power, stored in the power vector in Operation 2. This is also known as the Principal Component Analysis (PCA), or more precisely as the eigenvalue decomposition of the data covariance matrix [16]. From PCA, we know that the i^th eigenvalue represents the variance of the projection of all vectors stacked in S_3,_N onto the i^th eigenvector. Hence our choice for ${\hat{p}}_{h o l e}$ . The first rotation matrix in U_ROT can be generated by the knowledge of ${\hat{p}}_{h o l e}$ and the smallest angle between ${\hat{p}}_{h o l e}$ and ± ${\vec{A}}_{1}$ ; this matrix would rotate ${\hat{p}}_{h o l e}$ back to ${\vec{A}}_{1}$ or $� {\vec{A}}_{1}$ .
  
  Fig. 3 Power filtering representation in Stokes space for several QAM order. Noise loaded such that BER = 3.8 × 10⁻³ and P_th set such that on average, for every 100 Stokes vectors, 1 has powers exceeding P_th. (a) DP-4QAM with SNR of 8.53 dB, P_th = 2.19 (b) DP-16QAM with SNR of 15.2 dB, P_th = 2.01, (c) DP-64QAM with SNR of 21.1 dB, P_th = 2.01.
  Download Full Size | PDF
- 5.2. Once this matrix is generated, we apply it only to the Stokes vectors in S_3,_N that satisfied the three conditions in Operation 4. The resulting Stokes vectors are therefore rotated back to around the { ${\vec{A}}_{2}$ , ${\vec{A}}_{3}$ } plane. From there, a second matrix is estimated to append to the current U_ROT. This matrix has the purpose of bringing the three clusters of data points around ± ${\vec{A}}_{2}$ and ± ${\vec{A}}_{3}$ . Its matrix generator has the spinning axis ${\hat{A}}_{1}$ and an angle given by the mean angular deviation of the 3 clusters with respect to angles nπ/2, n = {0,1,2,3}. To remove the periodicity of nπ/2 in the angle ϕ of Eq. (2) in the 3 clusters, we use the 4th power algorithm [17,18].
- 5.3. The last estimate is about finding the absolute phase e^iδ as in Eq. (1) allowing estimating the initial phase before beginning processing in DD-LMS mode. For this part, from the Stokes vectors used in 5.2 we keep only the ones that fit within the last 10 ns before Operation 4. was successfully completed. As laser linewidths are typically less than 2.5 MHz in coherent fiber optic systems [4], we assume that the received signal’s phase noise can be considered constant within 10 ns. The initial Jones versions of those Stokes vectors have to be used. As the phase e^iδ is present in both the $\hat{x}$ and $\hat{y}$ part of every Jones state, the number of phase values δ_k used to estimate δ is double the number of Stokes vectors used in 5.2.
  Prior to estimating e^iδ, we need to suppress the intermediate frequency offset. To do so, we use all the Jones vectors created in Operation 1., even the power discarded ones, from system startup until the end of Operation 4. From this time series, we can estimate the frequency offset by raising each Jones vector to the fourth power and identify the peak frequency using a Fast Fourier transform [4]. Obtaining an estimate of e^iδ helps the DD equalizer to start in making good decisions.

The polarization derotation estimation matrix U_ROT is mathematically represented as

[\begin{matrix} X_{R x, D e r o t} \\ Y_{R x, D e r o t} \end{matrix}] = \overset{T o t a l e s t i m a t e d d e r o t a t i o n m a t r i x, U_{R O T}^{�}}{\overset{︷}{\underset{P h a s e e s t .}{\underset{︸}{e^{i δ}}} \underset{2^{n d} M a t r i x e s t i m a t e}{\underset{︸}{[\begin{matrix} e^{i ½ φ} & 0 \\ 0 & e^{- i ½ φ} \end{matrix}]}} \underset{1^{s t} M a t r i x e s t i m a t e}{\underset{︸}{[\begin{matrix} \cos (θ) & e^{- i χ} \sin (θ) \\ - e^{i χ} \sin (θ) & \cos (θ) \end{matrix}]}}}} [\begin{matrix} X_{R x, R o t} \\ Y_{R x, R o t} \end{matrix}]

Here the order of application of the 2 matrices is of paramount importance. The first matrix applied is the one that rotates ${\hat{p}}_{h o l e}$ to either ± ${\vec{A}}_{1}$ , the second matrix rotates the three clusters of data points to around ± ${\vec{A}}_{2}$ and ± ${\vec{A}}_{3}$ and the last term e^iδ estimates the absolute phase.

The loop in Operation 4 stops and U_ROT is estimated as soon as three circles in Fig. 2 that are opposite and perpendicular possess C Stokes vectors. Computing U_proj for every new Stokes vector buffered in S_3,N is not a computationally expensive task: U_proj_-N is simply S_3,_N^TS_3,_N. One can compute U_proj_-N from U_proj_-₍_N_–1) by only computing the projection of all the previous vectors stored in S_3,_N_–1 with the new added vector $\vec{S}$ , and append S_3,_N^{T $\vec{S}$} to the previous U_proj_-₍_N_–1). Because of pre-normalization, U_proj is a real symmetric matrix with ones on the diagonal and values between 1 and –1 elsewhere. The number of real multiplication and additions required to compute U_proj_-N of size N-by-N is 3N(N–1)/2 and N(N–1), respectively.

In Operation 5.1., a covariance matrix C_COV is computed and the eigenvector of the smallest eigenvalue is extracted. The C_COV is expressed as $C_{cov} (i, j) = \sum_{n = 1}^{N} (S_{i, n} - {\bar{S}}_{i}) (S_{j, n} - {\bar{S}}_{j})$ , where ‘ $\bar{k}$ ’ is the mean in dimension k. We can show that the total computational complexity of computing C_COV and the smallest eigenvalue with its eigenvector is small. Normal eigenvalue decomposition are numerically done iteratively which is a computationally intensive process. However, C_COV has the advantageous property of being a real symmetric 3-by-3 matrix. The eigenvalues of such matrices can easily be directly calculated, all independently [19]. Moreover, computing the eigenvector once an eigenvalue is known is quite trivial and goes as follows. If ${\vec{v}}_{1}$ is the desired column eigenvector of the eigenvalue λ₁ of matrix C_COV, we can compute B ${\vec{v}}_{1}$ = (C_COV–λ₁I₃) ${\vec{v}}_{1}$ = [ ${\vec{b}}_{i}^{T} {\vec{v}}_{1}$ , ${\vec{b}}_{j}^{T} {\vec{v}}_{1}$ , ${\vec{b}}_{k}^{T} {\vec{v}}_{1}$ ]^T=0. From the equation we know that the eigenvector ${\vec{v}}_{1}$ is perpendicular to ${\vec{b}}_{i}$ , ${\vec{b}}_{j}$ and ${\vec{b}}_{k}$ . As the determinant of B is zero, all $\vec{b}$ ’s lie in the same plane and consequently ${\vec{v}}_{1}$ is simply the cross product of any two $\vec{b}$ ’s, which can subsequently be normalized to unit power such that ${\vec{v}}_{1}^{T} {\vec{v}}_{1}$ = 1.

The 4 most power hungry processes required in finding U_rot are 1) computing the Stokes vectors that satisfy the power criterion P_th, 2) projecting each new Stokes vector onto all previously stored vector, 3) computing C_COV and 4) computing λ₁ and ${\vec{v}}_{1}$ . Without detailed steps, we can show that the entire computational complexity of all those processes require only (3N² + 30N/2 + 46) real multiplications and (N² + 16N + 16) additions, where N is the number of Stokes vectors stacked in S_3,_N. As N is very well below 100 and only 1 symbol every 6 on average are converted to Stokes, those numbers confirm the low complexity.

We model the data obtained after Operation 1. as being ${\vec{A}}_{R x}$ = R_{ROT ${\vec{A}}_{T x}$} + $\vec{n}$ and we keep only ${\vec{A}}_{R x}$ ’s that have power greater than P_th. Figure 3 shows the Stokes states that passed thepower discriminator as green dots and the others as blue dots for a) DP–4QAM, b) DP–16QAM and c) DP–64QAM when the noise level gives a BER of 3.8 × 10⁻³. We can clearly observe that the direction that has the smallest variance would be pointing towards the region where less green dots are found, hence the name ${\hat{p}}_{h o l e}$ . The P_th radius is the red sphere.

For a total mean signal power of 1, one can find a radius P_th that leads to a specific power discrimination ratio, i.e. the total number of Stokes points over the number of Stokes points outside the red sphere. Figure 3 depicts the results for a ratio of 100. Signals with higher Signal to Noise Ratio (SNR) need a smaller P_th to maintain a power discrimination ratio, and vice versa. As propagation and consequently amplification adds noise to the signal, P_th has to be slightly increased with distance in order to maintain a ratio and consequently a good estimate of ${\hat{p}}_{h o l e}$ . Observing the green “caps” on top of Figs. (a), (b) and (c), we understand that signals with higher SNRs exhibit 4 clusters that are more regrouped, providing a better estimate of ${\hat{p}}_{h o l e}$ . As higher order QAM inherently require higher SNR for the same BER, the blind PR technique presented in this paper works very well for higher order QAM modulation formats, which is not true for common blind adaptive PR techniques like CMA, because their inherent cost function is not designed for high order QAM [20].

A major benefit of this blind Stokes space polarization recovery compared to the iteratively adapting CMA algorithm is the avoidance of the singularity problem of CMA [21], where 2 severely rotated inputs can converge to the same output. As we force the SOP to rotate back to the { ${\vec{A}}_{2}$ , ${\vec{A}}_{3}$ } plane, the $\hat{x}$ and $\hat{y}$ outputs presents independent information.

When the derotation matrix U_ROT is estimated, its 4 values as used to initialize the center tap of the Decision Directed-Least Mean Square with built-in Phase Lock Loop (DD-LMS + PLL) equalizer that starts right after U_ROT’s estimation.

As mentioned in Eq. (9), the proposed algorithm compensates for the “zero^th” order PMD, or said differently for the global polarization rotations occurring on the signal during propagation. The assumption that the cumulative first order PMD in U_c(z,ω) at the receiver is small depends on the fiber type and more precisely on its PMD_Q value. Newer fiber types show very little PMD, with maximum PMD_Q of 0.04ps/√km. Even after 6400 km, they exhibit a very small mean DGD of around 3.2 ps. Even at a high baud rate of 28 GBaud, this value represents less than 10% of the symbol duration. Much greater instantaneous DGD can be tolerated for digital optical coherent receivers. As the fiber used on the experimental setup has a maximum PMD_Q of 0.04ps/√km, experimental results do not assess the proposed polarization recovery technique over large PMD with respect to symbol duration. Consequently, we will present numerical simulations for which the algorithm is tested over varying DGD and sampling offsets in subsection 5.1, after experimental results, and show the accuracy and speed of the algorithm over much larger DGD.

4. Comparison metric and experimental test bed

We compare our novel blind polarization recovery algorithm in Stokes space only against the CMA [10] scheme. We reason this not only because CMA is well known and widely used, but also because most other blind algorithms do not converge as well. For example, the radius-directed algorithm [15] relies on decisions based on the amplitude, and it has been found that this algorithm does not converge well in practice [22]. A recently proposed cascaded three-modulus blind equalization algorithm has shown worse convergence properties than the CMA [23] and can only be used after a channel estimate has been obtained. The same holds true for decision-directed algorithms operating on non-converged data, which depend on most decisions being correct. In [20], the convergence rate of both CMA and the Independent Component Analysis (ICA) is improved compared to the standard CMA by applying computationally expensive algorithms to the data. One method runs three independent computations of blind polarization recovery using different matrices initializations, similar to [4]. The other method performs the gradient descent based on all the observed symbols from system startup up to cumulative time k, instead of on single symbol for standard CMA. ICA converges faster but is always more costly than CMA, and even more so for higher order QAM like 16QAM [24]. We will consequently compare our technique against the easily implementable CMA algorithm with the objective of switching to DD-LMS + PLL as quickly as possible for steady-state operation. We claim that the time saved to get to convergence translates into saves in computational expenses, as an extra computational effort has to be spent during the additional time required for blind adaptive techniques to converge. During this time, 2 complex cost function errors have to be computed for every excessive required symbol, where the error of CMA (y_nR– $y_{n}^{*}$ y_ny_n) is more expensive than that of LMS ( ${\hat{y}}_{n}$ –y_n).

Figure 4 depicts in Stokes space the convergence location of three blind equalization algorithm schemes for DP–16QAM signals, namely (a) the CMA, (b) the Radius Directed Equalizer (RDE) [15] and (c) the Multimodulus Algotithm (MMA) [25]. Although considered blind, the MMA algorithm is phase dependent as its cost function minimizes the distancebetween a square instead of a circle(s) as in CMA (RDE). By comparing the figures in Fig. 4 against the constellations in Stokes space of Fig. 1(b) for DP–16QAM, we understand that the “thickness” in the ± ${\hat{A}}_{1}$ dimension as well as the presence of multiple valid rings driven by the cost function have an impact on the speed of convergence of the blind methods for varying initial signal SOP. Even if all those convergence rings align in planes of norm ${\hat{A}}_{1}$ , their multiple presence in several planes, e.g. A₁ = 0 (3 rings), A₁ = ± 0.4 (2 rings) and A₁ = ± 0.8 (1 ring), slows the speed at which a rotated SOP is blindly forced back to the desired A₁ = 0 plane. On the contrary, the single ring of CMA forces the SOP to fall more quickly to the A₁ = 0 plane, while leaving it freely rotating within that plane. A similar conclusion due to the thickness of the MMA convergence region in Fig. 4(c) explains its slower blind adaptation speed compared to CMA. Hence, as highlighted in [22], we observed a reduced robustness in filter pre-convergence using the multi-ring CMA and therefore opted for the classic CMA.

Fig. 4 Representation on the Poincare sphere of 3 different polarization tracking algorithm: (a) CMA for DP-16QAM (R = √0.66), (b) RDE for DP-16QAM with radiu.s to nearest symbols of √0.1, √0.5 and √0.9 (c) MMA with R = √0.41.

Download Full Size | PDF

4.1 Comparison metric

We will use two metrics to assess our method. Firstly, we will compare the convergence time of our technique against that of the CMA having a built-in PLL. We define the convergence time as the number of symbols required by a blind algorithm to switch from blind polarization recovery to DD-LMS + PLL. For our technique, the convergence time is the total number of received symbols before U_ROT can be estimated and used to initialize the taps of the subsequent DD-LMS + PLL. For the blind CMA adaptive technique, the convergence time is the number of symbols required before the average number of bits in error is less than 5% on each polarization, evaluated over a total number of 1000 received bits (500 per polarization). We decided on a BER threshold of 5% as we experimentally observe consistent post-convergence of DD-LMS for any QAM order using this value. Yet this BER is a high bound as it is more than double the current best pre-FEC BER of 2.4% for soft-decision FEC [26]. As an example, for DP–4QAM, –16QAM and –64QAM formats, the observation window, in symbols, onto which the BER is calculated is 250, 125, and 84, respectively. These short window lengths allow switching to LMS as soon as feasible, preventing to stay in CMA mode for a longer, undesired period. When the BER reaches such a value, the 2-by-2 multiple-input-multiple-output (MIMO) filter switches its method of adaptation to DD-LMS. For smooth transition from CMA to DD-LMS, the same PLL is used inside both polarization recovery schemes. To decide if we update the phase of the PLL or not, we use a criterion based on the signal power threshold on each polarization, as presented in [27].

To show in a different way the quickness of our new method, we will also compare the mean number of bits in errors using only the first 20 × 10³ symbols. Again, we compare our Stokes space blind polarization recovery technique against that of CMA. This will show that our matrix estimation in Eq. (10), including the important initial phase estimate e^iδ, yields an initial BER that is much smaller than what obtained using the common CMA algorithm.

4.2 Experimental test bed

We compare performance of our new method against CMA for 3 different modulation formats: DP–4QAM at 28 Gbaud, DP–16QAM at 28 Gbaud and DP–64QAM at 7 Gbaud. This will allow proving experimentally that the technique works for any square MQAM order. The experimental test bed is depicted in Fig. 5 . We use an Arbitrary Waveform Generator (AWG) from MICRAM^© to generate the multi-level signals applied to a Dual-Parallel Mach-Zehnder (DPMZ) modulator that modulates CW light out of an external cavity laser (ECL) that has a linewidth smaller than 100 kHz. Dual-Polarization is emulated using Polarization Beam Splitters/Combiners and an optical delay line. The single carrier optical signal is thenboosted to the desired launch power; we launch at –2.2 dBm for these experiments. The signal then propagates inside an optical recirculating loop which contains 4 spans of 80 km of Corning SMF-28e + LL^© and 4 in-line EDFAs. No optical dispersion compensation modules are present. On the receiver side, the signal gets filtered, amplified and re-filter before hitting the optical coherent receiver front end. The Local Oscillator (LO) laser is the same type as the transmitter. The Dual-Polarization 90° Optical Hybrid is from Kylia^©, the balanced photodiodes are BPDV2020R’s from U²T^© and the real-time oscilloscope is Agilent^©’s DSO-X 93204A sampling at 80 GSa/s with 3dB bandwidth of 33 GHz. The recirculating loop allows propagation by steps of 320 km.

Fig. 5 Experimental Test bed.

Download Full Size | PDF

5. Results

We present in this section the convergence speed and the initial BER for DP–4QAM in Fig. 6 , DP–16QAM in Fig. 7 and DP–64QAM in Fig. 8 . We compare the BER of the first 20 × 10³ symbols when a blind CMA + PLL filter starts operating on the received data, after resampling at T/2 and CD removal, against our Stokes Blind Recovery method. As our technique has to discard or buffer incoming data before estimating the polarization derotation matrix whereas CMA outputs decisions as soon as it starts processing, we consider the discarded and buffered Jones symbols when computing the BER of the Stokes space method, for a fair comparison of an equal number of samples starting at the same time. Those initial data are polarization entangled, off-phase and freely moving in both polarization and phase, therefore are very baddata for the slicer. For rapid blind convergence of the CMA algorithm, we used a relatively large adaptation coefficient of μ = 10⁻³ for all QAM orders. Even when considering those discarded/buffered symbols, the initial BER of the Stokes Space Blind Recovery method gives better performance than CMA. For DP–4QAM format, some traces had their initial SOP at the receiver fortuitously aligned to that of the PBS axis of the coherent receiver. For example, one capture after 1600 km shows that even if the minimum of 250 symbols where needed in CMA + PLL mode before switching to LMS + PLL (from the need to have at least 1000 bits collected to evaluate the BER), the initial BER of the first 20K symbols is still less than that of the Stokes Space Recovery method that required only 38 symbols to switch to LMS + PLL. The reason is that the first 38 symbols had both their phase and polarization tracked in CMA + PLL whereas they are left wandering in the Stokes Space Recovery algorithm. It is important to note that if we would assume that buffered incoming symbols could be post-processed by the polarization/phase derotation matrix of Eq. (10) after its estimation, before feeding them to the slicer, the initial BER of the Stokes Space algorithm would be even smaller, and those rare cases where CMA performed better than Stokes for DP–4QAM would most probably be reversed.

Fig. 6 (a) BER vs Distance of first 20K symbols for DP-4QAM format at 28 Gbaud comparing CMA and Stokes space Recovery. (b) Number of symbols required before switching filter from blind to DD-LMS + PLL.

Download Full Size | PDF

Fig. 7 (a) BER vs Distance of first 20K symbols for DP-16QAM format at 28 Gbaud comparing CMA and Stokes Space Blind Recovery. (b) Number of symbols required before switching from blind to DD-LMS + PLL.

Download Full Size | PDF

Fig. 8 (a) BER vs Distance of first 20K symbols for DP-64QAM format at 7 Gbaud comparing CMA and Stokes Space Blind Recovery. (b) Number of symbols required before switching from blind to DD-LMS + PLL.

Download Full Size | PDF

The sawtooth look of the required number of symbols for CMA in Fig. 6(b) for largely varying noise levels exhibits the strong dependence of the speed of convergence of CMA with respect to the received SOP and no dependence on the SNR. Strongly varying convergence speeds are observed at both large and little noise levels. The Stokes space method, however, shows no dependence on the input SOP, and only a slight dependence on the noise level depicted by the slope. This slope is caused by the voluntary increase of the power discriminator P_th with distance in order to keep a power discrimination ratio and consequently a good estimate of ${\hat{p}}_{h o l e}$ . For DP–4QAM, we let P_th increase as P_th = 1 + LoopNo/40, where LoopNo is the loop count. This means that at 6400 km where we reach BER = 3.8 × 10⁻³, P_th = 1.5 which is well below P_th = 2.18 for a ratio of 100 as in Fig. 3(a). In fact, P_th = 1.5 represents a theoretical power discrimination ratio of ~5.8 for DP–4QAM at BER = 3.8 × 10⁻³. As roughly 200 symbols were needed for U_ROT’s estimation after 6400 km, an average of 35 symbols were stacked in S_3-_N and used for the 1st matrix estimate (Operation 5.1), and only 3C = 12 or a little bit more were kept for the 2nd matrix estimate (Operation 5.2) and the phase estimate (Operation 5.3) of U_ROT. The BER of the first 20x10³ symbols shows the accuracy of our Stokes Space estimate in comparison to the steady state DD-LMS + PLL curves which represents the lowest bound for the initial BER of any method taken to get to steady state.

As demonstrated in the figures, the average initial BER of the Stokes Recovery method can be smaller by an order of magnitude or more compared to that of CMA. This shows the accuracy of our estimation of U_ROT: ${\hat{p}}_{h o l e}$ , the angle ϕ/2 in the 2nd matrices in Eq. (10) and the absolute phase e^iδ are well estimated. As both the PLL and the MIMO filters start with good initial conditions, locking of those processes happens more quickly. For CMA, if the initial polarization entangling is very severe ( ${\hat{p}}_{h o l e}$ pointing around ± ${\hat{A}}_{3}$ or ± ${\hat{A}}_{2}$ ), the algorithm will take more time to adaptively blindly untangle the information on both polarizations [4].

The independent PLLs working on $\hat{x}$ and $\hat{y}$ output polarizations operate on providing an absolute phase to the Jones vectors on a per symbol basis and help locking the SOP on the { ${\vec{A}}_{2}$ , ${\vec{A}}_{3}$ } plane to the ${\hat{A}}_{2}$ and ${\hat{A}}_{3}$ axis, preventing spinning within the plane.

For any square QAM of 4, 16 or 64-ary, at any distance (any noise level), our experimental results show that the number of required symbols needed to estimate U_ROT before switching to DD-LMS never exceeds 830. For DP–16QAM, at a BER of 2.2% after 2240 km, a total of only 829 symbols impinging on the receiver were needed to estimate U_ROT and start operating in DD-LMS. At 28 Gbaud, this is a very short 29.3 ns required for blindpolarization convergence, which is known to be the process taking the most time to converge [4]. For the DP–64QAM format at 7 Gbaud after 640 km at a BER of 1.4%, only 554 symbols were needed, converting into only 79.1 ns. Finally, for DP–4QAM, after 6400 km giving a BER of 0.38%, only 165 symbols were required before successfully switch to DD-LMS + PLL. This converts to a mere 5.9 ns. We can compare those results with recently reported convergence time of blind CMA algorithm applied on DP–4QAM signals, where modification were applied to the CMA in order to decrease its convergence time. In [4] they report a mean convergence time of 40 ns, with a best case of 20 ns and a worst case of 280 ns. In another work [6], the CMA algorithm achieves blind recovery in 200 ns, or 11200 symbols. By comparing our convergence time results, we demonstrate the rapidity of convergence of our novel blind Stokes space polarization recovery technique.

For the DP–16QAM and –64QAM, the initial BER within the first 20 × 10³ samples is always better; around 3 times lower for DP–64QAM and even more for DP–16QAM. In Figs. 6 to 8, (a), we can appreciate how close the curves of the Stokes space method are to the lower bound of steady-state operation, proving the accuracy of the method’s estimation of U_ROT.

5.1 Testing over PMD

We present in this subsection numerical simulations where the proposed polarization derotation method is assessed over varying instantaneous DGDs, assumed constant for the short period required by the algorithm to find the derotation matrix U_ROT. The proposed SOP recovery method is also assessed over different DGD when the sampling instances of the Analog to Digital Converters (ADCs) don’t align with the center of the symbols’ location, for more realistic received waveforms. We call this sampling offset with respect to the symbols’ location the Common Group Delay (CGD). This can also be true for two-fold oversampling as is the case for processing the received waveforms in our simulations. Mathematically, the numerical model is described as follows

[\begin{matrix} \tilde{A} {(ω)}_{x, o u t} \\ \tilde{A} {(ω)}_{y, o u t} \end{matrix}] = \overset{R e c e i v e r}{\overset{︷}{\underset{R x F i l t e r}{\underset{︸}{H_{T x - I S I - c o m p} (ω)}} \underset{C G D}{\underset{︸}{e^{i ω κ}}}}} \overset{L i n k}{\overset{︷}{\underset{R o t a t i o n}{\underset{︸}{U_{r a n d, 2}}} \underset{D G D}{\underset{︸}{U_{r a n d, 1}^{�} [\begin{matrix} e^{½ i ω τ} & 0 \\ 0 & e^{- ½ i ω τ} \end{matrix}] U_{r a n d, 1}}}}} [\begin{matrix} \tilde{A} {(ω)}_{x, i n} \\ \tilde{A} {(ω)}_{y, i n} \end{matrix}]

where τ is the DGD and κ is the CGD. U_rand_,1 and U_rand_,2 are random rotation matrices as defined in [9]. The two orthogonal eigenvectors of U_rand_,1 form the basis that will observe a positive and negative DGD of ½τ, and U_rand_,2 randomly rotates the entire resulting waveform. The waveforms

\tilde{A} {(ω)}_{x, i n}

and

\tilde{A} {(ω)}_{y, i n}

are respective Fourier transforms of A(t)_x,in and A(t)_y,in. Each A(t)_x/y,in is the resulting output of two Pulse Pattern Generators added in quadrature and operating at 28 Gbaud, having a Gaussian impulse response and an output risetime from 20% to 80% of 15 ps. The quadrature addition generates the QPSK signal, on both polarization, hence DP–4QAM format. Due to the risetime, some ISI is present at the transmitter. The required receiver filter to remove transmitter-induced ISI is computed and applied for all captures as the matched filter to apply on the receiver side. Each A(t)_x/y,in are initially oversampled at T/8 in order to accurately generate A(t)_x/y,out. Complex white Gaussian noise is added to A(t)_x/y,in such that after the transmitter ISI annulling filter H_Tx-ISI-comp(ω), the bit error rate is at the threshold of 3.8 × 10⁻³. Once A(t)_x,out and A(t)_y,out are generated at T/8, perfect ADCs picking 1 sample every 4 mimic two-fold oversampling at T/2. Next, the blind PR algorithm presented in Section 3 starts operating until a unitary rotation matrix is estimated and used to initialize the central tap of a subsequent 25-tap DD–LMS filter.

In the first case, CGD is zero and DGD is varied from 0 to 0.9T, where T is the symbol duration. This translates to up to 32 ps of DGD at 28 Gbaud; 10 × the mean DGD after 6400 km for fibers of PMD_Q = 0.04ps/√km or 2 × for fibers of PMD_Q = 0.2ps/√km like the G.652D fiber. In the second study we set the sampling offset to a large value; one third of the symbol duration. For this case, we vary the DGD from 0 to 0.3T, up to 10.7 ps at 28 GBaud.

Using the Monte Carlo method, for every different combinations of DGD and CGD, 400 random DP–4QAM waveforms with random noise sources and random U_rand_,1 and U_rand_,2 were generated to assess the proposed polarization derotation scheme. Then, the proposed blind polarization derotation method starts operating on |A(t)_out⟩, estimates U_rot and uses it as the center tap of the subsequent adaptive DD–LMS filter. The latter then processes data and slowly adapts until removal of all ISI coming from DGD, CGD and remaining polarization cross-talk due to the slightly imperfect evaluation of the U_rot. When all ISI is removed, only white Gaussian noise is present and steady state is reached. The unitary matrix U_rot that our proposed algorithm finds is an estimate of $U_{r a n d, 2}^{�}$ . The results are presented in the following Fig. 9 .

Fig. 9 System performance for varying DGD when the Common Group Delay is (a) 0, and (b) 1/3 T. Insets show BER zoom-in for symbols 0 to 10⁴.

Download Full Size | PDF

We can see the direct impact of DGD on system performance: higher DGD maps as higher bit error rate. For the curves in Fig. 9, the BER was filtered by a first order IIR low-pass filter with parameter α = 0.999, mimicking a BER computed over a window length of 1/(1–α) = 1000 symbols. The initial BER estimate, at index n = 0, is the BER if U_rot was found to be exactly $U_{r a n d, 2}^{�}$ and if no adaptive DD–LMS filter was applied to A(t)_x/y,out. We can see in the inset of Fig. 9 (a) that starting at DGD of 0.7T, the estimated U_rot starts to moderately differs from the optimum matrix $U_{r a n d, 2}^{�}$ , resulting in an initial BER that is slightly worse than if $U_{r a n d, 2}^{�}$ was applied. This is depicted by the subtle increase of the windowed BER. The accuracy of our method is confirmed by this tenuous different in initial performance from U_rot to $U_{r a n d, 2}^{�}$ and by convergence attained via a decision directed scheme. Systematic successful U_rot estimation and post-convergence was observed for DGD as high as 0.9T, but higher DGDs failed to converge. The results when CGD = 1/3T are shown in Fig. 9(b). For this case, the algorithm systematically post-converged to steady state for DGDs up to 0.3T.

The maximum DGD thresholds of 0.9T for CGD = 0 and 0.3T for CGD = 1/3T can readily be explained. The absolute sum of DGD and CGD applied to one axis is limited to be smaller than 0.5T, half the symbol’s duration. As an example, for the first case, a DGD = 0.9T gives a maximum delay on one axis of 0.45T. However when CGD = 1/3T, the maximum allowed DGD is 0.3T giving a maximum delay on one axis of 1/3T + 0.3T/2 = 0.483T. As one would imagine, any non data-aided polarization untangling process would require a total delay on any axis with respect to the central position of symbols to be less than half the symbol duration for successful subsequent blind adaptation. After such period, larger DGD is acceptable and tracking will solely depend on the tracking speed and method. Otherwise, if the DGD is too large during the blind process, taps of one output dimension of the adaptive 2-In-2-Out filter are likely to converge to a state that offsets the output symbols by an integer number of symbols with respect to the other output dimension.

The results of Fig. 9(b) shows the interesting feature that if the sampling instants are off as much as ± 1/3T with respect to the symbols’ center position, an instantaneous DGD of 0.3T is still tolerated for our proposed method to successfully find and derotate the SOP and allow direct operation in decision directed mode. Consequently, DGDs up to 0.3T can support CGDs in the wide range of –1/3T to + 1/3T, very close to the full range of –1/2T to + 1/2T.

Finally, the speed of convergence for various DGD is assessed for the proposed blind polarization untangling method. We show in Fig. 10 the average required number of symbols for varying DGD for both offset sampling cases.

Fig. 10 Required number of symbols to find U_rot over varying DGD, for CGD = 0 and 1/3T.

Download Full Size | PDF

We can observe that with increasing DGD, an increased number of symbols are required before successfully switching to DD–LMS. This is explained by the fix power threshold value P_th used in these simulations. To match the P_th used in experiments for DP–4QAM after 6400 km where BER≈3.8 × 10⁻³, we used P_th = 1.5. Using this fix value, we can show that the power discrimination ratio increases with increasing DGD. This explains the monotonically increasing trend of the required number of symbols over DGD. We observe that the required number of symbols is systematically very low even for a large DGD of 0.9T at around 600 symbols, compared to the possible 10⁴ symbols required by the CMA process that depends on the SOP alignment as shown in Fig. 6(b) when applied to low PMD fiber. Figures 9 and 10 show the accuracy and speed of the proposed polarization untangling process over largely varying DGD

Finally, the required number of symbols obtained numerically for instantaneous DGDs smaller than 0.2T match the value obtain for real propagation shown in Fig. 6(b). Simulations give between 200 and 245 required symbols on average for DGDs between 0 and 7.1 ps where experiments show around 200 symbols for the same modulation format and noise level with mean DGD of 0.04ps/√km × √6400km = 3.2 ps. Consequently, similarity of the experimental results with simulation over comparable range of PMD confirms the mathematical model used and validates results for larger DGD. The proposed method would successfully operate for fibers with higher PMD_Q like the G.652D. From the Maxwellian distribution of DGD with the sole knowledge of DGD_MEAN, we can be compute the probability that the instantaneous DGD be smaller than 0.9T on G.652D fiber after 6400 km with 28 Gbaud signals giving DGD_MEAN = 0.448T: a probability of 1–P(DGD>0.9/0.448 × DGD_MEAN) = 98.4%.

6. Conclusion

We presented a novel method for blind estimation of the SOP and polarization rotation for single carrier channels and coherent receivers. This method can be used either on regular system startup or for fast switching burst mode receivers. We showed experimentally that the algorithm outperforms the convergence time for any DP–square-QAM format by about an order of magnitude compared to the standard CMA algorithm. The numbers of symbols required to blindly derotate the received SOP before the self-adapting equalizer can operate in decision-directed–LMS mode is used to assess convergence. Unlike CMA, this blind Stokes space polarization recovery algorithm is completely independent of the input SOP of the signal. We compared experimental results of convergence time and initial BER when the CMA and when the Stokes space technique were used to blindly untangle the receiver SOP for 3 modulation formats of DP–4QAM, –16QAM and –64QAM. After propagation over SMF 28e + LL fiber, we obtain convergence within 5.9 ns at BER = 0.38% after 6400 km for DP–4QAM, 29.3 ns at BER = 2.2% after 2240 km for DP–16QAM and 79.1 ns at BER = 1.4% after 640 km for DP–64QAM. The initial BER of the first 20 × 10³ symbols when the Stokes space technique is used is always much closer to the steady-state operation than when CMA is used for tap adaptation, for DP–16QAM and –64QAM formats. For DP–4QAM, some fortuitously well aligned input SOP on system startup can give better initial BER when using CMA. However, the Stokes space method outperforms CMA even for DP–4QAM as it is robust to any input polarization alignment. Finally, we simulated PMD testing on our Stokes space polarization recovery technique to assess its robustness over larger PMDs than what provided by the experimental test bed. We demonstrated the accuracy and speed in obtaining the derotation matrix over DGDs as large as 90% of the symbol duration for perfect sampling offset and as large as 30% when the initial central tap of the filter is off by one third of a symbol with respect to the symbols’ center location. The speed of convergence using the blind Stokes method on large PMD is still generously smaller than that of CMA on little PMD.

References and links

1. S. J. Savory, “Digital coherent optical receivers: algorithms and subsystems,” IEEE J. Sel. Top. Quantum Electron. 16(5), 1164–1179 (2010). [CrossRef]

2. L. Liu, Z. Tao, W. Yan, S. Oda, T. Hoshida, and J. C. Rasmussen, “Initial tap setup of constant modulus algorithm for polarization de-multiplexing in optical coherent receivers,” in Proc. OFC2009, paper OMT2.

3. A. Vgenis, C. S. Petrou, C. B. Papadias, I. Roudas, and L. Raptis, “Nonsingular constant modulus equalizer for PDM-QPSK coherent optical receivers,” IEEE Photon. Technol. Lett. 22(1), 45–47 (2010). [CrossRef]

4. R. Maher, D. S. Millar, S. J. Savory, and B. C. Thomsen, “Widely tunable burst mode digital coherent receiver with fast reconfiguration time for 112Gb/s DP-QPSK WDM networks,” J. Lightwave Technol. (submitted).

5. F. Vacondio, O. Rival, Y. Pointurier, C. Simonneau, L. Lorcy, J.-C. Antona, and S. Bigo, “Coherent receiver enabling data rate adaptive optical packet networks,” in Proc. ECOC2011, paper Mo.2.A.4.

6. J. E. Simsarian, J. Gripp, A. H. Gnauck, G. Raybon, and P. J. Winzer, “Fast-tuning 224-Gb/s intradyne receiver for optical packet networks,” in Proc. OFC02012, paper PDPB5.

7. M. Morsy-Osman, M. Chagnon, Q. Zhuge, X. Xu, M. E. Mousa-Pasandi, Z. A. El-Sahn, and D. V. Plant, “Training symbol based channel estimation for ultrafast polarization demultiplexing in coherent single-carrier transmission systems with M-QAM constellations,” in Proc. ECOC2012, paper Mo.1.A.4.

8. G. P. Agrawal, Lightwave Technology: Telecommunication Systems (Wiley-Interscience, 2005).

9. J. P. Gordon and H. Kogelnik, “PMD fundamentals: polarization mode dispersion in optical fibers,” Proc. Natl. Acad. Sci. U.S.A. 97(9), 4541–4550 (2000). [CrossRef] [PubMed]

10. D. Godard, “Self-recovering equalization and carrier tracking in two-dimensional data communication system,” IEEE Trans. Commun. 28(11), 1867–1875 (1980). [CrossRef]

11. S. J. Savory and F. P. Payne, “Pulse propagation in fibers with polarization-mode dispersion,” J. Lightwave Technol. 19(3), 350–357 (2001). [CrossRef]

12. D. Marcuse, C. R. Menyuk, and P. K. A. Wai, “Application of the Manakov-PMD equation to studies of signal propagation in optical fibers with randomly varying birefringence,” J. Lightwave Technol. 15(9), 1735–1746 (1997). [CrossRef]

13. C. D. Poole and J. Nagel, “Polarization Effects in Lightwave Systems” in Optical Fiber Telecommunications, Vol. III A (Academic Press, 1997) Chap. 6.

14. E. Ip and J. M. Kahn, “Digital equalization of chromatic dispersion and polarization mode dispersion,” J. Lightwave Technol. 25(8), 2033–2043 (2007). [CrossRef]

15. I. Fatadin, D. Ives, and S. J. Savory, “Blind equalization and carrier phase recovery in a 16-QAM optical coherent system,” J. Lightwave Technol. 27(15), 3042–3049 (2009). [CrossRef]

16. I. T. Jolliffe, Principal Component Analysis (Springer-Verlag, 2002).

17. D.-S. Ly-Gagnon, S. Tsukamoto, K. Katoh, and K. Kikuchi, “Coherent detection of optical quadrature phase-shift keying signals with carrier phase estimation,” J. Lightwave Technol. 24(1), 12–21 (2006). [CrossRef]

18. A. J. Viterbi and A. M. Viterbi, “Nonlinear estimation of PSK-modulated carrier phase with application to burst digital transmission,” IEEE Trans. Inf. Theory 29(4), 543–551 (1983). [CrossRef]

19. O. K. Smith, “Eigenvalues of a symmetric 3x3 matrix,” Commun. ACM 4(4), 168 (1961). [CrossRef]

20. P. Johannisson, H. Wymeersch, M. Sjödin, A. S. Tan, E. Agrell, P. A. Andrekson, and M. Karlsson, “Convergence comparison of the CMA and ICA for blind polarization demultiplexing,” J. Opt. Commun. Netw. 3(6), 493–501 (2011). [CrossRef]

21. V. N. Rozental, T. F. Portela, D. V. Souto, H. B. Ferreira, and D. A. A. Mello, “Experimental analysis of singularity-avoidance techniques for CMA equalization in DP-QPSK 112-Gb/s optical systems,” Opt. Express 19(19), 18655–18664 (2011). [CrossRef] [PubMed]

22. P. J. Winzer, A. H. Gnauck, C. R. Doerr, M. Magarini, and L. L. Buhl, “Spectrally efficient long-haul optical networking using 112-Gb/s polarization-multiplexed 16-QAM,” J. Lightwave Technol. 28(4), 547–556 (2010). [CrossRef]

23. X. Zhou, and J. Yu, “200-Gb/s PDM-16QAM generation using a new synthesizing method,” in Proc. ECOC2009, paper 10.3.5.

24. P. Johannisson, H. Wymeersch, M. Sjödin, A. S. Tan, E. Agrell, P. Andrekson, and M. Karlsson, “Convergence comparison of CMA and ICA for blind polarization demultiplexing of QPSK and 16-QAM signals,” in Proc. ECOC2009, paper TH.9.A.3.

25. J. Yang, J.-J. Werner, and G. A. Dumont, “The multimodulus blind equalization and its generalized algorithms,” IEEE J. Sel. Areas Comm. 20(5), 997–1015 (2002). [CrossRef]

26. X. Zhou, L. Nelson, R. Isaac, P. Magill, B. Zhu, and D. Peckham, “1200km transmission of 50GHz spaced, 5x504-Gb/s PDM-32-64 hybrid QAM using electrical and optical spectral shaping,” in Proc. OFC2012, paper OM2A.2.

27. M. J. Ready and R. P. Gooch, “Blind equalization based on radius directed adaptation,” IEEE Int. Conf. Acoust. Speech Signal Proc., 3, 1699–1702 (1990).

Blind, fast and SOP independent polarization recovery for square dual polarization–MQAM formats and optical coherent receivers

Abstract

1. Introduction

2. DP-MQAM formats in Stokes space and modeling of SOP in lightwave systems

2.1 Modeling of SOP in lightwave systems

3. New blind polarization rotation estimation

4. Comparison metric and experimental test bed

4.1 Comparison metric

4.2 Experimental test bed

5. Results

5.1 Testing over PMD

6. Conclusion

References and links

Cited By

Figures (10)

Equations (11)

Optics Express