Efficient hierarchical list decoder for massive optical MIMO Transmission

Maxim Greenberg; Moshe Nazarathy; Meir Orenstein

doi:10.1364/OE.16.000718

1. Introduction

Given the large-scale proliferation of 10GbE (10G-BASE-SR) optical transmission in datacom applications, interest has recently arisen in extending the 802.3 suite of standards to 100GbE over Multi Mode Fiber (MMF), currently envisioned to be realized by means of CWDM technology. We have recently proposed an alternative optical Multiple-Input-Multiple-Output (MIMO) transmission architecture for short-range 100GbE over MMF, based on the insight that the MMF Channel Matrix (CM) random nature provides a unique opportunity to generate and manipulate random codes, reliably conveying vast amounts of information in parallel over a single MMF interconnect. We propose and detail a novel technique, briefly previewed in [1],[2],[3], for massive parallel transmission over MMF, in effect converting the fiber into an extended random code generator. Despite the fact that random codes ideally attain channel capacity [4], they have never been considered for the realization of any optical or wireless communication system, apparently due to the unavailability of means to share the random codebook realizations between the transmitter (TX) and receiver (RX), but also due to the decoding complexity. Here we show that by proper spatial encoding on the input we obtain random codes with certain structural properties. This will allow us to apply techniques of hierarchical decoding [5] in conjunction with list-decoding [6] in order to obtain efficient sub-ML decoder with reduced complexity.

Fig. 1. 100 Gbps MIMO transmission system over MMF

Download Full Size | PDF

The opto-electronic feasibility of our system is made possible at this juncture by advances in opto-electronic components, in particular silicon photonics [7], using potentially simpler optics, eliminating the multiple CWDM sources in favor of a single CW laser coupled to a silicon-based array of high-speed PSK waveguide modulators. Such state-of-the-art opto-electronics addresses the TX side realization, but it is the fiber natural random propagation characteristic that enables the random codewords generation and the distribution of the entire codebook to the RX-side by means of a training sequence procedure to be detailed, run in each coherency interval during which the CM is stable. It should be clarified that the capacity enhancement is not actually achieved by means of a random coding gain effect at the input of the system, but rather it is attained in our case due to multiplexing gain (a MIMO effect), since we manage to simultaneously transmit and successfully decode multiple independent data channels through a single fiber (10 channels in the proposed set-up). Random codes inherently produced by the MMF along with the subsequent binary decoders are not temporal but are spatial (over the multiple outputs). Their randomness is due to the channel consistently varying by being affected by thermal and acoustic perturbations. Over different coherence intervals, the guided modes accumulate different phases, which results in different speckle patterns and consequently different spatial codes realizations. The meaning of random coding here is that over the ensemble of multiple coherence intervals, we effectively transmit a multitude of random codes (rather than a single randomly drawn code).

In our previous works [1,2,3] it was conjectured that upon the aggregate data rates achievable with the MIMO architecture were limited by the realization of maximum likelihood (ML) decoding, known to have exponential complexity. For example in order to decode 10 parallel channels, the RX should perform 2¹⁰=1024 correlations, implemented as bitwise XORs between the received message and all possible codewords, which may be quite challenging using a 10GHz clock. In wireless MIMO communications, one approach to mitigate the complexity of ML decoder is by utilizing tree-based sphere detection algorithm [8]. Such solution may be suitable for optical MIMO, albeit only when there is a linear relationship between input and output, e.g. when using incoherent inputs [9,10] In the case of coherent transmission with quadratic direct detection of interest here, the sphere detection is not applicable.

The model described in this paper pertains to the relatively short distances of 30–50m, over which the Inter Symbol Interference (ISI)) induced by intermodal dispersion may be neglected [10]. Notably, high-speed (up to 10Gb/s) transmission over longer distances (300m–1km) is essentially ISI-limited. A vast amount of research has been devoted to mitigating ISI by means of diverse equalization techniques [11]. On the other hand, for short distances, conventional single-nput single-output (SISO) transmission is unable to support high-speed (tens of Gbps) communication, due to the unavailability of cost-effective transmitters at these rates. The only option seemingly left is parallelization of multiple channels over a single physical fiber, by means of techniques such as CWDM, or the recently considered MIMO.

2. Concept and system architecture

Consider a multimode fiber configured as a MIMO system, randomly coupling n_T modulated sources at its input and terminated in n_R detectors at its output (Fig. 1). The transmitter opto-electronic structure comprises n_T optical input ports synchronously modulated by binary PSK [1,2,3], each at a bitrate T ^-1=10Gb/s, injecting their mutually coherent optical outputs into the MMF, randomly coupled to the propagation modes of the MMF. The multiple PSK modulated optical sources may be realized as silicon-based plasma-effect modulators – a recently emerging technology amenable to compact integration.

In each symbol interval T, a k-bit message, m≡m ₁ m ₂…m_k∊Z ^k ₂(withZ ₂≡{0,1}), out of a space of M=2^k=2¹⁰ possible messages, is initially encoded as an n_T-dim. binary bitstring $g \equiv g_{1} g_{2} . . . g_{n_{r}} \in Z_{2}^{n_{r}}$ with n_T≥k+1, selected out of a spatial code G of a certain structure as discussed below. Each input binary codeword is mapped by the PSK modulator array into a complex field vector s Ḛ ^s, with phase-modulated bipolar elements ±A, respectively launched into the fiber input ports. The complex field vector incident at the n_R detectors is then Ḛ ^d=HḚ ^s, with H the MMF random CM of size n_RD×n_T, and D the number of speckle/modal optical Degrees of Freedom (DOF) at each detector. Assuming the detector responsivity ρ≅1[A/W], the photocurrents are I_i=∑^D _n=1|E ^d(i) _n|² where the field components were labeled by a two indexes, (i,n) pointing to the n-th DOF associated with i-th detector. It was established in [12],[13] that under the condition of random coupling, the elements of CM H are independent identically distributed (i.i.d) Complex Gaussian random variables. The electrical current in the i-th detector, I_i, is a Gamma distributed random variable, with parameters D, the number of DOFs per detector, and λ, the average current of a single DOF (equal to the total received noiseless current divided by total number of DOFs):

Fig. 2. (a). Equivalent channel model. (b). Bit-flip probability vs n_T for different δ.

Download Full Size | PDF

Γ_{(λ, D)} (I_{i}) = \frac{I_{i}^{D - 1} \exp {\frac{- I_{i}}{λ}}}{{(λ)}^{D} (D - 1)!}

Upon optical detection, the components of the received field are absolute-squared, summed over the DOFs and one-bit quantized with thresholds $\bar{I_{i}}$ selected offline:

c_{i} = sgn \frac{[I_{i} - \bar{I_{i}}]}{2} + \frac{1}{2} \in {0, 1}

The detected n_R -bit codewords $c = c_{1} c_{2} . . . c_{n_{R}}$ form realizations of a random code $ℂ \subset Z_{2}^{n_{R}}$ of size M in one-to-one correspondence with the message space Z ^k ₂ at the transmit side, m⃡g⃡c (to distinguish between the input and output codes we denote the transmitted “encoded messages” by g and the received codewords by c). As the CM H is random, so is the code ℂ[H] generated at the RX side. In order to obtain symmetric codes, such that Pr{c_i=1}=0.5, the thresholds $\bar{I_{i}} = \bar{I}$ should be chosen at the median of the current distribution density, and may be numerically calculated for any values of (λ,D). Including the impact of the thermal noise induced in the receiver, we have synthesized an end-to-end coded vector binary channel,

r = c \oplus ε, c \in ℂ [H]

with ε the error vector induced by the thermal noise. In our study we used a noise spectrum of N=10^-21 A²/Hz. The equivalent model is shown in Fig. 2(a). The MMF acts as a distributed random code generator, and the “noisy channel: is entirely located at the receive side.

The raw error probability P ₀₁=P ₁₀=P_ε depends on system parameters(λ,D). The pairwise error probability P ₀₁ that the noiseless current is α<Ī, the overall current including noise exceeds the threshold is expressed as a Gaussian-Q [8] function

P (err ∣ I_{i} = α) = \int_{\bar{I}}^{\infty} \frac{1}{\sqrt{2 π σ_{n}}} \exp {- \frac{{(x - α)}^{2}}{2 σ_{n}^{2}}} dx = Q (\frac{\bar{I} - α}{σ_{n}})

where σ_n is the noise variance. The average error probabilty per bit obtained by averaging over the Gamma-distributed photocurrent:

P_{ε} = P_{01} = E_{α} [P (ε ∣ I_{i} = α)] = \int_{0}^{\bar{I}} Q (\frac{\bar{I} - α}{σ_{n}}) Γ_{(λ, D)} (α) d α .

In previous works [1,2,3] we have investigated the system performance using Maximum Likelihood (ML) decoding for the effective channel, comprising M=2^k=2¹⁰ bitwise XORs (⊕) and Hamming weight ops (w_H) at the 10GHz clock, terminated in a select-minimum operation. The main bottleneck of VLSI implementation of such receiver would be the data distribution to all the XORs units, rather than the XOR operations. In this paper we introduce a sub-optimal decoder, trading off some performance for reduced decoder complexity [14].

The raw error rate P_ε at each detector (prior to decoding) depends on D, the total mode count, and the output power following (5); here we assume P_ε=0.01[3]. Each channel realization is assumed to be stable over >1ms [15] interval, hence the RX must estimate the full M-element random code, over this coherency interval. The speckle pattern on the output facet of the MMF varies as a result of varying modal phases which are in turn affected by thermal and acoustic perturbations. It should also be emphasized, that the response of Single Input Single Output (SISO) MMF channel would be much slower than that of a MIMO MMF system, since the overall power emitted from the MMF is almost insensitive to modal phase perturbations (it would actually be constant if all the power were collected from the end facet of a lossless fiber). For MIMO MMF transmission, this is not the case since, as explained, the MIMO response is essentially phase sensitive (each detector sees interference from multiple modes, effectively sampling the speckle pattern). Hence the variation of the fiber provides temporal randomness, from one coherence interval to the next, in addition to the spatial randomness (i.e. the unpredictable values attained at the multiple detectors, in each coherence interval). Altogether we have spatio-temporal randomness.

To this end the TX launches a training sequence comprising each of the messages m∊Z ^k ₂ repeated, say, 1000 times. The RX majority-decodes the repeated sequences, with proper synchronization [3]. We have established that the error rate and overhead (OH) entailed in the training process are negligible.

3. Spatial coding performance

In the previous publications we have investigated the influence of input spatial coding on the properties of the random output codes. In the uncoded-input case it was seen that the randomness of the CM was insufficient to render the noiselessly received codewords nearly independent. The correlation between any two received codewords was quantified in terms of the bit-flip probability namely the probability that the β-th bit flips between the two received codewords, p_BF ( $\bar{δ}$ )≡Pr{[c _g⊕c _g′]_β=1}, expressed as a function of $\bar{δ}$ = $\bar{δ}$ (n_T)≡d_H(g,g′)/n_T, the Normalized Hamming Distance (NHD) between the encoded messages g, g′. Here it will be more convenient to operate with un-normalized Hamming Distance (HD), such that δ= $\bar{δ}$ ·n_T. Fig. 2(b) displays the resulting p_BF vs n_T for different counts of δ. Specifically it was shown [3] that in the uncoded case with n_T=k+1=11 (including a reference input to remove phase ambiguity) two codewords induced by two messages differing in one bit location were on average in coincidence ~74% of the time. The improved performance was attained by using input codes with increased minimum NHD which results in p_BF≈0.5, consequently increasing the distances between the codewords on the output end of the fiber.

Fig. 3. (a). Evolution of the input spatial code G to the output random code ℂ. (b) Resulting bit error probability for different number of detectors, with/without thermal noise.

Download Full Size | PDF

Here we take another route – by proper spatial coding on input we impose certain structure on the output random code. Effectively the output codewords are divided into sub-groups: The codewords belonging to each subgroup are strongly correlated but there is vanishing crrelation between the groups. The decoding procedure then consists of two stages:

Decide to which group the received message belongs by applying the ML decoder on the received message and a representative from each group.
Once the group is determined, decide which codeword was actually transmitted by applying ML decoder on all the members within the group.

Such scheme, known as a hierarchical decoding [5], may drastically decrease the total number of correlations (XOR units) required in the decoder.

We have established that the hamming distance between the encoded messages at the input is a key factor determining the correlation between the associated codewords at the output. Based on this observation we propose the following structure for the input spatial code G: (i) select M_c encoded messages g, in n_T dimensional space, maximally separated in the Hamming distance sense; (ii) construct a code comprising all encoded messages at unity Hamming distance from the M_c messages above. We denote the first M_c encodded messages as “capitals”, where all the other encoded messages are called “towns”, the group of all messages belonging to a single “capital” is referred as a “province”. Altogether, we obtain a code with M=M_c(n_T+1) messages. Following the logic above, after propagating through the fiber, the input spatial code G will emerge as a random output code ℂ with the following useful properties: On the average the “towns” are strongly correlated with their “capitals” and there is vanishingly small correlation between different subgroups. A typical scenario is sketched in Fig. 3(a).

4. Error probability of sub-ML decoder and list decoding

The decoding process just introduced is affected by two sources of error – the first stage error whereby the “town” is associated with the wrong “capital” and the second stage error – the received message is decoded to the wrong “town” in the correct “province”. It is important to emphasize that unlike conventional ML decoding, the first stage error is an inherent feature of the system even in the noiseless channel. This may occur when a certain “town” in the output is closer to a wrong “capital” as a result of the random nature of the channel matrix [Fig. 3(a)]. Theoretically this type of error may be reduced by increasing the number of output detectors n_R. However, in practice n_R may be excessive for a reasonable probability of Pe=10^-3.

A remedy to this problem may be provided by introducing list decoding in the first stage. The decoder makes a list of L allowed capitals and then the decision is made out of L(n_T+1) “towns” belonging to these L capitals. The trade-offs of the method are readily apparent: For L=1 only M_c+(n_T+1) correlations are necessary, however the resulting error probability relatively poor; At the other extreme, for L=M_c we are back to conventional ML decoding, entirely eliminating the first stage error probability, but paying in M_c(n_T+1)=M required correlations. The overall error probability is then given by

P_{e} = P_{1} + (1 - P_{1}) P_{2} \approx P_{1} + P_{2}

with P ₁ and P ₂ the error probabilities of the first and second stages. Moreover it follows that P ₂ is several orders of magnitude smaller than P ₁ for the raw error rates P_ε<0.1. Figure 3(b). illustrates the results of Monte-Carlo simulations of the system presenting the resulting error probability P_e≈P ₁ for different n_R for varying list lengths. The underlying input code G consists of M_c=64 “capitals” at minimum Hamming distance δ _min=5 between them, and totally 1024 codewords, (Table 1). The code was constructed by direct search over all the 2¹⁵ binary sequences, however any other code with the same structural properties would be appropriate. It is apparent that the thermal noise resulting in P_ε=0.01 has only small contribution to the system error.

n_R (Number of inputs)	K (Number of data ch.)	M (Number of messages)	Mc (Number of capitals)	T^-1 (Bitrate per port)	Total bitrate
15	10	2¹⁰	64	10Gb/s	100Gb/s

5. Summary

It was shown that the random codes naturally induced by the MMF channel matrix statistics, further assisted by spatial input coding with hierarchical structure, as proposed and analyzed here, enable ultra-high bitrate parallel transmission of 100 GbE, with manageable complexity, of n_T independently modulated signals all superimposed into the MMF at the same wavelength. The implementation efficient sub-ML hierarchical list decoder introduced here provides a sizable reduction in receiver complexity relative to our previous scheme [3]. In conclusion, the novel proposed optical MIMO system architectures could potentially provide an alternative to CWDM, for short-range applications.

References and links

1. M. Greenberg, M. Nazarathy, and M. Orenstein, CMQ4, CLEO, 2007.

2. M. Greenberg, M. Nazarathy, and M. Orenstein, “Massively parallel transmission over multimode fiber applied to 100 Gigabit Ethernet using Gilbert-Varshamov codes,” OECC/IOCC 10B2–3 2007.

3. M. Greenberg, M. Nazarathy, and M. Orenstein, “Multimode fiber as random code generator-application to massively parallel MIMO transmission,” accepted to J. Lightwave Technol.

4. C. E. Shannon, “A mathematical theory of communication,” Bell Sys. Tech. J. 27, 379–423, 623–656, (1948).

5. P. C. Chang, J. May, and R. M. Gray, “Hierarchical vector quantizes with table-lookup encoders,” in Proc. 1985 IEEE Int. Conf. Communications , 3, pp. 1452–1455. (1985).

6. M. Sudan, “List decoding: Introduction 1,” http://theory.lcs.mit.edu/~madhu/papers/ifip-journ.ps.

7. Q. Xu, B. Schmidt, J. Shakya, and M. Lipson, “Cascaded silicon micro-ring modulators for WDM optical interconnection,” Opt. Express 14, 9430–9435 (2006). [CrossRef]

8. J. Barry, E. Lee, and D. Messerschmitt, Digital Communication, 3^rd ed., (Kluwer Academic Publishers, 2004).

9. Y. Yadin and M. Orenstein, “Parallel optical interconnects over multimode waveguide,” J. Lightwave Technol. 24, 380–386 (2006). [CrossRef]

10. M. Greenberg, M. Nazarathy, and M. Orenstein, “Data parallelization by optical MIMO transmission over multi-mode fiber with inter-modal coupling,” J. Lightwave Technol. 25, 1503–1514 (2007). [CrossRef]

11. P. Pepeljugoski, J. A. Tiero, A. Risteski, S. K. Reynolds, and L. Schares, “Performance of simulated Annealing Algorithm in equalized multimode fiber links with Linear Equalizers,” J. Lightwave Technol. 24, 4235–4249 (2006). [CrossRef]

12. Y. Yadin and M. Orenstein, “Parallel optical interconnects over multimode waveguides using mutually coherent channels and direct detection,” J. Lightwave Technol. 24, 380–386 (2006). [CrossRef]

13. R. C. J. Hsu, A. Tarighat, A. Shah, A. H. Sayed, and B. Jalali “Capacity enhancement in Coherent Optical MIMO (COMIMO) multimode fiber links,” IEEE Commun. Lett. 10, 195–197 (2006). [CrossRef]

14. M. Greenber, M. Nazarathy, and M. Orenstein, “Efficient hierarchical list decoder for 100 gigabit ethernet optical MIMO,” WEE 4, LEOS 2007.

15. J. Kahn, “Compensating multimode fiber dispersion using Adaptive Optics,” in OFC’07, OTuL1 (2007).

Efficient hierarchical list decoder for massive optical MIMO Transmission

Abstract

1. Introduction

2. Concept and system architecture

3. Spatial coding performance

4. Error probability of sub-ML decoder and list decoding

5. Summary

References and links

Cited By

Figures (3)

Equations (6)

Optics Express