Conditional convolutional GAN-based adaptive demodulator for OAM-SK-FSO communication

Zheng Han; Xiao Chen; Yiquan Wang; Yuanyuan Cai

doi:10.1364/OE.515138

1. Introduction

Vortex beams (VBs) are special light beams that exhibit a phase singularity, manifesting as an isolated dark spot and spiral wavefront. In 1992, Allen et al. proposed VBs characterized by Hilbert factor exp(iℓθ), such as the Laguerre-Gaussian (LG) modes, which can carry the orbital angular momentum (OAM) equivalent to ℓℏ per photon (where ℓ is an integer number) [1]. This angular momentum can be significantly greater than the spin angular momentum (SAM) related to the photon spin. In recent years, VBs have been found extensive applications in various fields including optical tweezers, optical communication, nonlinear optics, microscopy, imaging, and others [2].

With the increasing demand for information capacity, OAM can offer a novel degree of freedom in optical communication, in addition to amplitude, frequency, phase and polarization. By combining OAM modes with other beams properties, the data capacity and stability can be significantly enhanced [3]. Basically, there are two approaches for applying of OAM in communication: OAM multiplexing and OAM modulation. The former considers OAM as a new dimension of multiplexing, similar to time-domain-multiplexing (TDM) and frequency-domain-multiplexing (FDM). The latter utilizes OAM as a modulation technique by mapping the encoding bit sequence to the corresponding OAM modes. This approach is commonly employed in free space optical communication, known as orbital angular momentum shift keying-based free space optical communication (OAM-SK-FSO), which offers distinct advantages such as high photon efficiency, lower equipment cost, large bandwidth and reliable information security [4].

Atmospheric turbulence (AT) causes distortion in both intensity distribution and spiral phase of propagating OAM beams. Consequently, it leads to degradation of orthogonality, diminished purity of modes, increased bit-error-rate (BER), and ultimately leading to deterioration of the communication system. Therefore, how to suppress the AT effect in OAM-SK-FSO has become imperative. Nowadays, the adaptive optics (AO) are commonly employed to solve the AT effect, including the Shack-Hartman (SH) algorithm, Gerchberg-Saxton (GS) algorithm and Stochastic-Parallel-Gradient-Descent (SPGD) algorithm. The SH algorithm requires a wavefront sensor and Gaussian probe beam (GPB) for aberration detection and two wavefront correctors to rectify distorted OAM beams, which increases the system complexity. Both GS and SPGD algorithms do not need a sensor and GPB, however, they still involve extensive calculation and often struggle to find the global minimum due to limited memory capacity [5,6]. With the development of artificial intelligence, recent research has focused on the application of deep learning methods in free space optical communication (FSO). In visible light communication (VLC), three artificial neural networks (ANN) are applied to construct an end-to-end LED-based VLC system, including an encoder, a channel model and a decoder [7]. Similarly, conditional generative adversarial network (conditional GAN) has demonstrated high accuracy and excellent generalization capability in channel modeling [8]. Furthermore, deep learning methods have also been used in AT compensation and OAM modes recognition in OAM-FSO system. For example, a convolutional neural network (CNN) model is utilized to simultaneously detect AT and recognize OAM modes [9]. To enhance accuracy in recognizing OAM modes with lower bit-error-rate (BER), a turbo code [10] and joint rank-order adaptive median filter (RAMF) with very deep super-resolution (VDSR) network [11] are integrated with the CNN model. A series of GPBs are introduced into the CNN model for detecting AT strength and generating compensation phase screens that can restore distorted OAM beams and improve mode purity [12]. Other CNN series models [13–15] or vision transformers (ViT) are also used for this issue [16]. Additionally, direct compensation by predicting the phase screen and Zernike coefficients has also been studied [17–20]. Excellent performance can also be achieved by combining traditional AO system with CNN models [21].

Unfortunately, severe atmospheric turbulence (AT) worsens the performance of the CNN-based OAM-SK-FSO system [9]. Hence, an additional channel coding operation is introduced to eliminate AT, which in turn increases the complexity and computation of the system [10]. Furthermore, [11] proposes a scheme that requires an initial separation of AT classes, which is a little bit redundant. Besides, RAMF and VSDR are specific image processing operations that do not fundamentally enhance communication quality. As a result, this method may not be sufficiently effective in other communication scenarios. In [12], the application of GPB reduces the utilization rate of light. Moreover, under long-distance and severe AT conditions, it becomes increasingly challenging for a CNN model to accurately compensate for phase screens [12–16]. The joint AO systems and CNN model also require optical equipment, such as wavefront correctors and controllers along with specialized optical knowledge that is difficult to replicate and inconvenient. In summary, OAM-SK-FSO still requires an easy and effective approach to construct the demodulation system under atmospheric turbulence.

In this study, we propose an advanced adaptive demodulator based on a deep learning method in OAM-SK-FSO. The binary signals are modulated into a series of superimposed OAM modes and then transmit through a virtual AT channel. A conditional convolutional GAN (ccGAN) network is applied to recover and recognize the distorted OAM intensity pattern. The atmospheric refractive index structure constant $C_n^2$ randomly fluctuates between 10⁻¹⁶ to 10⁻¹²$\textrm{}{\textrm{m}^{ - 2\textrm{ / }3}}$. The results demonstrate the robust capability of neural network in recovering distorted OAM beams, thereby significantly enhancing the accuracy of OAM mode classification. Compared with the CNN model, our proposed OAM demodulator exhibits exceptional performance even under severe AT condition, achieving a recognition accuracy rate of 94.90% at $C_n^2$ = 6×10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}$.The entire demodulator is implemented solely on a computer without any optical devices. Consequently, this approach offers an effective and convenient method for constructing an OAM-SK-FSO system.

2. Principle of OAM-SK-FSO system

The OAM-SK-FSO system configuration is illustrated in Fig. 1, including a transmitter, an AT channel and a demodulator utilizing deep learning techniques. The AT channel is modeled by loading atmospheric phase patterns onto a spatial light modulator (SLM). The received OAM intensity patterns are captured by a CCD and then sent into our network.

Fig. 1. Scheme of OAM-SK-FSO system.

Download Full Size | PDF

2.1 Transmitter

The raw data is encoded into a series of bit signals in the transmitter, which are then transformed from serial to parallel. Subsequently, electrical signals are converted to superimposed OAM modes through a group of Mach–Zehnder modulators (MZM), spiral phase plates (SPP) and beam splitters. The Gaussian beams are modulated into Laguerre-Gaussian (LG) beams by SPPs. LG beams are expressed as follows.

(1)$$E(r,z) = \sqrt {\frac{{2p!}}{{\pi (p + |l |)!}} \cdot \frac{l}{{{\omega ^2}(z)}}} \cdot \exp \left[ { - \frac{{{r^2}}}{{{\omega^2}(z)}}} \right] \cdot {\left( {\frac{{2{r^2}}}{{{\omega^2}(z)}}} \right)^{\frac{{|l |}}{2}}} \cdot \left[ {L_p^{|l |} \cdot \left( {\frac{{2{r^2}}}{{{\omega^2}(z)}}} \right)} \right] \cdot \exp (il\theta ),$$

where $({r,\theta } )$ indicates the polar coordination, p is the radial index that is set to 0 in this paper, l is the topological charge, $L_p^{|l |}$ is the Laguerre polynomial, $\omega (z )$ is the waist radius at the distance z and $exp ({\textrm{i}l\theta } )$ is the helical phase. The selected OAM modes must be both robust against the atmospheric turbulence and highly effective. Therefore, we chose four OAM modes with $l$=1, -2, 3, -5, which are consistent with [10] and [11] for the convenience of follow-up comparison. This selection results in a total of 2⁴= 16 different states. Consequently, each set of 4 bits will be modulated to form a superimposed OAM state. Figure 2 illustrates all the states of the superimposed OAM modes. Each column represents a different modulated state that all intensity patterns are well-characterized compared to others.

Fig. 2. Superimposed OAM modes (the 4bits of signals with l = 1, -2, 3, -5) under different atmospheric turbulence strength with $C_n^2$= (a)0${\textrm{m}^{ - 2\textrm{ / }3}}$, (b)10⁻¹⁵${\textrm{m}^{ - 2\textrm{ / }3}}$, (c)10⁻¹⁴${\textrm{m}^{ - 2\textrm{ / }3}}$, (d)10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}$, (e)10⁻¹²${\textrm{m}^{ - 2\textrm{ / }3}}$.

Download Full Size | PDF

2.2 Atmospheric turbulence channel model

The phenomenon of atmospheric turbulence (AT) is widely observed in free space, causing disruptions to both the intensity and phase distribution during the light propagation. Vortex beams, in particular are highly susceptible to these disturbances due to the helical phase, leading to ambiguous intensity distribution and mode crosstalk.

The virtual AT channel we have developed is based on the Hill-Andrews (HA) model [22], which derives from the theory of atmosphere’s phase screen. The phase screen is given by

(2)$$\phi (x,y) = {\mathbb R}\left\{ {{F^{ - 1}}\left[ {{C_{N \times N}} \cdot \frac{{2\pi }}{{N \cdot \mathrm{\Delta }x}}\sqrt {\Phi ({k_x},{k_y})} } \right]} \right\},$$

where R denotes the operation of extracting the real part of a function, F^-1(•) indicates the inverse Fourier transform. ${C_{N \times N}}$ is an N × N dimensional array of complex random numbers that follows a standard normal distribution. Here, N is the sampling number and $\mathrm{\Delta }x\; $ is the grid spacing. $\Phi ({{k_{x,}}{k_y}} )$ represents the phase spectrum of the atmosphere refractive index, which can be written as

(3)$$\scalebox{0.9}{$\displaystyle\Phi ({k_x},{k_y}) = 2\pi {k_0}^2 \cdot \mathrm{\Delta }z \cdot 0.033C_n^2\left\{ {1 + 1.802\sqrt {\frac{{{\textrm{k}_x}^2 + {\textrm{k}_y}^2}}{{{{\left( {\frac{{3.3}}{{{l_0}}}} \right)}^2}}}} - 0.254{{\left[ {\frac{{{\textrm{k}_x}^2 + {\textrm{k}_y}^2}}{{{{\left( {\frac{{3.3}}{{{l_0}}}} \right)}^2}}}} \right]}^{\frac{7}{{12}}}}} \right\} \cdot \frac{{\exp \left[ {\frac{{{\textrm{k}_x}^2 + {\textrm{k}_y}^2}}{{{{\left( {\frac{{3.3}}{{{l_0}}}} \right)}^2}}}} \right]}}{{{{\left( {{\textrm{k}_x}^2 + {\textrm{k}_y}^2 + \frac{1}{{{L_0}^2}}} \right)}^{\frac{{11}}{6}}}}},$}$$

where the wavenumber k₀ represents the spatial frequency(rad/m), Δz is the interval of phase screen, $C_n^2\; $ indicates the atmospheric refractive index structure constant. ${l_0}$ and ${L_0}$ are the inner and outer scale of turbulence, respectively. The received light field can be expressed as

(4)$$u(z + \mathrm{\Delta }z) = {F^{ - 1}}\{{F[{u(z) \cdot \exp (il\phi )} ]\cdot H(\mathrm{\Delta }z)} \},$$

where F(•) is the Fourier transform and H(•) is the Fresnel transfer function.

The degradation of OAM patterns is observed in Fig. 2 as the AT strength increases. In severe cases, the patterns of different OAM modes become too indistinct to be recognized, just like the last row in Fig. 2, which cause a significant challenge for the subsequent demodulation. The parameters of both the AT channel and vortex beam we used are shown in Table 1.

Table 1. Parameters of AT channel and vortex beam

View Table | View all tables in this article

2.3 Conditional convolutional GAN demodulator

The proposed demodulator at the receiver employs an advanced conditional generative adversarial (conditional GAN) network. This neural network accurately identifies and assigns specific category numbers to the light intensity patterns, thereby facilitating the demodulation process.

2.3.1 Conditional GAN

Generative adversarial network (GAN) is a type of generative models that could generate new samples similar to the target distribution through training [23]. Unlike conventional neural network, GAN is based on the concept of zero-sum game. A classical GAN consists of two modules: the generator and the discriminator. The generator randomly samples from the latent space as input and aims to produce output that closely resemble the target samples. The discriminator takes either real samples or generator outputs as input and aims to distinguish between them. Through competition, both two modules improve themselves, leading to a Nash equilibrium where the discriminator cannot determinate the source of an input, indicating that the generator’s outputs could accurately represent the true data distribution.

The generated data of GAN, however are random and unpredictable, thereby impeding the network's ability to generate specific categories in a controlled manner. This limitation can be attributed to the fundamental nature of GAN, which primarily focuses on learning the distribution characteristics of the entire dataset, such as its probability density function (PDF). Consequently, GAN may generate samples that bear sufficient resemblance to the target but not belong to any specific category within them. In response to this challenge, conditional GAN has been proposed [24]. The core concept behind conditional GAN is to control generated images by incorporating additional conditional information into both the generator and discriminator inputs. This ensures that the generator can only successfully deceive the discriminator if the data is sufficiently realistic and consistent with the provided conditional information.

The convolutional layer is introduced to replace the fully connected layer in traditional conditional GAN, aiming to process image data more effectively. This modification enables better utilization of local correlation within images and reduces computational consumption. The proposed conditional convolutional generative adversarial network (ccGAN) is illustrated in Fig. 3. The generator takes distorted OAM intensity patterns and corresponding labels as input to generate recovered intensity patterns that closely resemble the original ones. Meanwhile, the discriminator receives either the generator’s output or real data with labels as input and produces a discriminant result ranging from 0 to 1 along with a classification outcome. In other words, the generator could be viewed as a equalizer, and the discriminator could be seen as a decoder.

Fig. 3. Overview of ccGAN.

Download Full Size | PDF

2.3.2 Generator

The generator architecture shown in Fig. 4(a) is inspired by the auto-encoder network [25]. A distorted intensity image is resized to 64 × 64, converted to grayscale and normalized. This preprocessed image is then combined with a label as the input. The overall architecture contains five convolutional layers and four deconvolutional layers. Each convolutional layer employs 3×3 convolutional kernels, while the deconvolutional layers utilize a kernel size of 4×4. The batch normalization layers and activation function layers are applied after each convolutional and deconvolutional layers, except for the last one. We utilize LeakyReLu as the activation function throughout, except for the final layer where tanh is used to compress output data within a range of -1 to 1.

Fig. 4. The architecture of (a) generator and (b) discriminator.

Download Full Size | PDF

2.3.3 Discriminator

The discriminator shown in Fig. 4(b) is primarily inspired by the design principle of auxiliary classifier GAN (ACGAN) [26]. It comprises four convolutional layers and two fully connected layers. Similar to the generator, the input includes labels and pre-processed generated or real light intensity images. After passing through the convolution blocks, there are two distinct pathways leading to different outputs. One pathway includes a fully connected layer followed by a Sigmoid activation function layer, generating discrimination results within a range of 0 to 1. The other pathway involves a classification module composed of a fully connected layer and a Softmax layer, enabling final classification based on the input.

3. Numerical results and analysis

The implementation details of the proposed adaptive demodulation system will be discussed in this section, along with an evaluation of the network’s performance. Section 3.1 demonstrates the process of obtaining training data and section 3.2 presents the specifics of network training process, including the training method and the performance of the trained network. An evaluation part is conducted in section 3.3, where multiple gray images are passed through our system to test its reliability. The test results indicate that the proposed network maintains a recognition accuracy above 94% and a peak signal to noise ratio (PSNR) above 27.5 dB even under severe AT conditions. Finally, there is a discussion about how to improve the performance further in section 3.4.

3.1 Collection of data sets

To obtain the training data of the network, we transmit a series of distinct superimposed original OAM modes through a simulated AT channel and collect the corresponding received intensity patterns. The parameters of both the AT channel and vortex beam are shown in Tab.1. In order to simulate the practical communication environment and enhance the model’s generalization capability, we randomly vary the atmospheric refractive index structure constant $C_n^2$ from 1 × 10⁻¹⁶ to 1 × 10⁻¹²${\textrm{m}^{ - 2\textrm{ / }3}}$ in the data collection phase. The training set comprises a total of 57,600 images with each category containing 1800 transmitted and received patterns, which could be achieved in Dataset 1 [27]. The generation of data is implemented on MATLAB R2022b, and we train the network by using Python version 3.9.18 and a framework of Pytorch version 2.1.1 (CPU: Intel Xeon CPU E5-2686 V4, GPU:NVIDIA GeForce RTX 4090). It takes about 1 hour to complete the training process.

3.2 Network training and results

For the network training process, we conduct 200 epochs to train this model. During the initial 20 epochs, both the distorted intensity distribution and the real labels are utilized as inputs for the generator, while original and generated intensity patterns combined with real labels are used as inputs for the discriminator. At this stage, the generator is not sufficiently robust, so we employ labels to guide it in transforming the distorted pattern into its corresponding pure pattern. However, in practical application, only intensity patterns can be received without available labels, so a randomly generated label is introduced after 20 epochs to replace the true label. This modification allows the model to learn enough information solely from the transmitted image and generate high-quality images regardless of the given labels.

The discriminator benefits from the real labels during the initial 25 epochs, which facilitates faster convergence and enhances the generator’s performance. However, it is important to note that such labels may not be feasible in real scenario. Extensive simulation results indicate that even with randomly assigned labels, the discriminator can effectively accomplish its classification task when presented with high-quality images. Although there is some fluctuation in the generator’s performance when the given label is changed at the 20th epoch, it quickly readjusts itself within a few epochs to generate high-quality recovered intensity distribution again. Consequently, we choose to replace real labels with random labels for the discriminator after the 25th epoch. Thus far, the network training aligns perfectly with real-world scenarios.

The loss function of generator L_G and discriminator L_D are expressed respectively as follows

(5)$${L_G} = E[{ - \log P({s = real\textrm{|}{x_{fake}}} )- \log P({C = c\textrm{|}{x_{fake}}} )} ]$$

(6)$$\scalebox{0.9}{$\begin{array}{@{}l} {L_D} = {L_S} + {L_C}\\ = \{{E[{ - \log P({s = real\textrm{|}{x_{real}}} )- \log P({s = fake\textrm{|}{x_{fake}}} )} ]+ E[{ - \log P({C = c\textrm{|}{x_{real}}} )- \log P({C = c\textrm{|}{x_{fake}}} )} ]} \}/2, \end{array}$}$$

where consists of discriminant loss and the classification loss. The chosen optimizer is Adam with a learning rate at 0.0002. The batch size is 16.

We give three examples to demonstrate the network’s performance. The superimposed OAM modes with l = (-5), (-2, 3) and (1, -5) are shown in Fig. 5(a)-(c), respectively. The first column in each subgraph displays the original OAM beams, while the second column are the distorted beams under different ATs, and the third indicates the recovery beams by our network. Each row represents different intensity of turbulence with $C_n^2$=1.00 × 10⁻¹⁴, 1.00 × 10⁻¹³, 4.45 × 10⁻¹³ and 6.00 × 10⁻¹³ ${\textrm{m}^{ - 2\textrm{ / }3}}\; $ in order. It’s evident that our designed ccGAN network effectively restores the distorted OAM beams to their original appearance with exceptional performance enhancement in classification.

Fig. 5. Recovery of OAM modes under different AT conditions. OAM mode is (a) l = -5, (b) l = -2, 3 and (c) l = 1, -5. Each subgraph consists of three columns: the first column represents the original beams, the second are the distorted beams under different ATs, and the third are the recovery beams after our network training. Each rows corresponds to a specific intensity of turbulence, namely $C_n^2$ = 1.00 × 10⁻¹⁴, 1.00 × 10⁻¹³, 4.45 × 10⁻¹³ and 6.00 × 10⁻¹³ ${\textrm{m}^{ - 2\textrm{ / }3}}\; $ from top to bottom.

Download Full Size | PDF

Figure 6 shows the recorded curves during the training process. Figure 6(a) is the accuracy curve for recognizing both the real and generated data, while Fig. 6(b) displays the mean square error (MSE) between the generated image and real image, which is also recorded throughout the training. The MSE is defined as

(7)$$MSE(a,b) = \frac{1}{{M \times N}}\sum\limits_{x = 0}^{N - 1} {\sum\limits_{y = 0}^{M - 1} {{{[{a(x,y) - b(x,y)} ]}^2},} }$$

where a(x, y) and b(x, y) are the corresponding intensity at (x, y) of the original and the generated images. The MSE value gradually decreases and the accuracy rate improves as the number of training increases, eventually reaching a stable state. The final MSE value is 0.00047, while the accuracy rates of real data and generated data are 99.99% and 99.03%, respectively. It is worth noting that the fluctuation in the curve of generated data around the 25th epoch is attributed to a transition in training labels from real to random.

Fig. 6. (a) The accuracy curves for recognizing both the real and generated data. (b) The mean square error (MSE) between the generated images and real images.

Download Full Size | PDF

3.3 Performance evaluation of adaptive demodulator

To evaluate the performance of our adaptive demodulator in a realistic scenario, we have chosen to transmit six different grayscale images in this study. Each pixel of these grayscale images consists of 8 bits, resulting in the modulation of the first 4 bits into one light beam and the remaining 4 bits into another beam. We have firstly chosen three different levels of AT strength with $C_n^2\; $=3.00 × 10⁻¹³, 4.45 × 10⁻¹³, 6.00 × 10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}$ to test the model’s performance under extreme conditions. Additionally, in order to better simulate the real scene, we make $C_n^2$ selected randomly from 1 × 10⁻¹⁶ to 1 × 10⁻¹²${\textrm{m}^{ - 2\textrm{ / }3}}\; $. The tested images are resized to dimensions of 100 × 100 pixels, thus requiring a total of 20000 intensity patterns for transmission.

Figure 7(a)-(f) demonstrate the comparative received images under different ATs. To enhance the result credibility, we have calculated the recognition accuracy rate for each image and introduced the peak signal-to-noise ratio (PSNR) as an evaluation. PSNR can be defined as follows:

(8)$$PSNR\textrm{ = }10\lg \left[ {\frac{{\max Va{l^2}}}{{MSE(a,b)}}} \right],$$

where the MSE is defined in Eq. (7). The higher the PSNR, the better the picture quality. These values are shown in Table 2.

Table 2. Parameters of AT channel and vortex beam

View Table | View all tables in this article

Fig. 7. The testing images received by our demodulator under different ATs.

Download Full Size | PDF

The average recognition accuracy rates of the four conditions are 0.9928, 0.9795, 0.9490 and 0.9783, respectively, while the average PSNRs are measured at 37.08, 32.47, 27.68 and 31.83 dB, respectively. Compared to the existing methods [11], our approach achieves an increase of up to 9.98% in recognition rate when $C_{n\; }^2$= 4.45 × 10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}$, and PSNR shows similar results as well, indicating its effectiveness in severe ATs. However, in [11], this improvement is achieved by combining a network with subsequent image processing, which does not fundamentally enhance the communication quality. Furthermore, our network demonstrates superior performance in recognition rate when $C_n^2$ = 6 × 10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}$ compared to the aforementioned existing method at $C_n^{2\; }\; $= 5 × 10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}$. Additionally, we achieve an average recognition rate of 0.9783 and PSNR of 31.83 dB under randomly selected atmospheric turbulence, thereby proving the reliability of our method. The effectiveness of the conditional GAN network can be attributed to several inherent characteristics: (1) the presence of mutually beneficial interactions between opposing networks; (2) the introduction of labels that facilitated accelerated and accurate convergence; (3) the enhanced training oversight provided by the discriminator compared to a single loss function. As a result, our proposed system can be implemented practically without the need for classifying turbulence intensity or calculating phase patterns. It can be viewed as a black box without requiring any physical devices.

Fig. 8. The crosstalk matrix of OAM modes under the ${\rm C}_{\rm n}^2 = 6 \times 10^{13}{\rm m}^{-2/3}$

Download Full Size | PDF

3.4 Discussion

The above method is relatively versatile because the problem being solved is at the communication level, specifically reducing bit-error-rate (BER). In this section, we aim to explore an alternative approach to enhance the performance of OAM-SK-FSO system. Figure 8 presents a crosstalk matrix of OAM modes under $C_n^2$=6 × 10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}$, which reveals distinct recognition accuracy rate for different superimposed OAM modes. By calculating the probability of each symbol at the source, we can utilize OAM modes with higher accuracy to represent the symbols with higher probability. Furthermore, it is observed that each OAM mode has a different probability of being misassigned to other modes. The mode 4, for instance is more likely to be misclassified to mode 8 and 2, while the mode 2 tends to be recognized as the mode 4 and 3. Inspired by the Gray code, the mapping rules of codes and OAM modes can be adjusted to further reduce the cost of identification errors. Besides, another way to improve the recognition accuracy rate is to choose more diverse intensity patterns. For instance, incorporating multiple distinct topological charges in superposition will generate more distinctive OAM intensity patterns that are easier for neural networks to recognize. However, it should be noted that the specific operation method needs to consider the characteristics of the source and channel condition on a case-by-case basis.

Despite achieving a higher recognition accuracy rate under strong atmospheric turbulence compared to other existing methods, our proposed network does have certain limitations.

1) Although the well-trained ccGAN network effectively recovers distorted OAM beams, the randomly assigned labels during the second stage of training introduce uncertainty to the network. It should be noted that in the early stage of training, we introduce labels to provide additional information for network learning and got a good effect. However, obtaining real labels is unachievable in practical communication scenarios, necessitating their eventual replacement. This uncertainty could lead to crosstalk between different patterns, particularly when there is strong perturbation and diverse OAM patterns become highly ambiguous. In other words, it can be considered as a compromise proposal to mitigate the impact of ATs. Consequently, further exploration is required to identify a more rational and robust approach for resolving this issue.
2) Comparing with other networks, training generative adversarial network (GAN) is inherently challenging to achieve stability, even when aided by labels. Further research is required to explore methods for enhancing the stability of GAN series networks.
3) The deep learning-assisted communication links demonstrate superior performance compared to traditional schemes. However, the lack of real-time communication devices between communication devices and computer software, hinders the practical implementation of such communication proposals in online scenarios. Moreover, incorporating neural networks for handling communication data would increase additional costs in digital signal processing. Therefore, further research is necessary to explore methods that connect communication devices with neural networks while minimizing costs.

Also, indeed, deep learning-based OAM recognition can achieve better performance than traditional methods such as loading inverse phase plate, Dammann gratings or fabricating corresponding devices. However, due to the inherent nature of deep learning as a statistical regression technique, neural networks tend to accomplish tasks through numerical approximations based on training data rather than physical models, resulting that the network may only have an excellent recognition rate for OAM patterns included in the training set. It seems less concise and substantial. Nevertheless, we believe it should not be considered as a significant issue in OAM-based FSO communications. Our reasons are as follows:

1) In most communication scenarios, the source information is typically known in advance, thus the categories of OAM modes should be consistent in stable communication links. However, due to the constantly changing strength of atmosphere turbulence from weak to strong, OAM-based FSO communication faces significant challenges. Consequently, we believe that prioritizing anti-interference capabilities over model flexibility and generalization is crucial.
2) To mitigate the limitation of our network, which requires fixed and known OAM modes during the training process, and enhance its effectiveness in practical application, we can use transfer learning technique to adapt the architecture. In other words, we can fine-tune the network’s parameters by adjusting the inputs and outputs to accommodate changes in different scenarios.

4. Conclusion

In this paper, we propose an novel adaptive optical demodulation system based on a deep learning method for the OAM-SK-FSO system. We employ a conditional convolutional GAN network (ccGAN) to effectively recover the distorted intensity patterns and assign them to their specified classes. To evaluate the performance of our network, we measure the average recognition accuracy rate and PSNR of transmitted gray images. The results show the recognition accuracy rates are 0.9928, 0.9795 and 0.9490, respectively and the PSNR values are 37.08, 32.47 and 27.68 dB when the atmospheric refractive index structure constant $C_n^2$ is 3.00 × 10⁻¹³, 4.45 × 10⁻¹³, 6.00 × 10⁻¹³${\textrm{m}^{ - 2\textrm{ / }3}}\; $ respectively. Furthermore, in order to enhance the accuracy of simulating real scenarios, we also apply a randomly selected $C_n^2$ values ranging from 1.00 × 10⁻¹⁶ to 1.00 × 10⁻¹²${\textrm{m}^{ - 2\textrm{ / }3}}$. Under this condition, the achieved accuracy rate is up to approximately 0.9783 with a corresponding PSNR value of 31.83 dB In conclusion, this approach may facilitate future development of OAM-SK-FSO communication.

Funding

National Key Research and Development Program of China (2021YFB3601404); National Natural Science Foundation of China (61675238, 61775244, 61975247).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available in Dataset 1, Ref. [27].

References

1. L. Allen, M. W. Beijersbergen, R. Spreeuw, et al., “Orbital angular momentum of light and the transformation of Laguerre-Gaussian laser modes,” Phys. Rev. A 45(11), 8185–8189 (1992). [CrossRef]

2. Y. J. Shen, X. J. Wang, Z. W. Xie, et al., “Optical vortices 30 years on: OAM manipulation from topological charge to multiple singularities,” Light-Science & Applications 8(1), 90 (2019). [CrossRef]

3. G. Gibson, J. Courtial, M. J. Padgett, et al., “Free-space information transfer using light beams carrying orbital angular momentum,” Opt. Express 12(22), 5448–5456 (2004). [CrossRef]

4. C. H. Kai, P. Huang, F. Shen, et al., “Orbital angular momentum shift keying based optical communication system,” IEEE Photonics J. 9(2), 1–10 (2017). [CrossRef]

5. S. Fu, S. Zhang, T. Wang, et al., “Pre-turbulence compensation of orbital angular momentum beams based on a probe and the Gerchberg–Saxton algorithm,” Opt. Lett. 41(14), 3185–3188 (2016). [CrossRef]

6. G. Xie, Y. Ren, H. Huang, et al., “Phase correction for a distorted orbital angular momentum beam using a Zernike polynomials-based stochastic-parallel-gradient-descent algorithm,” Opt. Lett. 40(7), 1197–1200 (2015). [CrossRef]

7. Z. Li, J. Shi, Y. Zhao, et al., “Deep learning based end-to-end visible light communication with an in-band channel modeling strategy,” Opt. Express 30(16), 28905–28921 (2022). [CrossRef]

8. W. Chen, M. Zhang, D. Wang, et al., “Deep learning-based channel modeling for free space optical communications,” J. Lightwave Technol. 41(1), 183–198 (2023). [CrossRef]

9. J. Li, M. Zhang, D. S. Wang, et al., “Joint atmospheric turbulence detection and adaptive demodulation technique using the CNN for the OAM-FSO communication,” Opt. Express 26(8), 10494–10508 (2018). [CrossRef]

10. Q. H. Tian, Z. Li, K. Hu, et al., “Turbo-coded 16-ary OAM shift keying FSO communication system combining the CNN-based adaptive demodulator,” Opt. Express 26(21), 27849–27864 (2018). [CrossRef]

11. Z. K. Li, J. B. Su, and X. H. Zhao, “Two-step system for image receiving in OAM-SK-FSO link,” Opt. Express 28(21), 30520–30541 (2020). [CrossRef]

12. J. M. Liu, P. P. Wang, X. K. Zhang, et al., “Deep learning based atmospheric turbulence compensation for orbital angular momentum beam distortion and communication,” Opt. Express 27(12), 16671–16688 (2019). [CrossRef]

13. H. Zhou, Z. Pan, M. I. Dedo, et al., “High-efficiency and high-precision identification of transmitting orbital angular momentum modes in atmospheric turbulence based on an improved convolutional neural network,” J. Opt. 23(6), 065701 (2021). [CrossRef]

14. Z. Li, X. Li, H. Jia, et al., “High-efficiency anti-interference OAM-FSO communication system based on Phase compression and improved CNN,” Opt. Commun. 537, 129120 (2023). [CrossRef]

15. M. I. Dedo, Z. Wang, K. Guo, et al., “OAM mode recognition based on joint scheme of combining the Gerchberg-Saxton (GS) algorithm and convolutional neural network (CNN),” Opt. Commun. 456, 124696 (2020). [CrossRef]

16. B. Merabet, B. Liu, Z. Li, et al., “Vision transformers motivating superior OAM mode recognition in optical communications,” Opt. Express 31(23), 38958–38969 (2023). [CrossRef]

17. Y. Zhai, S. Fu, J. Zhang, et al., “Turbulence aberration correction for vector vortex beams using deep neural networks on experimental data,” Opt. Express 28(5), 7515–7527 (2020). [CrossRef]

18. Y. Wu, A. Wang, and L. Zhu, “Direct prediction and compensation of atmospheric turbulence for free-space integer and fractional order OAM multiplexed transmission links,” Opt. Express 31(22), 36078–36095 (2023). [CrossRef]

19. W. Xiong, P. Wang, M. Cheng, et al., “Convolutional neural network based atmospheric turbulence compensation for optical orbital angular momentum multiplexing,” J. Lightwave Technol. 38(7), 1712–1721 (2020). [CrossRef]

20. H. Zhan, Y. Peng, B. Chen, et al., “Diffractive deep neural network based adaptive optics scheme for vortex beam in oceanic turbulence,” Opt. Express 30(13), 23305–23317 (2022). [CrossRef]

21. C. D. Lu, Q. H. Tian, X. J. Xin, et al., “Jointly recognizing OAM mode and compensating wavefront distortion using one convolutional neural network,” Opt. Express 28(25), 37936–37945 (2020). [CrossRef]

22. L. C. Andrews and R. L. Phillips, Laser Beam Propagation Through Random Media, 2nd ed. (SPIE, 2005).

23. I. Goodfellow, J. Pouget-Abadie, M. Mirza, et al., “Generative adversarial nets,” Advances in neural information processing systems 27 (2014).

24. M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv, arXiv:1411.1784 (2014). [CrossRef]

25. G. E. Hinton, A. Krizhevsky, and S. D. Wang, “Transforming auto-encoders,” in Artificial Neural Networks and Machine Learning–ICANN 2011: 21st International Conference on Artificial Neural Networks, Espoo, Finland, June 14-17, 2011, Proceedings, Part I 21, (Springer, 2011), 44–51.

26. A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary classifier gans,” in International conference on machine learning, (PMLR, 2017), 2642–2651.

27. Z. Han, “Datasets and code for ccGAN based OAM-SK-FSO communication system,” figshare (2023), https://doi.org/10.6084/m9.figshare.24779619

Parameter	Value	Parameter	Value
λ	1550 nm	N	256
${\omega _0}$	0.06 m	$\mathrm{\Delta }x$	0.0012 m
${l_0}$	0.0001 m	$\mathrm{\Delta }z$	200 m
${L_0}$	50 m	Z	1000 m

	$C_n^{2} \left( {{\rm m}^{-2{\rm /}3}} \right)$	Accuracy rate	PSNR(dB)
(a)	3.00 × 10⁻¹³	0.9936	39.21
	4.45 × 10⁻¹³	0.9794	33.43
	6.00 × 10⁻¹³	0.95	27.44
	random	0.9768	31.41
(b)	3.00 × 10⁻¹³	0.9933	37.94
	4.45 × 10⁻¹³	0.9798	33.59
	6.00 × 10⁻¹³	0.9545	28.21
	random	0.9796	33.23
(c)	3.00 × 10⁻¹³	0.9927	35.32
	4.45 × 10⁻¹³	0.979	31.28
	6.00 × 10⁻¹³	0.943	27.30
	random	0.9789	32.62
(d)	3.00 × 10⁻¹³	0.9913	34.02
	4.45 × 10⁻¹³	0.9772	31.35
	6.00 × 10⁻¹³	0.9447	26.95
	random	0.9746	31.00
(e)	3.00 × 10⁻¹³	0.9937	38.63
	4.45 × 10⁻¹³	0.9844	33.62
	6.00 × 10⁻¹³	0.9608	29.40
	random	0.9843	32.39
(f)	3.00 × 10⁻¹³	0.9926	37.34
	4.45 × 10⁻¹³	0.9769	31.53
	6.00 × 10⁻¹³	0.941	26.76
	random	0.9755	30.35

Parameter	Value	Parameter	Value
λ	1550 nm	N	256
${\omega _0}$	0.06 m	$\mathrm{\Delta }x$	0.0012 m
${l_0}$	0.0001 m	$\mathrm{\Delta }z$	200 m
${L_0}$	50 m	Z	1000 m

	$C_n^{2} \left( {{\rm m}^{-2{\rm /}3}} \right)$	Accuracy rate	PSNR(dB)
(a)	3.00 × 10⁻¹³	0.9936	39.21
	4.45 × 10⁻¹³	0.9794	33.43
	6.00 × 10⁻¹³	0.95	27.44
	random	0.9768	31.41
(b)	3.00 × 10⁻¹³	0.9933	37.94
	4.45 × 10⁻¹³	0.9798	33.59
	6.00 × 10⁻¹³	0.9545	28.21
	random	0.9796	33.23
(c)	3.00 × 10⁻¹³	0.9927	35.32
	4.45 × 10⁻¹³	0.979	31.28
	6.00 × 10⁻¹³	0.943	27.30
	random	0.9789	32.62
(d)	3.00 × 10⁻¹³	0.9913	34.02
	4.45 × 10⁻¹³	0.9772	31.35
	6.00 × 10⁻¹³	0.9447	26.95
	random	0.9746	31.00
(e)	3.00 × 10⁻¹³	0.9937	38.63
	4.45 × 10⁻¹³	0.9844	33.62
	6.00 × 10⁻¹³	0.9608	29.40
	random	0.9843	32.39
(f)	3.00 × 10⁻¹³	0.9926	37.34
	4.45 × 10⁻¹³	0.9769	31.53
	6.00 × 10⁻¹³	0.941	26.76
	random	0.9755	30.35

Conditional convolutional GAN-based adaptive demodulator for OAM-SK-FSO communication

Abstract

1. Introduction

2. Principle of OAM-SK-FSO system

2.1 Transmitter

2.2 Atmospheric turbulence channel model

2.3 Conditional convolutional GAN demodulator

2.3.1 Conditional GAN

2.3.2 Generator

2.3.3 Discriminator

3. Numerical results and analysis

3.1 Collection of data sets

3.2 Network training and results

3.3 Performance evaluation of adaptive demodulator

3.4 Discussion

4. Conclusion

Funding

Disclosures

Data availability

References

Supplementary Material (1)

Data availability

Cited By

Figures (8)

Tables (2)

Equations (8)

Optics Express