Abstract
Future wireless networks are planned to service many applications with an Ultra Low-Latency (ULL) requirement. Numerous 6G systems have been proposed including more traditional electro-magnetic (EM) antenna transmissions and optical wireless communications (OWC). For extremely wide-band operation, the traditional approaches require digital pre-distortion and other processing techniques which, in turn, require more computational resources and processing times thus increasing latency. Alternatively, OWC has the potential for extremely wide bandwidths in 6G without the need for as much digital signal processing. In order to realise ULL performance, a minimum number of digital signal processing (DSP) blocks is required, as well as an optimal design of each of these. In this letter, we propose a DSP solution for ULL and peak to average power ratio (PAPR) reduction for OWC systems. Unitary checkerboard precoding - orthogonal frequency division multiplexing (UCP-OFDM) is chosen as the modulation scheme and has been implemented within a single digital block avoiding the use of standard OFDM which would otherwise require multiple digital blocks. Experimentally validated results successfully demonstrate a 2.21184 GSps wireless link at distances of up to 2m in noisy daylight settings. Bit error rates (BER) of 0 at root mean square (RMS) error vector magnitude (EVM) of 4.09% are achieved. A complete digital line-up of an OWC transmitter chain for this work contains only three core blocks and ULL of less than 400 ns.
Published by Optica Publishing Group under the terms of the Creative Commons Attribution 4.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.
Corrections
25 October 2023: A correction was made to the author listing.
1. Introduction
One of the promises made from 6G is ultra low-latency (ULL) which is invaluable for time critical applications such as high-resolution virtual reality online play and medical devices. It is also essential that the higher throughput of data is received correctly in these scenarios where the consequences of a system output are paramount. To this end, 6G systems with antennas can be susceptible to the electromagnetic interference (EMI) from neighbouring devices. To resolve this potential issue, optical front-ends can be used in lieu of mmWave antennas which are EMI immune [1]. In turn, this also provides the capability of even wider bandwidths and increased user access than 6G standards have set with a vast unlicensed available optical spectrum. The terahertz (THz) spectrum falls within the range of just over 1 THz to 10 THz which encompasses the infrared spectrum range. This is one of the major areas of focus for free-space optical system designers due to the commercial availability of vertical cavity surface emitting lasers (VCSELs) which are suited to transmitting the modulation schemes used in OWC at the required higher data rates.
As free-space optical wireless communications (OWC) become more prominent in mainstream research due to the development of front-end technologies, greater distances at higher data rates have been accomplished. At the early stages of free-space OWC, [2] presented an Ethernet to visible light communications (VLC) link. The maximum range of 3m at 2 Mbps is presented. In more recent years, [3] presents real-time audio streaming at 3m. Typically, high quality audio is output at 320 kbps. Despite the progress in these papers, they do not contain data on performance parameters such as error vector magnitude (EVM) or bit error rates (BER). Within [4] a field programmable gate array (FPGA) design is achieved using 30 Mbps. In [5], the furthest distance reached with BER of 0 at 2 Mbps is 40 cm. In this case, however, the main focus is dynamic user access and resource allocation. Another VLC configuration is shown in [6] where 2.2 Gbps is validated at 8m through an air-water medium with a BER of $3.8e^{-3}$. The latency in this system is also reported at 0.8 $\mu {s}$. Within [7], a 5.5 Gbps air-water OWC system based on offline orthogonal frequency division multiplexing (OFDM) is shown with BER of $2.47e^{-3}$ at 26m using power loading. Many research works have been completed in air-water systems, such as in [8–11]. In all cases visible light using laser diodes (LD) are utilized. The International Electrotechnical Commission (IEC) 60825-12 specifies that only class 1 and 1M LD are eye-safe for infinite exposure [12]. Within [13], an emulated 20km FPGA-based communications link is presented, however this work is completed with a PXIe Software Defined Radio platform and no latency, DSP or FPGA architecture is reported. This letter presents an ULL solution specific for 6G applications in over-the-air environments unlike previously published works regarding FPGA DSP implementation. Although 6G KPIs have yet to be released, this letter works under the assumption of 10x improvement over previous generations, such as 5G, with the goal of latency to therefore be $\sim {0.1ms}$ [14].
To achieve these greater data rates, OFDM is frequently used. Despite its advantages, it has often been criticized for its high peak to average power ratio (PAPR). A high PAPR will result in a degradation in the performance of the optical front-ends as higher peaks cause a reduction in the average power when compared to a full-scale output. It also has a net result of causing harmonics and inter-modulation distortion in the received (Rx) signal. Without any post-processing to rectify this characteristic, the PAPR of OFDM, which can lead to power inefficiency and signal distortions from non-linear hardware components, can reach 10-13 dB [15]. Many techniques have been proposed to reduce the high PAPR of OFDM signals such as signal distortion, crest factor reduction, multiple signalling and probabilistic methods and coding [16]. However, these methods often lead to an increase in hardware complexity and impose a trade-off between performance and resource management. As such, unitary checkerboard precoded (UCP)-OFDM [17] is integrated into the FPGA transmitter (Tx) digital signal processing (DSP) to avoid such high PAPR levels associated with typical OFDM-based OWC signal structures and lower resource demands within the FPGA implementation. In contrast to [18], a precoding matrix for UCP-OFDM is utilized to combine OFDM generation and PAPR reduction.
Present obstacles in 6G advancement encompass establishing high-speed digital architectures capable of supporting exceedingly wide bandwidths, alongside the implementation of appropriate modulation schemes and signal design for optical communications. In this letter, a ULL FPGA-based optical Tx DSP is instantiated on a FPGA and its performance is experimentally verified for real world scenarios in line-of-sight applications for 6G. An extremely low-latency is achieved through the combination of precoding and inverse fast Fourier transform (IFFT) stages in the generation of the OFDM signal, as well as the optimization of the finite impulse response (FIR) filter structures. The solution also successfully demonstrates the viability of a noval precoding technique to OFDM to provide low PAPR signals for optical front-ends, and as such, allows the optical front-end to operate in a more efficient manner.
The remainder of the letter is as follows: Section 2 provides the details on the principles of the proposed low-latency Tx DSP, Section 3 expands on the principles of the Tx DSP in terms of the digital line-up presented in the proposed solution and within Section 4 is a description of the experimental setup and implementation of the 6G OWC system. Finally, Section 5 concludes this letter with discussions and final comments.
2. FPGA instantiation principles
2.1 Constellation mapping
Similar to the constellations of typical geometrically modified optical fiber communications which are circularly shaped [19], to reduce the overall latency of the transciever system, 8-Phase Shift Keying (PSK) is utilized. As the constant modulus removes the constraint of amplitude rectification on the OWC Rx side, the block demapping process solely relies on accurate phase boundary definitions, removing computational processes in the Rx. An offset of 22.5$^{\circ }$ also reduces the number of amplitude levels in the constellation from $3$ to $2$.
As 8-PSK exhibits a constant modulus type, the bits can be obtained solely from a signed Rx demodulated symbol. By examining the sign of the real and imaginary parts of the signal, the Rx demodulated symbol can be equated to a quadrant with 2 options. This ambiguity can be removed by deciding where the demodulated symbol lies with respect to the bisector of the corresponding quadrant. The absolute difference is then found and saved between the imaginary and real fixed constellation points and the Rx constellation points. Based on the quadrant, these values are then added together and compared to find the smallest distance from a constellation point. The Rx demodulated symbol is then mapped to the nearest constellation point and outputted for further processing.
2.2 Unitary checkerboard precoding
The physical properties of VCSELs and increasing commercial availability allow OWC to be realised as intensity modulated/direct detection (IM/DD) systems which could compete with traditional RF front-ends in new 6G standards. As light is incoherent, information can only be transmitted and received by the encoding and decoding of light intensity at the optical front-end. As such, only a limited number of modulation and multiplexing techniques can be modified from traditional RF systems for use in OWC systems. On-off keying, pulse position modulation and m-pulse amplitude modulation techniques proved successful within early OWC prototypes, however, as the demands for higher bandwidths and data rates increased, ISI became a more prevalent issue. Therefore, OFDM is suggested as a more appropriate modulation scheme. However, as mentioned previously, OFDM signals are often critiqued for their characteristically high PAPR. The PAPR of an OFDM block can be expressed as:
where $|\cdot |$ is the absolute value, $x_{peak}$ is the maximum value within the signal, $x$, and $x_{RMS}$ is the root mean square (RMS) value. For each OFDM block, the RMS can be obtained from:A lower PAPR equates to a higher power efficiency and access to the front-ends full-scale output range. To overcome this high PAPR issue, UCP-OFDM is proposed within [17] which achieves the levels near those of unmodulated constellation symbols. A precoding matrix provides full control over null sub-carriers and allows for the signal spectrum fluctuations around DC to be eliminated. This is ensured by a spectral mask matrix, given by
3. System architecture
The overview of the system architecture is described. As reference, note that the format of the OWC frame symbols which construct the OWC frame are shown in Fig. 1. Please note that R$_{1/2/3/4}$ represent the bit resolution within the FPGA at each module stage, as shown in Fig. 2. This is a key factor for FPGA design that can influence the output accuracy. Each stage has the potential to truncate signal values and effect the proficiency of other dependent modules which inherit information. For the purposes of this letter and the validation of the low-latency FPGA Tx DSP presented, fundamental OWC system parameters are tabulated in Table 1. An Ethernet back-haul is also presumed, such as in [2]. The data is encapsulated wholly and transformed into an OWC compatible signal. The Ethernet frame is segmented into packet lengths of even multiples of the total Ethernet frame length and then passed as UCP-OFDM symbols in parallel though the Tx DSP chain. The peripheral blocks controlling the RFDC IP, available from Xilinx, are not shown. The format of the OWC symbols which construct the OWC frame are also shown in. As seen $P$ is the OWC frame preamble, $H$ is the OWC frame header, $s_P$ are the OWC Payload symbols, where $P$ is the total number of payload symbols, $e_E$ are the OWC empty symbols, where $E$ is the total number of OWC frame symbols, $L$ represents the cyclic prefix (CP) length and $N$ is the number of sub-carriers per payload OWC symbol. An entire OWC frame consists of $E$ OWC symbols which are, in this case, synonymous with UCP-OFDM symbols. Please note the preamble and header are constructed in the same manner and can contain any information required about the OWC network within the system. Within this Tx DSP chain, the preamble is used for synchronisation, as well as channel estimation and equalization. The fundamental blocks for the low-latency FPGA-based UCP-OFDM OWC Tx DSP is displayed in Fig. 2. In all stages of the Tx DSP, each symbol element is processed in parallel while all whole symbols can be considered to be processed in serial. Control registers are used between modules to indicate traffic flow and prevent writing/reading errors such as described by flags in Fig. 3.
3.1 Block mapping
8-PSK is implemented using FPGA register transfer level/logic (RTL) verilog coding. For an offset 8-PSK constellation, the levels of I and Q are related by Eq. (5).
In this case, $\pm 106$ and $\pm 256$. The potential maximum value of any constellation in decimal is Eq. (6).
To this end, for each Tribit (consisting of 3 $R_1$ bits), the I and Q values are output to the precoding module where the real components are held on the first half of the output array (index 1:127) and the imaginary components are held on the second half of the output array (index 129:255). The centre and starting sub-carrier are nulled.
The latency of the system within block mapping is entirely dependent on the incoming data stream from the Ethernet back-haul. As data is mapped on the clocking edge for all $381$ bits at once, the only time delay is from the collection of data within a serial-to-parallel conversion.
3.2 Precoding
The latency of precoding is minimized due to the combination of DSP processes and parallel processing. UCP-OFDM is implemented in this module through matrix operations, and due to its innate behaviour, the entire process can be simplified down to three stages as it combines precoding and IFFT operations: 2 multiplication modules and 1 addition module. There are two main matrices which create $E$: $V_{r}$ and $U_{r}\Sigma _{r}$, such that
The first stage involves multiplying the OWC symbol elements by $V_r$ which feeds into the second stage where the symbols samples are multiplied by $U_{r}\Sigma _{r}$. The last stage is simple addition completing the precoding process. In order to ensure that the symbol elements are clocked in and out of the module correctly, first-in first-out (FIFO) buffers are used to control the flow of data. To decrease the latency of precoding, a finite state machine (FSM) is utilized to control the flow of logic between each stage. Matrix multiplication is also split up into parallel modules which house the different rows of the matrices; $V_{r}$ and $U_{r}\Sigma _{r}$. Their respective outputs are then appended onto wires for later stages. The DSP takes advantage of the time delays inherent with matrix operations to progress with each payload symbol as it becomes available from block mapping (i.e., As one payload symbol completes the first multiplication module, the data is concurrently passed the the second multiplication module, allowing the next payload symbol access to the first multiplication module).
UCP-OFDM also has a key sparsity characteristic which reduces the total number of operations required to $2NZ$, where Z is the total number of null sub-carriers. Typical Radix-2 FFT computations amount to $Nlog_2(N)$. The signal designer can chose to deterministically truncate the output data bits, $R_3$, at the output of the third stage, or after each precoding multiplication or addition stage.
3.3 Up-sampling and cyclic prefix
Before serial-to-parallel conversion, ensuring compatibility with the advanced extensible interface (AXI)- stream bus, a CP of length $L$ is added to the elements of each UCP-OFDM symbol. This is completed by copying data straight onto wires. These extended UCP-OFDM symbols are then up-sampled by a factor of $U$ by using FIR filters of the form:
The reduction in the overall latency of the OWC Tx DSP is also achieved through the parallelization of the FIR structures. Each FIR structure is identical and can be reused with the correct respective inputs. To minimize processing times, super-sampling is employed. This is a hybrid method between parallel and serial connections where multiple segments of the data are processed at one time. Using a shifter that cuts the data into slices, the data can be processed at a higher rate of Eq. (9). This is a reduction of $(N+L)-\frac {N+L}{slices}$ clock cycles when compared to serial calculations.
Shown in Fig. 2, $FL$ denotes the length of the discrete filter bank and $FB$ is the number of Filter Banks, where $S$ is the signal sample and $h$ is the filter co-efficient.
4. Experimental setup and implementation
In this section, the experimental setup and detailed hardware implementation is presented. Please note that in this case:
Shown in Fig. 2 are the 3 core blocks to this ULL Tx DSP for 6G OWC. There are no delays added to block mapping, where the latency of this module is entirely dependent on the data being clocked into the FPGA. Based on this, the data then propagates into precoding where multiple processes are combined. The FIR filters then employ super-sampling and parallelized structures to minimise time delays and overall latency. The experimental test-bench is displayed in Fig. 4 and related to the block diagram in Fig. 5. As described visually, the Ethernet frames are fed into the Tx DSP modules which are subsequently input into the RFDC IP core block available from Xilinx where it is output to the digital-to-analog converter (DAC). The Rx signal is then captured by a photo-detector and oscilloscope with in-built analog-to-digital converters for offline processing. There is an output data rate of 2.21184 GSps for this experimental setup. This is calculated from the AXI-stream interface with an output of a serialized $18$ parallel data streams at a rate of 122.88 MHz. Considering the output bit rate with DAC resolution, this would equate to 35.38944Gbps. The path loss of the system in this case was $\sim$15dB, calculated from Eq. (10), and the optical output power was $\sim$250mW.
where $P_{tx}$ and $P_{rx}$ is the power of the Tx and Rx signals, respectively.4.1 Transmitter DSP output validation
In order to validate the function of the Tx DSP, the FPGA output from the RTL is compared to a model representation of the system within MATLAB. These results confirmed the correct waveform with a calculated BER of 0. The error vector magnitude (EVM) is calculated [20] in Eq. (11).
4.2 Latency
For the context of this letter, latency refers to the time taken for an OWC symbol to be produced from the input Ethernet binary bits from start to end of the Tx DSP main module. This is signified on the timing waveforms shown in Fig. 3 as a number of flags: Block_Map_start_flag, Precoding_start_flag and FIR_CP_start_flag. Each of the flags at a logic level "1" represent the start of processing at the specified sub-module, block mapping, precoding and up-sampling and CP, respectively. The final data out is denoted by m_axis_tdata[143:0]. For this experiment, the global clock frequency is set to 122.88 MHz (or 8.14ns per clock period). In the worst case scenario at start-up, the total latency for the Tx DSP system is 716.15 ns. More generally, the Tx DSP completes in 88 clock cycles initially as the reset flag is asserted and all registers are occupied. For each individual Tx DSP module the latency, in clock cycles at start-up, is:
Once the registers have filled and the process has begun, the steady state latency for the Tx DSP is 48 clock cycles, or in this case, approximately 390.72ns (48 * 8.14ns). This is due to the parallel processing and combination of DSP practices fundamental to UCP-OFDM symbols, as well as the control FSM which allows for minimum delays when latter inputs are accessible. It should be noted that the data-rate, bandwidth and overall latency (in terms of seconds) of this DSP design are only limited by the hardware and global clock, depending on the requirements of the system.5. Conclusions
In this letter, an ultra low-latency FPGA-based optical Tx DSP for 6G is demonstrated using the RFSoC2x2. The low-latency Tx DSP system is achieved by combining the precoding and the IFFT operations, as well as the parallelization of FIR structures within the OWC signal generation, resulting in the minimization of required DSP blocks. Super-sampling and FSMs are also responsible for the reduction in latency as the data is processed. UCP-OFDM delivers on a lower PAPR at 5.6 dB, in this case, using 8-PSK allowing for angle corrections only at the Rx. This will also reduce future work within a transceiver which can operate at a higher power efficiency, which will be presented in a follow-up paper. The results of EVM 4.09% and BER of 0 within a noisy setting verify the feasibility and effectiveness of this ULL Tx DSP for OWC completing in less than 400 ns, half the time taken than state-of-the-art proposals and an approximate expected $\sim {0.1ms}$ for 6G KPIs.
Funding
This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) and is co-funded under the European Regional Development Fund under Grant Number (13/RC/2077_P2); Open access funding provided by Irish Research eLibrary. This work was also in part funded by Nokia-Bell Labs, Ireland.
Acknowledgments
The authors would like to thank the collaboration with Prof. Harald Haas, Dr. Stefan Videv, and Adrian Sparks, Strathclyde University, and PureLiFi Ltd., Scotland, UK. Open access funding provided by Irish Research eLibrary.
Disclosures
The authors declare that there are no conflicts of interest related to this article.
Data availability
Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.
References
1. A. K. Maini, Handbook of Defence Electron. and Optronics: Fundamentals, Technol. and Sys. (Wiley Telecom., 2018).
2. F. Delgado, I. Quintana, J. Rufo, J. Rabadan, C. Quintana, and R. Perez-Jimenez, “Design and Implementation of an Ethernet-VLC Interface for Broadcast Transmissions,” IEEE Commun. Lett. 14(12), 1089–1091 (2010). [CrossRef]
3. M. S. Gujar, S. Velankar, and A. Chavan, “Realtime Audio Streaming using visible Light Communication,” in 2016 Int. Conf. on Inventive Comput. Technol. (ICICT), vol. 3 (2016), pp. 1–3.
4. Y. Jia, M. Zhang, Y. Huang, and Y. Zhang, “Design of an FPGA Based Visible Light Communication System,” in 2014 12th Int. Conf. on Opt. Internet 2014 (COIN), (2014), pp. 1–2.
5. M. Zeng, Z. Kaixiong, C. Gong, S. Lou, X. Jin, and Z. Xu, “Design and Demonstration of an Indoor Visible Light Communication Network with Dynamic User Access and Resource Allocation,” in 2017 9th Int. Conf. on Wireless Commun. and Signal Process. (WCSP), (2017), pp. 1–6.
6. Y. Shao, R. Deng, J. He, K. Wu, and L.-K. Chen, “Digital 4K Video Transmission over Realtime Water-Air OWC-OFDM System with Low-Complexity Transmitter-Side DSP,” in 45th European Conf. on Opt. Commun. (ECOC 2019), (2019), pp. 1–4.
7. C. Yifei, M. Kong, T. Ali, J. Wang, R. Sarwar, J. Han, C. Guo, B. Sun, N. Deng, and J. Xu, “26 m/5.5 Gbps Air-Water Optical Wireless Communication based on an OFDM-modulated 520-nm Laser Diode,” Opt. Express 25(13), 14760–14765 (2017). [CrossRef]
8. G. Cossu, A. Sturniolo, A. Messa, S. Grechi, D. Costa, A. Bartolini, D. Scaradozzi, A. Caiti, and E. Ciaramella, “Sea-Trial of Optical Ethernet Modems for Underwater Wireless Communications,” J. Lightwave Technol. 36(23), 5371–5380 (2018). [CrossRef]
9. P. Wang, C. Li, and Z. Xu, “A Cost-Efficient Real-Time 25 Mb/s System for LED-UOWC: Design, Channel Coding, FPGA Implementation, and Characterization,” J. Lightwave Technol. 36(13), 2627–2637 (2018). [CrossRef]
10. W.-S. Tsai, C.-Y. Li, H.-H. Lu, Y.-F. Lu, S.-C. Tu, and Y.-C. Huang, “256 Gb/s Four-Channel SDM-Based PAM4 FSO-UWOC Convergent System,” IEEE Photonics J. 11(2), 1–8 (2019). [CrossRef]
11. C.-Y. Li, X.-H. Huang, H.-H. Lu, Y.-C. Huang, Q.-P. Huang, and S.-C. Tu, “A WDM PAM4 FSO-UWOC Integrated System With a Channel Capacity of 100 Gb/s,” J. Lightwave Technol. 38(7), 1766–1776 (2020). [CrossRef]
12. International Electrotechnical Commission (IEC), Safety of laser products - Part 12: Safety of Free Space Optical Communication Systems Used for Transmission of Information (2019).
13. H.-B. Jeon, S.-M. Kim, H.-J. Moon, D.-H. Kwon, J.-W. Lee, J.-M. Chung, S.-K. Han, C.-B. Chae, and M.-S. Alouini, “Free-space optical communications for 6g wireless networks: Challenges, opportunities, and prototype validation,” IEEE Commun. Mag. 61(4), 116–121 (2023). [CrossRef]
14. B. Aazhang, P. Ahokangas, H. Alves, et al., Key drivers and research challenges for 6G ubiquitous wireless intelligence (white paper) (2019).
15. M. Parker, Digital Signal Processing 101: Everything You Need to Know to Get Started (Elsevier, Oxford, 2017), 2nd ed.
16. M. Bisht and A. Joshi, “Various techniques to reduce PAPR in OFDM systems: A survey,” Int. J. Signal Process. Image Process. Pattern Recognit. 8(11), 195–206 (2015). [CrossRef]
17. T. E. Abrudan, S. Kucera, and H. Claussen, “Unitary Checkerboard Precoded OFDM for low-PAPR Optical Wireless Communications,” J. Opt. Commun. Netw. 14(4), 153–164 (2022). [CrossRef]
18. Y. Shao, R. Deng, J. He, K. Wu, and L.-K. Chen, “Real-Time 2.2-Gb/s Water-Air OFDM-OWC System With Low-Complexity Transmitter-Side DSP,” J. Lightwave Technol. 38(20), 5668–5675 (2020). [CrossRef]
19. J. Cho and P. J. Winzer, “Probabilistic Constellation Shaping for Optical Fiber Communications,” J. Lightwave Technol. 37(6), 1590–1607 (2019). [CrossRef]
20. J. Winzer, “Evolved universal terrestrial radio access (e-utra); base station (bs) radio transmission and reception,” in ETSI 3rd Generation Partnership Project (3GPP), (3rd Generat. ETSI 3rd Generation Partnership Project (3GPP), France, Sep 2015), p. Tech. Rep. 3GPP TS 36.104.