Wideband dynamic behavioral modeling of reflective semiconductor optical amplifiers using a tapped-delay multilayer perceptron

Zhansheng Liu; Manuel Alberto Violas; Nuno Borges Carvalho

doi:10.1364/OE.21.003354

1. Introduction

The demand of large bandwidth and high data rate in wireless communication systems is growing rapidly due to the number of mobile users increasing drastically and the expansion of wireless applications. The integration of optical fiber and wireless communication has been the development trend of the future communication systems [1], [2]. Radio over fiber (RoF) transmission system is currently finding a popular application in hybrid optical-wireless access networks. One of the main applications is a distributed antenna system (DAS). In a DAS, a large number of remote antenna units (RAUs) are geographically distributed over a large area and connected to a central unit (CU) via optical fiber [3], [4], in order to provide enhanced cellular coverage and capacity. The DAS can also improve the spectral efficiency of the cellular system, especially at the cell edge.

Semiconductor optical amplifier (SOA) and reflective SOA (RSOA) have recently attracted much interest of many researchers because of their nonlinear function, amplification, and modulation properties [4–9]. In [4], three types of optical links for uplink direction have been presented and compared. The RSOA link can provide more flexible and dynamically reconfigurable optical network architecture due to its colorless property.

The accurate mathematical model will contribute to optimally designing of devices and predicting the characteristics of devices. In [6], a steady-state physical model of SOA was developed. The Levenberg-Marquardt algorithm was used to extract the parameters of the physical model. In [7], a simple physical model for pulse propagation in RSOA was proposed to predict its gain dynamics, spatial dependence of the pulse shape, and dynamic chirp. Previously we have also developed a multi-section physical model of the bulk RSOA used as external modulator in radio over fiber systems [9]. The model was implemented by using symbolically defined devices (SDD) component in advanced design systems (ADS). The model parameters were extracted by fitting the measured data based on the ADS optimization tools. All of these models require the knowledge of physical parameters, which are difficult to be measured.

Unlike physical models, behavioral models, also called the black box, focus on the relation between the input and output signals of devices or systems. Artificial neural network (ANN) based behavioral models have been widely used to model the radio frequency (RF) power amplifiers [10], [11], erbium-doped fiber amplifier (EDFA) [12] and quantum-dot SOA (QD-SOA) [13] because of its generality and accuracy. In [13], the pulse amplification and four-wave mixing (FWM) characteristics of the QD-SOA has been investigated by using a multilayer perceptron (MLP) model.

In this paper we attempt to use a simple tapped-delay MLP (TDMLP) to model the RSOA modulator which can be used in the RoF uplink. The model has been trained and validated by the measured sampled input and output data of wideband modulated signals. The dynamic nonlinear distortion and memory effects of the RSOA modulator have been investigated based on the TDMLP model.

2. TDMLP based behavioral model for RSOA

2.1 TDMLP model

In order to investigate the memory effect of the RSOA modulator, a single hidden layer TDMLP based model is proposed as shown in Fig. 1 . The transformation from the input layer to the hidden layer is nonlinear, while the transformation from the hidden layer to the output layer is pure linear. $b_{h, j}$ ( $j = 1, 2, ..., M$ ) are the biases in the hidden layer. $f (\cdot)$ is the activation function in the hidden layer, which will be introduced (in Subsection 2.3). $b_{o, k}$ ( $k = 1, 2$ in this case) are the biases in the output layer.

Fig. 1 TDMLP based model with 2 × (m + 1) input nodes, m is the number of memory depth, z⁻¹ is for unit delay operation, M hidden nodes and two output nodes.

Download Full Size | PDF

The output $u_{j} (n)$ of the jth node in the hidden layer is depicted by

u_{j} (n) = f (\sum_{i = 0}^{m} ω_{i, j} x (n - i) + b_{h, j})

where the

ω_{i, j}

is the weighting coefficient from the ith node in the input layer to the jth node in the hidden layer. The x can be x_I or x_Q.

The output of the kth node in the output layer is given by

y_{k} (n) = \sum_{j = 1}^{M} ω_{j, k} u_{j} (n) + b_{o, k}

where the

ω_{j, k}

is the weighting coefficient from the jth node in the hidden layer to the kth node in the output layer.

y_{k}

is

y_{I}

or

y_{Q}

in our case.

All of the weighting coefficients $ω_{i, j}$ , $ω_{j, k}$ and bias parameters $b_{h, j}$ , $b_{o, k}$ need to be determined during the training phase. The number of the hidden layer can be increased if needed. The mathematical results, however, have been shown that the single hidden layer MLP is capable of approximating uniformly any continuous multivariate function to any desired degree of accuracy [14]. This implies that any failure of a function mapping must arise from inadequate choice of parameters or an insufficient number of hidden nodes. More than one hidden layer is often used to reduce the number of total nodes in hidden layers.

2.2 Overfitting

It is difficult to determine the optimal number of hidden nodes. The large number of nodes in the hidden layer makes the poor generality of the TDMLP as the nets with too much capacity overfit the training data [15]. In order to avoid overfitting during the training phase, it is necessary to use additional techniques such as regularization, early stopping, or cross-validation [16]. The cross-validation is the most popular method to achieve generalization by evaluating the performance of the model on a different set of data with ones for training [11].

2.3 Activation function

The effects toward obtaining the best performance of the ANN based model also need to focus on the neuronal activation scheme. Different choices of the activation functions result in different network models. In general, the activation function in the hidden layer is a nonlinear function. The anti-symmetric functions often yield faster convergence. The most common choice of the activation function in the hidden layer is the anti-symmetric hyperbolic tangent, which is mathematically described as:

f (x) = \tan h (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

Among the reasons for this popularity are still its boundedness in the unit interval, the function’s and its derivative’s fast computability. It is differentiable everywhere.

2.4 Back propagation learning algorithm (BPLA)

Back propagation learning algorithm (BPLA) is a popular method of training ANN, especially training MPL nets. The traditional BPLA, however, converges very slowly when compared with second-order algorithms. The Levenberg-Marquardt (LM) method was designed to utilize the advanced and complex second-order optimization algorithm to achieve significant advances in training speed and accuracy [17]. The LM back propagation algorithm is the fastest method for training feed-forward neural networks. It has been implemented in the MATLAB ANN toolbox.

In order to obtain the optimal weight and bias parameters, one will minimize the average error function, which is represented as [16], [17]

V (ω) = \frac{1}{2 P} \sum_{p = 1}^{P} \sum_{i = 1}^{N} {[t_{i}^{p} (ω) - y_{i}^{p} (ω)]}^{2} = \frac{1}{2 P} \sum_{p = 1}^{P} \sum_{i = 1}^{N} {[e_{i}^{p} (ω)]}^{2}

where P is the number of training patterns. N is the number of the output nodes (N is equal to 2 in our model). t is the target output and y is the model output. e is the network errors.

ω

is the parameters such as weights and biases to be extracted. In our case, t and y choose the corresponding in-phase parts and quadrature parts when i is equal to 1 and 2, respectively.

For the LM algorithm, the weight and bias parameters are updated by [17]

ω_{n + 1} = ω_{n} - {[J^{T} J + μ I]}^{- 1} J^{T} e

where

{(\cdot)}^{T}

denotes the transpose operation. n is the number of iteration.

ω

denotes the vector of all training parameters (weights and biases).

J

is the Jacobian matrix that contains first derivatives of the net errors with respect to the weights and biases.

e

is the vector of the network errors.

I

is the identity matrix and

μ

is the learning parameter.

3. Experimental and training

3.1 Experimental setup

The experimental test bench was set up as shown in Fig. 2 . The test baseband signals were designed in MATLAB and fed into a vector signal generator (VSG) to be up-converted to RF domain with a carrier frequency of 1 GHz. A commercial broadband amplifier (AMP), ZHL-42W, was used to drive the device under test (DUT). A vector signal analyzer (VSA) was used to capture the baseband input and output signals of the DUT. This was realized by a switch. For instance, the direct connection between the output of AMP and VSA enables us to obtain the measurements of the input signals for the DUT. The VSG and VSA were connected to a computer controller using general purpose interface bus (GPIB) cables. A 10 MHz reference and trigger signals between the VSG and VSA were used in order to synchronize and trigger the measurement events. Any error in the generation acquisition synchronization could lead to poor repeatability of measurements.

Fig. 2 The experimental test bench.

Download Full Size | PDF

A commercial distributed feedback (DFB) laser, which was biased at 30 mA and had a wavelength of 1550 nm, was used as the seed light of the RSOA. By adjusting a tunable optical attenuator, the input optical power of the RSOA was set to −7 dBm, which made the RSOA working under saturation condition [9]. The RSOA was biased at 90 mA. The forward and reverse signals of the RSOA were separated by an optical circulator. A photodiode (PD) with a responsivity of 0.8 A/W was used as a detector to convert optical signals to electrical ones.

3.2 TDMLP training

The training of the TDMLP involves choosing a set of weights and biases to optimize the accuracy of the mapping. The TDMLP based model can be easily trained by the BPLA in MATLAB. In order to train the TDMLP, a random 64 quadrature amplitude modulation (QAM) signal with 20 Msymbol/s, which were filtered with a square root raised cosine (RRC) filter with the roll-off factor of $α = 0.22$ , were created at baseband in MATLAB. The 64 QAM signal with a sampling rate at 80 MHz was fed into the VSG memory to be up-converted to the RF band at 1 GHz and finally to be passed through the DUT. In our case the average RF input power of the DUT was set up to 13 dBm, which is near the input 1 dB compression point of the RSOA. The complex baseband input and output signals of the DUT were captured from the VSA. After alignment and pre-processing, around 40 000 samples were captured to train, validate and test the TDMLP model.

Using the cross-validation technique, the training, validation and testing of the TDMLP were carried out using different segments of the measured input and output signals. The first 10 000 samples were used for training the nets. The second 20 000 samples, which were different with ones for training, were used to validate the model. The remaining 10 000 different samples were used for testing of the capacity of the model to predict the output of the RSOA modulator.

To evaluate the performance of the TDMLP based model in time domain, the normalized mean square error (NMSE) (in dB), which is the total power of the error vector between the measurement and model outputs, normalized to the measured signal power, is explicitly given as follows:

N M S E = 10 \log_{10} {\frac{\sum_{n = 1}^{N} ({[t_{I} (n) - y_{I} (n)]}^{2} + {[t_{Q} (n) - y_{Q} (n)]}^{2})}{\sum_{n = 1}^{N} ({[t_{I} (n)]}^{2} + {[t_{Q} (n)]}^{2})}}

where the

t_{I} (n)

and

t_{Q} (n)

are the in-phase and quadrature parts of the real output of the RSOA, respectively.

y_{I} (n)

and

y_{Q} (n)

are the in-phase and quadrature parts of the output of the model, respectively.

4. Results and discussion

By varying the number of hidden nodes and the memory depth to determine the optimal values, several particular instances were trained and validated. The performance of the TDMLP model versus memory depth and the number of hidden nodes is shown in Table 1 .

Table 1. TDMLP model performance versus memory depth and the number of hidden nodes

View Table

The performance increases with memory depth increasing. It, however, degrades when increasing memory depth further. The performance of the TDMLP model also increases with the number of hidden nodes increasing. By further increasing the hidden nodes, however, the performance starts to degrade due to overfitting the training data set which reduces the generality of the TDMLP. This issue can be solved by increasing the training data set. In our work we chose the hidden nodes of 20 and memory depth of 3 for following analysis even though there are several slightly better points and the more weighting coefficients are required.

The training, validation and testing performance under the hidden nodes of 20 and memory depth of 3 is shown in Fig. 3 . It is shown that the performance of the TDMLP improves with increasing number of training epochs. The error on the validation and testing data sets starts off decreasing as the underfitting is reduced, but then it eventually begins to increase again as overfitting occurs. A solution to ensure the best generality of the ANN is to use the procedure of early stopping. One trains the network on the training data set until the performance on the validation data set starts to deteriorate and then stops the training phase (in our work we set maximum validation failures to 10).

Fig. 3 The training performance of the TDMLP based model for the RSOA modulator with memory depth of 3 and hidden nodes of 20.

Download Full Size | PDF

The time domain waveforms of in-phase and quadrature parts of the RSOA outputs are shown in Fig. 4 , where only first 250 samples are shown for clarity. They indicate that the measured data points were fitted by the modeled ones well. This can also be proved in frequency domain. The measured and modeled power spectral densities are shown in Fig. 5 . After demodulation process, the constellation for symbols of the transmitted, measured output and TDMLP model output are depicted in Fig. 6 . The error vector magnitudes (EVMs) for measured and modeled output symbols are 5.70% and 5.64%, respectively, calculating by the following formula.

E V M = \sqrt{\frac{\sum_{n = 1}^{N} {| s (n) - r (n) |}^{2}}{\sum_{n = 1}^{N} {| r (n) |}^{2}}}

where

r (n)

is the reference transmitted symbol,

s (n)

is the measured or modeled output symbol, N is the number of symbols.

Fig. 4 Measured and modeled outputs in-phase and quadrature parts in time domain.

Download Full Size | PDF

Fig. 5 Normalized power spectral density of measured and modeled outputs.

Download Full Size | PDF

Fig. 6 Normalized constellation (red ‘ + ’ for the transmitted symbol, blue ‘·’ for the measured output, and green ‘x’ for the TDMLP model output).

Download Full Size | PDF

The nonlinear distortion and memory effect can be described by plotting the dynamic AM-AM and dynamic AM-PM characteristics as shown in Fig. 7 . We can clearly see that the distortion including static nonlinearities and memory effects was induced by the RSOA modulator. The results of the model match with the measurements quite well. The phase changes were up to 7°. The TDMLP based model with memory can accurately predict the dynamic behaviors of the RSOA modulator.

Fig. 7 Dynamic AM-AM and AM-PM characteristics of the RSOA modulator.

Download Full Size | PDF

5. Conclusion

In this paper, a TDMLP based model has been proposed to predict accurately the dynamic nonlinear distortion and memory effects of the RSOA modulator in RoF links. The cross-validation early stopping guarantees the generalization of the TDMLP based model while avoiding overfitting during the training phase. The NMSE between measurement and simulation is up to −44.33 dB when the number of hidden nodes and memory depth were set to 20 and 3, respectively. The measurement and simulation results show that a single hidden layer in TDMLP is enough to model the RSOA modulator. The dynamic AM-AM and dynamic AM-PM conversion characteristics of the RSOA modulator have also been demonstrated. The phase shift at high input power was up to 7°.

Acknowledgments

Z. Liu is sponsored by the Fundação para a Ciência e Tecnologia (FCT) under Ph.D Grant SFRH/BD/68376/2010, whose support is gratefully acknowledged.

References and links

1. J. Capmany and D. Novak, “Microwave photonics combines two worlds,” Nat. Photonics 1(6), 319–330 (2007). [CrossRef]

2. A. J. Seeds, “Microwave Photonics,” IEEE Trans. Microw. Theory Tech. 50(3), 877–887 (2002). [CrossRef]

3. D. Wake, A. Nkansah, and N. J. Gomes, “Radio over fiber link design for next generation wireless systems,” J. Lightwave Technol. 28(16), 2456–2464 (2010). [CrossRef]

4. D. Wake, A. Nkansah, N. J. Gomes, G. de Valicourt, R. Brenot, M. Violas, Z. Liu, F. Ferreira, and S. Pato, “A comparison of radio over fiber link types for the support of wideband radio channels,” J. Lightwave Technol. 28(16), 2416–2422 (2010). [CrossRef]

5. T. Durhuus, B. Mikkelsen, C. Joergensen, S. Lykke Danielsen, and K. E. Stubkjaer, “All-optical wavelength conversion by semiconductor optical amplifiers,” J. Lightwave Technol. 14(6), 942–954 (1996). [CrossRef]

6. M. J. Connelly, “Wide-band steady-state numerical model and parameter extraction of a tensile-strained bulk semiconductor optical amplifier,” IEEE J. Quantum Electron. 43(1), 47–56 (2007). [CrossRef]

7. M. J. Connelly, “Reflective semiconductor optical amplifier pulse propagation model,” IEEE Photon. Technol. Lett. 24(2), 95–97 (2012). [CrossRef]

8. N. Nadarajah, K. L. Lee, and A. Nirmalathas, “Upstream access and local area networking in passive optical networks using self-seeded reflective semiconductor optical amplifier,” IEEE Photon. Technol. Lett. 19(19), 1559–1561 (2007). [CrossRef]

9. Z. Liu, M. Sadeghi, G. de Valicourt, R. Brenot, and M. Violas, “Experimental validation of a reflective semiconductor optical amplifier model used as a modulator in radio over fiber systems,” IEEE Photon. Technol. Lett. 23(9), 576–578 (2011). [CrossRef]

10. Y. Cao and Q. J. Zhang, “A new training approach for robust recurrent neural-network modeling of nonlinear circuits,” IEEE Trans. Microw. Theory Tech. 57(6), 1539–1553 (2009). [CrossRef]

11. F. Mkadem and S. Boumaiza, “Physically inspired neural network model for RF power amplifier behavioral modeling and digital predistortion,” IEEE Trans. Microw. Theory Tech. 59(4), 913–923 (2011). [CrossRef]

12. N. Vijayakumar and S. N. George, “Design optimization of erbium-doped fiber amplifiers using artificial neural networks,” Opt. Eng. 47(8), 085008 (2008). [CrossRef]

13. J. I. Ababneh and O. Qasaimeh, “Simple model for quantum-dot semiconductor optical amplifiers using artificial neural networks,” IEEE Trans. Electron. Dev. 53(7), 1543–1550 (2006). [CrossRef]

14. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Netw. 2(5), 359–366 (1989). [CrossRef]

15. E. B. Baum and D. Haussler, “What size net gives valid generalization?” Neural Comput. 1(1), 151–160 (1989). [CrossRef]

16. S. Haykin, Neural Networks: A Comprehensive Foundation (Prentice-Hall, 1999).

17. M. T. Hagan and M. B. Menhaj, “Training feedforward networks with the Marquardt Algorithm,” IEEE Trans. Neural Netw. 5(6), 989–993 (1994). [CrossRef] [PubMed]

NMSE (dB)		Number of nodes
		5	10	15	20	25	30	35
Memory depth	0	−28.62	−28.64	−28.65	−28.66	−28.64	−28.64		−28.65
	1	−30.49	−36.80	−37.03	−37.08	−37.08	−37.09		−37.11
	2	−31.05	−41.68	−41.83	−41.79	−42.21	−42.22		−42.42
	3	−41.96	−43.01	−43.73	−44.33	−44.31	−43.46		−44.12
	4	−42.37	−43.62	−44.18	−44.21	−44.37	−44.39		−44.32
	5	−41.56	−44.13	−44.24	−43.88	−44.38	−44.36		−44.29
	6	−41.22	−44.05	−44.23	−43.53	−43.95	−44.19		−44.35

Wideband dynamic behavioral modeling of reflective semiconductor optical amplifiers using a tapped-delay multilayer perceptron

Abstract

1. Introduction

2. TDMLP based behavioral model for RSOA

2.1 TDMLP model

2.2 Overfitting

2.3 Activation function

2.4 Back propagation learning algorithm (BPLA)

3. Experimental and training

3.1 Experimental setup

3.2 TDMLP training

4. Results and discussion

5. Conclusion

Acknowledgments

References and links

Cited By

Figures (7)

Tables (1)

Equations (7)

Optics Express