Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Spoken digit recognition utilizing a reservoir computing system based on mutually coupled VCSELs under optical injection

Open Access Open Access

Abstract

In this work, we propose a reservoir computing (RC) system based on mutually delay-coupled vertical-cavity surface-emitting lasers (MDC-VCSELs) under optical injection for processing a spoken digit recognition task, and the performances have been numerically investigated. In such a system, two MDC-VCSELs are taken as two nonlinear nodes of the reservoir to perform non-linearly mapping of the input information. Each spoken digit is preprocessed by two different masks to form two masked matrices, whose subsequent column vectors are connected to the preceding one to form two time-dependent series. Then, they are injected into the main polarization of two VCSELs, respectively. The transient states of two VCSELs distributed in the whole coupling loop are sampled for post processing. Through analyzing the influences of some key parameters on the system performance, the optimized parameter regions for processing a spoken digit recognition task with high speed and low word error rate are determined. The simulation results show that, for processing a spoken digit recognition task with a rate of 1.1×107 words per second, the word error rate (WER) can achieve 0.02% under adopting a dataset consisting of 5000 samples.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

In today’s information-driven society, various efficient information processing technologies are emerging to deal with complicated tasks, which are difficult for conventional computers to handle, such as chaotic time series prediction, speech recognition, nonlinear channel equalization and so on [13]. Among these technologies, reservoir computing (RC) originated from recurrent neural network (RNN) is a recently proposed brain-inspired paradigm [1,411]. For RC, the input and the internal connection weights are randomly fixed, and only the output weight needs to be trained [3]. Therefore, compared with RNN, RC possesses a faster and simpler training process. The information processing of RC is based on a reservoir layer to perform the nonlinear mapping from low dimensional input signal to higher dimensional reservoir state [23,12]. Conceptually, the nonlinear mapping can be implemented via many interconnected physical devices named as nonlinear nodes. However, the adoption of a large number of nonlinear nodes results in a challenge for practical implementation due to relatively high costs and the difficulty to collect the node states [2]. In order to solve this issue, delay-based RC systems have been proposed, in which a single nonlinear node under time-delayed feedback is taken as the reservoir [1315], and the nonlinear transient responses of the system are sampled with an equal time interval from the feedback loop, which are taken as the virtual node states. Delay-based RC system can be divided into several categories, such as electronic delay-based RC [1], optoelectronic delay-based RC [14,1617], and all-optical delay-based RC [7,1318]. Owing to easy implementation by hardware, the delay-based RCs based on a single nonlinear node have been a research hotspot in the RC field, and numerous relevant studies have been reported [1926]. Particularly, all-optical delay-based RC systems, in which a semiconductor laser (SL) is taken as the single nonlinear node, receive extra attention due to their high relaxation oscillation frequency [27]. However, for SL-based RCs with a single nonlinear node utilized to process some complicated tasks, hundreds or even thousands of virtual nodes are required, and therefore a large number of data features need to be extracted for achieving good performance. Since the minimum time interval between two adjacent virtual nodes is limited by the response time of the nonlinear node, the requirement of large number of virtual nodes results in a long data processing time [28]. In other words, it is difficult to achieve high processing rate under the premise of good performance.

In order to improve the processing rate of SL-based RCs, some novel system schemes are recently proposed, which are composed of several nonlinear nodes under time-delayed feedback (or coupling) [3,29]. Theoretically, by combining the virtual nodes states of several reservoirs, the number of virtual nodes required for each reservoir can be reduced. Since the processing rate is inversely proportional to the number of virtual nodes in each feedback loop, the processing rate of this RC system including several nonlinear nodes can be increased. For instance, in 2018, our group proposed a RC system based on mutually delay-coupled semiconductor lasers (SLs) [3], and it possesses fast processing rate due to adopting two time-delayed SLs as the reservoir. In 2020, Sugano et al. proposed a parallel RC system based on multiple SLs with optical feedback [29]. The result shows that, this system can make a balance between system complexity and data processing rate. In the schemes mentioned above, the used lasers are edge-emitting semiconductor lasers (EELs). Comparing with EELs, vertical-cavity surface-emitting lasers (VCSELs) have many advantage such as small size, high energy efficiency, low threshold current, and easy to photonic integration [3031]. In particular, two orthogonal polarization components namely X-polarization component (X-PC) and Y-polarization component (Y-PC), may coexist in a VCSEL. As a result, VCSELs-based reservoir can provide much richer nonlinear dynamics than EELs-based reservoir. In 2019, Guo et al. proposed a RC system based on two mutually delay-coupled VCSELs for performing a Santa-Fe chaotic time series prediction task, and the processing rate is up to 20 Gbps [32].

At present, the all-optical delay-based RC system based on several nonlinear nodes under time-delayed feedback (or coupling) are almost adopted to process the Santa-Fe chaotic time series prediction task and the nonlinear channel equalization task. However, to our knowledge, the performances of such a type RC system for processing spoken digit recognition task have not been reported. In this work, we propose and numerically stimulate a delay-based RC system for processing spoken digit recognition task, where two mutually delay-coupled VCSELs (MDC-VCSELs) are taken as the reservoir. Through analyzing the influences of some key parameters on the system performance, the optimized parameter regions have been determined. For processing a spoken digit recognition task with a rate of 1.1×107 words per second, the word error rate can be as low as 0.02% under adopting a dataset consisting of 5000 samples.

2. System and theory models

Figure 1 is the schematic diagram of the proposed system based on two mutually delay-coupled VCSELs (MDC-VCSEL1 and MDC-VCSEL2) for processing spoken digit recognition task. We perform spoken digit recognition task based on TI-46 speech corpus [33]. Before injected into RC system, the original one-dimension (1D) sound waveform needs to be preprocessed by Lyon ear mode for converting into 2D frequency-time matrix named as a cochleagram matrix Mu [34]. Mu is a P×K matrix, where P is the number of frequency channel components and K is the number of discrete times.

 figure: Fig. 1.

Fig. 1. Schematic diagram of the RC system based on mutually delay coupled VCSELs (MDC-VCSELs). SL: semiconductor laser; MZM: Mach-Zehnder modulator; VCSEL: vertical-cavity surface-emitting laser; Mu: cochleagram matrix; W1(W2): mask matrix; S1(S2): masked spoken matrix; Q1(Q2): 1D time-dependent series.

Download Full Size | PDF

As shown in Fig. 1, the system is composed of three parts: input layer, reservoir and output layer. In the input layer, Mu is converted into a masked input signal S1 (S2) after multiplied by a mask matrix W1 (W2), which is Nn×P matrix (Nn is the number of virtual nodes collected from each VCSEL within coupled time). Each matrix moment in W1 (W2) is randomly setting from a binary sequence of {-1, 1}. As a result, the masked matrix S1 (S2) is an Nn×K matrix. Through directly connecting subsequent column vectors to the preceding one, a 1D time-dependent series Q1 (Q2) can be obtained. Finally, via two Mach-Zehnder modulator (MZM), Q1 and Q2 are loaded into optical wave provided by a semiconductor laser (SL), and then injected into VCSEL1 and VCSEL2, respectively. In the reservoir layer, the coupling time between VCSEL1 (VCSEL2) and VCSEL2 (VCSEL1) is denoted as τ1 (τ2). For simplicity, we assume τ1 = τ2 = τ. The responses of MDC-VCSELs with consecutive intervals θ correspond to the states of consecutive virtual nodes. Here, we employ synchronization scheme, i.e., θ = τ / Nn.

In the output layer, the total output intensity |E|2 = |Ex|2 + |Ey|2 of VCSEL1 (VCSEL2) is extracted at an interval of θ and interpreted as a virtual node state, in which |Ex|2 and |Ey|2 are the output intensity of the X polarization component (X-PC) and Y polarization component (Y-PC), respectively. In a coupling time τ, Nn virtual node states can be collected in each coupling loop. For a spoken digit including K discrete times, two Nn× K node state matrices X1 and X2 can be obtained. Through vertically concatenating X1 and X2, a resulted state matrix X with 2Nn×K matrix elements is acquired. Due to speech corpus including 500 spoken digits, there are 500 resulted state matrices. Particularly, we choose 450 matrices for training and 50 matrices for testing. The training matrix Xtrain is obtained by horizontally concatenate 450 training state matrices. Then, we use the Xtrain to train the read-out weight matrix Wr to minimize the mean square error || Wr·Xtrain - MT ||2. Therefore, the optimal read-out weight matrix Wr is obtained by:

$${\mathbf W}\textrm{r}\textrm{ = }{\mathbf M}_\textrm{T}\cdot {\mathbf X}_\textrm{train}{^\textrm{T}}{({{\mathbf X}\textrm{train}{\mathbf X}\textrm{trai}{\textrm{n}^\textrm{T}} - \lambda {\mathbf I}} )^{ - 1}}$$
where λ is regression parameter to compensate for a necessary ill-posed problem, I is the Nn -dimensional unity matrix. MT is a target matrix with ten rows, where the values are set as ones in the row corresponding to the correct digit, and zeros in the rest of rows.

In the testing process, for i-th testing matrix Xtesti, a computed matrix MCi can be obtained. As shown in Fig. 2, MCi is the product of optimal read-out weight matrix Wr and testing matrix Xtesti. Summing the elements of each row in the computed matrix MCi, the row corresponding to highest score is the identified result. Through comparing the recognition results with the known values, the number of errors can be determined. The WER is the ratio of the number of errors to the total number of the testing data. In this work, the WER <1% is selected as an appropriate value for judging good performance [1415]. The values of WER given below is the result of 10-fold cross-validation.

 figure: Fig. 2.

Fig. 2. Example of computed result MCi which appears as “6” for this untrained test digit.

Download Full Size | PDF

Based on the spin-flip model (SFM) [3536], the rate equations modeling the nonlinear dynamics of MDC-VCSELs under optical injection can be described by:

$$\begin{aligned} \frac{{dE_{1,2}^x}}{{dt}} &= k(1 + i\alpha )({N_{1,2}}E_{1,2}^x - E_{1,2}^x + i{n_{1,2}}E_{1,2}^y) - ({\gamma _a} + i{\gamma _p})E_{1,2}^x\\ + {k_d}E_{2,1}^x({t - \tau } ){e^{ - i{\omega _\textrm{0}}\tau }} &+ {k_{inj}}{\varepsilon _{1,2}}(t )+ F_{1,2}^x(t )\end{aligned}$$
$$\frac{{dE_{1,2}^y}}{{dt}} = k(1 + i\alpha )({N_{1,2}}E_{1,2}^y - E_{1,2}^y - i{n_{1,2}}E_{1,2}^x) + ({\gamma _a} + i{\gamma _p})E_{1,2}^y + F_{1,2}^y(t )$$
$$\frac{{d{N_{1,2}}}}{{dt}} ={-} {\gamma _N}[{{N_{1,2}}({1 + {{ {|{E_{1,2}^x} } |}^2} + {{ {|{E_{1,2}^y} } |}^2}} )- \mu + i{n_{1,2}}({E_{1,2}^xE{{_{1,2}^y}^{\ast }} - E_{1,2}^yE{{_{1,2}^x}^{\ast }}} )} ]$$
$$\frac{{d{n_{1,2}}}}{{dt}} ={-} {\gamma _s}{n_{1,2}} - {\gamma _N}[{{n_{1,2}}({{{ {|{E_{1,2}^x} } |}^2} + {{ {|{E_{1,2}^y} } |}^2}} )+ i{N_{1,2}}({E_{1,2}^yE{{_{1,2}^x}^{\ast }} - E_{1,2}^xE{{_{1,2}^y}^{\ast }}} )} ]$$
where the superscripts x and y represent X-PC and Y-PC, and the subscripts 1 and 2 account for VCSEL1 and VCSEL2, respectively. E is the slowly varied complex amplitude of the electric field, N expresses the total carrier inversion between conduction and valence bands, and n describes the difference between carrier inversions for the spin-up and spin-down radiation channels. k stands for the decay rate of field, α accounts for the linewidth enhancement factor, γa is the linear dichroism of two VCSELs, γs represents the decay rate of spin-flip rate, γp indicates the linear birefringence, and γN denotes the decay rate of N. µ serves as the normalized injection current, where µ takes the value of 1 at threshold. The coupling strengths between two VCSELs is kd. kinj denotes the external injection strength. ω0 indicates the center frequency of two orthogonal polarization outputs of VCSEL1 (VCSEL2). The parameters of two VCSELs are identical unless specifically mentioned.

The fourth term on the right side in Eq. (2) is the external injection containing two input information (Q1, Q2). µ1 and µ2 denote the outputs of MZMs, which are described as [37]:

$${\varepsilon _{1,2}}(t )= \frac{{|{{\varepsilon_0}} |}}{2}[{1 + {e^{i\gamma {Q_{1,2}}(t )}}} ]{e^{i\textrm{2}\pi \Delta ft}}$$
where |µ0| is the amplitude of the injection field, γ represents the input scaling factor, and Δf denotes the frequency detuning between SL and the central frequency of VCSEL1 (VCSEL2).

The last term on the right side in Eq. (2) and Eq. (3) represent the spontaneous emission noises, known as Langevin sources [38]:

$$F_{1,2}^\textrm{x}(t )= \sqrt {\frac{{{\beta _{\textrm{sp}}}}}{2}} \left( {\sqrt {{N_{1,2}} + {\textrm{n}_{1,2}}} {\xi_1} + \sqrt {{N_{1,2}}\textrm{ - }{\textrm{n}_{1,2}}} {\xi_2}} \right)$$
$$F_{1,2}^y(t )={-} i\sqrt {\frac{{{\beta _{\textrm{sp}}}}}{2}} \left( {\sqrt {{N_{1,2}} + {\textrm{n}_{1,2}}} {\xi_1} - \sqrt {{N_{1,2}}\textrm{ - }{\textrm{n}_{1,2}}} {\xi_2}} \right)$$
where βsp describes the coefficient of spontaneous emission, ζ is the Gaussian white noises sources with zero mean and unit variance.

3. Results and discussion

In this section, the performance of MDC-VCSELs in spoken digit recognition is evaluated in detail. During the simulation, the parameters of the two VCSELs are [39]: α = 3, k = 300 ns-1, γa = 0.1 ns-1, γp = 10 ns-1, γN = 1 ns-1, γs = 50 ns-1, βsp = 10−6 ns-1, ω0 = 2πc/λ with λ = 850 nm and c = 3×108 m/s, |µ0| = 1, γ = 0.5, Δf = 0 GHz, and µ = 1.1. Here, θ is fixed at 12.5 ps. The parameters are fixed unless specifically mentioned.

First, we investigate the dynamics of the reservoir. It has been reported that, under optical injection, for different injection intensities and coupling intensities, the best performance of the RC system appears at the edge of the injection-locked state of the reservoir [40]. The dynamic states as a function of the injection strengths kinj and the coupling strengths kd are shown in Fig. 3. IL, P1, P2, MP and CO represent injection-locked state, period-one state, period-two state, multi-periodic state and chaotic state, respectively. It can be seen from the figure that, when the reservoir is at a certain optical injection strength, with the coupling strength kd increasing, most of the state evolution go through the IL, P1, P2, MP, and CO states. Under a small amount of injection strengths, its dynamic behavior is more abundant, and the evolution from MP into CO states experiences P1and P2 states. The purpose of analyzing dynamical states is to lay the theoretical foundation for the subsequent parameter analysis.

 figure: Fig. 3.

Fig. 3. Dynamic states as a function of the injection strengths kinj and the coupling strengths kd with Δf = 0 GHz.

Download Full Size | PDF

Second, the variation of the WERs with the number of virtual nodes Nn are analyzed in Fig. 4. With Nn increasing, the trend of WERs firstly decreases, and then maintain at a relatively stable level. From this diagram, it can be seen that, with the increase of Nn from 50 to 125, the WER gradually decreases. The reason is that a small number of nodes Nn means a lower dimension of the state space, which makes the training process more difficult to implement and therefore the WER is larger. When Nn ≥ 125, the values of WER are below 1%. However, since the node interval θ is fixed, an increase of Nn is accompanied by an increase of the coupled time τ. This results in a longer coupled loop and a slower processing rate. For making a balance between the high-speed processing rate and the low WER, the value of Nn is fixed at 125 in the following discussion.

 figure: Fig. 4.

Fig. 4. WERs of RC systems based on MDC-VCSELs as a function of the number of virtual nodes Nn with kinj = 100 ns-1, kd = 30 ns-1 and Δf = 0 GHz.

Download Full Size | PDF

Next, in order to get the appropriate parameters space of kinj and kd, the evolution map of the WERs for RC based on MDC-VCSELs is displayed in Fig. 5. Different colors are corresponding to different values of WER. The blank region refers to the WER values larger than 0.06. The good performance is obtained at a relatively large injection strength kinj and a small coupling strength kd, which basically distribute in the band-shaped area sandwiched by two red lines. Comparing Fig. 3 and Fig. 5, it is found that, when the values of injection strength kinj and the coupling strength kd are corresponding to this area, the system is working on the edge of injection-locked state. That is to say, when the values of injection strength kinj and the coupling strength kd are on the edge of injection locking states, the non-linearity and memory required by speech digit recognition task can be provided. The minimum value of WER is 0.04%, which is obtained at kinj = 210 ns-1 and kd = 70 ns-1.

 figure: Fig. 5.

Fig. 5. WERs of the RC system based on MDC-VCSELs in kinj - kd parameter space for spoken digit recognition task with Δf = 0 GHz.

Download Full Size | PDF

Furthermore, the influences of inner parameters for RC based on MDC-VCSELs are studied in Fig. 6. In this test, two key parameters γa and γp are selected as examples. We set γa1 (γa2) as the linear dichroism of VCSEL1 (VCSEL2), and γp1 (γp2) as the linear birefringence of VCSEL1 (VCSEL2). Considering that γa1 = γa2 (γp1 = γp2) is difficult to satisfy in the practical implementation, it is necessary to analyze the case of γa1γa2 (γp1γp2). The WERs as a function of γa2 are exhibited in Fig. 6(a). Here, γa1 is fixed at 0.1 ns-1, γa2 is varied within [0.09 ns-1, 0.11 ns-1], and the difference (γa1 - γa2) is varied with ±10% of γa1. The WERs as a function of γp2 are exhibited in Fig. 6(b). γp1 is fixed at 10 ns-1, γp2 is varied within [9 ns-1, 11 ns-1], and the difference (γp1 - γp2) is varied with ±10% of γp1. As shown in Fig. 6, the WERs in two figures present fluctuations, but the amplitudes are small. Therefore, if the difference between γa1 and γa2 (γp1 and γp2) is within 10% of γa1 (γp1), the WERs can be achieved the similar level as that obtained under γa1 = γa2 (γp1 = γp2).

 figure: Fig. 6.

Fig. 6. WERs as a function of (a) γa2 with γa1 = 0.1 ns-1 and (b) γp2 with γp1 = 10 ns-1, under kinj = 210 ns-1, kd = 70 ns-1 and Δf = 0 GHz.

Download Full Size | PDF

Finally, we investigate the variation of WERs with the frequency detuning Δf. For simplicity, we suppose that the frequency of two VCSELs are same, and increase Δf from -20 GHz to 30 GHz. The dependence of the WERs on Δf are displayed in Fig. 7. With Δf increasing, the trend of WERs firstly present fluctuations, and then sharply decreases when Δf > -10 GHz. After achieving the minimum, the curve increases with Δf, and goes through an oscillation within 9 GHz ≤ Δf ≤ 27 GHz. It can be seen that the values of WER corresponding to the frequency detuning Δf region [-10 GHz, 27 GHz] are below 1%. The relatively low WERs are obtained around zero detuning, and the minimum value of WER = 0.02% is obtained at -2 GHz.

 figure: Fig. 7.

Fig. 7. WERs as a function of the frequency detuning Δf with kinj = 210 ns-1 and kd = 70 ns-1.

Download Full Size | PDF

4. Conclusion

In summary, the performance of a RC system based on mutually delay-coupled VCSELs (MDC-VCSELs) under optical injection for processing speech digits recognition task is numerically investigated. In such a system, two MDC-VCSELs are taken as two nonlinear nodes of the reservoir to perform non-linearly mapping of the input information. Each spoken digit is preprocessed by two different masks to form two masked matrices, whose subsequent column vectors are connected to the preceding one to form two time-dependent series. Then, they are injected into the main polarization of two VCSELs, respectively. The dynamical responses of two VCSELs are sampled as virtual node states for training and testing. It is found that, when the values of injection strength (kinj) and coupling strength (kd) are on the edge of injection locking states, the better performance in spoken digit recognition task can be achieved. Besides, if the difference of inner parameters (γa, γp) in two VCSELs is in a very small range, the performance is almost unaffected. Moreover, the best processing ability can be achieved by weak negative frequency detuning (Δf). The simulated result show that, for processing a spoken digit recognition task with a rate of 1.1×107 words per second, the word error rate can be as low as 0.02% under adopting a dataset consisting of 5000 samples.

Funding

National Natural Science Foundation of China (61775184, 61875167).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2(1), 468 (2011). [CrossRef]  

2. W. Maass, T. Natschläger, and H. Markram, “Real-time computing without stable States: A new framework for neural computation based on perturbations,” Neural Computation 14(11), 2531–2560 (2002). [CrossRef]  

3. Y. S. Hou, G. Q. Xia, E. Jayaprasath, D. Z. Yue, W. Y. Yang, and Z. M. Wu, “Prediction and classification performance of reservoir computing system using mutually delay-coupled semiconductor lasers,” Opt. Commun. 433(15), 215–220 (2019). [CrossRef]  

4. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304(5667), 78–80 (2004). [CrossRef]  

5. D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout, “Isolated word recognition with the liquid state machine: a case study,” Inf. Process. Lett. 95(6), 521–528 (2005). [CrossRef]  

6. C. Du, F. Cai, M. A. Zidan, W. Ma, S. H. Lee, and W. D. Lu, “Reservoir computing using dynamic memristors for temporal information processing,” Nat. Commun. 8(1), 2204 (2017). [CrossRef]  

7. Q. Vinckier, F. Duport, A. Smerieri, K. Vandoorne, P. Bienstman, M. Haelterman, and S. Massar, “High-performance photonic reservoir computer based on a coherently driven passive cavity,” Optica 2(5), 438–446 (2015). [CrossRef]  

8. D. Verstraeten, B. Schrauwen, and D. Stroobandt, “Reservoir-based techniques for speech recognition,” in Proceedings of IJCNN06, International Joint Conference on Neural Networks, ed. (Academic), 1050–1053 (2006).

9. M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” Künstl Intell 26(4), 365–371 (2012). [CrossRef]  

10. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Computer Science Review 3(3), 127–149 (2009). [CrossRef]  

11. J. Pathak, B. Hunt, M. Girvan, Z. Lu, and E. Ott, “Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach,” Phys. Rev. Lett. 120(2), 024102 (2018). [CrossRef]  

12. H. Jaeger, W. Maass, and J. Principe, “Special issue on echo state networks and liquid state machines,” Neural Networks 20(3), 287–289 (2007). [CrossRef]  

13. F. Duport, B. Schneider, A. Smerieri, M. Haelterman, and S. Massar, “All-optical reservoir computing,” Opt. Express 20(20), 22783–22795 (2012). [CrossRef]  

14. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2(1), 287 (2012). [CrossRef]  

15. D. Brunner, M. C. Soriano, C. R. Mirasso, and I. Fischer, “Parallel photonic information processing at gigabyte per second data rates using transient states,” Nat. Commun. 4(1), 1364 (2013). [CrossRef]  

16. M. C. Soriano, S. Ortín, D. Brunner, L. Larger, C. R. Mirasso, I. Fischer, and L. Pesquera, “Optoelectronic reservoir computing: Tackling noise-induced performance degradation,” Opt. Express 21(1), 12–20 (2013). [CrossRef]  

17. Q. C. Zhao, H. X. Yin, and H. G. Zhu, “Simultaneous recognition of two channels of optical packet headers utilizing reservoir computing subject to mutual-coupling optoelectronic feedback,” Optik 157, 951–956 (2018). [CrossRef]  

18. A. Dejonckheere, F. Duport, A. Smerieri, L. Fang, J. L. Oudar, M. Haelterman, and S. Massar, “All-optical reservoir computer based on saturation of absorption,” Opt. Express 22(9), 10868–10881 (2014). [CrossRef]  

19. K. Hicke, M. A. Escalona-Moran, D. Brunner, and M. C. Soriano, “Information processing using transient dynamics of semiconductor lasers subject to delayed feedback,” IEEE J. Sel. Top. Quantum Electron. 19(4), 1501610 (2013). [CrossRef]  

20. R. M. Nguimdo, G. Verschaffelt, J. Danckaert, and S. G. Van, “Fast photonic information processing using semiconductor lasers with delayed optical feedback: Role of phase dynamics,” Opt. Express 22(7), 8672–8686 (2014). [CrossRef]  

21. J. Nakayama, K. Kanno, and A. Uchida, “Laser dynamical reservoir computing with consistency: An approach of a chaos mask signal,” Opt. Express 24(8), 8679–8692 (2016). [CrossRef]  

22. Y. S. Hou, G. Q. Xia, W. Y. Yang, D. Wang, E. Jayaprasath, Z. F. Jiang, C. X. Hu, and Z. M. Wu, “Prediction performance of reservoir computing system based on a semiconductor laser subject to double optical feedback and optical injection,” Opt. Express 26(8), 10211–10219 (2018). [CrossRef]  

23. K. Harkhoe, G. Verschaffelt, A. Katumba, P. Bienstman, and G. V. Der Sande, “Demonstrating delay-based reservoir computing using a compact photonic integrated chip,” Opt. Express 28(3), 3086–3096 (2020). [CrossRef]  

24. D. Z. Yue, Z. M. Wu, Y. S. Hou, B. Cui, Y. H. Jin, M. Dai, and G. Q. Xia, “Performance optimization research of reservoir computing system based on an optical feedback semiconductor laser under electrical information injection,” Opt. Express 27(14), 19931–19939 (2019). [CrossRef]  

25. X. X. Guo, S. Y. Xiang, Y. H. Zhang, L. Lin, A. J. Wen, and Y. Hao, “High-speed neuromorphic reservoir computing based on a semiconductor nanolaser with optical feedback under electrical modulation,” IEEE J. Sel. Top. Quantum Electron. 26(5), 1–7 (2020). [CrossRef]  

26. A. Argyris, J. Cantero, M. Galletero, E. Pereda, C. R. Mirasso, I. Fischer, and M. C. Soriano, “Comparison of photonic reservoir computing systems for fiber transmission equalization,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–9 (2020). [CrossRef]  

27. A. Uchida, Optical Communication with Chaotic Lasers: Applications of Nonlinear Dynamics and Synchronization (Wiley-VCH, 2012).

28. K. Kitayama, M. Notomi, M. Naruse, K. Inoue, S. Kawakami, and A. Uchida, “Novel frontier of photonics for data processing - Photonic accelerator,” APL Photonics 4(9), 090901 (2019). [CrossRef]  

29. C. Sugano, K. Kanno, and A. Uchida, “Reservoir computing using multiple lasers with feedback on a photonic integrated circuit,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–9 (2020). [CrossRef]  

30. J. Vatin, D. Rontani, and M. Sciamanna, “Enhanced performance of a reservoir computer using polarization dynamics in VCSELs,” Opt. Lett. 43(18), 4497–4500 (2018). [CrossRef]  

31. S. Y. Xiang, W. Pan, B. Luo, L. Yan, X. Zou, N. Jiang, L. Yang, and H. Zhu, “Unpredictability-enhanced chaotic vertical-cavity surface-emitting lasers with variable-polarization optical feedback,” J. Lightwave Technol. 29(14), 2173–2179 (2011). [CrossRef]  

32. X. X. Guo, S. Y. Xiang, Y. H. Zhang, L. Lin, A. J. Wen, and Y. Hao, “Four-channels reservoir computing based on polarization dynamics in mutually coupled VCSELs system,” Opt. Express 27(16), 23293–23306 (2019). [CrossRef]  

33. “Texas Instruments-Developed 46-Word Speaker-Dependent Isolated Word Corpus (TI46)”, NIST Speech Disc 7-1.1 (1 disc), (1991).

34. R. F. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 7, 1282–1285 (1982).

35. J. Martin-Regalado, F. Prati, M. San Miguel, and N. B. Abraham, “Polarization properties of vertical-cavity surface-emitting lasers,” IEEE J. Quantum Electron. 33(5), 765–783 (1997). [CrossRef]  

36. C. Masoller and N. B. Abraham, “Low-frequency fluctuations in vertical-cavity surface-emitting semiconductor lasers with optical feedback,” Phys. Rev. A 59(4), 3021–3031 (1999). [CrossRef]  

37. R. M. Nguimdo, G. Verschaffelt, J. Danckaert, and G. Van der Sande, “Simultaneous computation of two independent tasks using reservoir computing based on a single photonic nonlinear node with optical feedback,” IEEE Trans. Neural Netw. Learning Syst. 26(12), 3301–3307 (2015). [CrossRef]  

38. X. S. Tan, Y. S. Hou, Z. M. Wu, and G. Q. Xia, “Parallel information processing by a reservoir computing system based on a VCSEL subject to double optical feedback and optical injection,” Opt. Express 27(18), 26070–26079 (2019). [CrossRef]  

39. I. Gatare, M. Sciamanna, A. Locquet, and K. Panajotov, “Influence of polarization mode competition on the synchronization of two unidirectionally coupled vertical-cavity surface-emitting lasers,” Opt. Lett. 32(12), 1629–1631 (2007). [CrossRef]  

40. F. Hua, N. Fang, and L. T. Wang, “Method of selecting operating point of reservoir computing system based on semiconductor lasers,” Acta Phys. Sin. 68(22), 224205 (2019). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (7)

Fig. 1.
Fig. 1. Schematic diagram of the RC system based on mutually delay coupled VCSELs (MDC-VCSELs). SL: semiconductor laser; MZM: Mach-Zehnder modulator; VCSEL: vertical-cavity surface-emitting laser; Mu: cochleagram matrix; W1(W2): mask matrix; S1(S2): masked spoken matrix; Q1(Q2): 1D time-dependent series.
Fig. 2.
Fig. 2. Example of computed result MCi which appears as “6” for this untrained test digit.
Fig. 3.
Fig. 3. Dynamic states as a function of the injection strengths kinj and the coupling strengths kd with Δf = 0 GHz.
Fig. 4.
Fig. 4. WERs of RC systems based on MDC-VCSELs as a function of the number of virtual nodes Nn with kinj = 100 ns-1, kd = 30 ns-1 and Δf = 0 GHz.
Fig. 5.
Fig. 5. WERs of the RC system based on MDC-VCSELs in kinj - kd parameter space for spoken digit recognition task with Δf = 0 GHz.
Fig. 6.
Fig. 6. WERs as a function of (a) γa2 with γa1 = 0.1 ns-1 and (b) γp2 with γp1 = 10 ns-1, under kinj = 210 ns-1, kd = 70 ns-1 and Δf = 0 GHz.
Fig. 7.
Fig. 7. WERs as a function of the frequency detuning Δf with kinj = 210 ns-1 and kd = 70 ns-1.

Equations (8)

Equations on this page are rendered with MathJax. Learn more.

W r  =  M T X train T ( X train X trai n T λ I ) 1
d E 1 , 2 x d t = k ( 1 + i α ) ( N 1 , 2 E 1 , 2 x E 1 , 2 x + i n 1 , 2 E 1 , 2 y ) ( γ a + i γ p ) E 1 , 2 x + k d E 2 , 1 x ( t τ ) e i ω 0 τ + k i n j ε 1 , 2 ( t ) + F 1 , 2 x ( t )
d E 1 , 2 y d t = k ( 1 + i α ) ( N 1 , 2 E 1 , 2 y E 1 , 2 y i n 1 , 2 E 1 , 2 x ) + ( γ a + i γ p ) E 1 , 2 y + F 1 , 2 y ( t )
d N 1 , 2 d t = γ N [ N 1 , 2 ( 1 + | E 1 , 2 x | 2 + | E 1 , 2 y | 2 ) μ + i n 1 , 2 ( E 1 , 2 x E 1 , 2 y E 1 , 2 y E 1 , 2 x ) ]
d n 1 , 2 d t = γ s n 1 , 2 γ N [ n 1 , 2 ( | E 1 , 2 x | 2 + | E 1 , 2 y | 2 ) + i N 1 , 2 ( E 1 , 2 y E 1 , 2 x E 1 , 2 x E 1 , 2 y ) ]
ε 1 , 2 ( t ) = | ε 0 | 2 [ 1 + e i γ Q 1 , 2 ( t ) ] e i 2 π Δ f t
F 1 , 2 x ( t ) = β sp 2 ( N 1 , 2 + n 1 , 2 ξ 1 + N 1 , 2  -  n 1 , 2 ξ 2 )
F 1 , 2 y ( t ) = i β sp 2 ( N 1 , 2 + n 1 , 2 ξ 1 N 1 , 2  -  n 1 , 2 ξ 2 )
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.