Spoken digit recognition utilizing a reservoir computing system based on mutually coupled VCSELs under optical injection

ShuLu Tan; ShuLu Tan; ZhengMao Wu; ZhengMao Wu; ZhengMao Wu; DianZuo Yue; WeiLai Wu; WeiLai Wu; GuangQiong Xia; GuangQiong Xia

doi:10.1364/OPTCON.453196

1. Introduction

In today’s information-driven society, various efficient information processing technologies are emerging to deal with complicated tasks, which are difficult for conventional computers to handle, such as chaotic time series prediction, speech recognition, nonlinear channel equalization and so on [1–3]. Among these technologies, reservoir computing (RC) originated from recurrent neural network (RNN) is a recently proposed brain-inspired paradigm [1,4–11]. For RC, the input and the internal connection weights are randomly fixed, and only the output weight needs to be trained [3]. Therefore, compared with RNN, RC possesses a faster and simpler training process. The information processing of RC is based on a reservoir layer to perform the nonlinear mapping from low dimensional input signal to higher dimensional reservoir state [2–3,12]. Conceptually, the nonlinear mapping can be implemented via many interconnected physical devices named as nonlinear nodes. However, the adoption of a large number of nonlinear nodes results in a challenge for practical implementation due to relatively high costs and the difficulty to collect the node states [2]. In order to solve this issue, delay-based RC systems have been proposed, in which a single nonlinear node under time-delayed feedback is taken as the reservoir [13–15], and the nonlinear transient responses of the system are sampled with an equal time interval from the feedback loop, which are taken as the virtual node states. Delay-based RC system can be divided into several categories, such as electronic delay-based RC [1], optoelectronic delay-based RC [14,16–17], and all-optical delay-based RC [7,13–18]. Owing to easy implementation by hardware, the delay-based RCs based on a single nonlinear node have been a research hotspot in the RC field, and numerous relevant studies have been reported [19–26]. Particularly, all-optical delay-based RC systems, in which a semiconductor laser (SL) is taken as the single nonlinear node, receive extra attention due to their high relaxation oscillation frequency [27]. However, for SL-based RCs with a single nonlinear node utilized to process some complicated tasks, hundreds or even thousands of virtual nodes are required, and therefore a large number of data features need to be extracted for achieving good performance. Since the minimum time interval between two adjacent virtual nodes is limited by the response time of the nonlinear node, the requirement of large number of virtual nodes results in a long data processing time [28]. In other words, it is difficult to achieve high processing rate under the premise of good performance.

In order to improve the processing rate of SL-based RCs, some novel system schemes are recently proposed, which are composed of several nonlinear nodes under time-delayed feedback (or coupling) [3,29]. Theoretically, by combining the virtual nodes states of several reservoirs, the number of virtual nodes required for each reservoir can be reduced. Since the processing rate is inversely proportional to the number of virtual nodes in each feedback loop, the processing rate of this RC system including several nonlinear nodes can be increased. For instance, in 2018, our group proposed a RC system based on mutually delay-coupled semiconductor lasers (SLs) [3], and it possesses fast processing rate due to adopting two time-delayed SLs as the reservoir. In 2020, Sugano et al. proposed a parallel RC system based on multiple SLs with optical feedback [29]. The result shows that, this system can make a balance between system complexity and data processing rate. In the schemes mentioned above, the used lasers are edge-emitting semiconductor lasers (EELs). Comparing with EELs, vertical-cavity surface-emitting lasers (VCSELs) have many advantage such as small size, high energy efficiency, low threshold current, and easy to photonic integration [30–31]. In particular, two orthogonal polarization components namely X-polarization component (X-PC) and Y-polarization component (Y-PC), may coexist in a VCSEL. As a result, VCSELs-based reservoir can provide much richer nonlinear dynamics than EELs-based reservoir. In 2019, Guo et al. proposed a RC system based on two mutually delay-coupled VCSELs for performing a Santa-Fe chaotic time series prediction task, and the processing rate is up to 20 Gbps [32].

At present, the all-optical delay-based RC system based on several nonlinear nodes under time-delayed feedback (or coupling) are almost adopted to process the Santa-Fe chaotic time series prediction task and the nonlinear channel equalization task. However, to our knowledge, the performances of such a type RC system for processing spoken digit recognition task have not been reported. In this work, we propose and numerically stimulate a delay-based RC system for processing spoken digit recognition task, where two mutually delay-coupled VCSELs (MDC-VCSELs) are taken as the reservoir. Through analyzing the influences of some key parameters on the system performance, the optimized parameter regions have been determined. For processing a spoken digit recognition task with a rate of 1.1×10⁷ words per second, the word error rate can be as low as 0.02% under adopting a dataset consisting of 5000 samples.

2. System and theory models

Figure 1 is the schematic diagram of the proposed system based on two mutually delay-coupled VCSELs (MDC-VCSEL1 and MDC-VCSEL2) for processing spoken digit recognition task. We perform spoken digit recognition task based on TI-46 speech corpus [33]. Before injected into RC system, the original one-dimension (1D) sound waveform needs to be preprocessed by Lyon ear mode for converting into 2D frequency-time matrix named as a cochleagram matrix M_u [34]. M_u is a P×K matrix, where P is the number of frequency channel components and K is the number of discrete times.

Fig. 1. Schematic diagram of the RC system based on mutually delay coupled VCSELs (MDC-VCSELs). SL: semiconductor laser; MZM: Mach-Zehnder modulator; VCSEL: vertical-cavity surface-emitting laser; M_u: cochleagram matrix; W₁(W₂): mask matrix; S₁(S₂): masked spoken matrix; Q₁(Q₂): 1D time-dependent series.

Download Full Size | PDF

As shown in Fig. 1, the system is composed of three parts: input layer, reservoir and output layer. In the input layer, M_u is converted into a masked input signal S₁ (S₂) after multiplied by a mask matrix W₁ (W₂), which is N_n×P matrix (N_n is the number of virtual nodes collected from each VCSEL within coupled time). Each matrix moment in W₁ (W₂) is randomly setting from a binary sequence of {-1, 1}. As a result, the masked matrix S₁ (S₂) is an N_n×K matrix. Through directly connecting subsequent column vectors to the preceding one, a 1D time-dependent series Q₁ (Q₂) can be obtained. Finally, via two Mach-Zehnder modulator (MZM), Q₁ and Q₂ are loaded into optical wave provided by a semiconductor laser (SL), and then injected into VCSEL1 and VCSEL2, respectively. In the reservoir layer, the coupling time between VCSEL1 (VCSEL2) and VCSEL2 (VCSEL1) is denoted as τ₁ (τ₂). For simplicity, we assume τ₁ = τ₂ = τ. The responses of MDC-VCSELs with consecutive intervals θ correspond to the states of consecutive virtual nodes. Here, we employ synchronization scheme, i.e., θ = τ / N_n.

In the output layer, the total output intensity |E|² = |E_x|² + |E_y|² of VCSEL1 (VCSEL2) is extracted at an interval of θ and interpreted as a virtual node state, in which |E_x|² and |E_y|² are the output intensity of the X polarization component (X-PC) and Y polarization component (Y-PC), respectively. In a coupling time τ, N_n virtual node states can be collected in each coupling loop. For a spoken digit including K discrete times, two N_n × K node state matrices X₁ and X₂ can be obtained. Through vertically concatenating X₁ and X₂, a resulted state matrix X with 2N_n×K matrix elements is acquired. Due to speech corpus including 500 spoken digits, there are 500 resulted state matrices. Particularly, we choose 450 matrices for training and 50 matrices for testing. The training matrix X_train is obtained by horizontally concatenate 450 training state matrices. Then, we use the X_train to train the read-out weight matrix W_r to minimize the mean square error || W_r·X_train - M_T ||². Therefore, the optimal read-out weight matrix W_r is obtained by:

(1)$${\mathbf W}\textrm{r}\textrm{ = }{\mathbf M}_\textrm{T}\cdot {\mathbf X}_\textrm{train}{^\textrm{T}}{({{\mathbf X}\textrm{train}{\mathbf X}\textrm{trai}{\textrm{n}^\textrm{T}} - \lambda {\mathbf I}} )^{ - 1}}$$

where λ is regression parameter to compensate for a necessary ill-posed problem, I is the N_n -dimensional unity matrix. M_T is a target matrix with ten rows, where the values are set as ones in the row corresponding to the correct digit, and zeros in the rest of rows.

In the testing process, for i-th testing matrix X_testⁱ, a computed matrix M_Cⁱ can be obtained. As shown in Fig. 2, M_Cⁱ is the product of optimal read-out weight matrix W_r and testing matrix X_testⁱ. Summing the elements of each row in the computed matrix M_Cⁱ, the row corresponding to highest score is the identified result. Through comparing the recognition results with the known values, the number of errors can be determined. The WER is the ratio of the number of errors to the total number of the testing data. In this work, the WER <1% is selected as an appropriate value for judging good performance [14–15]. The values of WER given below is the result of 10-fold cross-validation.

Fig. 2. Example of computed result M_Cⁱ which appears as “6” for this untrained test digit.

Download Full Size | PDF

Based on the spin-flip model (SFM) [35–36], the rate equations modeling the nonlinear dynamics of MDC-VCSELs under optical injection can be described by:

(2)$$\begin{aligned} \frac{{dE_{1,2}^x}}{{dt}} &= k(1 + i\alpha )({N_{1,2}}E_{1,2}^x - E_{1,2}^x + i{n_{1,2}}E_{1,2}^y) - ({\gamma _a} + i{\gamma _p})E_{1,2}^x\\ + {k_d}E_{2,1}^x({t - \tau } ){e^{ - i{\omega _\textrm{0}}\tau }} &+ {k_{inj}}{\varepsilon _{1,2}}(t )+ F_{1,2}^x(t )\end{aligned}$$

(3)$$\frac{{dE_{1,2}^y}}{{dt}} = k(1 + i\alpha )({N_{1,2}}E_{1,2}^y - E_{1,2}^y - i{n_{1,2}}E_{1,2}^x) + ({\gamma _a} + i{\gamma _p})E_{1,2}^y + F_{1,2}^y(t )$$

(4)$$\frac{{d{N_{1,2}}}}{{dt}} ={-} {\gamma _N}[{{N_{1,2}}({1 + {{ {|{E_{1,2}^x} } |}^2} + {{ {|{E_{1,2}^y} } |}^2}} )- \mu + i{n_{1,2}}({E_{1,2}^xE{{_{1,2}^y}^{\ast }} - E_{1,2}^yE{{_{1,2}^x}^{\ast }}} )} ]$$

(5)$$\frac{{d{n_{1,2}}}}{{dt}} ={-} {\gamma _s}{n_{1,2}} - {\gamma _N}[{{n_{1,2}}({{{ {|{E_{1,2}^x} } |}^2} + {{ {|{E_{1,2}^y} } |}^2}} )+ i{N_{1,2}}({E_{1,2}^yE{{_{1,2}^x}^{\ast }} - E_{1,2}^xE{{_{1,2}^y}^{\ast }}} )} ]$$

where the superscripts x and y represent X-PC and Y-PC, and the subscripts 1 and 2 account for VCSEL1 and VCSEL2, respectively. E is the slowly varied complex amplitude of the electric field, N expresses the total carrier inversion between conduction and valence bands, and n describes the difference between carrier inversions for the spin-up and spin-down radiation channels. k stands for the decay rate of field, α accounts for the linewidth enhancement factor, γ_a is the linear dichroism of two VCSELs, γ_s represents the decay rate of spin-flip rate, γ_p indicates the linear birefringence, and γ_N denotes the decay rate of N. µ serves as the normalized injection current, where µ takes the value of 1 at threshold. The coupling strengths between two VCSELs is k_d. k_inj denotes the external injection strength. ω₀ indicates the center frequency of two orthogonal polarization outputs of VCSEL1 (VCSEL2). The parameters of two VCSELs are identical unless specifically mentioned.

The fourth term on the right side in Eq. (2) is the external injection containing two input information (Q₁, Q₂). µ₁ and µ₂ denote the outputs of MZMs, which are described as [37]:

(6)$${\varepsilon _{1,2}}(t )= \frac{{|{{\varepsilon_0}} |}}{2}[{1 + {e^{i\gamma {Q_{1,2}}(t )}}} ]{e^{i\textrm{2}\pi \Delta ft}}$$

where |µ₀| is the amplitude of the injection field, γ represents the input scaling factor, and Δf denotes the frequency detuning between SL and the central frequency of VCSEL1 (VCSEL2).

The last term on the right side in Eq. (2) and Eq. (3) represent the spontaneous emission noises, known as Langevin sources [38]:

(7)$$F_{1,2}^\textrm{x}(t )= \sqrt {\frac{{{\beta _{\textrm{sp}}}}}{2}} \left( {\sqrt {{N_{1,2}} + {\textrm{n}_{1,2}}} {\xi_1} + \sqrt {{N_{1,2}}\textrm{ - }{\textrm{n}_{1,2}}} {\xi_2}} \right)$$

(8)$$F_{1,2}^y(t )={-} i\sqrt {\frac{{{\beta _{\textrm{sp}}}}}{2}} \left( {\sqrt {{N_{1,2}} + {\textrm{n}_{1,2}}} {\xi_1} - \sqrt {{N_{1,2}}\textrm{ - }{\textrm{n}_{1,2}}} {\xi_2}} \right)$$

where β_sp describes the coefficient of spontaneous emission, ζ is the Gaussian white noises sources with zero mean and unit variance.

3. Results and discussion

In this section, the performance of MDC-VCSELs in spoken digit recognition is evaluated in detail. During the simulation, the parameters of the two VCSELs are [39]: α = 3, k = 300 ns^-1, γ_a = 0.1 ns^-1, γ_p = 10 ns^-1, γ_N = 1 ns^-1, γ_s = 50 ns^-1, β_sp = 10⁻⁶ ns^-1, ω₀ = 2πc/λ with λ = 850 nm and c = 3×10⁸ m/s, |µ₀| = 1, γ = 0.5, Δf = 0 GHz, and µ = 1.1. Here, θ is fixed at 12.5 ps. The parameters are fixed unless specifically mentioned.

First, we investigate the dynamics of the reservoir. It has been reported that, under optical injection, for different injection intensities and coupling intensities, the best performance of the RC system appears at the edge of the injection-locked state of the reservoir [40]. The dynamic states as a function of the injection strengths k_inj and the coupling strengths k_d are shown in Fig. 3. IL, P1, P2, MP and CO represent injection-locked state, period-one state, period-two state, multi-periodic state and chaotic state, respectively. It can be seen from the figure that, when the reservoir is at a certain optical injection strength, with the coupling strength k_d increasing, most of the state evolution go through the IL, P1, P2, MP, and CO states. Under a small amount of injection strengths, its dynamic behavior is more abundant, and the evolution from MP into CO states experiences P1and P2 states. The purpose of analyzing dynamical states is to lay the theoretical foundation for the subsequent parameter analysis.

Fig. 3. Dynamic states as a function of the injection strengths k_inj and the coupling strengths k_d with Δf = 0 GHz.

Download Full Size | PDF

Second, the variation of the WERs with the number of virtual nodes N_n are analyzed in Fig. 4. With N_n increasing, the trend of WERs firstly decreases, and then maintain at a relatively stable level. From this diagram, it can be seen that, with the increase of N_n from 50 to 125, the WER gradually decreases. The reason is that a small number of nodes N_n means a lower dimension of the state space, which makes the training process more difficult to implement and therefore the WER is larger. When N_n ≥ 125, the values of WER are below 1%. However, since the node interval θ is fixed, an increase of N_n is accompanied by an increase of the coupled time τ. This results in a longer coupled loop and a slower processing rate. For making a balance between the high-speed processing rate and the low WER, the value of N_n is fixed at 125 in the following discussion.

Fig. 4. WERs of RC systems based on MDC-VCSELs as a function of the number of virtual nodes N_n with k_inj = 100 ns^-1, k_d = 30 ns^-1 and Δf = 0 GHz.

Download Full Size | PDF

Next, in order to get the appropriate parameters space of k_inj and k_d, the evolution map of the WERs for RC based on MDC-VCSELs is displayed in Fig. 5. Different colors are corresponding to different values of WER. The blank region refers to the WER values larger than 0.06. The good performance is obtained at a relatively large injection strength k_inj and a small coupling strength k_d, which basically distribute in the band-shaped area sandwiched by two red lines. Comparing Fig. 3 and Fig. 5, it is found that, when the values of injection strength k_inj and the coupling strength k_d are corresponding to this area, the system is working on the edge of injection-locked state. That is to say, when the values of injection strength k_inj and the coupling strength k_d are on the edge of injection locking states, the non-linearity and memory required by speech digit recognition task can be provided. The minimum value of WER is 0.04%, which is obtained at k_inj = 210 ns^-1 and k_d = 70 ns^-1.

Fig. 5. WERs of the RC system based on MDC-VCSELs in k_inj - k_d parameter space for spoken digit recognition task with Δf = 0 GHz.

Download Full Size | PDF

Furthermore, the influences of inner parameters for RC based on MDC-VCSELs are studied in Fig. 6. In this test, two key parameters γ_a and γ_p are selected as examples. We set γ_a1 (γ_a2) as the linear dichroism of VCSEL1 (VCSEL2), and γ_p1 (γ_p2) as the linear birefringence of VCSEL1 (VCSEL2). Considering that γ_a1 = γ_a2 (γ_p1 = γ_p2) is difficult to satisfy in the practical implementation, it is necessary to analyze the case of γ_a1 ≠ γ_a2 (γ_p1 ≠ γ_p2). The WERs as a function of γ_a2 are exhibited in Fig. 6(a). Here, γ_a1 is fixed at 0.1 ns^-1, γ_a2 is varied within [0.09 ns^-1, 0.11 ns^-1], and the difference (γ_a1 - γ_a2) is varied with ±10% of γ_a1. The WERs as a function of γ_p2 are exhibited in Fig. 6(b). γ_p1 is fixed at 10 ns^-1, γ_p2 is varied within [9 ns^-1, 11 ns^-1], and the difference (γ_p1 - γ_p2) is varied with ±10% of γ_p1. As shown in Fig. 6, the WERs in two figures present fluctuations, but the amplitudes are small. Therefore, if the difference between γ_a1 and γ_a2 (γ_p1 and γ_p2) is within 10% of γ_a1 (γ_p1), the WERs can be achieved the similar level as that obtained under γ_a1 = γ_a2 (γ_p1 = γ_p2).

Fig. 6. WERs as a function of (a) γ_a2 with γ_a1 = 0.1 ns^-1 and (b) γ_p2 with γ_p1 = 10 ns^-1, under k_inj = 210 ns^-1, k_d = 70 ns^-1 and Δf = 0 GHz.

Download Full Size | PDF

Finally, we investigate the variation of WERs with the frequency detuning Δf. For simplicity, we suppose that the frequency of two VCSELs are same, and increase Δf from -20 GHz to 30 GHz. The dependence of the WERs on Δf are displayed in Fig. 7. With Δf increasing, the trend of WERs firstly present fluctuations, and then sharply decreases when Δf > -10 GHz. After achieving the minimum, the curve increases with Δf, and goes through an oscillation within 9 GHz ≤ Δf ≤ 27 GHz. It can be seen that the values of WER corresponding to the frequency detuning Δf region [-10 GHz, 27 GHz] are below 1%. The relatively low WERs are obtained around zero detuning, and the minimum value of WER = 0.02% is obtained at -2 GHz.

Fig. 7. WERs as a function of the frequency detuning Δf with k_inj = 210 ns^-1 and k_d = 70 ns^-1.

Download Full Size | PDF

4. Conclusion

In summary, the performance of a RC system based on mutually delay-coupled VCSELs (MDC-VCSELs) under optical injection for processing speech digits recognition task is numerically investigated. In such a system, two MDC-VCSELs are taken as two nonlinear nodes of the reservoir to perform non-linearly mapping of the input information. Each spoken digit is preprocessed by two different masks to form two masked matrices, whose subsequent column vectors are connected to the preceding one to form two time-dependent series. Then, they are injected into the main polarization of two VCSELs, respectively. The dynamical responses of two VCSELs are sampled as virtual node states for training and testing. It is found that, when the values of injection strength (k_inj) and coupling strength (k_d) are on the edge of injection locking states, the better performance in spoken digit recognition task can be achieved. Besides, if the difference of inner parameters (γ_a, γ_p) in two VCSELs is in a very small range, the performance is almost unaffected. Moreover, the best processing ability can be achieved by weak negative frequency detuning (Δf). The simulated result show that, for processing a spoken digit recognition task with a rate of 1.1×10⁷ words per second, the word error rate can be as low as 0.02% under adopting a dataset consisting of 5000 samples.

Funding

National Natural Science Foundation of China (61775184, 61875167).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2(1), 468 (2011). [CrossRef]

2. W. Maass, T. Natschläger, and H. Markram, “Real-time computing without stable States: A new framework for neural computation based on perturbations,” Neural Computation 14(11), 2531–2560 (2002). [CrossRef]

3. Y. S. Hou, G. Q. Xia, E. Jayaprasath, D. Z. Yue, W. Y. Yang, and Z. M. Wu, “Prediction and classification performance of reservoir computing system using mutually delay-coupled semiconductor lasers,” Opt. Commun. 433(15), 215–220 (2019). [CrossRef]

4. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304(5667), 78–80 (2004). [CrossRef]

5. D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout, “Isolated word recognition with the liquid state machine: a case study,” Inf. Process. Lett. 95(6), 521–528 (2005). [CrossRef]

6. C. Du, F. Cai, M. A. Zidan, W. Ma, S. H. Lee, and W. D. Lu, “Reservoir computing using dynamic memristors for temporal information processing,” Nat. Commun. 8(1), 2204 (2017). [CrossRef]

7. Q. Vinckier, F. Duport, A. Smerieri, K. Vandoorne, P. Bienstman, M. Haelterman, and S. Massar, “High-performance photonic reservoir computer based on a coherently driven passive cavity,” Optica 2(5), 438–446 (2015). [CrossRef]

8. D. Verstraeten, B. Schrauwen, and D. Stroobandt, “Reservoir-based techniques for speech recognition,” in Proceedings of IJCNN06, International Joint Conference on Neural Networks, ed. (Academic), 1050–1053 (2006).

9. M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” Künstl Intell 26(4), 365–371 (2012). [CrossRef]

10. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Computer Science Review 3(3), 127–149 (2009). [CrossRef]

11. J. Pathak, B. Hunt, M. Girvan, Z. Lu, and E. Ott, “Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach,” Phys. Rev. Lett. 120(2), 024102 (2018). [CrossRef]

12. H. Jaeger, W. Maass, and J. Principe, “Special issue on echo state networks and liquid state machines,” Neural Networks 20(3), 287–289 (2007). [CrossRef]

13. F. Duport, B. Schneider, A. Smerieri, M. Haelterman, and S. Massar, “All-optical reservoir computing,” Opt. Express 20(20), 22783–22795 (2012). [CrossRef]

14. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2(1), 287 (2012). [CrossRef]

15. D. Brunner, M. C. Soriano, C. R. Mirasso, and I. Fischer, “Parallel photonic information processing at gigabyte per second data rates using transient states,” Nat. Commun. 4(1), 1364 (2013). [CrossRef]

16. M. C. Soriano, S. Ortín, D. Brunner, L. Larger, C. R. Mirasso, I. Fischer, and L. Pesquera, “Optoelectronic reservoir computing: Tackling noise-induced performance degradation,” Opt. Express 21(1), 12–20 (2013). [CrossRef]

17. Q. C. Zhao, H. X. Yin, and H. G. Zhu, “Simultaneous recognition of two channels of optical packet headers utilizing reservoir computing subject to mutual-coupling optoelectronic feedback,” Optik 157, 951–956 (2018). [CrossRef]

18. A. Dejonckheere, F. Duport, A. Smerieri, L. Fang, J. L. Oudar, M. Haelterman, and S. Massar, “All-optical reservoir computer based on saturation of absorption,” Opt. Express 22(9), 10868–10881 (2014). [CrossRef]

19. K. Hicke, M. A. Escalona-Moran, D. Brunner, and M. C. Soriano, “Information processing using transient dynamics of semiconductor lasers subject to delayed feedback,” IEEE J. Sel. Top. Quantum Electron. 19(4), 1501610 (2013). [CrossRef]

20. R. M. Nguimdo, G. Verschaffelt, J. Danckaert, and S. G. Van, “Fast photonic information processing using semiconductor lasers with delayed optical feedback: Role of phase dynamics,” Opt. Express 22(7), 8672–8686 (2014). [CrossRef]

21. J. Nakayama, K. Kanno, and A. Uchida, “Laser dynamical reservoir computing with consistency: An approach of a chaos mask signal,” Opt. Express 24(8), 8679–8692 (2016). [CrossRef]

22. Y. S. Hou, G. Q. Xia, W. Y. Yang, D. Wang, E. Jayaprasath, Z. F. Jiang, C. X. Hu, and Z. M. Wu, “Prediction performance of reservoir computing system based on a semiconductor laser subject to double optical feedback and optical injection,” Opt. Express 26(8), 10211–10219 (2018). [CrossRef]

23. K. Harkhoe, G. Verschaffelt, A. Katumba, P. Bienstman, and G. V. Der Sande, “Demonstrating delay-based reservoir computing using a compact photonic integrated chip,” Opt. Express 28(3), 3086–3096 (2020). [CrossRef]

24. D. Z. Yue, Z. M. Wu, Y. S. Hou, B. Cui, Y. H. Jin, M. Dai, and G. Q. Xia, “Performance optimization research of reservoir computing system based on an optical feedback semiconductor laser under electrical information injection,” Opt. Express 27(14), 19931–19939 (2019). [CrossRef]

25. X. X. Guo, S. Y. Xiang, Y. H. Zhang, L. Lin, A. J. Wen, and Y. Hao, “High-speed neuromorphic reservoir computing based on a semiconductor nanolaser with optical feedback under electrical modulation,” IEEE J. Sel. Top. Quantum Electron. 26(5), 1–7 (2020). [CrossRef]

26. A. Argyris, J. Cantero, M. Galletero, E. Pereda, C. R. Mirasso, I. Fischer, and M. C. Soriano, “Comparison of photonic reservoir computing systems for fiber transmission equalization,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–9 (2020). [CrossRef]

27. A. Uchida, Optical Communication with Chaotic Lasers: Applications of Nonlinear Dynamics and Synchronization (Wiley-VCH, 2012).

28. K. Kitayama, M. Notomi, M. Naruse, K. Inoue, S. Kawakami, and A. Uchida, “Novel frontier of photonics for data processing - Photonic accelerator,” APL Photonics 4(9), 090901 (2019). [CrossRef]

29. C. Sugano, K. Kanno, and A. Uchida, “Reservoir computing using multiple lasers with feedback on a photonic integrated circuit,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–9 (2020). [CrossRef]

30. J. Vatin, D. Rontani, and M. Sciamanna, “Enhanced performance of a reservoir computer using polarization dynamics in VCSELs,” Opt. Lett. 43(18), 4497–4500 (2018). [CrossRef]

31. S. Y. Xiang, W. Pan, B. Luo, L. Yan, X. Zou, N. Jiang, L. Yang, and H. Zhu, “Unpredictability-enhanced chaotic vertical-cavity surface-emitting lasers with variable-polarization optical feedback,” J. Lightwave Technol. 29(14), 2173–2179 (2011). [CrossRef]

32. X. X. Guo, S. Y. Xiang, Y. H. Zhang, L. Lin, A. J. Wen, and Y. Hao, “Four-channels reservoir computing based on polarization dynamics in mutually coupled VCSELs system,” Opt. Express 27(16), 23293–23306 (2019). [CrossRef]

33. “Texas Instruments-Developed 46-Word Speaker-Dependent Isolated Word Corpus (TI46)”, NIST Speech Disc 7-1.1 (1 disc), (1991).

34. R. F. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 7, 1282–1285 (1982).

35. J. Martin-Regalado, F. Prati, M. San Miguel, and N. B. Abraham, “Polarization properties of vertical-cavity surface-emitting lasers,” IEEE J. Quantum Electron. 33(5), 765–783 (1997). [CrossRef]

36. C. Masoller and N. B. Abraham, “Low-frequency fluctuations in vertical-cavity surface-emitting semiconductor lasers with optical feedback,” Phys. Rev. A 59(4), 3021–3031 (1999). [CrossRef]

37. R. M. Nguimdo, G. Verschaffelt, J. Danckaert, and G. Van der Sande, “Simultaneous computation of two independent tasks using reservoir computing based on a single photonic nonlinear node with optical feedback,” IEEE Trans. Neural Netw. Learning Syst. 26(12), 3301–3307 (2015). [CrossRef]

38. X. S. Tan, Y. S. Hou, Z. M. Wu, and G. Q. Xia, “Parallel information processing by a reservoir computing system based on a VCSEL subject to double optical feedback and optical injection,” Opt. Express 27(18), 26070–26079 (2019). [CrossRef]

39. I. Gatare, M. Sciamanna, A. Locquet, and K. Panajotov, “Influence of polarization mode competition on the synchronization of two unidirectionally coupled vertical-cavity surface-emitting lasers,” Opt. Lett. 32(12), 1629–1631 (2007). [CrossRef]

40. F. Hua, N. Fang, and L. T. Wang, “Method of selecting operating point of reservoir computing system based on semiconductor lasers,” Acta Phys. Sin. 68(22), 224205 (2019). [CrossRef]

Spoken digit recognition utilizing a reservoir computing system based on mutually coupled VCSELs under optical injection

Abstract

1. Introduction

2. System and theory models

3. Results and discussion

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (7)

Equations (8)

Optics Continuum