Ultra-compact polarimeter based on a plasmonic spiral assisting by machine learning

Boxuan Zhou; Yu Yu; Wenhao Wu; Xinliang Zhang

doi:10.1364/OSAC.2.003343

1. Introduction

The state of polarization (SOP) is one of the basic properties of monochromatic electromagnetic wave along with phase and frequency, and it has been studied and implemented in various fields ranging from astronomy [1], remote sensing [2] to material analysis [3]. In order to quantify the SOP, relative Stokes vector [4] with three Stokes parameters has been introduced. Conventionally, a series of polarizers and wave plates have to be arranged properly and several round measurements are needed to obtain Stokes parameters, increasing inconvenience of the system. Furthermore, those systems are usually inevitably complex and bulky.

In order to detect SOP with only one chip, different types of metastructures were proposed [5–8]. Typically, incident light with different SOPs can be guided into different directions. Other designs use polarization sensitive nano-structures that generate unidirectional surface plasmon polariton (SPP) wave [9,10]. However, devices mentioned above are usually spatially inefficient or require complex nano-structures, making them difficult to design and fabricate. Without utilizing complex structures, the plasmonic spiral lens consisting of only an Archimedes’ spiral has attracted great interests for years, mainly for the ability to distinguish circular polarization [11–15]. Using a left-hand spiral plasmonic lens, a bright focus spot appears at the center of the transmission pattern for right-handed circular polarization (RCP) illumination, while a dark center can be observed for left-handed circular polarization (LCP) illumination. However, this property cannot be used to directly detect SOPs of elliptically polarized light.

To tackle this issue, the machine learning method is introduced here. Recently, machine learning is widely implemented in the field of nanophotonics such as using geometric deep learning to facilitate the design process [16]. In this paper, machine learning algorithm is used to uncover the correlation between near-field intensity patterns and input SOPs. Specifically, a 4-turn Archimedes’ spiral slot is designed on a 600 nm thick Au film with a footprint of 10 × 10µ${m^2}$. By using the near-field intensity images with pre-known incident SOPs to train a three-layer neural network (NN), this spiral structure can be implemented to detect any SOP in one shot, which is different from previous similar schemes worked only for circular polarization. Although the device is designed for the wavelength of 980 nm, it is still applicable for a broadband ranging from 900 to 1100 nm, showing a high tolerance to wavelength shift. Simulated results show that the training process takes only 250 epochs to successfully converge and the detection accuracy is 1.23e-3, benchmarked by mean-squared error (MSE) between the predicted SOPs (represented by Stokes parameters) and real values.

2. Theories and methods

In this part, the theoretical SPP field distribution of a designed spiral structure is derived, in order to verify its potential to use machine learning for SOP detection. Also, the architecture of the NN and data generation process will be discussed.

2.1 Theoretical models

A left hand 4-turn Archimedes spiral slot is penetrated in Fig. 1(a), based on a thin Au film with a thickness of 600 nm on the glass substrate. The whole footprint is 10µ$\textrm{m}$ in length and 10µ$\textrm{m}$ in width, illuminated normally by a plane wave at 980 nm. The slit width of each turn of the spiral is chosen to be 250 nm. As shown in Fig. 1(b), it is described in the cylindrical coordinate as :

(1)$${r_n}(\varphi )= {r_{n0}} + \frac{\varphi }{{2\pi }}{\lambda _{spp}},\; \textrm{for}\; 0 < {\varphi } < 2{\pi }\textrm{, }\textrm{n} = 1,2,3,4\; ,$$

Fig. 1. (a) The Au spiral structure on a glass substrate illuminated normally by different polarization states. (b) A left hand 4-turn spiral slot penetrated through an Au film. (c) Single turn spiral.

Download Full Size | PDF

where ${r_n}(\varphi ){\; }$ is the distance from the point (${r_n}$, φ) of the ${n_{th{\; }}}$ spiral slot to the center point and ${r_{n0}}$ is the starting distance to the center of the n_th turn. In particular, the starting radius ${r_{10{\; }}}$ is set to be 1µ$\textrm{m}$ and SPP wavelength ${\lambda _{spp}}$ is 920nm. It should be noted that the pitch of the spiral slot is also 920 nm which should be equal to ${\lambda _{spp}}$. According to Ref. [13], a spiral with multiple turns can enhance the plasmonic field and increase its sensitivity to incident SOP significantly compared to single-turn spiral. On the other hand, considering the fact that the proposed polarimeter has an ultra-compact geometry with a footprint of only 10µ$\textrm{m} \times {\; }$10µ$\textrm{m}$, a 4-turn spiral structure is selected eventually.

In order to prove that such a spiral structure is sensitive to different SOPs theoretically, in other words, its transmission intensity pattern will change along with the incident SOP, the theoretical field distribution is derived. For brevity, we only calculate the condition of a single-turn spiral, as shown in Fig. 1(c), considering the major difference between multi-turn and single-turn spiral is only the enhancement of plasmonic field [13]. According to Ref. [11], the near-field distribution of this spiral structure illuminated by LCP is given as:

(2)$$\overrightarrow {{E_{spp}}} ({R,\theta } )= \overrightarrow {{e_z}} 2\pi {E_{0z}}{r_0}{e^{ - {k_z}z}}{e^{i{k_r}{r_0}}}{J_0}({{k_r}R} )$$

Similarly, the near-field distribution of the same structure illuminated by RCP is derived as:

(3)$$\overrightarrow {{E_{spp}}} ({R,\theta } )= \overrightarrow {{e_z}} 2\pi {E_{0z}}{r_0}{e^{ - {k_z}z}}{e^{i{k_r}{r_0}}}{e^{2i\theta }}{J_2}({{k_r}R} )$$

Here, we can calculate the near-field distribution as a superposition of the two parts shown above because every SOP can be written as the superposition of two orthogonal circular polarized segments, as indicated in Eq. (4). Specifically, we use Jones vector to demonstrate it as:

(4)$$\left[ {\begin{array}{c} A\\ B \end{array}} \right] = \textrm{C}{e^{i{\varphi _1}}}\left[ {\begin{array}{c} {\frac{{\sqrt 2 }}{2}}\\ { - \frac{{\sqrt 2 }}{2}i} \end{array}} \right] + \textrm{C}{e^{i{\varphi _2}}}\left[ {\begin{array}{c} {\frac{{\sqrt 2 }}{2}}\\ {\frac{{\sqrt 2 }}{2}i} \end{array}} \right]$$

where $\left[ {\begin{array}{c} {\frac{{\sqrt 2 }}{2}}\\ { - \frac{{\sqrt 2 }}{2}i} \end{array}} \right]$ denotes LCP, $\left[ {\begin{array}{c} {\frac{{\sqrt 2 }}{2}}\\ {\frac{{\sqrt 2 }}{2}i} \end{array}} \right]$ denotes RCP and $\left[ {\begin{array}{c} A\\ B \end{array}} \right]$ can be any SOP. Therefore, for any input SOP in the cylindrical coordinate as shown in Fig. 1(c), the field distribution is given by:

(5)$$\begin{aligned}{{\vec{E}}_{spp}} &= {e^{i{\varphi _2}}}C^{\prime}\overrightarrow {{e_z}} {r_0}{e^{ - {k_z}Z}}{e^{i{K_{SPP}}{r_0}}}{J_0}({{K_{SPP}}R} )+ {e^{i{\varphi _1}}}C^{\prime}\overrightarrow {{e_z}} {r_0}{e^{ - {k_z}Z}}{e^{i{K_{SPP}}{r_0}}}{e^{2i\theta }}{J_2}({{K_{SPP}}R} )\\ &= C^{\prime}\overrightarrow {{e_z}} {r_0}{e^{ - {k_z}Z}}{e^{i{K_{SPP}}{r_0}}}[{e^{i{\varphi _2}}}{J_0}({{K_{SPP}}R} )+ {e^{i{\varphi _1}}}{e^{2i\theta }}{J_2}({{K_{SPP}}R} )], \end{aligned}$$

where C’ is a constant, ${J_0}$ is the zero-order Besell function and ${J_2}$ is the second-order Besell function. However, only the intensity distribution $\textrm{I}({\textrm{R},{\theta }} )$, described in Eq. (6), rather than the electric field distribution is available in practical application. In this sense, the phase information of the field is unrecoverable and the SOP information is also unknown.

(6)$$\textrm{I} = {\vec{E}_{spp}}\ast {\vec{E}_{spp}}^\ast $$

To retrieve SOPs using only intensity patterns, it is necessary to create a function that maps the intensity distribution $\textrm{I}$ to the incident relative Stokes vector S = [s1, s2, s3]. Unlike traditional regression approaches, machine learning algorithm is a flexible technique, which is applicable for different problems regardless of regression types and more capable of processing a large amount of data. Therefore, machine learning based on NN has clear advantages for finding an unknown mapping function here.

2.2 Data set generation and network architecture

As shown in Figs. 2(a) to 2(d), the data set contains pixelated near-field intensity patterns with pre-known SOPs as labels. Specifically, 700 near-field intensity patterns of this spiral device under arbitrary polarized illumination are collected, using FDTD method (commercial simulation software FDTD). 300 samples are randomly selected as training-set, 100 as validation-set and 300 as testing-set. In simulation, a plane wave at 980nm is chosen to be the light source and the calculation region is 15µ$\textrm{m} \times \; $15µ$\textrm{m} \times {\; }$5µ$\textrm{m}$, surrounded by a perfect matched layer (PML) which serves as the absorbing boundary. The monitor with exactly the same footprint of the spiral is located 100 nm above the Au film. Then, the center region (2µ$\textrm{m} \times {\; }$2µ$\textrm{m}$) of each nearfield intensity pattern consisting of 2500 pixels is retrieved. Finally, every pixel value of each sample is normalized into a range of [0-2], and flattened to be a one-dimensional vector. Here, the relative Stokes vector is utilized to label each sample. To reduce the complexity of design and computation, a feed-forward network with single hidden layer as shown in Fig. 2(e) is developed via the Neural Network Toolbox in matlab, which consists of three layers, namely input layer, hidden layer and output layer. The activation function of the input and hidden layer is sigmoid and that of the output layer is linear. There are 2500 nodes in the input layer which will be fed by the data set mentioned before. As the near-field intensity pattern of this spiral device is highly sensitive to incident SOP, a feed forward NN with high detection accuracy can already be obtained using a training-set consisting of only 300 samples, and thus the required computation for training is acceptable even when there are large number of input nodes. However, it should be noted that appropriate dimensionality reduction (DR) methods of the input such as autoencoder architecture [16–18] are necessary when huge amount of training data is required and the number of input features is also large to reduce computation complexity. The last layer has 3 nodes to output the predicted relative Stokes parameters. As for the hidden layer, the detection accuracy of NNs with different number of nodes in the hidden layer should be tested via the testing-set, which will be discussed later.

Fig. 2. a∼d are simulated near-field intensity distributions for horizontal, 45°, left hand and right hand circularly polarized wave. Intensity patterns illuminated by arbitrary SOPs will be utilized as the training-set. (e) NN architecture with three layers including one hidden layer. Input layer has 2500 nodes that correlate to 2500 pixels of each intensity image and the output layer’s 3 nodes correlate to the 3 Stokes parameters.

Download Full Size | PDF

3. Results and discussion

Here, the NN is optimized via certain methods. Furthermore, the detection accuracy of Stokes parameters as well as operation bandwidth will be discussed to demonstrate the reliability of our method. Firstly, the network will be trained by the training-set through the scaled conjugate gradient algorithm [19] typically used for fast training. Using an early stop technique, the validation-set is utilized to measure network generalization error of each training iteration in order to halt training when generalization error stops decreasing. A testing-set consisting of randomly selected 300 samples (intensity patterns of this spiral device under different polarized illumination) is utilized to test the robustness and reliability of NN with different node number in the hidden layer. Specifically, in order to quantify the estimation error and reflect the performance of the trained NN, the MSE of N samples (N = 300) in the testing set is computed as shown in Eq. (7),

(7)$$\textrm{MSE} = \frac{1}{N}\mathop \sum \nolimits_i^N \mathop \sum \nolimits_j^M {({{P^i}[j ]- {L^i}[j ]} )^2}$$

where ${P^i}$ and ${L^i}$ denote the predicted and labeled relative Stokes vector with M parameters (M = 3) of the ${i^{th}}$ sample, and a low MSE means a high detection accuracy. It is found that the MSE remains low when the number of nodes is over 15, but improves greatly for small number of nodes in the hidden layer as shown in Fig. 3(a). It is optimized to be 22 nodes in the hidden layer. It should also be noted that the early stop technique implemented here can reduce the amount of computation significantly. In this case, the training process finishes quickly at the 252th iteration when the MSE of the validation-set reaches its minimum value of 9.23e-4, which is illustrated in Fig. 3(b). The MSE between the predicted SOPs (represented by Stokes parameters) and real values of the training-set ends up with 8.11e-4 while that of the testing-set also reaches a low value of 1.23e-3. It is reasonable that the MSE of the testing-set is slightly higher than the testing-set because the NN is optimized only by the training-set. Although the detection error still exits, the overall accuracy of the feed forward NN is satisfactory and might be further improved by collecting more training samples and optimizing the machine learning algorithm.

Fig. 3. (a) The MSE of testing-set for NNs with different number of nodes in the hidden layer. (b) Training process represented by MSE of the training-set and validation-set for every training iteration.

Download Full Size | PDF

The final results of Stokes parameter’s detection are given in Figs. 4(a) and 4(b), where the predicted ${[{{S_1},{S_2},{S_3}} ]^T}$ of 300 samples in the testing-set are compared with the real values. For clarity, Fig. 4(b) is plotted in the form of Poincare sphere. In all cases, the testing results have good agreement with their real Stokes parameters. Although the detection error still exists, it can be reduced by further optimizing the machine learning algorithm or collecting more training samples.

Fig. 4. (a) Measurement of the state of polarization ${[{{S_1},{S_2},{S_3}} ]^T}$ of 300 selected polarizations using the spiral polarimeter (orange) with their real Stokes parameters (blue). (b) Comparison between measured Stokes parameters (orange) using our Stokes analyzer and their real values (blue) in a Poincare sphere.

Download Full Size | PDF

To be noted, even though the plasmonic spiral structure is designed for the wavelength of 980nm, it is still applicable for other wavelengths. Concretely, we calculate the correlation coefficients (R) [20] between the intensity distribution matrix of λ=980nm and other wavelength conditions, as shown in Fig. 5. When R is close to 1, the intensity patterns of other wavelengths are similar to the condition of 980 nm. Oppositely, as the wavelength moves away from the designed 980 nm, R is close to 0 and the device can no longer be used to detect SOPs because it gets insensitive to SOPs under such wavelength. It turns out that the same algorithm for polarization analysis can be shared, within a broad wavelength ranging from 900 to 1100nm.

Fig. 5. Correlation coefficients (R) between the intensity distribution matrix at wavelength of 980nm and other wavelength conditions.

Download Full Size | PDF

4. Conclusion

In summary, we propose a plasmonic spiral device to retrieve Stokes parameters of arbitrary SOP incidence, assisting by the machine learning algorithm. The whole chip has a footprint of 10*10µ${m^2}$, which is extremely space efficient comparing with traditional polarimeters. After mathematically proving the strong polarization sensitivity of the proposed structure, machine learning algorithm is used to find the hidden correlation between near-field intensity patterns and Stokes parameters. We successfully use such a simple spiral structure to achieve any SOP detection, and a bandwidth larger than 200 nm is found for the proposed scheme.

Funding

National Natural Science Foundation of China (61775073, 61911530161, 61922034); Program for HUST Academic Frontier Youth Team (2018QYTD08).

Disclosures

The authors declare that there are no conflicts of interest related to this article.

References

1. A. Schwope and J. Tinbergen, Astronomical Polarimetry (Cambridge University Press, 1996).

2. G. Vasile, E. Trouvé, J.-S. Lee, and V. Buzuloiu, “Intensity-driven adaptive-neighborhood technique for polarimetric and interferometric SAR parameters estimation,” IEEE Trans. Geosci. Electron. 44(6), 1609–1621 (2006). [CrossRef]

3. S. A. Hall, M.-A. Hoyle, J. S. Post, and D. K. Hore, “Combined stokes vector and Mueller matrix polarimetry for materials characterization,” Anal. Chem. 85(15), 7613–7619 (2013). [CrossRef]

4. H. G. Berry, G. Gabrielse, and A. Livingston, “Measurement of the Stokes parameters of light,” Appl. Opt. 16(12), 3200–3205 (1977). [CrossRef]

5. D. Wen, F. Yue, S. Kumar, Y. Ma, M. Chen, X. Ren, P. E. Kremer, B. D. Gerardot, M. R. Taghizadeh, and G. S. Buller, “Metasurface for characterization of the polarization state of light,” Opt. Express 23(8), 10272–10281 (2015). [CrossRef]

6. A. Pors, M. G. Nielsen, and S. I. Bozhevolnyi, “Plasmonic metagratings for simultaneous determination of Stokes parameters,” Optica 2(8), 716–723 (2015). [CrossRef]

7. S. Wei, Z. Yang, and M. Zhao, “Design of ultracompact polarimeters based on dielectric metasurfaces,” Opt. Lett. 42(8), 1580–1583 (2017). [CrossRef]

8. W. Wu, Y. Yu, W. Liu, and X. Zhang, “Fully integrated CMOS compatible polarization analyzer,” Nanophotonics 8(3), 467–474 (2019). [CrossRef]

9. K. Lee, H. Yun, S. E. Mun, G. Y. Lee, J. Sung, and B. Lee, “Ultracompact broadband plasmonic polarimeter,” Laser Photonics Rev. 12(3), 1700297 (2018). [CrossRef]

10. J. B. Mueller, K. Leosson, and F. Capasso, “Ultracompact metasurface in-line polarimeter,” Optica 3(1), 42–47 (2016). [CrossRef]

11. S. Yang, W. Chen, R. L. Nelson, and Q. Zhan, “Miniature circular polarization analyzer with spiral plasmonic lens,” Opt. Lett. 34(20), 3047–3049 (2009). [CrossRef]

12. W. Chen, R. L. Nelson, and Q. Zhan, “Efficient miniature circular polarization analyzer design using hybrid spiral plasmonic lens,” Opt. Lett. 37(9), 1442–1444 (2012). [CrossRef]

13. J. Miao, Y. Wang, C. Guo, Y. Tian, S. Guo, Q. Liu, and Z. Zhou, “Plasmonic lens with multiple-turn spiral nano-structures,” Plasmonics 6(2), 235–239 (2011). [CrossRef]

14. J. Zhang, Z. Guo, R. Li, W. Wang, A. Zhang, J. Liu, S. Qu, and J. Gao, “Circular polarization analyzer based on the combined coaxial Archimedes’ spiral structure,” Plasmonics 10(6), 1255–1261 (2015). [CrossRef]

15. Q. Zhang, P. Li, Y. Li, X. Ren, and S. Teng, “A universal plasmonic polarization state analyzer,” Plasmonics 13(4), 1129–1134 (2018). [CrossRef]

16. Y. Kiarashinejad, M. Zandehshahvar, S. Abdollahramezani, O. Hemmatyar, R. Pourabolghasem, and A. Adibi, “Knowledge Discovery In Nanophotonics Using Geometric Deep Learning.” arXiv preprint arXiv:1909.07330 (2019).

17. S. Kiarashinejad, A. Abdollahramezani, and Adibi, “Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures.” arXiv preprint arXiv:1902.03865 (2019).

18. Y. Kiarashinejad, S. Abdollahramezani, M. Zandehshahvar, O. Hemmatyar, and A. Adibi, “Deep Learning Reveals Underlying Physics of Light–Matter Interactions in Nanophotonic Devices,” Adv. Theory Simul. 2(9), 1900088–0390 (2019). [CrossRef]

19. M. F. Møller, “A scaled conjugate gradient algorithm for fast supervised learning,” Neural Netw. 6(4), 525–533 (1993). [CrossRef]

20. J. Lee Rodgers and W. A. Nicewander, “Thirteen ways to look at the correlation coefficient,” Am. Stat. 42(1), 59–66 (1988). [CrossRef]

Ultra-compact polarimeter based on a plasmonic spiral assisting by machine learning

Abstract

1. Introduction

2. Theories and methods

2.1 Theoretical models

2.2 Data set generation and network architecture

3. Results and discussion

4. Conclusion

Funding

Disclosures

References

Cited By

Figures (5)

Equations (7)

OSA Continuum