Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Nanophotonic inverse design with deep neural networks based on knowledge transfer using imbalanced datasets

Open Access Open Access

Abstract

Deep neural networks (DNNs) have been used as a new method for nanophotonic inverse design. However, DNNs need a huge dataset to train if we need to select materials from the material library for the inverse design. This puts the DNN method into a dilemma of poor performance with a small training dataset or loss of the advantage of short design time, for collecting a large amount of data is time consuming. In this work, we propose a multi-scenario training method for the DNN model using imbalanced datasets. The imbalanced datasets used by our method is nearly four times smaller compared with other training methods. We believe that as the material library increases, the advantages of the imbalanced datasets will become more obvious. Using the high-precision predictive DNN model obtained by this new method, different multilayer nanoparticles and multilayer nanofilms have been designed with a hybrid optimization algorithm combining genetic algorithm and gradient descent optimization algorithm. The advantage of our method is that it can freely select discrete materials from the material library and simultaneously find the inverse design of discrete material type and continuous structural parameters of the nanophotonic devices.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Nanophotonic inverse design is an important way to design photonic devices for the desired performance [1]. Usually, A designer first picks some free parameters based on nanophotonic theory. The designer then optimizes these parameters using different algorithms [25]. Although this inverse design approach has been used to design many nanophotonic devices [3], it is inefficient and only search a small parameter space of the devices. Deep neural networks (DNNs), due to its remarkable success in many realms of science and engineering, has emerged as a new tool in nanophotonic inverse design [1,3]. DNNs is a data-driven method that has advantages in searching the full parameter space of fabricable nanophotonic devices [6,7]. It has been used in the inverse design of various nanophotonic structures, including multilayer nanostructures [68], metamaterials [912], photonic crystals [13,14] and other photonic devices [1517]. In these researches, DNNs have been used as approximate predictors for light-matter interaction phenomena. Using DNNs instead of the expensive numerical simulation of Maxwell equations can greatly reduce the overall design time [6,11]. However, training DNNs requires a large training dataset. Usually the dataset is with the scale of 104. Most researches only focus on the design of nanostructures using one or two specific materials because it needs a huge training dataset to design nanophotonic devices if selecting materials from the material library [68]. For example, if we select two materials from 16 materials for inverse design, we need A (16,2) x 104 = 2.4 × 106 training samples. As the material library grows, the training datasets required will increase dramatically. In this case DNNs encountered a dilemma of poor performance with a small training dataset or loss of the advantage of short design time, for collecting a large amount of data is time consuming.

To alleviate the poor performance of direct learning from a small dataset, knowledge transfer learning has been used to effectively improve the performance of the target DNNs model in different optical scenarios [1820]. Qu et al. demonstrated that whether in similar optical scenarios or in very different optical scenarios, DNNs with knowledge transfer learning can give more accurate predictions than direct learning when using the same small training datasets [18]. But they only considered two materials for these related tasks. There is still no solution to the large datasets problem due to the combination of materials from the material library. Another great challenge is that the material is a discrete parameter and indexed by a number in the design space if we select materials from the material library, while other structural parameters like thickness are continuous value. Simultaneous inverse design of discrete material type and continuous structural parameters of nanophotonic devices is difficult for DNNs [21].

To address the above challenges, we propose a multi-scenarios DNNs model training with imbalanced datasets. It has small datasets (about 103 scale) for some optical scenarios, but other scenarios have normal datasets (about 104 scale). Therefore, we can maintain a training dataset of a suitable size, especially for the many scenarios formed by the combination of materials which are selected from the material library. Unlike training the model directly with small datasets, multi-scenarios training can improve the accuracy of the model even though it is trained with small datasets. Using the high-precision predictive DNNs model obtained by the multi-scenarios training, we develop a hybrid optimization algorithm for inverse designs of multilayer nanoparticles and multilayer nanofilms, which can freely select discrete materials from the material library and simultaneously find the inverse design of continuous and discrete optical parameters.

2. DNNs models and datasets

In this work, we consider two completely different optical scenarios, one is 3-layers nanoparticles and the other is 12-layers nanofilms (as shown in Fig. 1). Both scenarios have structures consist of alternating layers of different materials. For each of these two scenarios, we use 4 different metals (Ag, Au, Cu, and Al) and 4 different dielectric materials (TiO2, SiC, Si3N4, and SiO2) to compose 16 types of optical scenarios (hereafter referred to as similar multi-scenarios). To distinguish these similar multi-scenarios, we use different combinations of discrete 0 and 1 to represent the material for different scenarios, as shown in Table 1. The optical properties of the metals are characterized by the Lorentz-Drude model [22,23], and the wavelength-dependent refractive indices of TiO2 and SiO2 are referenced from Poudel et al. [24], SiC from Wang et al. [25], and Si3N4 from Luke et al. [26].

 figure: Fig. 1.

Fig. 1. Schematic diagram for DNNs used to approximate the extinction (transmittance) spectra of multilayer nanoparticles (nanofilms). The discrete inputs represent material type for different scenarios, and the continuous inputs are structural parameters of the nanostructures.

Download Full Size | PDF

Tables Icon

Table 1. Binary symbols for 16 scenarios with different materials

We first generated 176000 samples for these 16 similar multi-scenarios, 11000 samples for each scenario, via the transfer matrix method (for 3-layers nanoparticles) [6] and rigorous coupled wave analysis (RCWA) method (for 12-layers nanofilms) [27]. The field is decomposed into transverse electric (TE) and transverse magnetic (TM) in the transfer matrix method. The total scattering cross section is the sum of all channels of the TE and TM polarization. Usually the order of channel is less than 30 that will make the spectrum converge. The maximum number of Fourier expansion orders of RCWA is set to 80, that can ensure the convergence of the spectrum. We randomly selected 1000 samples from each scenario as the test dataset, and the remaining samples were randomly divided into training (90%) and validation (10%) datasets. The detailed training process of DNNs is described in Supplement 1. Inspired by the research work of Sitzmann et al. [28], we compare several activation functions and choose sine activation function, which performs best in the training process of DNNs, as shown in Supplement 1, Fig. A.1. Further, we compare several neural network frameworks with different sizes and depths. Network 2 are selected (refer to Supplement 1) for the following work in this paper. The cost function is expressed as follows for this paper:

$$Error = \frac{1}{N}{\sum\limits_{i = 1}^N {{\big (}{{\big |}{{R_i} - R_i^{\prime}} {\big |}} {\big )}} ^2},$$
where N is the batch size of the network and set to 100 here. The Adam optimizer [29] is used to update the parameters of the network by minimizing the difference between R (actual spectrum from the training dataset) and $R^{\prime}$ (predicted spectrum calculated by the DNNs). The hidden layers of the DNNs model are six-layer fully connected network. The number of neurons in each layer is 250, 500, 500, 500, 500, 250, respectively. Next, we demonstrate the training of this neural network framework for imbalanced datasets.

3. Training with imbalanced datasets

The average test errors of the DNNs model are shown in Fig. 2. There are three average error levels. STL900 is the average error of the 16 similar multi-scenarios when the DNNs is trained with 900 samples of each scenario. Obviously, this error is the highest. MTL900 is the average error of the 16 similar multi-scenarios when the DNNs is trained with 16 × 900 samples of all the scenarios. It should be noted that STL900 has different DNNs models corresponding to 16 different scenarios, while MTL900 has one DNN model shared by all scenarios. These multi-scenarios training can reduce the error of the model, but due to the relatively small data set, the average error is still relatively large. If we increase the training datasets to the scale of 104, the average error will decrease significantly, as shown with STL9000, where the DNNs is trained with 9000 samples of each scenario. In order to reduce the size of all training datasets, we trained the DNNs model with imbalanced datasets. We randomly select one, two and three scenarios from the 16 similar multi-scenarios. We take 9,000 samples from these selected scenarios but take 900 samples from the remaining scenarios to form the imbalanced datasets. The average errors are shown with the three error bars in Fig. 2. The results show that these errors can be reduced to a level close to STL9000, which need 16 × 9000 samples to be trained. Compared with the samples required with STL9000, the maximum number of samples in the imbalanced datasets is nearly four times smaller (3 × 9000 + 13 × 900). We believe that as the material library increases, the advantages of the imbalanced datasets will become more obvious.

 figure: Fig. 2.

Fig. 2. Comparison of average test errors of DNNs model for (a) multilayer nanoparticles and (b) multilayer nanofilms. The DNNs model is trained with different datasets. STL900 is the result of the DNNs trained with 900 samples of each scenario. MTL900 is the average error when the DNNs is trained with 16 × 900 samples of all the scenarios. STL9000 takes 9000 samples from each scenario. The three error bars are the result of the DNNs trained with the imbalanced datasets.

Download Full Size | PDF

Performance improvement for the DNNs model trained with the imbalanced datasets comes from the multi-scenarios training. Multiple scenarios interact with each other, which reduces the risk of the DNNs getting stuck in the local optimal solution of one task, and makes it prefer to learn a useful shared representation for all scenarios. Multi-scenarios training can be viewed as a way of achieving inductive transfer between scenarios [30]. That is to say, for each scenario, other scenarios are equivalent to its inductive bias, which can effectively improve the generalization accuracy of the learned DNNs model for the scenario.

4. Transfer learning

The above results show the effective inductive transfer between 16 similar multi-scenarios. Next, we demonstrate that this transfer can be done in two completely different optical scenarios by using network-based knowledge transfer learning [19], where the source scenario is 3 layers of nanoparticles and the target scenario is 12 layers of nanofilms. The schematic diagram of network-based knowledge transfer learning is shown in Fig. 3, where the TransferNet and the BaseNet have the same network structure, but the initial weights and biases of some hidden layers in the TransferNet are transferred from the BaseNet. The BaseNet is trained using the above imbalanced multi-scenarios method in the source scenarios. Then the TransferNet is trained for the target scenarios, however part of its initial weights and biases of the hidden layers is copied from the BaseNet, the remaining hidden layers are randomly initialized, and the entire TransferNet is fine-tuned with back-propagation algorithm using the target scenarios datasets.

 figure: Fig. 3.

Fig. 3. Schematic diagram of network-based knowledge transfer learning. The top row is the base network (BaseNet) which is trained using the imbalanced multi-scenarios datasets method in the source scenarios. The bottom row is the transfer network (TransferNet) that part of its weights and biases is copied directly from the BaseNet, the weights and biases of the remaining layers are randomly initialized, and the entire network is updated by fine-tuning.

Download Full Size | PDF

In general, the first several layers of deep neural network are feature extraction layers, which are used to extract high-level features of input training data, and the subsequent layers are used to identify information for specific tasks [1820]. Intuitively, it makes sense that the first few layers are used to extract high-level representations that distinguish between different tasks rather than to learn specific knowledge about the relationship between input attributes and optical responses. Therefore, we only select layers 2 to 4 of the BaseNet for transfer, and do not consider transferring with other combinations of hidden layers in this work. The first layer is not chosen because of the different dimensions of the input attributes for 3-layer nanoparticles and 12-layer nanofilms. The results of transferring knowledge from the BaseNet to the TransferNet and direct learning are shown in Fig. 4. Model 1, Model 2 and Model 3 refer to the model with the smallest average test error among the BaseNet models trained with single scenario 9000, two scenarios 9000 and three scenarios 9000 respectively (refer to Fig. 2).

 figure: Fig. 4.

Fig. 4. Comparison of average test errors for transfer learning with different BaseNets versus direct learning with different training data. The number of training samples for transfer learning is 144000, but the initial weights and biases of hidden layers 2 to 4 of the TransferNet are copied from the corresponding layers of the BaseNet.

Download Full Size | PDF

Compared to direct learning, the average test error of the TransferNet is reduced when the initial neural network parameters of layers 2 to 4 are copied from BaseNets. Transfer learning has different performance when transfer from different source models. However, it is shown that the best performance of the TransferNet is equivalent to direct learning with training samples between 288000 and 432000. This is very promising for reducing the dependence of neural networks on training data, especially when data collection is complex and expensive. Although we cannot know in advance which kind of transfer learning has the best performance, we can select the best model from its test results. Here, we choose the model with the minimum average test error (Transfer from Model 2 in Fig. 4) for the inverse design of multilayer nanofilms.

5. Nanophotonic inverse design

When we get a high-precision predictive DNNs model, we can use it to inverse design of nanophotonic structures. However, we need to know that the material is a discrete parameter and indexed by ${m_1}$ and ${m_2}$ (Table 1), while the thickness is a continuous value that varies continuously in an interval. Simultaneous inverse design of discrete material parameters and thickness parameters of nanophotonic structures is often difficult because material parameters with discrete values are considered as classification problem and thickness parameters with continuous values are considered as regression problem [21,3133]. So et al. applied a loss function weighted of classification and regression losses to effectively circumvent this problem [21]. In this work, we propose a hybrid optimization algorithm that combines metaheuristic genetic algorithm [34,35] and gradient descent optimization algorithm [36,37] for the inverse design of multilayer nanoparticles and multilayer nanofilms. This method gives us a solution that can freely select discrete materials from the material library for optical inverse design.

In order to find at least one near-optimal solution when solutions exist and simultaneously find the inverse design of continuous and discrete optical parameters, genetic algorithms with binary coding can be used to deal with this problem effectively [38]. However, binary coded genetic algorithms (BCGAs) have a slow convergence rate for local search and are difficult to obtain a high-precision solution. As shown in Fig. 5, after the binary string of thickness is decoded, we can use the hybrid optimization method for simultaneous inverse design of discrete and continuous optical parameters. First, the BCGA is used to find a suitable solution whose adjacent domain may contain a near-optimal solution, and then the gradient optimization algorithm is used to exploit this domain for rapid convergence.

 figure: Fig. 5.

Fig. 5. The encoding form of material and thickness parameters in genetic operation, and the corresponding value that can be recognized by the DNNs.

Download Full Size | PDF

The detailed execution process of the hybrid optimization algorithm is contained in Supplement 1. Here, we use 5 bits to represent the thickness parameter (${d_n}$) ranging from 0 to 90 nm. We randomly select some physically realizable spectra from the test dataset of multilayer nanoparticles for the inverse design, and the results are shown in Fig. 6. Designed spectra are calculated by the transfer matrix method [6], and corresponding design parameters are given in the caption below each figure. The trained model is even suitable for all-dielectric or all-metal nanospheres. The results are shown in Fig. 6 (c) and (d). When the binary parameter space is expanded from 3 layers of nanoparticles to 12 layers of nanofilms, the hybrid optimization algorithm remains robust and effective. The results are shown in Fig. 7, and design parameters are summarized in Table 2. See Supplement 1 for more examples.

 figure: Fig. 6.

Fig. 6. Inverse design results for target spectra randomly selected from test datasets of multilayer nanoparticles. (c) and (d) are special cases of all-metal and all-dielectric nanosphere.

Download Full Size | PDF

 figure: Fig. 7.

Fig. 7. Inverse design results for target spectra randomly selected from test datasets of multilayer nanofilms.

Download Full Size | PDF

Tables Icon

Table 2. Design parameters for the examples in Fig. 7

Further, we demonstrate the effectiveness of the proposed hybrid optimization method for arbitrarily shaped target spectra. Unlike a physically realizable target spectrum, there may not be a set of design parameters that match a specific target spectrum within a certain range of material and thickness parameters. However, our proposed method can still find a set of design parameters whose spectrum is close to the target spectrum, as shown in Fig. 8. The target spectra are generated using formulas with different parameters (refer to Supplement 1). For more examples, please refer to Supplement 1, including target spectra of many different shapes.

 figure: Fig. 8.

Fig. 8. Inverse design results of arbitrarily Lorenz-shaped target spectra. Design parameters of 3 layers nanoparticles are given in the caption below each figure.

Download Full Size | PDF

6. Conclusion

In summary, we propose a multi-scenarios training method for the DNNs model using imbalanced datasets. Compared with the conventional training method, the imbalanced datasets used by our method are nearly four times smaller. Then we demonstrate that this method can also be applied to two completely different optical scenarios by using network-based knowledge transfer learning. Using the high-precision predictive DNNs model obtained by this training method, we can use it to inverse design of different multilayer nanoparticles and multilayer nanofilms. A hybrid optimization algorithm combining genetic algorithm and gradient descent optimization algorithm has been used for these inverse designs that can freely select discrete materials from the material library and simultaneously find the inverse design of continuous and discrete optical parameters. Our method shows great potential for inverse design problems when the material library increases.

Funding

Natural Science Foundation of Guangdong Province (2017A030310064); National Natural Science Foundation of China (61307080).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. S. Molesky, Z. Lin, A.Y. Piggott, W. Jin, J. Vucković, and A.W. Rodriguez, “Inverse Design in Nanophotonics,” Nature Photon. 12(11), 659–670 (2018). [CrossRef]  

2. P.R. Wiecha, A. Arbouet, C. Girard, and O.L. Muskens, “Deep Learning in Nano-Photonics: Inverse Design and Beyond,” Photon. Res. 9(5), B182–B200 (2021). [CrossRef]  

3. K. Yao, R. Unni, and Y. Zheng, “Intelligent Nanophotonics: Merging Photonics and Artificial Intelligence at the Nanoscale,” Nanophotonics. 8(3), 339–366 (2019). [CrossRef]  

4. F. Yang, A. Hwang, C. Doiron, and G.V. Naik, “Non-Hermitian Metasurfaces for the Best of Plasmonics and Dielectrics,” Opt. Mater. Express. 11(7), 2326–2334 (2021). [CrossRef]  

5. A. Michaels and E. Yablonovitch, “Inverse Design of Near Unity Efficiency Perfectly Vertical Grating Couplers,” Opt. Express. 26(4), 4766–4779 (2018). [CrossRef]  

6. J. Peurifoy, Y. Shen, L. Jing, Y. Yang, F. Cano-Renteria, B.G. DeLacy, J.D. Joannopoulos, M. Tegmark, and M. Soljačić, “Nanophotonic Particle Simulation and Inverse Design Using Artificial Neural Networks,” Sci. Adv. 4(6), eaar4206 (2018). [CrossRef]  

7. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training Deep Neural Networks for the Inverse Design of Nanophotonic Structures,” ACS Photonics 5, 1365–1369 (2018). [CrossRef]  

8. R. Unni, K. Yao, and Y. Zheng, “Deep Convolutional Mixture Density Network for Inverse Design of Layered Photonic Structures,” ACS Photonics 7(10), 2703–2712 (2020). [CrossRef]  

9. W. Ma, F. Cheng, and Y. Liu, “Deep-Learning-Enabled On-Demand Design of Chiral Metamaterials,” ACS Nano 12(6), 6326–6334 (2018). [CrossRef]  

10. Y. Kiarashinejad, S. Abdollahramezani, and A. Adibi, “Deep Learning Approach Based on Dimensionality Reduction for Designing Electromagnetic Nanostructures,” Npj Comput Mater 6(1), 12 (2020). [CrossRef]  

11. S. An, B. Zheng, M.Y. Shalaginov, H. Tang, H. Li, L. Zhou, J. Ding, A.M. Agarwal, C. Rivero-Baleine, M. Kang, K.A. Richardson, T. Gu, J. Hu, C. Fowler, and H. Zhang, “Deep Learning Modeling Approach for Metasurfaces with High Degrees of Freedom,” Opt. Express 28(21), 31932–31942 (2020). [CrossRef]  

12. J. Ma, Y. Huang, M. Pu, D. Xu, J. Luo, Y. Guo, and X. Luo, “Inverse Design of Broadband Metasurface Absorber Based on Convolutional Autoencoder Network and Inverse Design Network,” J. Phys. D: Appl. Phys. 53(46), 464002 (2020). [CrossRef]  

13. R. Singh, A. Agarwal, and B. W Anthony, “Mapping the Design Space of Photonic Topological States via Deep Learning,” Opt. Express 28(19), 27893–27902 (2020). [CrossRef]  

14. C.-X. Liu, G.-L. Yu, and G.-Y. Zhao, “Neural Networks for Inverse Design of Phononic Crystals,” AIP Advances 9(8), 085223 (2019). [CrossRef]  

15. M.H. Tahersima, K. Kojima, T. Koike-Akino, D. Jha, B. Wang, C. Lin, and K. Parsons, “Deep Neural Network Inverse Design of Integrated Photonic Power Splitters,” Sci Rep 9(1), 1368 (2019). [CrossRef]  

16. K. Kojima, Y. Tang, T. Koike-Akino, Y. Wang, D. Jha, K. Parsons, M.H. Tahersima, F. Sang, J. Klamkin, and M. Qi, “Inverse Design of Nanophotonic Devices Using Deep Neural Networks,” in: Asia Communications and Photonics Conference/International Conference on Information Photonics and Optical Communications 2020 (ACP/IPOC), OSA, Beijing, 2020: pp. 1–3.

17. T. Zhang, J. Wang, Q. Liu, J. Zhou, J. Dai, X. Han, Y. Zhou, and K. Xu, “Efficient Spectrum Prediction and Inverse Design for Plasmonic Waveguide Systems Based on Artificial Neural Networks,” Photon. Res. 7(3), 368–380 (2019). [CrossRef]  

18. Y. Qu, L. Jing, Y. Shen, M. Qiu, and M. Soljačić, “Migrating Knowledge between Physical Scenarios Based on Artificial Neural Networks,” ACS Photonics 6(5), 1168–1174 (2019). [CrossRef]  

19. C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A Survey on Deep Transfer Learning,” in: V. Kůrková, Y. Manolopoulos, B. Hammer, L. Iliadis, and I. Maglogiannis (eds.), Artificial Neural Networks and Machine Learning – ICANN 2018, 11141Springer International Publishing, Cham, 2018: pp. 270–279. [CrossRef]  

20. D. Xu, Y. Luo, J. Luo, M. Pu, Y. Zhang, Y. Ha, and X. Luo, “Efficient Design of a Dielectric Metasurface with Transfer Learning and Genetic Algorithm,” Opt. Mater. Express 11, 1852–1862 (2021). [CrossRef]  

21. S. So, J. Mun, and J. Rho, “Simultaneous Inverse Design of Materials and Structures via Deep Learning: Demonstration of Dipole Resonance Engineering Using Core–Shell Nanoparticles,” ACS Appl. Mater. Interfaces. 11, 24264–24268 (2019). [CrossRef]  

22. A.D. Rakić, A.B. Djurišić, J.M. Elazar, and M.L. Majewski, “Optical Properties of Metallic Films for Vertical-Cavity Optoelectronic Devices,” Appl. Opt. 37(22), 5271–5283 (1998). [CrossRef]  

23. B. Ung and Y. Sheng, “Interference of Surface Waves in a Metallic Nanoslit,” Opt. Express 15(3), 1182–1190 (2007). [CrossRef]  

24. K.N. Poudel and W.M. Robertson, “Maximum Length Sequence Dielectric Multilayer Reflector,” OSA Continuum 1(2), 358–372 (2018). [CrossRef]  

25. S. Wang, M. Zhan, G. Wang, H. Xuan, W. Zhang, C. Liu, C. Xu, Y. Liu, Z. Wei, and X. Chen, “4H-SiC: A New Nonlinear Material for Midinfrared Lasers,” Laser Photonics Rev. 7(5), 831–838 (2013). [CrossRef]  

26. K. Luke, Y. Okawachi, M.R.E. Lamont, A.L. Gaeta, and M. Lipson, “Broadband Mid-Infrared Frequency Comb Generation in a Si3N4 Microresonator,” Opt. Lett. 40(21), 4823–4826 (2015). [CrossRef]  

27. V. Liu and S. Fan, “S4: A Free Electromagnetic Solver for Layered Periodic Structures,” Comput. Phys. Commun. 183(10), 2233–2244 (2012). [CrossRef]  

28. V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, “Implicit Neural Representations with Periodic Activation Functions,” in: H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, Curran Associates, Inc., 2020: pp. 7462–7473. https://proceedings.neurips.cc/paper/2020/file/53c04118df112c13a8c34b38343b9c10-Paper.pdf.

29. D.P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” in: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.

30. D.L. Silver, R. Poirier, and D. Currie, “Inductive Transfer with Context-Sensitive Neural Networks,” Mach Learn 73, 313–336 (2008). [CrossRef]  

31. S. So, D. Lee, T. Badloe, and J. Rho, “Inverse Design of Ultra-Narrowband Selective Thermal Emitters Designed by Artificial Neural Networks,” Opt. Mater. Express 11(7), 1863–1873 (2021). [CrossRef]  

32. E. Vahidzadeh and K. Shankar, “Artificial Neural Network-Based Prediction of the Optical Properties of Spherical Core–Shell Plasmonic Metastructures,” Nanomaterials 11(3), 633 (2021). [CrossRef]  

33. A.-K.S.O. Hassan, A.S.A. Mohamed, M.M.T. Maghrabi, and N.H. Rafat, “Optimal Design of One-Dimensional Photonic Crystal Filters Using Minimax Optimization Approach,” Appl. Opt. 54(6), 1399–1409 (2015). [CrossRef]  

34. E. Kerrinckx, L. Bigot, M. Douay, and Y. Quiquempois, “Photonic Crystal Fiber Design by Means of a Genetic Algorithm,” Opt. Express 12(9), 1990–1994 (2004). [CrossRef]  

35. V. Egorov, M. Eitan, and J. Scheuer, “Genetically Optimized All-Dielectric Metasurfaces,” Opt. Express 25(3), 2583–2593 (2017). [CrossRef]  

36. T.W. Hughes, M. Minkov, I.A.D. Williamson, and S. Fan, “Adjoint Method and Inverse Design for Nonlinear Nanophotonic Devices,” ACS Photonics 5(12), 4781 (2018). [CrossRef]  

37. J.S. Jensen and O. Sigmund, “Topology Optimization for Nano-Photonics,” Laser & Photon. Rev. 5(2), 308–321 (2011). [CrossRef]  

38. Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Springer Berlin Heidelberg, 2013. https://books.google.com.hk/books?id=JmyrCAAAQBAJ.

Supplementary Material (1)

NameDescription
Supplement 1       Supplement 1

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. Schematic diagram for DNNs used to approximate the extinction (transmittance) spectra of multilayer nanoparticles (nanofilms). The discrete inputs represent material type for different scenarios, and the continuous inputs are structural parameters of the nanostructures.
Fig. 2.
Fig. 2. Comparison of average test errors of DNNs model for (a) multilayer nanoparticles and (b) multilayer nanofilms. The DNNs model is trained with different datasets. STL900 is the result of the DNNs trained with 900 samples of each scenario. MTL900 is the average error when the DNNs is trained with 16 × 900 samples of all the scenarios. STL9000 takes 9000 samples from each scenario. The three error bars are the result of the DNNs trained with the imbalanced datasets.
Fig. 3.
Fig. 3. Schematic diagram of network-based knowledge transfer learning. The top row is the base network (BaseNet) which is trained using the imbalanced multi-scenarios datasets method in the source scenarios. The bottom row is the transfer network (TransferNet) that part of its weights and biases is copied directly from the BaseNet, the weights and biases of the remaining layers are randomly initialized, and the entire network is updated by fine-tuning.
Fig. 4.
Fig. 4. Comparison of average test errors for transfer learning with different BaseNets versus direct learning with different training data. The number of training samples for transfer learning is 144000, but the initial weights and biases of hidden layers 2 to 4 of the TransferNet are copied from the corresponding layers of the BaseNet.
Fig. 5.
Fig. 5. The encoding form of material and thickness parameters in genetic operation, and the corresponding value that can be recognized by the DNNs.
Fig. 6.
Fig. 6. Inverse design results for target spectra randomly selected from test datasets of multilayer nanoparticles. (c) and (d) are special cases of all-metal and all-dielectric nanosphere.
Fig. 7.
Fig. 7. Inverse design results for target spectra randomly selected from test datasets of multilayer nanofilms.
Fig. 8.
Fig. 8. Inverse design results of arbitrarily Lorenz-shaped target spectra. Design parameters of 3 layers nanoparticles are given in the caption below each figure.

Tables (2)

Tables Icon

Table 1. Binary symbols for 16 scenarios with different materials

Tables Icon

Table 2. Design parameters for the examples in Fig. 7

Equations (1)

Equations on this page are rendered with MathJax. Learn more.

E r r o r = 1 N i = 1 N ( | R i R i | ) 2 ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.