Event recognition method based on feature synthesizing for a zero-shot intelligent distributed optical fiber sensor

Yi Shi; Hanfang Liu; Wentao Zhang; Zhongdi Cheng; Jiewei Chen; Qian Sun

doi:10.1364/OE.514878

1. Introduction

The phase-sensitive optical time-domain reflectometer (Φ-OTDR) is first proposed in 1993 [1]. Based on the reflectometer structure and highly coherent probe light, Φ-OTDR sensing system has the advantage of distributed sensing, long sensing range, high sensitivity, acoustic waveform detection ability and precise positioning ability, which is suitable for many field applications, such as intrusion monitoring for large infrastructures, pipeline monitoring, underwater seismic signals detection, seismic waves prediction, monitoring of under-ground tunnels, submarine power cables, et al [2–7]. In addition, due to the characteristics of fiber optic materials, Φ-OTDR have the advantages of anti-electromagnetic interference, low cost and easy to realize long distance measurement. The long sensing range asks for better event recognition ability. Traditional event recognition methods are based on manual feature extraction, which usually works well in the laboratory tests but lacks generalization and robustness for field application. With the development of artificial intelligence, the event recognition methods based on machine learning and deep learning are introduced for Φ-OTDR, which shows a good event identification ability and robustness in field tests. Sun et al [8] extracted feature vectors from morphological features to solve the classification task with 3 types of vibration events. Jia et al [9] proposed a vibration event recognition method based on nearest neighbor classification support vector machine. However, these machine learning and deep learning classifiers are data-driven and request a training dataset with all event types included. It is difficult to obtain an all-covered training dataset with all event types and large number of data samples in field applicant. Sometimes, we only know the abstract conceptions of some event but cannot obtain their data samples in advance. For example, it is very hard, very expensive or dangerous to get the data sample of landslide, earthquake or bomb explosion for long range pipeline monitoring. And sometimes, limited by experiment condition or costs, the event types obtained in advance for training are limited. If there are no data sample of certain event type in the training dataset, these data-driven recognition method cannot recognize that event type in tests. A general view of an outdoor intrusion recognition system with an unknown event type is shown in Fig. 1.

Fig. 1. Diagram of outdoor intrusion recognition system.

Download Full Size | PDF

Zero-shot learning (ZSL) is an effective solution for situations where no sample instances are available. The goal of ZSL is to achieve classification of unknown class by learning mapping relations from known categories to unknown class [10,11]. ZSL has been applied in some cases and can reach about 30%∼50% classification accuracy for unknown classes. Zhang et al [12] proposed a general probabilistic method by learning joint latent similarity embedding for both source and target domains, which reached 29.15% classification accuracy on the Caltech-UCSD-Birds dataset. Kodirov et al [13] proposed a method using the target labels’ projection in the semantic space to regularize the learned projection, which reached 40.6% classification accuracy on the Caltech-UCSD-Birds dataset. Al-Halah et al [14] proposed an unsupervised ZSL method by predicting class-attribute associations and reached 37% classification accuracy on the APascal-aYahoo dataset. Guo et al [15] propose an approach which turns the ZSL problem into a conventional supervised learning problem by synthesizing samples for the unseen classes. It reached 55.75% on the Caltech-UCSD-Birds dataset.

However, there is no suitable semantic space to adequately describe the sensing events in the field of fiber optic sensing applications. Some works related to new event recognition or open-set problem for distributed optical fiber sensor have been proposed. Wu et al [16] proposed a multi-task learning network for simultaneous event recognition and localization, which can also be used to recognize new events after parameters fine-tuning. However, it still needs the data sample of the new event to retrain the network. Rizzo et al [17] proposed a network combined with feature extractor, region proposal network and detection head to detect and recognize known and unknown event in OTDR. Zhou et al [18] proposed a residual learning convolution neural network to recognize the unknown event. These works [17,18] do not need unknown data samples in the training stage, but they treat all the other events except those events in training set as one unknown class. Thus, they can only tell one event is not a known class but can not tell what the new event is. It is also a common disadvantage of clustering algorithm. It is better to know the type of new event in order to further determine whether it is worth paying attention to. Inspired by ZSL, we propose a novel method to construct training samples of unknown classes using mixed class approach (MC) and reinforce learning based guided training method (RLGTM) to solve the problem of missing unknown class samples during training. The proposed method can then recognize the event type of the missing class. The experiment results indicate that the proposed method in the task of classifying eight event classes with one unknown class can identify the unknown event type and reach an average accuracy of 42% for the unknown class. It is the first time that the distributed fiber sensing system can recognize the unknown or non-training event types and tell its event type.

2. Methodology and mathematical model

2.1 Methodology

We first introduce the conception of the proposed method and then show its details. Traditional deep learning networks are capable of recognizing all input classes, extracting specific features, and distinguishing them. They use feature extraction network (FEN) to extract the signal features. However, some of these features are unnecessary for synthesizing unknown class samples. Therefore, we would like to make a FEN focused only on extracting features related to the unknown class, and meanwhile, blurring the features not belong to the unknown class. This approach enables the extracted features to be purer, containing only the target features required for synthesizing unknown class samples. However, it is difficult to make this kind of FEN converge with traditional training method. Therefore, we propose RLGTM to train and obtain those FENs. The primary goal of reinforcement learning is to optimize actions through rewards associated with state-action pairs. In this paper, each data sample is considered as an agent state, while predicting the sample label is considered as an action. A positive reward will be given if the agent state is assigned to the correct action (correct label) and a negative reward will be added when the wrong action (wrong label) is taken. The objective of maximizing the cumulative rewards is used to make the strategy model converge. By iteratively exploring and exploiting these interactions, the system can acquire an optimal policy corresponds to different states [19]. With the help of reinforcement learning, the model will converge towards the desired goal through a unique reward and exploration mechanism.

In order to extract the target features better, another MC method is proposed. Neural networks can be understood as having the ability to extract features of the input. But it cannot be determined what exactly features are extracted. When humans try to recognize an object, they always remember the features that recur. This means when a feature is repeated more frequently in a large amount of data, it is more likely to be learned. Thus, several classes are mixed into one mixed class in the MC method. It will highlight the common feature between these classes, and meanwhile, weaken the edge features of these mixed sub-classes (MSCs). MC method will give prominence to these common features. If the MSCs and the unknown have the same common features, then these extracted common feature can be used to synthesize the fake unknown class samples in feature domain. In order to enhance the realism of synthesized samples for unknown classes, it is better to choose the semantically similar event classes as the MSCs.

The overview of the proposed method is shown in Fig. 2. It consists of five steps: (1) Signal collection; (2) Semantic space development and MC selection; (3) FEN training through RLGTM; (4) Feature synthesis and classifier training; (5) Offline test. The extra vibration near the sensing fiber will cause probe light phase change in the sensing fiber. And the phase change will be converted into light intensity fluctuation in the Rayleigh backscattered traces(RBT). Through rearranging the RBTs, the temporal-spatial data matrix can be obtained, whose horizontal row stands for spatial domain and the vertical column stands for temporal domain. The local temporal-spatial data matrix including the extra vibration information will be processed into grayscale images and used as the event data sample [20,21]. Semantic space will be developed by describing the events in different aspects and MC will be selected based on the event similarity in semantic space. Several FENs will be trained by RLGTM approach to extract different features of the unknown class. Then the data samples of unknown class can be synthesized in the feature domain by the features of those known class samples based on the transformation relationship obtained from semantic space. The FEN obtained by RLGTM can only identify the MC. Thus another special FEN (SFEN) is constructed by normal training method to recognize the other known classes. One sample from each MC group will be selected and input into the corresponding FEN during the unknown class sample synthesis. The sample from known classes will pass through both the FENs trained by RLGTM and SFEN to form the final sample feature (FSF) for known classes. Because there is no unknown class samples in training stage, one output feature of randomly selected FEN will be copied once more to concatenate after the synthesized feature to form the FSF of unknown class. Both the FSF of unknown class and known classes will then be used to train a classifier. Finally, the trained FENs and classifier are tested offline. The data sample will pass through all the FENs and SFEN to form the FSF for classification in the test stage.

Fig. 2. The overview of the proposed method for Φ-OTDR event recognition.

Download Full Size | PDF

2.2 Semantic space and MC selection

In this section, we introduce the mathematical relationship between known and unknown classes and how to chose the proper MC. The training set is denoted as $S = \{{{X_S},{L_S}} \}$ with $K$ types of known class. ${X_S}$ denotes the sample and ${L_S} = \{{{l_k}|k = 1,2, \cdots ,K} \}$ denotes the label . ${l_k}$ is the label of the $kth$ class. The test set is denoted as $T = \{{{X_t},{L_t}} \}$, where ${X_t}$ denotes the samples of both known and unknown classes. ${L_t} = \{{{l_{{k_{test}}}}|{k_{test}} = 1,2, \cdots {K_{test}}} \}$ denotes the label while ${L_t} - {L_S} \ne \emptyset ,{K_{test}}\mathrm{\ > {\rm K}}$. The goal is to predict $l_{{k_{test}}}^{Test} \in {L_t}$. For each of the known and unknown classes, a semantic vector ${D_K} = [d_k^1,d_k^2, \cdots d_k^n],k = 1,2, \cdots {K_{test}}$ is defined by their real physical process (vibration process). The distance between each known class and the unknown class in the semantic space will be calculated by the Euclidean distance, which is expressed as,

(1)$$dis({D_k^{},D_{\textrm{un}known}^{}} )\textrm{ = }\sqrt {\sum\limits_{i = 1}^n {{{({d_k^i - d_{unknown}^i} )}^2}} }$$

Then, m known classes with the shortest distance from the unknown class will be selected. In order to extract the features of the unknown class from different aspects, several MCs will be created. Each MC is consist of ${v_i}$ MSCs (${v_i} \le m$) selected from the nearest m known classes. The subscript i is the index of MC. The semantic vector ${D_{M{C_i}}}$ and the data samples of the $M{C_i}$ can be expressed as,

(2)$${D_{M{C_i}}} = \frac{1}{{{v_i}}}({{D_{{k_1}}} + {D_{{k_2}}} + \cdots + {D_{{k_{{v_i}}}}}} )$$

(3)$${X_{M{C_i}}} = \bigcup\limits_{i = 1}^{{v_i}} {{X_{{k_i}}}}$$

2.3 FEN training with RLGTM

One $FEN$ will be trained to extract the common features in one MC. The training set ${S_{MC}}$ of a $FEN$ includes the MC and other known classes. In order to effectively extract the common features in MC, the RLGTM is proposed. The goal of RLGTM is to recognize only the MC and try its best to blur other classes. The basic process of RLGTM is as follows. Assume one data sample is ${x_t}$, the action ${a_t}$ means predicting the classification result of ${x_t}$ after FEN. Then the instant reward ${r_t}$ will be obtained based on the action ${a_t}$. If the action ${a_t}$ returns a correct classification, a positive reward will be given and if the action ${a_t}$ returns a wrong classification, a negative reward will be given. Then the process turns into the next data sample ${x_{t + 1}}$. This process repeats until all of the training data are applied and saves the experience during these interactions. Through inspecting the obtained experience, the FEN can refresh itself. Deep Q-learning network (DQN) is the deep reinforcement learning strategy used to lead the evolutionary direction of FEN in this paper. The value Q denotes the future rewards received by a network in current state. The DQN algorithm consists of two networks, namely the target network and the current network. In order to expand the random search scope at start and avoid falling into a local optimal solution, $\mathrm{\varepsilon }\textrm{ - greedy}$ strategy is applied to select the action ${a_t}$. There are two ways to obtain the action ${a_t}$ and value ${Q_{target}}$. One way is to obtained them from the target network with $\textrm{1 - }\mathrm{\varepsilon }$ probability and the other way is to randomly set them with $\mathrm{\varepsilon }$ probability, where $\mathrm{\varepsilon } \in [0,1]$ and $\mathrm{\varepsilon }$ decreases after each action. The target network chooses the action ${a_t}$ with the maximum ${Q_{target}}$ and gets immediate rewards ${r_t}$. The $experience\textrm{ = }\{{{x_t},{a_t},{r_t},{Q_{target}}} \}$ will be restored. Several experience will be randomly selected from the restored experience database to calculate the estimated long term Q value ${q_{estimate}}\textrm{ = (1 - }\beta \textrm{)}{\textrm{Q}_{target}} + \beta {r_t}$, where $\beta \in (0,1)$ is a trade-off parameter to balance instant reward and future reward. The current network is a copy of target network, and it also chooses an action ${a_t}$ and obtain a future reward ${Q_{current}}$. The current network updates itself through comparing the difference between the one-step future reward ${Q_{current}}$ with the best long term Q value ${q_{estimate}}$. The above process repeats specific times and then passes the parameters of current network to the target network. The flow of DQN is shown in Algorithm 1 and in Fig. 3. D is the memory restoring the experience. α is the leaning rate, which gradually decreases from 0.1 to 0.001 during the training and the update_time is set to be 128 in the following experiment. The final current network after training is used as the FEN for next processing.

Algorithm 1. Feature network training process based on DQN

View Table | View all tables in this article

Fig. 3. The flow of DQN training process.

Download Full Size | PDF

2.4 Synthesizing samples for unknown classes

The similar samples with shorter feature distances are more likely to belong to one class. This is one of the important ideas of clustering algorithms [22]. The same idea also works in the semantic space, which means the distance between each class in the semantic space also represents the similarity of them [23,24]. The similarity weights ${w_i}$ in the semantic space between the unknown class and $M{C_i}$ can be expressed as,

(4)$${w_i} = \exp \left( { - \frac{{||{D_{M{C_i}}} - {D_{unkonwn}}|{|^2}}}{{{\xi^2}}}} \right)$$

where, $\xi$ is the average of the distance in semantic space between any MC and the unknown class.

The output of the second to last layer in FEN can be extracted as the sample feature. The synthesized feature is constructed from the output feature ${F_{M{C_i}}}$ from all FENs. The MC samples are fed into the FENs, and the corresponding feature outputs ${F_{M{C_i}}}$ are weighted summed by the similarity weights ${w_i}$ to synthesize feature of unknown class,

(5)$${F_t} = \sum\limits_{i = 1}^N {{w_i} \bullet {F_{M{C_i}}}}$$

where, ${F_t}$ denotes the synthesized feature of unknown class and N is the total number of MC. Then one output feature of randomly selected FEN will be copied once more to concatenate after the synthesized feature to form the FSF of unknown class.

3. Experiment and analysis

3.1 Data preparation

The data are collected by a home made Φ-OTDR system, the structure of which is shown in Fig. 4. An ultra-narrow linewidth laser (NLL) with a linewidth of 3 kHz is used as the light source. Continuous light is converted into pulses by an acousto-optic modulator (AOM) with 200 MHz driving. In order to compensate the insertion loss, an erbium-doped fiber amplifier (EDFA) is applied. The probe pulses are injected into the sensing fiber through a circulator. The sensing fiber type is a G652 single-mode fiber, about 1 km long and buried 5 cm below the ground surface. When the probe pulse propagates along the sensing fiber, Rayleigh backscattered light is formed and routed back to the photodetector (PD). The RBT carries the optical phase change information along the sensing fiber. When an event occurs near the sensing fiber, the optical phase changes with the vibration, which results in a change in the light intensity in the RBT. The RBTs are then recorded by a data acquisition card (DAC) with a sampling frequency of 50 MHz, and processed in a personal computer (PC). Two pulse widths, 100 ns and 200 ns, are applied. The RBTs within 1s and 50 meters two side range near the vibration location are converted into a data matrix. The 50 meters range are long enough to cover the whole vibration range in this experiment. The horizontal rows of the data matrix represent the spatial domain and the vertical columns of the data matrix represents the temporal domain. Each matrix is converted into a grayscale image and resized to 256 × 256, which is regarded as a spatial-temporal image (STI) sample. Eight types of event are conducted and recorded. The details of these events are shown in Table 1. In order to test the recognition ability without training samples, JP event is set to be the unknown class, which means there are no JP samples in the training set. The typical STI for each category is shown in Fig. 5.

Fig. 4. Diagram of distributed optical fiber sensing system.

Download Full Size | PDF

Fig. 5. Typical spatial-temporal images of each vibration event.

Download Full Size | PDF

Table 1. Distribution of the data

View Table | View all tables in this article

3.2 Mixed class selection

In this part, the suitable MCs are selected based on whether they have the same specific attributes in semantic space. The semantic space including several attributes is manually defined based on the physical process of each vibration. The semantic vector consists of seven values (attributes) from different aspects, including whether the contact material of the vibration source is soft or not, the contact process between the vibration source and the ground is a short pulse duration or not, the vibration itself is strong or not, the acoustic waveform of the vibration is pulse-like or not, there is an aftershock during the whole vibration time or not, the impact period of the vibration is short or not, and the impact of the main vibration weaken rapidly or not. Those attributes in the semantic vector are directly observed from how these vibration happen and how they act on the optical fiber sensor. And the selection of these attributes is based on how a human judge waveform type. For example, the impact material of WK and JP are sneakers, and the corresponding soft material attribute of them are set to be 1. The impact material of DG and BT are shovels, and the soft material attribute of them are set to be 0. As the physical process of the unknown class is known, the semantic vector of it can also be defined. The semantic attributes are shown in Table 2. Based on the semantic vectors, three classes that are closest to the unknown class are taken according to Eq.(1). The three classes are BE, WK and DG, which are used as the MSCs. Each two MSCs are mixed together to get three MCs. MC1 consists of BE and WK, which have the same attributes of soft materials, pulse like waveform and aftershock. MC2 consists of WK and DG, which have the same attributes of aftershock, short impact time and rapid weakening. MC3 consists of BE and DG, which have the same attributes of strong vibration and aftershock. These common attributes in each MCs are all important attributes of the unknown class JP. Then the semantic vectors of each MC are obtained by Eq.(2).

Table 2. Semantic attributes for each class

View Table | View all tables in this article

3.3 Feature extraction and sample synthesis

Different FENs are trained to extract the specific features as we want and tested in this section. Four datasets (including training and test set) are reconstructed to train and test FEN1, FEN2, FEN3 and SFEN, respectively. The details of these four datasets are shown in Table 3. In order to reduce the amount of computation, the Resnet18 network pre-trained on Imagenet dataset is used as the basic network for FEN. The structure of Resnet18 is shown in Table 4. FEN1, FEN2, and FEN3, are trained by MC-RLGTM on dataset1, dataset2, and dataset3 respectively. FEN1 is trained to extract the features related to soft materials, pulse like waveform and aftershock. FEN2 is trained to extract the features related to aftershock, short impact time and rapid weakening. FEN3 is trained to extract the features related to strong vibration and aftershock. Figure 6(a) shows the test results of FEN1 after training. The result shows that the FEN1 can only identify the MC1, while the recognition accuracy for other classes is basically a blind guess. In this way, the FEN1 will extract the common feature in MC1, which is the feature representing soft materials, pulse like waveform and aftershock. The FEN4 (the SFEN) is trained by normal training method on dataset4 to classify the known classes. Figure 6(b) shows the test results of FEN4. It shows that FEN4 can recognize all known classes.

Fig. 6. Test results of FEN1 (a) and FEN4 (b).

Download Full Size | PDF

Table 3. Composition of each dataset

View Table | View all tables in this article

Table 4. The structure of Resnet18

View Table | View all tables in this article

The output of the second to last layer of FEN is extracted out as the feature of the input data samples, whose size is 1 × 10. The FSF used for final classifier training are synthesis of the output features from FEN1∼4. There are two ways to synthesize FSF for the known classes and the unknown class, respectively. Way 1 for known class: The data sample of known class are input into FEN1∼4 simultaneously. The output features of FEN1∼3 are synthesized firstly by weighted summation. The summation weights are the similarity weights of each MC obtained by Eq.(4), which are 0.26,0.37 and 0.37. Then the output of FEN4 and the weighted summation of FEN1∼3 is concatenated to form the FSF, whose size is 1 × 20. Way 2 for the unknown class: As there is no unknown class data sample in the training set, three samples from MC1, MC2 and MC3 in the training set are picked out to help create the FSF of unknown class. These three samples pass through the FEN1, FEN2 and FEN3 respectively and three feature vectors are obtained. Then the three feature vectors are weighted summed up, whose weights are the same as way 1. At this time, this vector contains the characteristics of the unknown class and can be used to train the final classifier. However, in order to keep the feature length of unknown class equal to the feature length of known classes, the output of FEN1 is copied once more and concatenated to form a 1 × 20 vector. The FSF of both known and unknown classes are visualized by t-distributed stochastic neighbor embedding (t-SNE) [25], which is shown in Fig. 7. It can be seen that FSF belong to one class can cluster together. It also needs to mention that the WK class forms two groups in Fig. 7, but it doesn’t means there are two classes. This phenomenon is due to the non-convex dimension reduction process of t-SNE. Then a support vector machine (SVM) classifier is trained by the FSF of both known and unknown classes for the final classification.

Fig. 7. t-SNE visualization of FSF used for final classifier training (including JP-synthesized unknown classes). Points with the same color belong to the same class.

Download Full Size | PDF

3.4 Comparison test

In this part the effectiveness of MC and RLGTM methods are validated through ablation experiments. Three other training approaches are carried out for comparison. Method 1: Using the normal network training method proposed in [15] with normal training set (without MC and RLGTM); Method 2: Using normal network training method with MC to set the training set (MC-only); Method 3: Using RLGTM for network training with normal training set (RLGTM-only). Method 1 is the basic automatic feature selection method and set as a blank control. Methods 2 and 3 are the ablation comparisons. The training set of Method 2 and the proposed approach, which both use MC method, are set as in Table 3.

In order to enhance the reliability and credibility of the comparison, ten-fold cross-validation [5] is applied. Each data sample will take turns to be part of the test set in ten-fold cross-validation. Both the overall classification accuracy of all event types and the classification accuracy of the unknown event type are recorded. The average classification accuracy of ten tests is used as the evaluation parameter.

The results are shown in Fig. 8 and Fig. 9. Figure 8 shows the average classification accuracy for all classes and unknown class, separately, and Fig. 9 shows the details of the ten tests. The error bar in Fig. 8 shows the standard deviation of the average classification accuracy. It can be seen that the overall accuracy is about 70% for different methods. However, the classification accuracy of unknown class (AoU) are different. The AoU is only 13% for Method 1, which is just the level of random guess. Method 2 (MC-only) achieves 21% AoU, with 8% higher than that of Method 1, which proves that the MC method can help the FEN to focus on the common feature in MC. Method 3 (RLGTM-only) also achieves 21% AoU, which proves that the RLGTM training method can help the FEN evolve in specific direction to help reduce the unwanted features. With the help of MC and RLGTM to purify the feature output of FEN, the proposed MC-RLGTM finally improves the AoU to 42%, with 29% improvement compared to Method 1. The proposed method also has larger AoU fluctuation when the training set changes. This means the performance of common feature extraction are more likely to be affected by the data diversity of training set.

Fig. 8. The average overall and unknown class classification accuracy of different methods.

Download Full Size | PDF

Fig. 9. Details of ten-fold cross validation for different training methods.

Download Full Size | PDF

It needs to mention that there is no unknown class data sample in the training set and only the relationship between abstract conceptions of known and unknown classes is used. Other studies using ZSL shows that the unknown class recognition accuracy can only achieve 30%∼50% [10–15]. Thus, a 42% recognition rate of unknown class is a relatively good and acceptable performance under such a harsh condition. The method can also be used in conjunction with other recognition methods. For example, it can be applied to models that require the use of annotated test data to further improve recognition accuracy [26,27]. By increasing the initial recognition rate of unknown classes, it may be possible to effectively improve the performance of these models.

4. Conclusion

This paper proposes a new feature synthesizing framework based on MC-RLGTM approach for classification task of Φ-OTDR in the no-sample case. This is the first time that the distributed optical fiber sensing system can recognize a specific unknown event and tell its event type without collecting its data sample in advance. With the help of MC and RLGTM approaches, the FEN will evolve in a specific direction, and similar feature elements between the MC and unknown class can be extracted. The ablation comparisons validate the effectiveness of MC and RLGTM. The experimental results show that the proposed method can achieve a 42% classification accuracy for unknown class, with 29% improvement compared to the existing instance synthesizing method. This technique may be further combined with other semi-supervised methods or data augmentation methods, and will promote and expand the application of distributed optical fiber sensors.

Funding

National Natural Science Foundation of China (61801283); Basic and Applied Basic Research Foundation of Guangdong Province (2021A1515012001); Research Project of Tianjin Education Commission (2018KJ230).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data will be made available upon reasonable request.

References

1. C. E, Lee and H. F Taylor, “Apparatus and method for fiber optic intrusion sensing,” U.S. Patent 5,194,847 (Mar. 16, 1993).

2. Y. Wang, H. Yuan, X. Liu, et al., “A comprehensive study of optical fiber acoustic sensing,” IEEE Access 7, 85821–85837 (2019). [CrossRef]

3. F. Wang, Z. Shao, Z. Hu, et al., “Micromachined fiber optic Fabry-Perot underwater acoustic probe,” Proc. SPIE 9283, 928308 (2014). [CrossRef]

4. H. Wu, S. Xiao, X. Li, et al., “Separation and determination of the disturbing signals in phase-sensitive optical time domain reflectometry (Φ-OTDR),” J. Lightwave Technol. 33(15), 3156–3162 (2015). [CrossRef]

5. Y. Shi, Y. Zhang, S. Dai, et al., “Footsteps detection and identification based on distributed optical fiber sensor and double-YOLO model,” Opt. Express 31(25), 41391–41405 (2023). [CrossRef]

6. Y. Hu, Z. Meng, M. Zabihi, et al., “Performance enhancement methods for the distributed acoustic sensors based on frequency division multiplexing,” Electronics 8(6), 617 (2019). [CrossRef]

7. A. Lv and J. Li, “On-line monitoring system of 35 kV 3-core subma-rine power cable based on ϕ-OTDR,” Sens. Actuators, A 273(15), 134–139 (2018). [CrossRef]

8. Q. Sun, H. Feng, X. Yan, et al., “Recognition of a Phase-Sensitivity OTDR Sensing System Based on Morphologic Feature Extraction,” Sensors 15(7), 15179–15197 (2015). [CrossRef]

9. H. Jia, S. Liang, S. Lou, et al., “A k-nearest Neighbor Algorithm based Near Category Support Vector Machine Method for Event Identification of ϕ-OTDR,” IEEE Sens. J. 19(10), 3683–3689 (2019). [CrossRef]

10. W. Wang, V. W. Zheng, H. Yu, et al., “A Survey of Zero-Shot Learning: Settings,Methods, and Applications,” ACM Trans. Intell. Syst. Technol. 10(2), 37 (2019). [CrossRef]

11. Z. Akata, F. Perronnin, Z. Harchaoui, et al., “Label-embedding for attribute-based classification,” in Conference on Computer Vision and Pattern Recognition, 819–826 (IEEE, 2013).

12. Z. Zhang and V. Saligrama, “Zero-Shot Learning via Semantic Similarity Embedding,” in International Conference on Computer Vision, 4166–4174 (IEEE, 2015).

13. E. Kodirov, T. Xiang, Z. Fu, et al., “Unsupervised Domain Adaptation for Zero-Shot Learning,” in International Conference on Computer Vision, 2452–2460 (IEEE, 2015).

14. Z. Al-Halah, M. Tapaswi, and R. Stiefelhagen, “Recovering the Missing Link: Predicting Class-Attribute Associations for Unsupervised Zero-Shot Learning,” in Conference on Computer Vision and Pattern Recognition, 5975–5984 (IEEE, 2016).

15. Y. Guo, G. Ding, J. Han, et al., “Synthesizing samples for zero-shot learning,” in 26th International Joint Conference on Artificial Intelligence, 1774–1780 (2017).

16. H. Wu, Y. Wang, X. Liu, et al., “Smart Fiber-Optic Distributed Acoustic Sensing (sDAS) With Multi-Task Learning for Time-Efficient Ground Listening Applications,” IEEE Internet of Things Journal 1-1 (2023).

17. A. M. Rizzo, L. Magri, D. Rutigliano, et al., “Known and unknown event detection in OTDR traces by deep learning networks,” Neural Computing and Applications 34(22), 19655–19673 (2022). [CrossRef]

18. Z. Zhou, W. Jiao, X. Hu, et al., “Open-Set Event Recognition Model Using 1-D RL-CNN With OpenMax Algorithm for Distributed Optical Fiber Vibration Sensing System,” IEEE Sens. J. 23(12), 12817–12827 (2023). [CrossRef]

19. J. Lin, Z. Ma, R. Gomez, et al., “A Review on Interactive Reinforcement Learning From Human Social Feedback,” IEEE Access 8, 120757–120765 (2020). [CrossRef]

20. Y. Shi, Y. Wang, L. Zhao, et al., “An event recognition method for φ-OTDR sensing system based on deep learning,” Sensors 19(15), 3421 (2019). [CrossRef]

21. Y. Shi, S. Dai, X. Liu, et al., “Event recognition method based on dual-augmentation for a Φ-OTDR system with a few training samples,” Opt. Express 30(17), 31232–31243 (2022). [CrossRef]

22. S. Aghabozorgi, A. S. Shirkhorshidi, and T. Y. Wah, “Time-series clustering - A decade review,” Information Systems 53, 16–38 (2015). [CrossRef]

23. Y. Guo, G. Ding, J. Han, et al., “Zero-shot recognition via direct classifier learning with transferred samples and pseudo labels,” in 31st Conference on Artificial Intelligence, 4061–4067 (AAAI, 2017).

24. Y. Guo, G. Ding, J. Han, et al., “Zero-shot learning with transferred samples,” IEEE Trans. on Image Process. 26(7), 3277–3290 (2017). [CrossRef]

25. L. V. Maaten, G. E. Hinton, and L Der-Yeghiaian, “Visualizing High-Dimensional Data Using t-SNE,” Journal of Machine Learning Research 9(11), 2579–2605 (2008).

26. Y. Shi, J. Chen, S. Dai, et al., “Φ-OTDR Event Recognition System Based on Valuable Data Selection,” J. Lightwave Technol. 42(2), 961–969 (2024). [CrossRef]

27. K. Sohn, D. Berthelot, N. Carlini, et al., “FixMatch: Simplifying semi-supervised learning with consistency and confidence,” Advances in neural information processing systems 33, 596 (2020).

Event type	Class Name	Training samples	Test samples	Detail
Background	BK	1368	336	Without artificial interference, only environmental signals are collected in sunny arid days.
Beating	BT	1300	316	A person takes a shovel to tap ground periodically near the sensor fiber at a rate of one second.
Bicycling	BE	909	224	A person rides a bicycle in circles around the sensing fiber
Digging	DG	1152	280	A person takes a shovel to dig periodically near the sensor fiber at a rate of one second
Raining	RN	1600	212	Similar to the BK, the difference is that the BK is the data collected during a sunny day, while this type of event is the data collected during a rainy day.
Walking	WK	1416	348	One person walks near the sensing fiber. The walking speed is about 1.2 m per second.
Watering	WR	532	132	A person takes a water pipe and flushes the ground near the sensor fiber. Water reaches its highest point about one meter above the ground.
Jumping	JP	0	1286	A person jumps up and down near the sensor fiber at a height of about 25 cm and a rate of one second.

Class name	Soft materials	Short pulse duration	Strong vibration	Pulse like waveform	Aftershock	Short impact time	Rapid weakening
BK	0	0	0	0	0	0	0
BT	0	1	0	1	0	1	0
BE	1	0	1	1	1	0	0
DG	0	1	1	0	1	1	1
RN	1	0	1	0	0	0	0
WK	1	0	0	1	1	1	1
WR	1	0	0	0	0	0	0
JP(unknown)	1	1	1	1	1	1	1

Dataset Number	Training and test set	Composition details
1	Training set $S_{M C_{1}}$	BK, BT, DG, RN, WR, MC1
1	Testing set $T_{M C_{1}}$	BK, BT, DG, RN, WR, MC1
2	Training set $S_{M C_{2}}$	BK, BT, BE, RN, WR, MC2
2	Testing set $T_{M C_{2}}$	BK, BT, BE, RN, WR, MC2
3	Training set $S_{M C_{3}}$	BK, BT, WK, RN, WR, MC3
3	Testing set $T_{M C_{3}}$	BK, BT, WK, RN, WR, MC3
4	Training set $S_{k n o w n}$	BK, BT, BE, DG, RN, WK, WR
4	Testing set $T_{k n o w n}$	BK, BT, BE, DG, RN, WK, WR

Layers	Structure	Input size
Conv1	7 × 7, 64, stride 2	3 × 256 × 256
Max_pool	3 × 3, max pool, stride 2	64 × 128 × 128
Layer1	$[\begin{array}{ll} 3 \times 3, & 64 \\ 3 \times 3, & 64 \end{array}] \times 2$ ,stride 1	64 × 64 × 64
Layer2	$[\begin{array}{ll} 3 \times 3, & 128 \\ 3 \times 3, & 128 \end{array}] \times 2$ ,stride 1	64 × 64 × 64
Layer3	$[\begin{array}{ll} 3 \times 3, & 256 \\ 3 \times 3, & 256 \end{array}] \times 2$ ,stride 1	128 × 32 × 32
Layer4	$[\begin{array}{ll} 3 \times 3, & 512 \\ 3 \times 3, & 512 \end{array}] \times 2$ ,stride 1	256 × 16 × 16
Avg_pool	Average pool	512 × 8 × 8
FC	In features = 512,Out feature = category number	512 × 1 × 1

Event type	Class Name	Training samples	Test samples	Detail
Background	BK	1368	336	Without artificial interference, only environmental signals are collected in sunny arid days.
Beating	BT	1300	316	A person takes a shovel to tap ground periodically near the sensor fiber at a rate of one second.
Bicycling	BE	909	224	A person rides a bicycle in circles around the sensing fiber
Digging	DG	1152	280	A person takes a shovel to dig periodically near the sensor fiber at a rate of one second
Raining	RN	1600	212	Similar to the BK, the difference is that the BK is the data collected during a sunny day, while this type of event is the data collected during a rainy day.
Walking	WK	1416	348	One person walks near the sensing fiber. The walking speed is about 1.2 m per second.
Watering	WR	532	132	A person takes a water pipe and flushes the ground near the sensor fiber. Water reaches its highest point about one meter above the ground.
Jumping	JP	0	1286	A person jumps up and down near the sensor fiber at a height of about 25 cm and a rate of one second.

Event recognition method based on feature synthesizing for a zero-shot intelligent distributed optical fiber sensor

Abstract

1. Introduction

2. Methodology and mathematical model

2.1 Methodology

2.2 Semantic space and MC selection

2.3 FEN training with RLGTM

2.4 Synthesizing samples for unknown classes

3. Experiment and analysis

3.1 Data preparation

3.2 Mixed class selection

3.3 Feature extraction and sample synthesis

3.4 Comparison test

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (9)

Tables (5)

Equations (5)

Optics Express