Intrusion identification using GMM-HMM for perimeter monitoring based on ultra-weak FBG arrays

Fang Liu; Haiwen Zhang; Xiaorui Li; Zhengying Li; Zhengying Li; Honghai Wang

doi:10.1364/OE.452418

1. Introduction

Perimeter security technology is one of the most significant technical means to ensure the safety of vital areas, for example in military bases, airports, and pipelines. However, the traditional technologies, including video surveillance [1], leaky cable detection [2] and microwave detection [3], can hardly meet the growing security requirements. Because they are difficult to carry out a large-scale monitoring with the obvious shortcomings such as blind spots of monitoring and susceptibility to electromagnetic interference. For long-distance monitoring needs, fiber optical distributed acoustic sensing (DAS) has been a novel technology which is being widely used [4]. Bai et al. [5] reported a DAS-based method for external intrusion identification of the pipeline, claiming that the intrusion events are classified and located. Compared to DAS, ultra-weak fiber Bragg grating (UWFBG) arrays are a new way to achieve high spatial resolution [6], fast sensing response [7] and dynamic measurement [8] in long-distance monitoring. Xin et al. [9] presented research on subway tunnels safety based on UWFBG arrays, which has distinguished surface intrusions with high identification rate. In fact, the report [8] revealed that the UWFBG arrays have better performance than DAS, which is more suitable to deal with distributed vibration.

Usually, the analysis of vibration signal is employed to identify intrusion in long-distance range monitoring. Most of the researches in vibration signal analysis focus on feature extraction and pattern recognition. However, the features extracted from the signal may be similar in different intrusions, which limits the performance of intrusion identification. To improve the identification effect, the studies in pattern recognition play a more important role. The common machine learning methods used for perimeter security are presented in [10–12], including random forest (RF), relevance vector machine (RVM) and support vector machine (SVM). Moreover, the deep learning method becomes a study hotspot in recent researches. One study by Lyu et al. verified that the hybrid convolutional neural network (CNN) is viable to recognize intrusion events with the requirements of real-time monitoring [13]. Li et al. [14] divided intrusion signals into multiple fix-sized frames, and subsequently a classification model with the convolutional long short-term memory network (ConvLSTM) was established to detect high-speed railway intrusion.

Although the aforementioned studies have positive effect on the intrusion identification, the identification methods still have some limitations. In order to achieve a high identification rate, common machine learning methods [10–12] require the model input need to be an effective extraction for structural information of vibration signals, but the high complexity vibration signals increase difficulties for feature extraction. Meanwhile, the deep learning methods has reached a high point because of the advantage of automatically learning hierarchical features. The previous research [14] reveals that the intrusions can be identified by the time dependencies analysis of vibration signals, but identification performance is restrained by the dataset scale. Whereas it is a huge consumption to obtain sufficient dataset in the field experiments, which limits the application of deep learning method in actual engineering [10].

In this paper, an intrusion identification scheme acquired by a Gaussian mixture model (GMM) [15–17] and a hidden Markov model (HMM) [18,19] is proposed for perimeter monitoring. The GMM is a multi-dimensional probability density function composed of multiple single Gaussian distributions, which has the superiority in the expression of sample distribution density. The HMM has attracted much attention in recent sequential data researches because its inherent state transition characteristics benefit to analyze the time dependencies of signal. The GMM is combined with the HMM to improve the performance of intrusion identification by the analysis of time dependencies composed of the temporal and spatial properties of vibration signals in relevant sensors from the intrusion procedures. Since the structural information in intrusion procedure is insufficient to be indicated by feature extraction, the simultaneous analysis of the time dependencies and the features can achieve a higher identification rate compared with the single feature extraction used in common machine learning methods. Thus, the proposed GMM-HMM scheme includes the following steps: First, the analysis of the relevant sensors from the intrusion procedures is used to acquire the time dependencies. Then, the features are extracted from time-domain and frequency-domain. Last, the time dependencies and the features are simultaneously analyzed by GMM-HMM to identify intrusion. The experimental results verify that the scheme can effectively discriminate three intrusions (walking, knocking and climbing) and two non-intrusions (heavy truck passing and wind blowing).

The rest of the paper is organized as follows: Section 2 introduces the experimental system based on UWFBG arrays. Section 3 describes the acquiring of time dependencies and the principle of GMM-HMM identification scheme. Section 4 presents the experimental procedure, and experimental results in several schemes are compared. Finally, Section 5 comes to conclusions.

2. Experimental system

The sensor network in this paper consists of two large-scale UWFBG arrays which are equipped with armored protection. The ultra-weak fiber Bragg gratings (UWFBGs) are fabricated by the on-line writing system, where the UV laser emitted by a 248-nm excimer laser irradiates onto the drawing optical fiber through the phase mask with a constant period [20]. Therefore, the transmission loss of the sensing network is extremely low because there is no fusion loss during the writing.

As shown in Fig. 1(b), the monitoring center consists of the demodulator and the computer. The demodulator collects the vibration response with a sampling rate of 1 kHz and transmits the demodulated data to the computer for analysis. In the demodulator, the light source is a narrow linewidth laser (NLL) with central wavelength of 1550 nm. The continuous-wave (CW) light from NLL is modulated into nanosecond pulse by semiconductor optical amplifier (SOA). Then, the pulse light is amplified by an erbium-doped fiber amplifier (EDFA) and directed to the corresponding UWFBG arrays by a circulator. The reflected pulses from the UWFBGs go through 3${\times} $3 coupler phase demodulation unit which consists of unbalanced Mach-Zehnder interferometer. The phase demodulation unit restores the time-domain vibration signal amplitude by demodulating the phase variation introduced to the light by the optical length variation in the 5-m optical fiber between two adjacent UWFBGs. Besides, optical time domain reflection (OTDR) technology is employed to locate where the vibration appeared [21].

Fig. 1. (a) The UWFBG arrays setting. (b) Five abnormal events collected by UWFBG arrays.

Download Full Size | PDF

As shown in Fig. 1(a). two UWFBG arrays defined as UWFBG arrays I and UWFBG arrays II are separately deployed in the 30 cm underground and the fence. According to the requirements of actual engineering, unapproved people need to be warned when they approach the boundary. Therefore, UWFBG arrays I is applied to detect vibration response generated by heavy truck passing or human footsteps on the ground surface. UWFBG arrays II only detects the vibration responses generated on the fence, such as knocking, climbing and wind blowing.

3. Principle of identification scheme

3.1 Time dependencies acquiring

To develop a scheme which can effectively acquire time dependencies, the intrusion procedure information with temporal-spatial properties needs to be accurately obtained. In UWFBG arrays, the temporal properties can be indicated in a single sensor as time goes by, while the spatial properties can be indicated by the relevance of sensors. Therefore, the vibration signal with time dependencies is defined as the combination of signal frames from the relevant sensors. Besides, there are two important steps need to be considered in the combination. One is determining the signal frame, and the other is combining the signal frames according to the spatial relevance of sensors.

The data characteristics of vibration response in UWFBG arrays are as follows.

(1) The vibration response generated by an intrusion disturbance can be simultaneously detected by continuous adjacent sensors. According to this characteristic, a sensor sequence can be obtained in an intrusion disturbance.
(2) The intensity of the vibration signals in sensors decreases with the increase of the intrusion disturbance distance. According to this characteristic, a sensor with maximum intensity vibration signal can be determined in the obtained sensor sequence, which is defined as the center sensor. Then, the signal frame in center sensor is acquired and defined as sample frame sequence which expressed by $(1)$${f_{i,t}} = [{{y_{i{t_1}}},{y_{i{t_2}}}, \ldots {y_{i{t_L}}}} ], $$$ where ${\textrm{y}_{\textrm{i}{\textrm{t}_\textrm{k}}}}$ is the sampling value of i-th sensor at the ${t_k}$-th moment of t-th time frame, and the length of sample frame sequence is denoted by L, which is determined by the sampling rate of the demodulator in monitoring system. It is found that the center sensor is not fixed because the location of intrusion disturbance can change during the intrusion procedure. Therefore, the temporal-spatial properties in intrusion procedure can be acquired by the combination for sample frame sequence of the center sensor at different time frames. However, when multiple intrusion procedures exist simultaneously, the combinations are easily confused because of the unclear spatial relevance among center sensors at different time frames. As shown in Fig. 2, there are three intrusion procedures in sensors data stream, while in the combination for the latest frames (sample frame sequences), the blue arrows mark correct combinations, and the red arrows mark wrong combinations because the intrusion procedures cannot correspond to the combinations.

In the proposed paper, two assumptions are employed to deal with the confusion problem.

(1) Homogeneity assumption of the sample frame sequence: the sample frame sequence at time frame t is only related to the sample frame sequence at previous time frame $t - 1$. Therefore, it is not necessary to pay attention to the integral combination path, but the spatial position of former sequence ${f_{{i_{t - 1}},t - 1}}$ is considered. Given by $(2)$$P({{f_{{i_t},t}}|{{f_{{i_{t - 1}},t - 1}},{f_{{i_{t - 2}},t - 2}}, \ldots ,{f_{{i_{t - T + 1}},t - T + 1}}} } )= P({{f_{{i_t},t}}|{{f_{{i_{t - 1}},t - 1}}} } ), $$$ where the length of combination is T.
(2) Pertinence assumption of the sample frame sequences: the relevance between two sensors decreases with the increase of the spatial distance. Therefore, the relevance between ${f_{{i_t},t}}$ and ${f_{{i_{t - 1}},t - 1}}$ can be expressed by $(3)$$P\left( {{f_{{i_t}}}|{f_{{i_{t - 1}},t - 1}}} \right) = N(distance({i_t} - {i_{t - 1}})),$$$ where $N(x )$ is the normal distribution, and distance$({{i_t},{i_{t - 1}}} )$ is the distance between ${i_t}$-th center sensor and ${i_{t - 1}}$-th center sensor.

Fig. 2. Combination of sample frame sequences in multiple intrusion procedures.

Download Full Size | PDF

According to the analysis of relevant sensors, including ${i_t}$-th center sensor and ${i_{t - 1}}$-th center sensor. The spatial relevance of sensors is determined and the confusion problem is solved. Then, the combination scheme is put forward and the details are provided in Algorithm 1. Finally, when the intrusion disturbance in an intrusion procedure disappears, the combination cannot get the latest ${f_{{i_t},t}}$. Then, a combination corresponding to this intrusion procedure can be acquired, which is defined as the observation sample sequence.

oe-30-10-17307-i001

3.2 Hidden Markov model (HMM)

HMM describes a double-stochastic process and has a good effect in processing and identifying time sequence. In this paper, one of the stochastic processes is the transition of multiple states in vibration signals, and the other is the initial state probability produced by the state. For convenience, an HMM can be succinctly represented by a $\boldsymbol{\mathrm{\lambda}} = ({\boldsymbol{\mathrm{\pi}},{\mathbf A},{\mathbf B}} )$ triplet which is characterized by the following elements:

(1) $\boldsymbol{\mathrm{\pi}}$ is the initial state probability distribution matrix, $(4)$$\boldsymbol{\mathrm{\pi}} = \{{{\pi_1},{\pi_2}, \ldots ,{\pi_N}} \}, $$$ where N is the number of states in the model. The set of states is defined by ${\mathbf s} = \{{{\textrm{s}_1},{\textrm{s}_2}, \ldots ,{\textrm{s}_N}} \}$. From the set ${\mathbf v} = \{{{v_1},{v_2}, \cdots ,{v_M}} \}$ of observation symbols per state, we can obtain a state sequence ${\mathbf q}$ and the corresponding observation sequence ${\mathbf x}$. They are, $(5)$${\mathbf q} = \{{{q_1},{q_2}, \ldots ,{q_T}} \}, $$$ $(6)$${\mathbf x} = \{{{x_1},{x_2}, \ldots ,{x_T}} \}, $$$ where T is the sequence length.
(2) ${\mathbf A}$ is the state transition probability distribution matrix, $(7)$${\mathbf A} = \{{{a_{ij}}|{1 \le i} \le N,1 \le j \le N} \}, $$$ where ${a_{ij}} = P\{{{q_t} = {\textrm{s}_i}\,|\,{q_{t + 1}} = {\textrm{s}_j}} \}$ represents the transition probability while state ${\textrm{s}_i}$ at time frame t transfers to the state ${\textrm{s}_j}$ at time frame $t + 1$.
(3) ${\mathbf B}$ is the observation symbol probability distribution matrix, $(8)$${\mathbf B} = \{{{b_j}(k )|{1 \le j \le N,1 \le k \le M} } \}, $$$ where ${b_j}(k )= P\textrm{(}{v_k}\; \textrm{at}\; t\textrm{|}{q_t} = {\textrm{s}_j})$ is the probability of producing observation symbol ${v_k}$ by state ${\textrm{s}_j}$ at time frame t. An example of a simple HMM composed of two states (${\textrm{s}_1}$,${\textrm{s}_2}$) is shown in Fig. 3, which shows that the model is drawn with circles for states and arrows for state transitions, and state sequence ${\mathbf q}$ is produced during the transition. In the Markov process, the sequence ${\mathbf x}$ can only be directly observed, which is the observation sample sequence in Section 3.1, while the observation symbol ${x_t}$ is the sequence ${f_{{i_t},t}}$. Therefore, to acquire the matrix ${\mathbf B}$, the number N of states corresponding to each intrusion need to be obtained, which is analyzed in Section 4.2.

3.3 Construction and identification of GMM-HMM

Instead of the conventional Baum-welch algorithm, GMM is employed in HMM to estimate the observation symbol probability in this paper to fully consider the spatial distribution characteristics. A GMM is a probabilistic model that assumes all the data are generated from a superposition of finite Gaussian distributions. By adjusting means, covariances and coefficients of a sufficient number of Gaussian distributions, almost any sample distribution densities can be approximated to arbitrary accuracy [22]. Assuming the number of Gaussian distributions is K, the GMM can be described as

(9)$$p({\mathbf x}) = \sum\limits_{k = 1}^K {{c_k}} N({{\mathbf v}|{{\mu_k}} ,{\Sigma _k}} ), $$

where ${c_k}$ is the weight of each Gaussian distribution, ${\boldsymbol v} = \{{{v_1},{v_2}, \cdots ,{v_M}} \}$ is the set of observation symbols per state, and $N({{\mathbf v}\textrm{|}{\mu_k},{\Sigma _k}} )$ represents the probability density function of the k-th Gaussian distribution

(10)$$N({\mathbf v}|{{\mu_k}} ,{\Sigma _k}) = \frac{{\exp \left\{ { - \frac{1}{2}{{({{\mathbf v} - {\mu_k}} )}^T}\Sigma _k^{ - 1}({{\mathbf v} - {\mu_k}} )} \right\}}}{{{{(2\pi )}^{{ {D\left/2 \right.}}}}{{|{{\Sigma _k}} |}^{{ {1\left/{1} \right.}}}}}}, $$

where ${\Sigma _k}$ is covariance matrix, ${\mu _k}$ is mean vector, and D is the data dimensions of set ${\mathbf v}$.

Fig. 3. A simple HMM composed of two states.

Download Full Size | PDF

Before the evaluating of GMM, K-means clustering algorithm is applied to initialize parameters $({{\mu_k},{c_k},{\Sigma _k}} )$. Then, the probability density of the set ${\mathbf v}$ can be formulated as the log-like likelihood function

(11)$$\ln p({\mathbf v}|{{\mathbf c},\boldsymbol{\mathrm{\mu}},\boldsymbol{\mathrm{\Sigma}}} ) = \sum\limits_{m = 1}^M {\ln \left\{ {\sum\limits_{k = 1}^K {{c_k}N({v_m}|{{\mu_k},{\Sigma _k}} )} } \right\}}. $$

To construct a well-suited GMM, expectation maximization (EM) method, including an estimation step (E step) and a maximization step (M step), is employed to solve maximum likelihood function by adding Lagrange multiplier [23]. The M step solves the GMM by maximizing the function, the E step calculates the auxiliary function ${\hat{\gamma }_{mk}}$ which estimates the probability that the sample frame sequence ${v_m}$ belongs to the k-th Gaussian distribution. Given by

(12)$${\hat{\gamma }_{mk}} = \frac{{{c_k}N({v_m}|{{\mu_k},{\Sigma _k}} )}}{{\sum\limits_{k = 1}^K {{c_k}N({v_m}|{{\mu_k},{\Sigma _k}} )} }}. $$

Therefore, the observation symbol probability in Eq. (8) can be expressed by

(13)$${b_j}({v_m}) = \sum\limits_{k = 1}^K {{c_{jk}}N({v_m}|{{\mu_{jk}},{\Sigma _{jk}}} )}. $$

where matrix ${\mathbf B}$ is obtained by the training of N GMMs corresponding to N states, which is input to HMM for analysis.

By given an observation sample sequence ${\mathbf x} = \{{{x_1},{x_2}, \ldots ,{x_T}} \}$, the training of HMM is to find the optimum model parameter $\boldsymbol{\mathrm{\lambda}} = ({\boldsymbol{\mathrm{\pi}},{\mathbf A},{\mathbf B}} )$ that maximizes $P({{\mathbf x}\textrm{|}\boldsymbol{\mathrm{\lambda}}} )$ [24]. Thus, the log-like likelihood function can be derived

(14)$$\ln p({\mathbf x},{\mathbf q}|\boldsymbol{\mathrm{\lambda}} ) = \ln {\pi _{{q_1}}} + \sum\limits_{t = 1}^{T - 1} {\ln {a_{{q_t}{q_{t + 1}}}} + \sum\limits_{t = 1}^T {\ln {b_{{q_t}}}({x_t})} }, $$

where ${\mathrm{\pi }_{{q_1}}}\; $ is the initial state probability, and ${a_{{q_t}{q_{t + 1}}}}$ is the transition probability distribution. Additionally, ${\theta _t}(i )= p({{q_t} = {s_i}\textrm{|}{\mathbf x},\boldsymbol{\mathrm{\lambda}}} )$ and ${\xi _t}({i,j} )= p({{q_t} = {s_i},{q_{t + 1}} = {s_j}\textrm{|}{\mathbf x},\boldsymbol{\mathrm{\lambda}}} )$ solved by forward-backward calculation can benefit to settle the evaluation problem $P({{\mathbf x}\textrm{|}\boldsymbol{\mathrm{\lambda}}} )$, in which ${\theta _t}(i )$ represents the probability of being in state ${s_i}$ at time frame t by given the sequence ${\mathbf x}$ and model $\boldsymbol{\mathrm{\lambda}}$, ${\xi _t}({i,j} )$ represents the probability of being in state ${s_i}$ at time frame t and state ${s_j}$ at time frame $t + 1$ by given the sequence ${\mathbf x}$ and model $\boldsymbol{\mathrm{\lambda}}$. Therefore, the HMM parameters $({\boldsymbol{\mathrm{\pi}},{\mathbf A}} )$ can be obtained,

(15)$${\hat{\pi }_i} = \frac{{p({\mathbf x},{q_1} = {s_i}|{\bar{\boldsymbol{\mathrm{\lambda}}}} )}}{{p({\mathbf x}|{\bar{\boldsymbol{\mathrm{\lambda}}}} )}}, $$

(16)$${\hat{a}_{ij}} = \frac{{\sum\limits_{t = 1}^{T - 1} {p({\mathbf x},{q_t} = {s_i},{q_{t + 1}} = {s_j}|{\bar{\boldsymbol{\mathrm{\lambda}}}} )} }}{{\sum\limits_{t = 1}^{T - 1} {p({\mathbf x},{q_t} = {s_i}|{\bar{\boldsymbol{\mathrm{\lambda}}}} )} }}. $$

where the parameters $({\boldsymbol{\mathrm{\pi}},{\mathbf A}} )$ calculated in the iteration and the parameter ${\mathbf B}$ evaluated by GMM jointly make up the GMM-HMM.

After the construction of GMM-HMM, the way to identify intrusions is to use the Viterbi algorithm [25] to find the “best” state transition path between the test observation sample sequence ${\mathbf o} = \{{{o_1},{o_2}, \ldots ,{o_T}} \}$ and the previously trained GMM-HMM. Depending on the difficulties of GMM-HMM training, the input to the model can be the raw sequence or the feature sequence of the raw sequence. The principles of model construction and identification are shown in Fig. 4, and the primary notations used in this section are listed in Table 1.

Fig. 4. Construction and identification of GMM-HMM.

Download Full Size | PDF

Table 1. Notations

View Table | View all tables in this article

4. Results and discussion

4.1 Experimental scenario

The experimental area is Ezhou Huahu Airport, and each UWFBG arrays are with a total length of about 2 km and a spatial resolution of 5 m. The vibration responses are detected by the two UWFBG arrays shown in Fig. 1(a), and the vibration signals are obtained by using the demodulator shown in Fig. 1(b). The sampling rate of the system is 1 kHz, and L is set as 1000. Figure 5 shows four abnormal events collected in the field experiments. The signals corresponding to the five events (including wind blowing) are acquired in the following ways: the signals collected from UWFBG arrays I are generated by the heavy truck passing and the human footsteps on the ground surface; the signals collected from UWFBG arrays II are generated by knocking, climbing and wind blowing on the fence.

Fig. 5. Four common abnormal events collected by two different UWFBG arrays. (a) Walking. (b) Heavy truck passing. (c) Knocking. (d) Climbing.

Download Full Size | PDF

4.2 Signal state analysis

The signal and corresponding time-frequency spectrum of two common abnormal events in UWFBG arrays I are shown in Fig. 6, in which the time-frequency spectrum is obtained by calculating and normalizing the spectrum of each sequence. There are discrete pulse responses in walking signal, while the signal between discrete pulse responses is calm. Therefore, walking signal is split up into the transition of two states which are discrete pulse and calm. As a comparison, the signal of heavy truck passing has continuous pulse responses, and the response intensity manifests three trends including increasing, invariant and decreasing over time. Then, heavy truck passing signal is regarded as the transition of three states.

Fig. 6. Signal and corresponding time-frequency spectrum of two common abnormal events in UWFBG arrays I. (a) Walking. (b) Heavy truck passing.

Download Full Size | PDF

The signal and corresponding time-frequency spectrum of three common abnormal events in UWFBG arrays II are shown in Fig. 7. Compared to Fig. 6, the complexities in three vibration signals become higher. The signal differences between knocking and climbing are that the knocking signal shows periodic change, but in one period, the trend changes of response intensity are roughly the same, both of which include increasing, decreasing and invariant. Therefore, the two signals are decomposed into the transition of three states. Additionally, Fig. 7(c) shows that the trend changes of response intensity in wind blowing signal cannot be directly acquired. Therefore, K-means clustering algorithm is employed to accurately obtain the transition of states in wind blowing signal, in which the contour coefficient is utilized as evaluation criteria. As shown in Fig. 8, the clustering performs best when the number of clusters is two. Thus, wind blowing signal is treated as the transition of two states.

Fig. 7. Signal and corresponding time-frequency spectrum of three common abnormal events in UWFBG arrays II. (a) Knocking. (b) Climbing. (c) Wind blowing.

Download Full Size | PDF

Fig. 8. Clustering evaluation for wind blowing signal.

Download Full Size | PDF

Finally, the number N of states in Eq. (4) corresponding to walking, heavy truck passing, knocking, climbing and wind blowing are obtained, which are two, three, three, three and two, respectively.

4.3 Data preprocessing

In order to fully indicate the signal characteristics, we extract features from time-domain and frequency-domain. The time-domain collects six features including zero-crossing rate, waveform index, pulse factor, kurtosis, peak and variance. The frequency-domain features acquire nine features, which not only includes center frequency, main frequency and mean square frequency, but also considers the mixed features in frequency spectrum, in which the frequency band (0∼300 Hz) is selected and averagely truncated into six frequency bands, and six features are obtained by separately calculating the kurtosis of each frequency band. Finally, a 15-dimensional feature vector is acquired by the normalization and the standardization for the 15 features collected from time-domain and frequency-domain.

In the two events (walking and heavy truck passing) from UWFBG arrays I, 64 groups of observation sample sequence for each event are selected and randomly divided into two parts, 45 groups as training observation sample sequences and 19 groups as testing observation sample sequences. In the three events (knocking, climbing and wind blowing) from UWFBG arrays II, 85 groups of observation sample sequence are selected for each event and randomly divided into two parts, 60 groups as training observation sample sequences and 25 groups as testing observation sample sequences.

Before model training, the number K of Gaussian distributions in Eq. (9) need to be chosen. Because the GMM is applied to express the distribution density of observation symbols per state, K is determined by the length of observation symbols set ${\mathbf v}$ in Eq. (10). From observation sample sequences, the length of set ${\mathbf v}$ is about 350 in each state, which will lead to overfitting and low efficiency for GMM when K is too high (K$> $8). Besides, our calculations show that because K is too small (1∼3), the obtained distribution information is insufficient. Thus, K is separately set to six, four, four, six and eight for walking, heavy truck passing, knocking, climbing and wind blowing.

4.4 Evaluating the performance of classification models

To show the performance of the GMM-HMM, some comparison classification models are set, including RF, decision tree (DT), SVM, RVM, CNN and CNN-LSTM. The input of different classifiers from machine learning methods need adopt the same feature extraction as Section 4.3, which is a 15-dimensional feature vector acquired from the extraction of an observation sample sequence ${\mathbf x} = \{{{x_1},{x_2}, \ldots ,{x_T}} \}$. We extracted a 15-dimensional feature vector from sequence ${x_k}$, and a 15${\times} $T-dimensional feature sequence is obtained from sequence ${\mathbf x}$, which is used as the input of GMM-HMM. The inputs of CNN and CNN-LSTM are raw sequence ${\mathbf x}$.

${F_1}$ score is selected as the performance evaluation for seven classifiers. The ${F_1}$ score considers both precision P and recall R, calculated by

(17)$${F_1} = \frac{{2PR}}{{P + R}}. $$

Table 2 shows the identification rates in five abnormal events using GMM-HMM and other six classifiers, where the GMM-HMM has highest dentification rates for walking, knocking and wind blowing, and the corresponding identification rates are 100%, 96% and 100%. The CNN-LSTM has highest dentification rates for heavy truck passing and climbing, and the corresponding identification rates are 100% and 100%.

Table 2. Identification rates in five abnormal events for seven classifiers

View Table | View all tables in this article

Figure 9 shows the ${F_1}$ scores in UWFBG arrays I and II for seven classifiers. It can be seen that most classifiers perform well in UWFBG arrays I, but machine learning methods (RVM, SVM, DT and RF) and CNN have worse performance in UWFBG arrays II. As a comparison, CNN-LSTM and GMM-HMM perform well in UWFBG arrays II because both can analyze time dependencies of vibration signals.

Fig. 9. ${F_1}$ scores in UWFBG arrays I and II for seven classifiers.

Download Full Size | PDF

Table 3 details the identification performance for seven classifiers, and the average identification rates (AIR) are also listed in Table 3. Obviously, the identification performance is best when GMM-HMM is adopted, where average ${F_1}$ score is 0.973 and AIR is 98.2%. Compare to other six classifiers, the GMM-HMM is more effective for intrusion identification in perimeter monitoring.

Table 3. Comparison of identification performance for seven classifiers

View Table | View all tables in this article

5. Conclusion

On the basis of such remarkable advantages of UWFBG arrays as high spatial resolution, fast sensing response and dynamic measurement, an intrusion identification scheme based on UWFBG arrays is proposed to address the challenge of realizing high identification rate in long-distance range perimeter monitoring. By the analysis of relevant sensors from UWFBG arrays, the vibration signals with time dependencies are obtained and extracted to features which are used as the input of GMM-HMM. Then, intrusion is identified by the simultaneous analysis of the time dependencies and the features. The GMM-HMM effectively identifies intrusion and reduces the huge consumptions of dataset. The experimental results show that the proposed scheme can discriminate three intrusions and two non-intrusions with the average identification rate of 98.2%, which is better than the other six classifiers. The proposed scheme proves that it can meet the requirements of most actual applications in long-distance perimeter monitoring.

Funding

National Natural Science Foundation of China (61735013).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. C. Hsu and H. Szu, “Smart sensing surveillance video system,” in Sensing and Analysis Technologies for Biomedical and Cognitive Applications, vol. 9871, p.98710W (2016).

2. H. Xu, J. Qiao, J. Zhang, H. Han, J. Li, L. Liu, and B. Wang, “A High-Resolution Leaky Coaxial Cable Sensor Using a Wideband Chaotic Signal,” Sensors 18(12), 4154 (2018). [CrossRef]

3. Q. Wang, H. Yigitler, R. Jantti, and X. Huang, “Localizing Multiple Objects using Radio Tomographic Imaging Technology,” IEEE Trans. Veh. Technol. 65(5), 3641–3656 (2016). [CrossRef]

4. Y. Muanenda, “Recent Advances in Distributed Acoustic Sensing Based on Phase-Sensitive Optical Time Domain Reflectometry,” J. Sens. 2018, 1–16 (2018). [CrossRef]

5. Y. Bai, J. Xing, F. Xie, S. Liu, and J. Li, “Detection and identification of external intrusion signals from 33 km optical fiber sensing system based on deep learning,” Opt. Fiber Technol. 53, 1 (2019). [CrossRef]

6. X. Gui, Z. Li, F. Wang, Y. Wang, C. Wang, S. Zeng, and H. Yu, “Distributed sensing technology of high-spatial resolution based on dense ultra-short fbg arrays with large multiplexing capacity,” Opt. Express 25(23), 28112–28122 (2017). [CrossRef]

7. C. Wang, Y. Shang, X. Liu, C. Wang, H. Yu, D. Jiang, and G. Peng, “Distributed OTDR-interferometric sensing network with identical ultra-weak fiber Bragg gratings,” Opt. Express 23(22), 29038–29046 (2015). [CrossRef]

8. Q. Nan, S. Li, Y. Yao, Z. Li, H. Wang, L. Wang, and L. Sun, “A Novel Monitoring Approach for Train Tracking and Incursion Detection in Underground Structures Based on Ultra-Weak FBG Sensing Arrays,” Sensors 19(12), 2666 (2019). [CrossRef]

9. L. Xin, Z. Li, X. Gui, X. Fu, M. Fan, J. Wang, and H. Wang, “Surface intrusion event identification for subway tunnels using ultra-weak FBG arrays based fiber sensing,” Opt. Express 28(5), 6794–6805 (2020). [CrossRef]

10. X. Huang, B. Wang, K. Liu, and T. Liu, “An Event Recognition Scheme Aiming to Improve Both Accuracy and Efficiency in Optical Fiber Perimeter Security System,” J. Lightwave Technol. 38(20), 5783–5790 (2020). [CrossRef]

11. F. Liu, D. Ma, S. Li, W. Gan, and Z. Li, “Classifying tunnel anomalies based on ultraweak FBGs signal and transductive RVM combined with Gaussian mixture model,” IEEE Sens. J. 20(11), 6012–6019 (2020). [CrossRef]

12. P. Ma, K. Liu, J. Jiang, Z. Li, P. Li, and T. Liu, “Probabilistic Event Discrimination Algorithm for Fiber Optic Perimeter Security Systems,” J. Lightwave Technol. 36(11), 2069–2075 (2018). [CrossRef]

13. C. Lyu, Z. Huo, X. Cheng, J. Jiang, A. Alimasi, and H. Liu, “Distributed Optical Fiber Sensing Intrusion Pattern Recognition Based on GAF and CNN,” J. Lightwave Technol. 38(15), 4174–4182 (2020). [CrossRef]

14. Z. Li, J. Zhang, M. Wang, Y. Zhong, and F. Peng, “Fiber distributed acoustic sensing using convolutional long short-term memory network: a field test on high-speed railway intrusion detection,” Opt. Express 28(3), 2925–2938 (2020). [CrossRef]

15. C. Arellano and R. Dahyot, “Robust ellipse detection with Gaussian mixture models,” Pattern Recognit. 58, 12–26 (2016). [CrossRef]

16. T. Reitmaier and B. Sick, “The responsibility weighted Mahalanobis kernel for semi-supervised training of support vector machines for classification,” Inf. Sci. 323, 179–198 (2015). [CrossRef]

17. B. Kulis, S. Basu, I. Dhillon, and R. Mooney, “Semi-supervised graph clustering: a kernel approach,” Mach. Learn. 74(1), 1–22 (2009). [CrossRef]

18. J. Lichtenauer, E. Hendriks, and M. Reinders, “Sign Language Recognition by Combining Statistical DTW and Independent Classification,” IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 2040–2046 (2008). [CrossRef]

19. G. Dahl, D. Yu, L. Deng, and A. Acero, “Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition,” IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012). [CrossRef]

20. H. Guo, F. Liu, Y. Yuan, H. Yu, and M. Yang, “Ultra-weak fbg and its refractive index distribution in the drawing optical fiber,” Opt. Express 23(4), 4829–4838 (2015). [CrossRef]

21. Y. Tong, Z. Li, J. Wang, H. Wang, and H. Yu, “High-speed mach-zehnder-otdr distributed optical fiber vibration sensor using medium-coherence laser,” Photonic Sens. 8(3), 203–212 (2018). [CrossRef]

22. A. C. Faul and M. E. Tipping, “Analysis of sparse Bayesian learning,” in 15th Annual Conference on Neural Information Processing Systems (NIPS), vol.14, pp. 383–389 (2002).

23. Z. Ju and H. Liu, “Fuzzy Gaussian Mixture Models,” Pattern Recognit. 45(3), 1146–1158 (2012). [CrossRef]

24. X. Li, M. Parizeau, and R. Plamondon, “Training hidden Markov models with multiple observations - A combinatorial method,” IEEE Trans. Pattern Anal. Mach. Intell. 22(4), 371–377 (2000). [CrossRef]

25. S. Eddy, “Profile hidden Markov models,” Bioinformatics. 14(9), 755–763 (1998). [CrossRef]

Parameter	Definition	Parameter	Definition
L	The length of sample frame sequence	$f_{i_{t}, t}$	Sample frame sequence
i	Sensor	K	The number of Gaussian distributions
t	Time frame	$q$	State sequence
N	The length of $s$	$s$	The set of states
T	The length of $x$	$x$	Training observation sample sequence
D	The data dimensions of $v$	$v$	The set of observation symbols per state
M	The length of $v$	$o$	Testing observation sample sequence

Classifier	UWFBG arrays I		UWFBG arrays II
Classifier	Walking	Heavy truck passing	Climbing	Knocking	Wind blowing
RVM	95.2%	100%	86.4%	92.9%	88.9%
SVM	100%	94.4%	90.9%	89.3%	85.2%
DT	90.5%	100%	81.8%	82.1%	81.5%
RF	100%	94.4%	86.4%	85.7%	85.2%
CNN	100%	100%	76.3%	71.1%	98.8%
CNN-LSTM	88%	100%	100%	94.7%	97.5%
GMM-HMM	100%	94.7%	96%	96%	100%

Performance	RVM	SVM	DT	RF	CNN	CNN-LSTM	GMM-HMM
$F_{1}$ score	0.920	0.911	0.852	0.902	0.891	0.957	0.973
AIR	91.1%	92.2%	84.2%	88.8%	91.4%	93.5%	98.2%

Parameter	Definition	Parameter	Definition
L	The length of sample frame sequence	$f_{i_{t}, t}$	Sample frame sequence
i	Sensor	K	The number of Gaussian distributions
t	Time frame	$q$	State sequence
N	The length of $s$	$s$	The set of states
T	The length of $x$	$x$	Training observation sample sequence
D	The data dimensions of $v$	$v$	The set of observation symbols per state
M	The length of $v$	$o$	Testing observation sample sequence

Classifier	UWFBG arrays I		UWFBG arrays II
Classifier	Walking	Heavy truck passing	Climbing	Knocking	Wind blowing
RVM	95.2%	100%	86.4%	92.9%	88.9%
SVM	100%	94.4%	90.9%	89.3%	85.2%
DT	90.5%	100%	81.8%	82.1%	81.5%
RF	100%	94.4%	86.4%	85.7%	85.2%
CNN	100%	100%	76.3%	71.1%	98.8%
CNN-LSTM	88%	100%	100%	94.7%	97.5%
GMM-HMM	100%	94.7%	96%	96%	100%

Intrusion identification using GMM-HMM for perimeter monitoring based on ultra-weak FBG arrays

Abstract

1. Introduction

2. Experimental system

3. Principle of identification scheme

3.1 Time dependencies acquiring

3.2 Hidden Markov model (HMM)

3.3 Construction and identification of GMM-HMM

4. Results and discussion

4.1 Experimental scenario

4.2 Signal state analysis

4.3 Data preprocessing

4.4 Evaluating the performance of classification models

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (9)

Tables (3)

Equations (17)

Optics Express