Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Automatic baseline correction method for the open-path Fourier transform infrared spectra by using simple iterative averaging

Open Access Open Access

Abstract

It’s necessary to remove the baseline from the spectra, which measured by open-path Fourier transform infrared spectrometry, for further spectral analysis such as qualitative and quantitative analysis. An automatic baseline correction method, the Iterative Averaging method, is presented. Baseline corrected by this method is accurate, and it also shows more precise than other methods when it is applied to Fourier Transform Infrared experimental spectra and simulated data. This method solves the key technology of the real-time on-line spectral analysis of OP-FTIR and improves the capability and adaptability of the unsupervised on-line system effectively.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

The technology of the spectra analysis has been applied broadly with the development of the spectra technology. And the open-path Fourier transform infrared (OP-FTIR) spectrometry made by ourselves is mainly used for monitoring the atmospheric trace gases, however, the measured spectra would have a wander baseline due to the background of the system, such as the light source and atmospheric environment, are changing constantly. So, we need to remove the baseline from the spectra for further spectral analysis. Any acquired spectra, which are measured by infrared spectrometer or other spectrometer, often contains undesirable elements such as noise and background in addition to the desired signal itself [1]. There are many different baseline correction methods [2–13] proposed according to the characteristic of the baseline in different kind of spectra. However, most of the baseline correction methods have one or more tuning parameters, which would limit the application of these methods when they are applied to the unsupervised on-line spectrometer. So, these methods can’t meet demands of the OP-FTIR system that needs on-line spectral analysis.

Many common baseline correction methods have been proposed, such as Wavelet transform [2,8], polynomial fitting [3], Morphological weighted penalized least squares [4], adaptive iterative reweight penalized least squares [5], automatic iterative moving average [6], Asymmetric least squares [7,12] and so on. The wavelet-transform method involves performing a discrete wavelet transform to separate the spectrum into an “approximation” part and a “detail” part. Shao and Griffiths [8] proposed an algorithm where the spectrum is recursively decomposed into approximation and detail components. To some extent, the baseline of the spectra corrected by this method would have some distortion and its optimal wavelet basis, decomposition level, the wavelet coefficient or the threshold in this method are difficult to chooses. The polynomial fitting algorithm assumes that a baseline can be approximated by a low-order polynomial. This method is not suitable for the spectra with lower signal-noise ratio. The method of automatic iterative moving average(AIMA) is not suitable when the spectra are overlapped, and the baseline of the spectra acquired by this method is not smoothness. The principle of Asymmetric least squares(ALS) and adaptive iterative reweight penalized least squares(airPLS) are based on the knowledge about Whittaker smoother. Both methods need to optimize the parameter, which limit the application of these methods when they are applied to on-line spectral analysis, to get a better estimated baseline. And the Morphological weighted penalized least squares(MPLS) algorithm derived from a theory named mathematical morphology that mainly based on classical set operations. However, the traditional method named rubber-band [9] divide spectra into segments firstly, then using linear interpolation or a spline to the point which is lowest in each segment that an estimate of the baseline can be made. The effect of the method is depended on the experience of the users’.

It is difficult to analyze the spectra which are measured by OP-FTIR spectrometry on-line due to the background is changing constantly. Then we proposed an automatic iterative average (IA)method based on the knowledge of the moving average [14] and the characteristic of the spectra that the background is varying slowly. The main step of the IA method is to minizine the peaks of the spectra constantly till satisfying the termination conditions. The IA method can well estimate the baseline automatically compared with other methods such as rubber-band, AIMA, airPLS and MPLS, when it is applied to the experimental and simulated data. It’s implemented by MATLAB in Windows10 system.

2. Methods

2.1 Principle

There are some different definitions of the term ’baseline’ in the literature. To some extent, the spectra data Y that is common to be regarded as three components: baseline, b; signal, s; and noise, n [15]. It can be described as the easiest equation, adding the three components purely, when the spectra data of the OP-FTIR have been normalized and transformed into absorbance. This would give the following equation: Y = b + s + n. However, the baseline correction method can be considered to remove the estimated baseline from the spectrum data. This would leave only the pure signal and some noise. The noise or baseline varies in some linear or nonlinear way with the intensity of the signal in common case. The IA baseline correction method is based on the characteristic of the spectra that the background is varying slowly. We have realized the automatic iterative average baseline correction method according to the basic knowledge of moving average and set up a certain termination conditions for this method. And the main procedure of the IA method can be described as follows.

First, we should get a baseline of the spectra where the peaks are minimized constantly. The intensities of the spectra considered to be an equal interval array xi = [x1, x2, x3, …, xN], and then we start with updating the intensities using:

Xi+1=min(Xi+1,Xi+Xi+22)

where i = 1, 2, 3, …N-1. And then, the following iteration updates the intensities but leaves out the first and last updates. And the termination condition is that the first update, i, reaches the floor(N/2), which N is the number of intensities. And we can get a new array xi’ that peaks of the spectra are not maximal. Then we get a value Sabs as follows:

Sabs=1N|(XiXi')|

where i = 1,2,3…N. And then, we repeat the above steps till the ΔSabs/Sabs is less than a threshold so that we can get a well estimated baseline of the FTIR spectra. The threshold, which should be modified or adjusted when the method is used in other spectra such as the Raman spectra or others, is designed and calculated in the next section.These above procedures named IA algorithm have been finished. And the idea of the IA method can be described more briefly in another way as the flow chart which is shown in Fig. 1.

 figure: Fig. 1

Fig. 1 The flow chart of IA algorithm.

Download Full Size | PDF

2.2 Thresholding

We add a low-order polynomial baseline to the standard spectra of ammonia, which comes from the spectra database established by Pacific Northwest National Laboratory, to create raw spectra with standard signal and baseline. And the formation of the low-order polynomial is y = 0.1 + 0.2*x + 0.03*x^2 + 0.02*x^3, where x represents the wavenumber ranged from 700 to 1900 cm−1. And then, we applied the IA method, in which threshold is set as a series value from 0.000001 to 0.1 divided into 20000 segments, to the created raw spectra to estimate the baseline. Next, we calculated the different root mean square error(RMSE) between the standard spectra and corrected spectra according to the different thresholds. We also calculated the computation time at each threshold and the relationship between the threshold, RMSE and computation time is shown in Fig. 2. Then, we choose the most suitable value as the threshold for a certain termination, which can be judged by that the value of the RMSE is smaller and the computation time is considerable, for this method when it’s applied to the FTIR spectra. We decided to choose 0.0021 as the most suitable threshold for this method according to the results from the Fig. 2. And then we applied IA method to the simulated and experimental data to verify the performance of this method in the next section.

 figure: Fig. 2

Fig. 2 Relationship between the RMSE, threshold and computation time.

Download Full Size | PDF

3. Experimental results and discussion

3.1 Simulated data

We use a part of the standard spectra of ammonia ranged from 913 to 941 cm−1, in which the spectra are overlapped and shown in Fig. 3(a), as the standard simulated spectra. And then we add a baseline, which polynomial formation can be expressed as y = 0.5 + 0.08*x1 + 0.02*x1^2 + 0.002*x1^3 + 0.001*x1^4, to the standard spectra to generate the simulated raw spectra shown in Fig. 3(b). Then we estimated the baseline of the simulated raw spectra using different baseline correction methods such as AIMA, IA, Rubber-band, airPLS and MPLS, and the results of the baseline corrected by these methods can be shown in Fig. 3(c). The data in Fig. 3 are shown in Dataset 1 named simulated data [Ref [16].]. You can obviously find that the baseline corrected by IA methods are more precise than others from Fig. 3(c). The AIMA and Rubber-band methods can’t well estimate the baseline of the spectra, especially the peaks of the spectra are overlapped. We also chose three different peaks with different height of raw spectra to evaluate the effects of the baseline corrected by these methods, and the results of that is shown in Table 1, where the peak 1, 2 and 3 in Fig. 3(a) are the criteria peak of the spectra with standard height, and the position of these peaks are marked in these figures. And the RMSE is the error between corrected peaks and standard peaks. The RMSE value of IA method is close to the airPLS, and it is as twice as smaller than AIMA method. It can be seen from the table that IA baseline correction method outer performs others through the RMSE value. So, we can conclude that the IA baseline correction method shows better results compared with these methods, and it can well estimate the baseline for the FTIR spectra.

 figure: Fig. 3

Fig. 3 (a) Standard spectra; (b) Simulated raw spectra; (c) Baseline corrected by these methods.

Download Full Size | PDF

Tables Icon

Table 1. Three different peaks of the spectra corrected by these baseline correction methods

3.2 Experimental data

The experimental spectra of SF6 measured by OP-FTIR spectrometry made by ourselves, which resolution is set at 1 cm−1 and one scan are added for each spectrum. We had measured the background spectra continuously before we pumped the SF6 gas into the cell, and then we measured the spectra of SF6. So, we can get the true background spectra of the SF6 and regard it as the standard background spectra. The background spectra and raw spectra, which is a part of the measured spectra ranged from 890 to 980 cm−1, are shown in Fig. 4(a).

 figure: Fig. 4

Fig. 4 (a) Raw spectra and background spectra. (b) Baseline corrected by these methods.

Download Full Size | PDF

Next, we apply the IA baseline correction method to the experimental data to verify the performance of this method and compared it with other methods. The results of the baseline corrected by these methods are shown in the Fig. 4(b). And the data set in Fig. 4 named experimental data are available here Dataset 1, [16]. The different methods estimated the baseline of the raw spectra in different colors in this figure. It can be found that the baseline corrected by IA method is the closest to the standard background spectra from the Fig. 4(b). It’s not easy to decide which is the best baseline estimated by these methods for the experimental spectra from this figure. So, we calculated the RMSE between the standard background spectra and corrected baseline spectra, the RMSE of the AIMA, IA, Rubber-band, airPLS and MPLS are 0.0721, 0.0049, 0.0558, 0.0105 and 0.0188, respectively. It is obviously that the IA method can corrected the baseline precisely for the OP-FTIR spectra from the RMSE value. The RMSE value of airPLS is as twice as larger than IA method, however, the RMSE value of the AIMA is the largest among the RMSE value of these methods.

3.3 Results and discussion

The results of the baseline corrected by these methods can be seen from the simulated data and experimental data, which are shown in Figs. 3(c) and 4(b). We can get a conclusion that the IA baseline correction method can better estimate the baseline for the OP-FTIR spectra from these figures and calculation. The Rubber-band method is seriously influenced by the number of the divided section. And the airPLS and MPLS are influenced by the parameters directly. What’s more, the AIMA method can’t well estimate the baseline of the spectra that the peaks are overlapped.

4. Conclusions

Spectral analysis is largely hampered by the baseline or trend problem due to the background is changing constantly. We need to remove the baseline precisely to improve the precision of the spectral analysis. The IA baseline correction method is proposed by the basic knowledge of moving average. Simulated and experimental data are used for testing the efficiency of IA baseline correction algorithm. The results on both simulated and experimental data show the superiority of the mentioned baseline correction methods when they are applied to the OP-FTIR spectra. The advantages of this method are great: it is a robust method with only one parameter and it also can automatically process the OP-FITR spectra on-line without human intervention. The IA method solves the key technology of the real-time on-line spectral analysis of OP-FTIR and improves the capability and adaptability of the unsupervised online analysis of spectral instruments effectively.

Funding

Key Research of Frontier Science Programs of the Chinese Academy of Science (QYZDY-SSW-DQC016), the National Natural Science Foundation of China (41405029) and the National Key Research and Development Program of China (2016YFC0201000 and 2016YFC0803000).

Acknowledgment

The authors would like to thank to anonymous reviewers for their insightful comments and constructive suggestions.

References and links

1. G. Schulze, A. Jirasek, M. M. L. Yu, A. Lim, R. F. B. Turner, and M. W. Blades, “Investigation of Selected Baseline Removal Techniques as Candidates for Automated Implementation,” Appl. Spectrosc. 59(5), 545–574 (2005). [CrossRef]   [PubMed]  

2. C. G. Bertinetto and T. Vuorinen, “Automatic Baseline Recognition for the Correction of Large Sets of Spectra Using Continuous Wavelet Transform and Iterative Fitting,” Appl. Spectrosc. 68(2), 155–164 (2014). [CrossRef]   [PubMed]  

3. T. Lan, Y. Fang, W. Xiong, and C. Kong, “Automatic baseline correction of infrared spectra,” Chin. Opt. Lett. 5(10), 613–616 (2007).

4. Z. Li, D. J. Zhan, J. J. Wang, J. Huang, Q. S. Xu, Z. M. Zhang, Y. B. Zheng, Y. Z. Liang, and H. Wang, “Morphological weighted penalized least squares for background correction,” Analyst (Lond.) 138(16), 4483–4492 (2013). [CrossRef]   [PubMed]  

5. Z. M. Zhang, S. Chen, and Y. Z. Liang, “Baseline correction using adaptive iteratively reweighted penalized least squares,” Analyst (Lond.) 135(5), 1138–1146 (2010). [CrossRef]   [PubMed]  

6. B. D. Prakash and Y. C. Wei, “A fully automated iterative moving averaging (AIMA) technique for baseline correction,” Analyst (Lond.) 136(15), 3130–3135 (2011). [CrossRef]   [PubMed]  

7. J. Peng, S. Peng, A. Jiang, J. Wei, C. Li, and J. Tan, “Asymmetric least squares for multiple spectra baseline correction,” Anal. Chim. Acta 683(1), 63–68 (2010). [CrossRef]   [PubMed]  

8. L. Shao and P. R. Griffiths, “Automatic Baseline Correction by Wavelet Transform for Quantitative Open-Path Fourier Transform Infrared Spectroscopy,” Environ. Sci. Technol. 41(20), 7054–7059 (2007). [CrossRef]   [PubMed]  

9. M. Pirzer and J. Sawatzki, Method and device for correcting a spectrum, U.S. patent: 7359815 (2008).

10. C. Rowlands and S. Elliott, “Automated algorithm for baseline subtraction in spectra,” J. Raman Spectrosc. 42(3), 363–369 (2011). [CrossRef]  

11. W. Dietrich, C. H. Rudel, and M. Neumann, “Fast and Precise Automatic Baseline Correction of One- and Two-Dimensional NMR Spectra,” J. Magn. Reson. 91(1), 1–11 (1991).

12. P. H. Eilers, “A perfect smoother,” Anal. Chem. 75(14), 3631–3636 (2003). [CrossRef]   [PubMed]  

13. X. Liu, Z. Zhang, Y. Liang, P. F. M. Sousa, Y. Yun, and L. Yu, “Baseline correction of high resolution spectral profile data based on exponential smoothing,” Chemom. Intell. Lab. Syst. 139, 97–108 (2014). [CrossRef]  

14. S. W. Smith, Digital Signal Processing (Elsevier, 2003), pp. 277–284.

15. K. H. Liland, T. Almøy, and B. H. Mevik, “Optimal choice of baseline correction for multivariate calibration of spectra,” Appl. Spectrosc. 64(9), 1007–1016 (2010). [CrossRef]   [PubMed]  

16. X. C. Shen, “Simulated data and experimental data” figshare (2018) [retrieved 14 January 2018] https://doi.org/10.6084/m9.figshare.5786667.v1

Supplementary Material (1)

NameDescription
Dataset 1       Simulated data and experimental data in the revised manuscript

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (4)

Fig. 1
Fig. 1 The flow chart of IA algorithm.
Fig. 2
Fig. 2 Relationship between the RMSE, threshold and computation time.
Fig. 3
Fig. 3 (a) Standard spectra; (b) Simulated raw spectra; (c) Baseline corrected by these methods.
Fig. 4
Fig. 4 (a) Raw spectra and background spectra. (b) Baseline corrected by these methods.

Tables (1)

Tables Icon

Table 1 Three different peaks of the spectra corrected by these baseline correction methods

Equations (2)

Equations on this page are rendered with MathJax. Learn more.

X i+1 =min(X i+1 , X i +X i+2 2 )
S a b s = 1 N | ( X i X i ' ) |
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.