Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Raman spectra recovery using a second derivative technique and range independent baseline correction algorithm

Open Access Open Access

Abstract

We report on a computational technique that recovers Raman peaks embedded in highly fluorescent contaminated spectra. The method uses a second derivative technique to identify the most intense Raman peak, and a modified Savisty Golay algorithm to filter and recover the embedded Raman peaks iteratively. This technique is an improvement on existing background removal algorithms in both performance and user objectivity.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Raman Spectroscopy is presumably one of the most widely used techniques in analysing the chemical composition of compounds. The result of this analysis is reflected in the extensive literature information regarding this technique [14]. Raman analysis is noninvasive and requires very little or no sample preparation for molecular identification or discrimination [5].

Nonetheless, like most other spectroscopic techniques, the Raman technique has a major setback caused by spectral baselines interfering with the Raman signals [69]. The spectral baselines appear as huge background features of several orders in magnitude that submerge the tiny Raman peaks. These relatively huge backgrounds are considered to be intrinsic fluorescence emitted by the sample [10]. Both Fluorescence and Raman emissions are due to electronic transition processes that occur almost simultaneously in molecules. However, sophisticated instrumentation can distinguish and observe the Raman emissions while suppressing or altogether avoiding the fluorescence emissions. Some of the instrumental-based approaches used to mitigate the effect of fluorescence include; Polarisation Raman spectroscopy (PRS), shifted excitation Raman difference spectroscopy (SERDS), UV or NIR Raman Spectroscopy [613]. Nonetheless, each of these techniques brings in its own cost and space implications.

Consequently, current studies in Raman spectroscopy focus on Computational based processing methods as cost-effective alternatives to instrumental-based methods. Computational-based processing methods include manual estimation by visual inspection, iterative moving average techniques, complete matrix methods, frequency domain filtering methods such as Fourier transform and wavelets transforms, shifted spectra techniques, and polynomial fitting [916]. Peculiar situations where these methods are found useful are discussed [9]. However, polynomial fitting is predominately used in various cases because of its efficiency and simplicity [15]. In this approach, a polynomial of an appropriate order is selected to estimate and remove the spectral baselines while leaving Raman signals intact. That notwithstanding, polynomial fitting requires some level of expertise in choosing the polynomial order for different situations. Polynomial order choosing makes the process less repeatable and somewhat subjective.

There are several techniques based on polynomial fitting that has been developed in an attempt to overcome these limitations [13,14,16]. Very recent among these techniques is the Range independent background subtraction Algorithm (RIA) [17]. This algorithm estimates the spectral baseline by filtering an input spectrum using a zero-order Savisty Golay Filter. The filtering process is iterated until all the high-frequency Raman peaks are smoothened out, leaving the low-frequency baseline to be subtracted from the input signal. Modifications to RIA have been suggested in the proposed Raman peak recognition method based automated fluorescence subtraction algorithm (RIA-SG-RPR) [18]. Both RIA and RIA-SG-RPR, do not entirely avoid human interference since they rely on user inputs for execution. In RIA, height and width parameters for two artificial Gaussian peaks are subjectively determined and included in the algorithm. In RIA-SG-RPR, the user must specify the Signal to Fluorescence Ratio (SFR) of the input spectrum before the algorithm will function. The SFR parameter is approximately estimated by visual inspection.

We have therefore proposed a background subtraction algorithm that addresses the shortcomings of RIA and RIA-SG-RPR. Our proposed algorithm, named second derivative technique and range independent algorithm, SDT-RIA, tackles the problem of user intervention using a second derivative technique. This method computes and objectively determines the magnitude of the most intense Raman peak for any input spectra to be used as the convergence criterion. This criterion allows for complete automation implementation in the modified iterative filtering process. We have evaluated SDT-RIA with simulated data by comparing our results with RIA and RIA-SG-RPR. We have also used SDT-RIA to recover Raman signals of some experimentally measured data obtained from highly fluorescent samples to demonstrate its applicability in real situations

2. Methods

The SDT-RIA initially determines the magnitude of the most intense Raman peak in the input spectral data using the second derivative approach before continuing with the iteration process. This approach is explained by the following; the measured signal $(S)$ collected at a frequency $(v)$ can generally be expressed as

$$S(v) = L + R(v)$$
$S(v)$ is a combination of the fluorescence background $L$ and Raman signal $R(v)$. The fluorescence background assumes a fixed position even when the excitation frequency changes, whereas $R(v)$ shifts relative to the excitation frequency. Since $L$ does not change with excitation frequency, it behaves as a constant term in the equation. $R(v)$ also represents peaks of a high frequency described generally by a Lorentzian profile [19] given as
$$R(v) = \frac{{{a_0}}}{{1 + {{(\frac{{v - {a_1}}}{{{a_2}}})}^2}}}$$
where ${a_0}$, ${a_1}$ and ${a_2}$ represent the amplitude, position, and width (FWHM) of the Lorentzian peak profile, respectively. By differentiating Eq. (1) twice with respect to $v$, Eq. (1) becomes
$$S^{\prime\prime}(v) = R^{\prime\prime}(v)$$

Substituting Eq. (2) into Eq. (3) leads to the following expression

$$S^{\prime\prime}(v) = \frac{{8{a_0}(v - {a_1})}}{{{a_2}^4{{(1 + {{(\frac{{v - {a_1}}}{{{a_2}}})}^2})}^3}}} - \frac{{2{a_0}}}{{{a_2}^2{{(1 + {{(\frac{{v - {a_1}}}{{{a_2}}})}^2})}^2}}}$$

The unwanted fluorescence background represented by L is done away with in Eq. (4), leaving $S^{\prime\prime}(v)$ comprising solely of the pure Raman signals in the second derivative form.

The position of the Raman peak with maximum intensity in the measured spectrum $S(v)$ corresponds to the position where the minimum value in the second derivative spectrum $S^{\prime\prime}(v)$ occurs. Equation (4) is fitted when the peak with minimum value in the second derivative spectrum occurs. The fitting is done using the Levenberg-Marquardt nonlinear least square fitting routine [11]. The value determined by the fitting routine serves as an intrinsic criterion of convergence in the iteration procedure, which follows afterwards. The fitting routine satisfies the central purpose for the second derivative technique introduced in this work.

The SDT-RIA process proceeds to perform the underlying iterative smoothening procedure [17,18]. In this process, the measured spectrum $S(v)$ is subjected to an iterative smoothing using a zero-order Savisty Golay Filter. The smoothing is done so that the high-frequency Raman peaks are gradually eliminated, leaving the fluorescence background L to be subtracted $S(v)$ to obtain the pure Raman signal $R(v)$. Figures 1 and 2 show the flow chart of the SDT-RIA process and its pictorial depiction, respectively. An executable code for the SDT-RIA is included in Supplement 1.

 figure: Fig. 1.

Fig. 1. Flowchart of the Second Derivative Technique and Range Independent Baseline Correction Algorithm (SDT-RIA) for Raman spectral recovery

Download Full Size | PDF

 figure: Fig. 2.

Fig. 2. Raman signal recovery processes using SDT-RIA, (a) is the simulated Raman spectra and its second derivative, (b) shows the most intense peak fitted in the second derivative spectrum, (c) shows how SDT-RIA estimates the background, and (d) shows how accurately the recovered signal compares with the original spectrum having no baseline.

Download Full Size | PDF

3. Results and discussion

3.1 SDT-RIA simulation with different types of baselines

For the development and validation of our proposed algorithm, SDT-RIA, band parameters of 12 Lorentzian bands, and a fifth-order polynomial described in literature were used to generate a synthetic Raman spectrum for analysis [17].

We first analysed the performance of SDT-RIA for the different types of baselines usually encountered in spectroscopy. For this reason, a Gaussian and Sigmoidal baseline function and the fifth-order polynomial function were used for the analysis.

The nature of each of the three types of baselines is shown in Fig. 3. There is complete recovery of the Raman peaks with almost no apparent distortions or artefacts in all three situations. This full recovery exemplifies the effectiveness of the SDT-RIA in estimating spectral baselines and recovering Raman signals irrespective of the nature of the baseline

 figure: Fig. 3.

Fig. 3. Raman signals recovered (red) from (a) fifth-degree polynomial (b) Gaussian and (c) sigmoidal baselines using SDT-RIA

Download Full Size | PDF

3.2 SDT-RIA simulation with different baselines sizes

We also investigated the influence that the magnitude of the fluorescence background to the magnitude of the Raman signal has on the performance of SDT-RIA. This relation is known as the Signal to Fluorescence Ratio (SFR) [18,19]. We expect that any good background subtraction algorithm should accurately recover Raman signals embedded in a spectrum irrespective of the SFR.

In Fig. 4, we illustrate how SDT-RIA successfully recovers Raman signals with varying SFRs. Values of SFR below 0.05 represent some of the very challenging situations experienced in actual measurement. However, our proposed algorithm, as clearly shown in Figs. 4(c) and 4(d) perform exceptionally well even for such difficult circumstances.

 figure: Fig. 4.

Fig. 4. Raman signals recovered (red) with SDT-RIA from polynomial baselines having SFR values of (a) 1, (b) 0.5, (c) 0.05 and (d) 0.005

Download Full Size | PDF

3.3 Comparison of SDT-RIA with RIA and RIA-SG-RPR

The performance of SDT-RIA was assessed based on the weighted correlation coefficient (WCC) and compared with RIA and RIA-SG-RPR (see Table 1). The high WCC values obtained in most cases by SDT-RIA indicate our algorithm can recover Raman signals that match the generated Raman spectra very well. This indication implies that the recovered spectra reveal the “molecular fingerprint”, which is crucial in Raman spectroscopy analysis [20].

Tables Icon

Table 1. Comparison of the performance of RIA, RIA-SG-RPR, and SDT-RIA for different baseline types and sizes.

Among the three methods, SDT-RIA performed best to recover Raman signals from spectral baselines with SFR greater than 0.05. SDT-RIA also served creditably better, especially for spectral baselines of the Gaussian and Polynomial types. Even though the SDT-RIA eliminates all subjectivity associated with the two other methods, we are hopeful that further optimisation processes, when introduced, will make SDT-RIA robust enough to efficiently recover signals in any type or size of the background

3.4 SDT-RIA recovery behaviour for spectra with varying signal to noise ratio

Ideally, higher SNR is preferred in Raman spectroscopy primarily because of the weak signals. However, noise is usually encountered in actual experimental situations. Since SDT-RIA uses the second derivative approach in detecting peaks and because derivatives generally have the potential of enhancing noise, it was worthwhile assessing the proposed method on the synthetically generated Raman spectra with artificially added noise. We have therefore investigated the recovery capability of SDT-RIA for noisy signals, as shown in Fig. 5. Artificially generated Gaussian white noise having a signal to noise ratio (SNR) of 10, 1 and 0.01 was developed and added to the spectra (Figs. 5(a) and 5(b)).

 figure: Fig. 5.

Fig. 5. Raman signal recovery process on noisy spectra (a) shows the simulated Raman spectra with artificial noise of different SNR (10, 1, 0.01), (b) magnifies a section of the spectra for visual observation of noise levels (c) shows the second derivative computed for each of the noisy spectra (d) shows the SDT-RIA process on the derivative spectra with SNR of 0.01(S4), where the spectra are first smoothed for easy identification of peak with maximum amplitude before fitting followed by (e) baseline estimation. (f) compares the recovered spectra of S4 to the original simulated spectra (S4), which had no baseline

Download Full Size | PDF

From Fig. 5(c), it is observed that computing the second derivative of noisy spectra leads to higher noise levels. The derivative spectra looked highly obscured in all cases of SNR. The noise in the derivative spectra reached similar amplitudes comparative to the natural spectral peaks. Therefore, to enable the correct detection of the most intense peak, the derivative spectra had to be smoothed before fitting (Fig. 5(d)). The choice of the number of iterations for the smoothening is an optimisation problem. However, we observed Savitsky Golay smoothing of 3 iterations effectively reduced the noise in all three levels of SNR, thus enabling the Raman signals to be correctly observed and recovered (as shown for SNR 0.01 in Figs. 5(e) and 5(f)). The recovered spectra remained intact with its noise since the smoothing is performed only on the derivative spectra and not on the original data. SDT-RIA is, therefore, capable of analyzing noisy spectra as the case usually is for actual experimental data.

4. Applications

We considered some experimentally collected data on which our proposed algorithm can be applied to correct their baselines. The first set was rock minerals (Adamite and Tourmaline) whose spectra were obtained from a Raman spectra database [21] (Figs. 6 and 7). Adamite has a spectral background with features comparable to the Gaussian and sigmoidal functions. The Raman spectra of Tourmaline represent a pretty complex situation. It has a high noise level and relatively low SFR. Our results in Figs. 6(d) and 7(d) show many similarities between the SDT-RIA recovered signals and those retrieved by experts in the database [21].

 figure: Fig. 6.

Fig. 6. Raman signal of Adamite recovered with SDT-RIA and compared with that corrected by experts as obtained from the spectral database (Ref. [21])

Download Full Size | PDF

 figure: Fig. 7.

Fig. 7. Raman signal of Tourmaline recovered with SDT-RIA and compared with that corrected by experts as obtained from the spectral database (Ref. [21])

Download Full Size | PDF

The second application was a honey sample measured with 785 nm laser excitation. As a biological sample, honey is highly fluorescent. Over here, too, it can be seen that the spectra are almost lost due to the intense fluorescence. SDT-RIA applied to the spectra in comparison with results using other widely accepted baseline correction methods, including AsLS [22], rubber-band [23] and SNIP algorithm [24] by some experts are shown in Fig. 8. These examples demonstrate how SDT-RIA can be successfully implemented in practical situations.

 figure: Fig. 8.

Fig. 8. Raman signals of 785 nm excited Raman spectra recovered with SDT-RIA and compared with that corrected by experts using standard baseline correction methods (AsLS, Rubber-band and SNIP algorithm)

Download Full Size | PDF

5. Conclusion

This article proposed a novel Raman baseline correction algorithm, SDT-RIA, which addresses the shortcomings of existing polynomial-based baseline correction techniques. Our algorithm was developed by taking advantage of the second derivative technique, which helps identify and compute the most intense Raman peak amplitude in input spectral data. After that, the value obtained from the second derivative technique is incorporated into the modified iterative fitting procedure to serve as the convergence criterion. The various analyses have shown how SDT-RIA can accurately estimate the baseline and subsequently recover the embedded Raman signals in fluorescence contaminated spectra independent of the baseline type and size. Our assessment of SDT-RIA with RIA and SG-RPR-RIA showed that all three methods are roughly comparable. However, our proposed baseline correction algorithm, SDT-RIA, most often produced the best quantitative estimate on the merit of their weighted correlation coefficient (WCC). This quantitative estimate is exceptionally accurate for baselines of Polynomial and Gaussian types. Furthermore, the SDT-RIA improves on the existing SG-RPR-RIA for full automation implementation since it operates objectively with virtually no need for user intervention.

Acknowledgements

The authors thank ICTP and IPPS of ISP for financial and instrumental support. We are also grateful to TWAS for the Research fellowship, which made it possible for us to obtain the Raman spectra of the honey samples experimentally. We appreciate the help of Mr Eric Boateng Osei and Dr Anuradha Ramoji for assisting us with standard baseline correction methods for our comparison.

Disclosures

The authors declare no conflicts of interest

Data availability

Data underlying the results presented in this paper are available upon request.

Supplemental document

See Supplement 1 for supporting content.

References

1. L. A. Nafie, “Recent advances in linear and nonlinear Raman spectroscopy. Part IV,” J. Raman Spectrosc. 41(12), 1566–1586 (2010). [CrossRef]  

2. E. V. Efremov, F. Ariese, and C. Gooijer, “Achievements in resonance Raman spectroscopy,” Anal. Chim. Acta 606(2), 119–134 (2008). [CrossRef]  

3. M. A. d. S. Martins, D. G. Ribeiro, E. A. Pereira dos Santos, A. A. Martin, A. Fontes, and H. d. S. Martinho, “Shifted-excitation Raman difference spectroscopy for in vitro and in vivo biological samples analysis,” Biomed. Opt. Express 1(2), 617 (2010). [CrossRef]  

4. R. S. Das and Y. K. Agrawal, “Raman spectroscopy: Recent advancements, techniques and applications,” Vib. Spectrosc. 57(2), 163–176 (2011). [CrossRef]  

5. E. Smith and G. Dent, Modern Raman Spectroscopy - A Practical Approach (John Wiley & Sons, Ltd, 2004).

6. K. Sowoidnich and H.-D. Kronfeldt, “Fluorescence Rejection by Shifted Excitation Raman Difference Spectroscopy at Multiple Wavelengths for the Investigation of Biological Samples,” ISRN Spectrosc. 2012, 1–11 (2012). [CrossRef]  

7. D. Wei, S. Chen, and Q. Liu, “Review of Fluorescence Suppression Techniques in Raman Spectroscopy,” Appl. Spectrosc. Rev. 50(5), 387–406 (2015). [CrossRef]  

8. N. A. Macleod and P. Matousek, “Deep Noninvasive Raman Spectroscopy of Turbid Media,” Appl. Spectrosc. 62(11), 291A–304A (2008). [CrossRef]  

9. G. Schulze, A. Jirasek, M. M. L. Yu, A. Lim, R. F. B. Turner, and M. W. Blades, “Investigation of Selected Baseline Removal Techniques as Candidates for Automated Implementation,” Appl. Spectrosc. 59(5), 545–574 (2005). [CrossRef]  

10. P. J. Cadusch, M. M. Hlaing, S. A. Wade, S. L. McArthur, and P. R. Stoddart, “Improved methods for fluorescence background subtraction from Raman spectra,” J. Raman Spectrosc. 44(11), 1587–1595 (2013). [CrossRef]  

11. P. A. Mosier-Boss, S. H. Lieberman, and R. Newbery, “Fluorescence Rejection in Raman Spectroscopy by Shifted-Spectra, Edge Detection, and FFT Filtering Techniques,” Appl. Spectrosc. 49(5), 630–638 (1995). [CrossRef]  

12. S. Guo, T. Bocklitz, and J. Popp, “Optimization of Raman-spectrum baseline correction in biological application,” Analyst 141(8), 2396–2404 (2016). [CrossRef]  

13. X. Liu, Z. Zhang, Y. Liang, P. F. M. Sousa, Y. Yun, and L. Yu, “Baseline correction of high resolution spectral profile data based on exponential smoothing,” Chemom. Intell. Lab. Syst. 139, 97–108 (2014). [CrossRef]  

14. K. H. Liland, A. Kohler, and N. K. Afseth, “Model-based pre-processing in Raman spectroscopy of biological samples,” J. Raman Spectrosc. 47(6), 643–650 (2016). [CrossRef]  

15. C. A. Lieber and A. Mahadevan-Jansen, “Automated Method for Subtraction of Fluorescence from Biological Raman Spectra,” Appl. Spectrosc. 57(11), 1363–1367 (2003). [CrossRef]  

16. J. Zhao, M. M. Carrabba, and F. S. Allen, “Automated Fluorescence Rejection Using Shifted Excitation Raman Difference Spectroscopy,” Appl. Spectrosc. 56(7), 834–845 (2002). [CrossRef]  

17. H. Krishna, S. K. Majumder, and P. K. Gupta, “Range-independent background subtraction algorithm for recovery of Raman spectra of biological tissue,” J. Raman Spectrosc. 43(12), 1884–1894 (2012). [CrossRef]  

18. K. Chen, H. Wei, H. Zhang, T. Wu, and Y. Li, “A Raman peak recognition method based automated fluorescence subtraction algorithm for retrieval of Raman spectra of highly fluorescent samples,” Anal. Methods 7(6), 2770–2778 (2015). [CrossRef]  

19. F. Rosi, M. Paolantoni, C. Clementi, B. Doherty, C. Miliani, B. G. Brunetti, and A. Sgamellotti, “Subtracted shifted Raman spectroscopy of organic dyes and lakes,” J. Raman Spectrosc. 41(2), 452–458 (2009). [CrossRef]  

20. P. R. Griffiths and L. Shao, “Self-Weighted Correlation Coefficients and Their Application to Measure Spectral Similarity,” Appl. Spectrosc. 63(8), 916–919 (2009). [CrossRef]  

21. T. Ens-Lyon, “Handbook of Raman spectra,” http://www.geologie-lyon.fr/Raman/.

22. P. H. C. Eilers, “A perfect smoother,” Anal. Chem. 75(14), 3631–3636 (2003). [CrossRef]  

23. M. Pirzer and J. Sawatzki, “Method and Device for Correcting a Spectrum,” U.S. patent US7359815B2 (2006).

24. C. G. Ryan, E. Clayton, W. L. Griffin, S. H. Sie, and D. R. Cousens, “SNIP, a statistics-sensitive background treatment for the quantitative analysis of PIXE spectra in geoscience applications,” Nucl. Instrum. Methods Phys. Res., Sect. B 34(3), 396–402 (1988). [CrossRef]  

Supplementary Material (1)

NameDescription
Supplement 1       MATLAB executable code for the SDT-RIA

Data availability

Data underlying the results presented in this paper are available upon request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. Flowchart of the Second Derivative Technique and Range Independent Baseline Correction Algorithm (SDT-RIA) for Raman spectral recovery
Fig. 2.
Fig. 2. Raman signal recovery processes using SDT-RIA, (a) is the simulated Raman spectra and its second derivative, (b) shows the most intense peak fitted in the second derivative spectrum, (c) shows how SDT-RIA estimates the background, and (d) shows how accurately the recovered signal compares with the original spectrum having no baseline.
Fig. 3.
Fig. 3. Raman signals recovered (red) from (a) fifth-degree polynomial (b) Gaussian and (c) sigmoidal baselines using SDT-RIA
Fig. 4.
Fig. 4. Raman signals recovered (red) with SDT-RIA from polynomial baselines having SFR values of (a) 1, (b) 0.5, (c) 0.05 and (d) 0.005
Fig. 5.
Fig. 5. Raman signal recovery process on noisy spectra (a) shows the simulated Raman spectra with artificial noise of different SNR (10, 1, 0.01), (b) magnifies a section of the spectra for visual observation of noise levels (c) shows the second derivative computed for each of the noisy spectra (d) shows the SDT-RIA process on the derivative spectra with SNR of 0.01(S4), where the spectra are first smoothed for easy identification of peak with maximum amplitude before fitting followed by (e) baseline estimation. (f) compares the recovered spectra of S4 to the original simulated spectra (S4), which had no baseline
Fig. 6.
Fig. 6. Raman signal of Adamite recovered with SDT-RIA and compared with that corrected by experts as obtained from the spectral database (Ref. [21])
Fig. 7.
Fig. 7. Raman signal of Tourmaline recovered with SDT-RIA and compared with that corrected by experts as obtained from the spectral database (Ref. [21])
Fig. 8.
Fig. 8. Raman signals of 785 nm excited Raman spectra recovered with SDT-RIA and compared with that corrected by experts using standard baseline correction methods (AsLS, Rubber-band and SNIP algorithm)

Tables (1)

Tables Icon

Table 1. Comparison of the performance of RIA, RIA-SG-RPR, and SDT-RIA for different baseline types and sizes.

Equations (4)

Equations on this page are rendered with MathJax. Learn more.

S ( v ) = L + R ( v )
R ( v ) = a 0 1 + ( v a 1 a 2 ) 2
S ( v ) = R ( v )
S ( v ) = 8 a 0 ( v a 1 ) a 2 4 ( 1 + ( v a 1 a 2 ) 2 ) 3 2 a 0 a 2 2 ( 1 + ( v a 1 a 2 ) 2 ) 2
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.