Simplified model of spectral absorption by non-algal particles and dissolved organic materials in aquatic environments

B. B. Cael; Emmanuel Boss

doi:10.1364/OE.25.025486

1. Introduction

The total absorption spectrum a(λ) (in units of m⁻¹, with λ denoting wavelength, in units of nm) in aquatic environments is frequently decomposed as the sum of four components (e.g. [1])

a (λ) = a_{w} (λ) + a_{ϕ} (λ) + a_{d} (λ) + a_{g} (λ)

where a_w(λ) is the absorption by water, a_ϕ(λ) is the absorption by phytoplankton pigments, a_d(λ) is the absorption by non-algal particles (NAP), and a_g(λ) is the absorption by colored dissolved organic materials (CDOM). Here we are interested in the last two components: absorption by NAP and CDOM.

a_d(λ) and a_g(λ) are routinely approximated by exponential functions

a_{d} (λ) = A_{d} (λ_{o}) \exp [- s_{d} (λ - λ_{o})], a_{g} (λ) = A_{g} (λ_{o}) \exp [- s_{g} (λ - λ_{o})]

for the purpose of radiative transfer computations [2]; by extension, their sum a_dg(λ) = a_d(λ) + a_g(λ) is routinely approximated by a double exponential function (the sum of two exponentials). In applications where it is not possible to separate the signals, such as in semi-analytical inversions of ocean-color remote sensing [3–6], they are typically lumped into a single exponential function due to the similar spectral shapes of a_d(λ) and a_g(λ). (Other models are also used; cf. Ref. [7], their Table 3.)

The first approach treats NAP and CDOM as distinct and independently varying, while the second treats NAP-plus-CDOM as a single pool. Operationally, CDOM is defined as material which passes through a given filter (most often with pore size 0.2 µm) and NAP is defined as that which does not (other than soluble pigments, which are separated into a_ϕ), forming a continuum of material in terms of size [8]. This suggests it would be useful to model the absorption by NAP and CDOM as a the absorption of a continuum of heterogeneous material.

The objective of this paper is to propose a model for a_dg(λ) that considers NAP and CDOM as a continuum, and to demonstrate that this model is a superior one for a_dg(λ) and therefore may be useful in remote sensing signal inversion. We propose modeling a_dg(λ) with the stretched exponential (a.k.a. Kohlrausch function [9]):

a_{d g} (λ) = A \exp {- {[s (λ - λ_{o})]}^{β}}, β \in [0, 1] .

The stretched exponential function can be considered the sum of many simple exponential functions with different exponents, where the wider the distribution of exponents, the smaller the value of β [10]. The stretched exponential reduces to a single exponential when the exponents being summed over are identical, in the limit where β = 1 (hence β being limited to the range [0, 1]; β = 1 is the limiting case where the distribution of s_p values is a delta function, and β = 0 is the limiting case of an infinitely wide distribution of s_p values). Therefore, modeling a_dg(λ) by a stretched exponential function considers NAP and CDOM as a collection of many sub-pools of material (e.g. many size classes), each of which has an exponential absorption spectrum with exponents s_p that may differ.

The parameter β then corresponds to the optical heterogeneity of NAP-plus-CDOM, where smaller values of β indicate more variance between the spectral shapes of absorption by the sub-pools. This implies that β may contain useful biogeochemical information (cf. Ref. [11], their Fig. 3); waters with a more heterogeneous NAP-plus-CDOM pool should have smaller β values. In terms of NAP and CDOM, smaller β values should indicate similar magnitudes and dissimilar spectral shapes of NAP and CDOM absorption.

2. Stretched exponential fit to double exponential

We first demonstrate that for any double exponential curve with plausible values for the parameters (A_d, s_d, A_g, s_g), there exists a stretched exponential curve that closely fits that double exponential. Therefore using the stretched exponential as a model for a_dg(λ) yields no loss of flexibility in modeling absorption curves, despite having one fewer parameter.

We fit a stretched exponential to a suite of artificial a_dg(λ) data derived from double exponential curves:

a_{i, j, k} (λ) = A_{n} \exp [- s_{i} (λ - λ_{o})] + A_{n} R_{j} \exp [- s_{k} (λ - λ_{o})]

where s_i and s_k correspond to the exponents for NAP and CDOM respectively, R_j corresponds to the ratio of the amplitudes A_g/A_d, and A_n is a normalization constant set so that the maximum of a_i_,_j_,_k(λ) is equal to 1 m⁻¹. We vary s_i from 0.01 to 0.026 nm⁻¹ and s_k from 0.008 to 0.018 nm⁻¹, each by increments of 0.001 nm⁻¹ (corresponding to the ranges reported in Ref. [12]), and vary R_j = A_g(λ_o)/A_d(λ_o) from 1/16 to 16 by factors of two (i.e. ranging from a_d ≫ a_g to a_d ≪ a_g), and generate artificial a_i_,_j_,_k(λ) data for each combination of (s_i, R_j, s_k) at 1 nm resolution from 300–700 nm, totaling 17 × 11 × 9 = 1; 683 curves.

We use two metrics to assess the quality of fit of the stretched exponential function to the double exponential function: the coefficient of determination (r²), and the root-mean-square-error (RMSE; the square root of the sum of squared residuals divided by the degrees of freedom of the regression). The fits (as well as those in Section 3) were performed in MATLAB (R2014b; all code used for all analyses herein will be made available at http://cael.space should this manuscript be accepted) using Nonlinear Least Squares, with the constraint β ∈ [0, 1] according to the definition of the stretched exponential function. We set max [a_i_, _j_,_k(λ)] = 1 in all cases so that the RMSEs are comparable across different parameter combinations. Setting max [a_i_, _j_,_k(λ)] = 1 is immaterial; only the ratio R_j = A_g(λ_o)/A_d(λ_o) is relevant for the fitting procedure, as the stretched exponential can be rescaled by any factor C by multiplying its amplitude by C. In Equation 4, we use λ_o = 300 nm, but this value is also immaterial for the same reason; for a single or double exponential, the choice of λ_o is equivalent to a rescaling of the amplitude. However, the stretched exponential is not well-defined for wavelengths λ < λ_o. For simplicity, throughout the manuscript we have therefore chosen λ_o equal to the lowest wavelength present in any given analysis, e.g. λ_o = 412 nm when analyzing the data from Ref. [13], and λ_o = 300 nm when analyzing the data from Ref. [14].

Figure 1 shows an example fit of a stretched exponential fit to a double exponential where the means reported in Ref. [12] are used for s_d = 0.0123 nm⁻¹ and s_g = 0.0176 nm⁻¹. The stretched exponential fit (A = 1.005 m⁻¹, s = 0.0165 nm⁻¹, β = 0.974) is extremely good, with r² > 0.9999 and RMSE = 8.2 × 10⁻⁴; all deviations are within 0.005 m⁻¹, the largest occurring at the endpoint λ = 300 nm. Note that 0.005 m⁻¹ is typically within the measurement uncertainty of the instruments used to measure absorption such as the WETLabs’ ac-s [15].

Fig. 1 a) Example of a stretched exponential (dashed line) fit to a double exponential (solid line). b) Residual difference (solid line) between the two curves in Fig. 1(a).

Download Full Size | PDF

Table 1 shows summary statistics from fitting the 1,683 a_i_,_j_,_k(λ) curves. In each case, the fit was similar to Fig. 1, e.g. r² ≥ 0.9985. The largest residual was always at the endpoint λ = 300 nm as in Fig. 1. This demonstrates that for any plausible combination of parameters, a double exponential can be extremely well-fit by a stretched exponential, even though the stretched exponential has one fewer parameter. By extension, any a_dg(λ) data that can be well-described by a double exponential should also be well-described by a stretched exponential.

Table 1. Summary statistics for stretched exponential fits to the sum of two exponential functions. Across fits to all parameter combinations, the median, minimum, and maximum r² and RMSE are reported, as well as the median, minimum, and maximum of s and β.

View Table | View all tables in this article

3. Fits to in situ data

The above statistical exercise illustrates the capacity of the stretched exponential to fit a double exponential, but this means little if it does not translate into improvements in fitting a_dg(λ) data. We next demonstrate that the stretched exponential is indeed a superior model in fitting a_dg(λ) data, at both low and high spectral resolution.

We fit a single, stretched, and double exponential to two compilations of in situ a_dg data. We also fit a power-law function of wavelength, as has been shown to be a better model for multispectral CDOM absorption data [7]. For each a_dg(λ) curve, we fit all four functions, and compare the stretched exponential to the other three by their RMSE and by their r². Note that the RMSE is arguably a better metric by which to compare these models because it incorporates a regression’s degrees of freedom, i.e. it suitably penalizes models for having additional fitting parameters. For each data compilation, we say one function outperformed another if it most often had the lower RMSE (or higher r²).

The first data compilation was developed to inter-compare different remote-sensing inversions [13]. We generate n = 656 curves of a_dg(λ) by adding the measured a_d(λ) and a_g(λ) data in this compilation. As these data are measured at five wavelengths (λ = 412, 443, 490, 510, 555 nm), they correspond to the low-resolution limit, where the degrees of freedom are very small; the number of free parameters in the regression is close to the spectral resolution, and the RMSE will therefore harshly penalize the addition of free parameters.

The second data compilation is from the field program BIOSOPE [14]. We generate n = 260 curves of a_dg(λ) by adding the measured a_d(λ) and a_g(λ) data in this compilation. As these data are measured every 2 nm from 300–700 nm, i.e. at 201 wavelengths, they correspond to the high-resolution limit, where the degrees of freedom are large, and the RMSE and r² are therefore likely to behave similarly.

We found that the stretched exponential outperformed each of the three other functions in both cases. Table 2 shows the number of cases for each data compilation where the stretched exponential had a better (lower) RMSE and better (higher) r². The stretched exponential outperformed the single exponential, double exponential, and power law, in terms of both RMSE and r², for both datasets. In the low-resolution dataset, in 128 cases the regression found β = 1 for the stretched exponential, reducing it to the single exponential and thereby making their r² and RMSE the same (hence the stretched exponential having a higher r² in only 528 cases).

Table 2. Summary statistics for exponential fits to in situ data. First column corresponds to data compilation; ‘High-resolution’ data are those compiled in Ref. [14], and ‘Low-resolution’ data are those compiled in Ref. [13]. Second column corresponds to which function the stretched exponential is being compared: single exponential, double exponential, and power law. Third column reports the number of cases where the stretched exponential had a higher r². Fourth column reports the number of cases where the stretched exponential had a lower RMSE.

View Table | View all tables in this article

4. Discussion

The results considered herein were insensitive to the spectral range considered; we repeated the analysis in Section 2 for the ranges 300–750 nm, 400–700 nm, and 400–750 nm, which yielded no substantive difference. While no ocean color satellite radiometer to date measures below 400 nm, future ones will – NASA’s Plankton, Aerosols, Cloud and Ecosystem (PACE) is planned to have UV band extending at least to 350nm – so different spectral ranges are of interest.

Interestingly, the regression estimated different values for β in the two data compilations; see Fig. 2. For the high-resolution data compilation, the β estimates were in the range β ∈ [.56, .96]; for the low-resolution data compilation, a majority of β estimates were > 0.96. These data compilations differ not only in terms of their spectral resolution and range, but also in terms of the water types they sample – the high-resolution data are taken from clear ocean waters [14], whereas the low-resolution data are taken from a range of water types. To examine whether spectral resolution and range can explain this difference in the estimated β values, we subsampled the high-resolution data at the same wavelengths as the low-resolution data and repeated the regressions on the subsampled data. The subsampling did not appreciably affect the β estimates, resulting in an average difference of 0.02 and a maximum difference of 0.07 (n.b. the stretched exponential outperformed the other models on the subsampled data). This suggests that these differences in β may result from optical differences between water samples, corroborating the possibility mentioned in the Introduction that β may contain useful biogeochemical information (ideally when comparing data with good and similar spectral resolution and range). β and s are also anti-correlated throughout all analyses described in this paper, suggesting that a reduced-parameter expression with β as a function of s, or vice versa, could be employed; further research is required to determine the global distributions of β and s, as well as the relationship between these two parameters.

Fig. 2 Histogram of β estimates from the regression to the high-resolution (BIOSOPE) data and to the low-resolution (IOCCG) data.

Download Full Size | PDF

5. Summary

We found, using published in situ data compilations of CDOM and NAP absorption spectra, that the stretched exponential function (Equation 3) performed better as an analytical approximation of NAP-plus-CDOM absorption than a single or double exponential or a power law at both high and low spectral resolution. We also found that, for any plausible parameter combination, a double exponential representation of a_dg can be very well fit by a stretched exponential. These results favor the stretched exponential model for parameterizing NAP-plus-CDOM absorption, especially in applications where their signals cannot be separated.

We expect that the strongest potential application for this new model is its incorporation into semi-analytical inversions, such as the model of Ref. [6]. We expect it to provide improved inversions, particularly when incorporating data in the UV. We hope our findings here might encourage other researchers to test this model against their own and other community data, to further evaluate its utility in other applications and study regions.

Funding

National Science Foundation (NSF) (OCE-1315201); National Science Foundation Graduate Research Fellowship Program (NSF GRFP) (2388357); National Aeronautic and Space Administration (NASA) Ocean Biology and Biogeochemistry program (NNX15AC08G).

Acknowledgments

It is a pleasure to thank Hannah Williams for copyediting assistance.

Disclosures

The authors declare that there are no conflicts of interest related to this article.

References and links

1. C. S. Roesler, M. J. Perry, and K. L. Carder, “Modeling in situ phytoplankton absorption from total absorption spectra in productive inland marine waters,” Limnol. Oceanogr. 34(8), 1510–1523 (1989). [CrossRef]

2. C. D. Mobley, Light and Water: Radiative Transfer in Natural Waters (Academic, 1994).

3. C. S. Roesler and M. J. Perry, “In situ phytoplankton absorption, fluorescence emission, and particulate backscattering spectra determined from reflectance,” J. Geophys. Res. 10013279–13294 (1995). [CrossRef]

4. S. A. Garver and D. A. Siegel, “Inherent optical property inversion of ocean color spectra and its biogeochemical interpretation 1. Time series from the Sargasso Sea,” J. Geophys. Res. 102, 18607–18625 (1997). [CrossRef]

5. Z.P. Lee, K. L. Carder, and R. Arnone, “Deriving inherent optical properties from water color: a multi-band quasianalytical algorithm for optically deep waters,” Appl. Opt. 41, 5755–5772 (2002). [CrossRef] [PubMed]

6. P. J. Werdell, B. A. Franz, S. W. Bailey, G. C. Feldman, E. Boss, V. E. Brando, M. Dowell, T. Hirata, S. J. Lavender, Z. P. Lee, H. Loisel, S. Maritorena, F. Mélin, T. S. Moore, T. J. Smyth, D. Antoine, E. Devred, O. Hembise Fanton d’Andon, and A. Mangin, “Generalized ocean color inversion model for retrieving marine inherent optical properties,” Appl. Opt. 52(10) 2019–2037 (2013). [CrossRef] [PubMed]

7. M. S. Twardowski, E. Boss, J. M. Sullivan, and P. L. Donaghay, “Modeling the spectral shape of absorption by chromophoric dissolved organic matter,” Marine Chemistry 89(1), 69–88 (2004). [CrossRef]

8. K. S. Shifrin, Physical Optics of Ocean Water (Springer Science & Business Media, 1988).

9. R. Kohlrausch, “Theorie des elektrischen Rückstandes in der Leidner Flasche,” Annalen der Physik un Chemie 9156–82 (1854). [CrossRef]

10. D. C. Johnston, “Stretched exponential relaxation arising from a continous sum of exponential decays,” Phys. Rev. B 74184430 (2006). [CrossRef]

11. K. L. Carder, R. G. Steward, G. R. Harvey, and P. B. Ortner, “Marine humic and fulvic acids: Their effects on remote sensing of ocean chlorophyll,” Limnol. Oceanogr. 34(1), 68–81 (1989). [CrossRef]

12. M. Babin, D. Stramski, G. M. Ferrari, H. Claustre, A. Bricaud, G. Obolensky, and N. Hoepffner, “Variations in the light absorption coefficients of phytoplankton, nonalgal particles, and dissolved organic matter in coastal waters around Europe,” J. Geophys. Res. 108(C7), 3211 (2003). [CrossRef]

13. Z. P. Lee, ed. “Reports of the International Ocean-Colour Coordinating Group, No. 5, Remote sensing of inherent optical properties: fundamentals tests of algorithms, and applications,” IOCCG, Dartmouth, NS, Canada (2006).

14. A. Bricaud, M. Bavin, H. Claustre, J. Ras, and F. Tieche, “Light absorption properties and absorption budget of South East Pacific waters,” J. Geophys. Res. 115(C08) 009 (2010). [CrossRef]

15. “ac-s In Situ Spectrophotometer Datasheet,” Sea-bird Scientific (WET Labs), Philomath, OR, USA (2016).

Quantity	Median (Range)
r ²	1.0000 (0.9985 − 1)
RMSE	3.3×10⁻⁴ (1.4 ×10⁻⁵ − 8.0×10⁻³)

β	0.986 (0.755 − 1)
s [nm⁻¹]	0.0165 (0.0082 − 0.0510)

a_dg data	Stretched vs.	Better r²	Better RMSE
High-resolution (n = 260)	Single	260	260
	Double	145	145
	Power	257	257

Low-resolution (n = 656)	Single	528	367
	Double	610	656
	Power	439	352

Quantity	Median (Range)
r ²	1.0000 (0.9985 − 1)
RMSE	3.3×10⁻⁴ (1.4 ×10⁻⁵ − 8.0×10⁻³)

β	0.986 (0.755 − 1)
s [nm⁻¹]	0.0165 (0.0082 − 0.0510)

a_dg data	Stretched vs.	Better r²	Better RMSE
High-resolution (n = 260)	Single	260	260
	Double	145	145
	Power	257	257

Low-resolution (n = 656)	Single	528	367
	Double	610	656
	Power	439	352

Simplified model of spectral absorption by non-algal particles and dissolved organic materials in aquatic environments

Abstract

1. Introduction

2. Stretched exponential fit to double exponential

3. Fits to in situ data

4. Discussion

5. Summary

Funding

Acknowledgments

Disclosures

References and links

Cited By

Figures (2)

Tables (2)

Equations (4)

Optics Express