Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Shot noise reduction in radiographic and tomographic multi-channel imaging with self-supervised deep learning

Open Access Open Access

Abstract

Shot noise is a critical issue in radiographic and tomographic imaging, especially when additional constraints lead to a significant reduction of the signal-to-noise ratio. This paper presents a method for improving the quality of noisy multi-channel imaging datasets, such as data from time or energy-resolved imaging, by exploiting structural similarities between channels. To achieve that, we broaden the application domain of the Noise2Noise self-supervised denoising approach. The method draws pairs of samples from a data distribution with identical signals but uncorrelated noise. It is applicable to multi-channel datasets if adjacent channels provide images with similar enough information but independent noise. We demonstrate the applicability and performance of the method via three case studies, namely spectroscopic X-ray tomography, energy-dispersive neutron tomography, and in vivo X-ray cine-radiography.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Many imaging modalities rely on the penetration ability of radiation to make the interior of an object visible. Physical interactions of radiation and matter, such as absorption, scattering, or phase shifts, can be used to obtain contrast inside the object of interest. Commonly, either radiography (single view projection) or tomography (multiple views with subsequent volumetric image reconstruction) are acquired. As particle or photon emission and detection are stochastic processes, and often source flux and detector efficiency are limited, longer exposure time improves image quality. However, there are many scenarios where sufficient exposure can not be achieved. An obvious example is in vivo imaging, where the radiation dose ultimately limits the amount of information acquired [1]. Another example is spectroscopic imaging with a polychromatic beam, where the detected intensity or particle counts are distributed across multiple energy bins. This leads to a significant noise per energy channel or requires a dramatic increase of exposure times, hence, limiting the experiment throughput [2]. Both imaging modes can be generalized as multi-channel images. The paper addresses a number of cases when the channels of multi-channel images share a sufficient amount of common structural information but are affected by independent noise samples.

Given the aforementioned physical constraints, we often need to rely on image processing techniques to improve image quality and extract valuable data. A group of methods for improving image quality that is affected by noise is called denoising. As multi-channel images naturally share a lot of structural information, a denoising approach exploiting the data redundancy in the channel domain is an intuitive way to enhance image quality. Similar to other domains, methods based on machine learning (ML) have revolutionized denoising [3]. In this paper, we demonstrate an ML approach to improve the quality of underexposed images in challenging applications such as spectroscopic k-edge X-ray tomography, in vivo X-ray radiography, and energy-dispersive Bragg-edge neutron tomography. The method is based on the recent Noise2Noise (N2N) self-supervised denoising approach [4]. It has been demosnstrated that the N2N clearly outperforms the comparable methods [5], therefore we limit the scope of the paper to N2N approach. The main assumption enabling the N2N method is formulated as follows: Consider two images $I_1$ and $I_2$ that share the same structural information $S$ but are affected by independent and identically distributed (iid) instances of noise $\sigma _1$ and $\sigma _2$. If the model is trained to predict $I_1$, given $I_2$ as input, the best prediction possible is $S$, because $\sigma _2$ is conditionally independent of $\sigma _1$, given $S$. N2N assumes that $I_1$ and $I_2$ have an identical signal however in the multi-channel case this condition commonly is not satisfied. Therefore we broaden the applicability of N2N by relaxing the original constraints. Additionally, we demonstrate how to negotiate possible shortcomings of these relaxations.

This paper is organized as follows. First, we outline the related work to put our work in context. Then, the N2N method is described, followed by three rigorous case studies. Finally, a discussion of our findings in the context of multi-channel imaging is provided.

2. Related work

Artifacts are inherent to digitally acquired images, as an acquisition function is subject to many uncertainties and in general, is not accurately known. The acquisition function includes source or detector heterogeneity, optical distortion by optical elements or diffraction during wave-field propagation, and inherent noise driven by the stochastic nature of particle emission, detection, and interactions. As optical distortion is a misplacement of information, a distortion map can be estimated and applied to compensate for it. For a number of cases, simple flat and dark field corrections are applicable. Flat and dark fields refer to the measurement of detector response with and without source illumination, respectively. Finally, denoising is used to compensate for the inherent stochastic noise. Hence, the denoising problem is to restore the (deterministic) signal $S$ from a noisy observation $I$ [6]:

$$I = S + \sigma(S)$$
where $\sigma (S)$ is the inherent noise of the imaging device. This noise depends not only on the stochasticity of the particles (neutrons, photons, etc.) and the electronics but additionally on the distortions, and also on the transformation applied to correct the image. All this makes a closed-form distribution estimation problematic.

The existing image denoising approaches can be roughly categorized into two large groups: classical image processing and ML approaches. Typically, classical image processing approaches work in a single-image manner and incorporate expert beliefs about the nature of noise. ML approaches, on the other hand, employ the idea of fitting a data-driven model entirely without or with minimal expert knowledge about the nature of the data.

2.1 Classical image processing

The basic spatial filtering methods are mean, median, or Gaussian kernel filters [6]. For each pixel, these filters select a new value, based on the weighted values of the neighboring pixels. These filters are fast, robust, well-understood, and work fairly well in many situations. The main drawback of these classical filters is their tendency to blur sharp edges.

More advanced spatial filtering approaches, e.g., non-local means (NLM), use more information from the whole image [7]. Instead of taking an average of the direct neighbourhood of the pixel, NLM takes an average of the large region, weighted by the similarity between the “donor” and the “recipient” pixel. The process of revisiting multiple locations in the image, comparing their surroundings, and computing the average can take minutes for one image. In return, this method is capable of producing sharper denoised images [8].

Alternatively, denoising can be formulated as an optimization problem and regularization can be used to incorporate some prior knowledge about the image properties [9]. These methods are very powerful but require deep mathematical knowledge and handcrafted regularizers in many cases, making their application for experimental data challenging. One of the most successful regularizers is Total Variation (TV) which encourages piece-wise constant image regions with sharp boundaries [10]. In summary, classical methods require fine tuning of parameters by an expert to balance smoothing and denoising. Hence there is a significant risk of information loss if applied incorrectly. A more comprehensive overview of classical denoising methods can be found elsewhere [8].

2.2 Machine learning approaches

The evolution of classical methods may be seen as a series of steps taken to increase the amount of information used to correct a single pixel value. In this respect, ML-based approaches appear as a natural further step: a model, trained to correct the noise, implicitly incorporates knowledge about the whole dataset.

Early ML-based image denoising approaches worked in a supervised manner, i.e. a model was trained on a set of noisy images to predict a noise-free image (target). Recently, authors of the N2N method demonstrated that there is no need for a noise-free target: if one uses a pair of noisy images (affected by iid instances of the noise) as an input and as a target for the training, the model will predict the noise-free image [4]. The underlying intuition is that independent instances of noise are uncorrelated and cannot be predicted, hence the model is forced to extract features. Even though N2N does not explicitly require a set of noise-free images, the authors of [4] synthetically formed noisy pairs by adding noise to noise-free images.

There have been several attempts to extend the N2N method for denoising problems where pairs of images are not naturally available. Noise2Self [11] and Noise2Void [12] generate the required pair of images by taking random pixels in the noisy image and disturbing them with yet another noise distribution. In this way, multiple training pairs can be constructed from a single noisy image. Noise2Stack [13] was designed for three-dimensional tomographic data and is based on an assumption that tomographic data is typically smooth. Therefore, slice-to-slice changes are assumed to be significantly smaller than the slice-wise variability caused by noise, hence, neighbouring slices can be used for training.

Alternatively, constrained autoencoders can be used to denoise images [14]. During the training, autoencoders use the same image both as input and target and attempt to compress (encode) the input image into its lower-dimensional representation. The denoising properties of this approach rely on the assumption that the noise, due to its stochastic nature, is harder to encode, than the signal. To additionally limit the capacity of the model to store information about the noise, it can be restricted by limiting the computational capacity of the model, lowering the dimensionality of the learned representation, or introducing synthetic noise into it [14]. However, the autoencoders are inefficient if the noise is spatially correlated and can be easily memorized by the model. The Hierarchical DivNoising (HDN) method addresses this issue by training a variational autoencoder with a noise model imposed over output [5]. The authors proposed a way to find particular components of the model that encode information about the noise so that they can remove those components. Even though the proposed methods provide a valuable alternative to the N2N approach, the authors highlight that the N2N approach is a hard-to-beat baseline [5]. Therefore we limit the scope of the paper to the N2N approach.

3. Model training

The N2N method assumes that a pair of images contains the same signal and iid noise. Our adaptation of the method to multi-channel image data takes its inspiration from the denoising of synthetic aperture radar (SAR) images [15]. In SAR imaging, both the phase and amplitude of received microwaves are measured in each pixel; commonly the phase information is ignored. However, the authors demonstrated that the amplitude and the phase contain complementary information and can be used as a basis for N2N denoising. We hypothesize that in multi-channel imaging, adjacent time frames or energy levels indeed share sufficiently similar signals, and have noise samples close to being iid. Therefore, we generate the required pair images based on this hypothesis. To help the model catch complex spatial structures of the signal, we also feed it with multiple adjacent energies or time frames as input whenever it does not result in oversmoothing.

Following [4,11,12], we use the fully-convolutional neural networks as a model architecture. We employ U-Net with ResNet-50 (as implemented in [16]) as the backbone and rely on the Adam optimizer with a $3\times 10^{-4}$ learning rate, without scheduling. Referring to the model, we will use $f_\theta$ and model interchangeably, where $\theta$ denotes trainable parameters of the neural network. The training, therefore, is the process of minimization of the proposed loss function (defined for specific experiments) by changing the parameters $\theta$. We augment each image pair with random crops, shifts, scale, rotations, distortions, and different types of blur. We acknowledge that there is room for quality improvement via larger models, modern architectures, better optimization procedures, or more aggressive augmentations. The sensitivity study of training parameters is a topic of future investigation.

4. Experiments

4.1 Simulated spectral X-ray tomography

As a first case study we discuss the applicability of N2N to energy-dispersive X-ray tomography, which is of interest for biomedical imaging [2]. The polychromatic emission of laboratory X-ray tube sources is suitable to provide sufficient photon flux. However, the broadband spectrum also leads to disadvantages in quantitative analysis. In the conventional absorption mode, each detector pixel integrates all photons irrespective of their energy. Since attenuation is a function of photon energy, conventional tomographic reconstruction might exhibit so-called beam-hardening artifacts [17]. However, acquisition with an energy-dispersive X-ray detector allows segmenting materials that can be inseparable in polychromatic absorption contrast. These are materials with similar mean polychromatic absorption, but with spectrum showing sharp discontinuities at energies equal to the binding energies of the core-electron states, so-called absorption-edges (K, L, M) edges. Thie energy spectrum in each reconstructed voxel can be used to identify the corresponding material. Highly energy-dispersive (so-called hyperspectral) X-ray detectors have yet a limited total pixel number but an energy resolution of about 1 keV [18], allowing to distinguish even neighboring chemical elements. However, a high spectral energy resolution entails long exposure times since the acquired counts are distributed over multiple bins. Therefore, a state-of-the-art reliable denoising approach might help to improve the experimental throughput. In this study, to ensure strictly controlled conditions, we simulated the tomographic acquisition.

4.1.1 Data

We generated a volumetric phantom by combining several three-dimensional point clouds: two Swiss rolls, two moon crescents, and an s-curve. All point clouds were generated by the Scikit-Learn library [19]; to convert 2D point clouds to 3D, the third axis was added by randomly sampling from the Uniform distribution. To convert the point clouds to a raster volume, we selected the size (in voxels) of each point. To resolve the ambiguous cases (when several materials appeared in the same voxel), we selected the priority order and always assigned the material with the highest order to fill the ambiguous voxel. The spatial size of the phantom was set to $512 \times 512 \times 512$. In a rough structure, all slices of the dataset are the same. However, the surface texture varied because of the random nature of the point clouds.

We assigned the simulated objects with energy-dependent mass attenuation coefficient (MAC) of europium (${}_{63}{\mathbf {Eu}}$, k-edge = 48.5 keV), gadolinium (${}_{64}{\mathbf {Gd}}$, k-edge = 50.2 keV), ytterbium (${}_{70}{\mathbf {Yb}}$, k-edge = 61.3 keV), lutetium (${}_{71}{\mathbf {Lu}}$, k-edge = 63.3 keV), and uranium (${}_{92}{\mathbf {U}}$, k-edge = 115.6 keV). The background was assigned with MAC of air. This particular choice of materials was inspired by the study of the separability of k-edge nanoparticles presented in [20]. Two pairs of materials have neighbouring atomic numbers, hence very close k-edges, and are barely distinguishable in a noisy image; Uranium was added to have a k-edge in the noisiest part of the spectrum, to check the ability of the method to locate the k-edge in extreme noise conditions.

We used the MATLAB package PhotonAttenuation to generate the energy-dependent MAC of the selected materials [21]. A spectrum profile of a Boone/Fewell source with the tube potential 150 kV (no kV Ripple and filters) was generated using the MATLAB package spektr 3.0 [22]. The obtained source spectrum was normalized and scaled to have a maximum value of $175 \times 10^3$ photons / $\mathrm {mm^2}$ to imitate short exposure acquisition. MAC of selected materials and the source spectrum are shown in Fig. 1.

 figure: Fig. 1.

Fig. 1. The materials and source characteristics used to simulate spectral X-ray tomography. For materials, energy-dependent mass attenuation coefficients (MAC) are presented, and for the simulated Boone/Fewell source, we present the source profile. We selected two pairs of materials with close k-edges that are hard to resolve and one material with the k-edge in the low-flux zone of the source.

Download Full Size | PDF

We generated 135 energy bins between 15 and 150 keV with a 1 keV step. For each bin, we simulated 120 equally-spaced parallel-beam CT projections over 180$^{\circ }$. The spectral characteristics of the material and the source are shown in Fig. 1. We used the conventional FBP algorithm to reconstruct tomographic data (implemented in in [23]). Examples of reconstructed slices for 40 keV (high flux) and 140 keV (low flux) are shown in Fig. 2(b) (right). As expected, at 140 keV the reconstructed slice is uninterpretable.

 figure: Fig. 2.

Fig. 2. Qualitative examination of the denoising of the simulated energy-resolved X-ray CT of a specially devised phantom. On the left, present a noisy (left) and denoised (right) transverse slice both near peak (top) and low (bottom) flux. On the right, we present a comparison of theoretical, noisy, and denoised spectra for different materials. For each material, we selected one representative pixel. Note how denoising is able to recover information even in extremely noisy cases both spatially (b) and spectrally (see the slight k-edge of the Uranium on the (c) plot).

Download Full Size | PDF

4.1.2 Training and processing

The model $f_{\theta }$ was trained by optimizing

$$\mathbb{E}_{i,j} \Vert f_{\theta}(x_{i,j-1}, x_{i,j+1}) - x_{i,j} \Vert_1 \xrightarrow[\theta]{} \min,$$
where $x_{i,j}$ is a projection acquired at the transmission angle $i$ and in the energy bin $j$. We randomly split the whole set of projection angles into a training set and a validation set with a ratio of $80/20$. We do not select a test set, since in our experiments, we do want to overfit for the exact dataset and do not require generalization. Note that the energy level $j$, which is required to be predicted by a model, should not be fed into the model to avoid the trivial solution. Only adjacent $j-1, j+1$ levels should be used. This forms a gap of one energy level in the inputs.

During the inference, we feed the model with the adjacent energy bins without the gap used in training, to avoid blur in the spectral domain:

$$\tilde{x}_{i,j-0.5} = f_{\theta}(x_{i,j-1}, x_{i,j}).$$

However, since the model predicts the energy level which is averaged between two input levels, it will inevitably predict an energy level between two adjacent ones used as input. It is important yet easy to compensate for this.

As before, the denoised tomographic datasets were reconstructed with the conventional FBP algorithm. Here, each energy bin was reconstructed separately resulting in 135 volumes. To obtain the spatial distribution of individual materials in the sample, we performed material decomposition as described in [24]. The employed decomposition relies on the assumption that each voxel is a unit volume and each material occupies a volume fraction in this unit volume (the fraction can be 0). Under this assumption, a voxel-wise sum of all material maps is equal to 1 in each voxel.

4.1.3 Results discussion

Figrues 2(a) and 2(b) show two-dimensional slices for selected (individual) energy bins. N2N demonstrates the drastic quality improvement of the reconstruction. For 40 keV (high source flux, Fig. 2(a)) the reconstructed slice appears to be almost noise-free; the slice shows sharply defined objects where all original structures become clearly visualized. Although no signal seems to be visible in the 140 keV slice (Fig. 2(b)) prior to denoising, N2N is able to partially recover the structures in the slice.

Single noisy energy spectra are used for one voxel per material component and denoised spectra are reconstructed for the voxels and plotted in Fig. 2(c) along with the theoretical MAC. The voxel positions within the materials were chosen arbitrarily. Noise reduction results in sharp and accurately positioned k-edges, aiding further material decomposition. Even the slight uranium k-edge is visible in the denoised spectrum.

To quantitatively evaluate denoising results, we perform the material decomposition. Since the sum of volume fractions corresponding to each material, obtained through material decomposition, is bound to 1 in each voxel, we can treat the estimated volume fractions as probabilities. Hence, the task of material decomposition can be considered a classification problem and the related quality assessment metrics can be applied to quantitatively assess the results. The comparison results are shown in the Figs. 3(a) and 3(b). In the top row we show the binarized material decomposition error (black corresponds to erroneous material prediction). The confusion matrices between the predicted and true materials for each pixel are presented in the bottom row (perfect classification results in the identity confusion matrix). A high level of noise in the simulated data causes misclassifications between close materials (e.g., Lutetium and Ytterbium). Also, as it is visible on the top row, these errors are distributed evenly throughout the sample. Hypothetically, this can be compensated by enforcing an assumption of material homogeneity. However, this assumption might cause severe errors close to material interfaces. The errors in the denoised volume are mostly concentrated around the borders (see the top row), and mainly correspond to misclassification for air due to slight blur (see the bottom row). But overall, the confusion matrix for the denoised dataset is considerably closer to the identity matrix.

 figure: Fig. 3.

Fig. 3. Quantitative examination of the denoising of the simulated energy-resolved X-ray CT of a specially devised phantom. We study the quality of denoising through the lens of further material decomposition. On the top row, we present the binarized material decomposition error. Pixels that are black were assigned the wrong material. On the bottom row, we present the confusion matrix of the material decomposition for different materials, where ideal decomposition should yield the identity matrix. We note that materials with close k-edges are frequently confused before the denoising, and after the denoising, the confusion mainly comes from spatial smoothing.

Download Full Size | PDF

We also present the Area Under Precision-Recall Curve (AUPRC), measured for each material Table 1. AUPRC for ideal classification is 1. AUPRC results additionally highlight improvement after denoising: N2N provides a boost of more than $10\%$ of mean AUPRC for the downstream material decomposition. To assess quality loss caused by reconstruction itself (without any effect of denoising), we generated another set of projections with very high flux (all other parameters remained constant). Material decomposition for this volume shows a mean AUPRC of $0.999$, with the lowest precision of $0.996$ for the air. We conclude that the reconstruction losses are negligible in this experiment.

Tables Icon

Table 1. Quantitative examination of the denoising of the simulated energy-resolved X-ray CT of a specially devised phantom. We numerically compare the material decomposition quality before and after denoising. The denoising provides a prominent quality boost for material decomposition.

4.2 Neutron imaging

As a second case study we discuss the applicability of N2N to energy-dispersive Bragg edge neutron tomography. Neutron imaging provides a complementary contrast to conventional X-ray imaging. Neutrons mainly interact with atomic nuclei, in this way a neutron beam passing through an object can capture information about the internal material structure. The energy spectrum of the neutron transmission of a polychromatic thermalized neutron beam passing a predominantly polycrystalline material contains sudden and sharp edges at wavelengths equal twice the interplanar distance between scattering planes in dependence of the crystalline properties of the sample material [25]. Energy dispersive images can efficiently be acquired by combining a pulsed neutron spallation source and a suitable time-sensitive detector by using the time-of-flight (ToF) method, which employs the energy-dependent neutron velocity for spectral information (the more energetic neutrons, by having higher velocities, reach the detector earlier than less energetic (slower) neutrons). Measuring the time of arrival of the neutrons at the detector and knowing the flight path length, their energies, and the corresponding wavelengths can be determined. For TOF methods, high energy resolution requires long flight distances and many time bins in the detector. Hence, only a few pulses per second can be measured, and acquired counts are shared between multiple bins [26]. More details on this acquisition mode can be found elsewhere, for both the measurement setup [27] and applications [28,29]. Neutron facilities are expensive and demand for neutron beamtime exceeds the supply capacity [30]. Therefore, there is a high interest in efficient image denoising techniques to reduce exposure time and subsequently increase experiment throughput.

4.2.1 Data

In this study, we employ the dataset [31] acquired at the Imaging and Materials Science & Engineering (IMAT) beamline operating at the ISIS spallation neutron source (Rutherford Appleton Laboratory, UK) [32,33]. More details on acquisition parameters and preprocessing can be found elsewhere [24]; here we only briefly summarize details relevant to this study.

A sample contains 6 aluminium tubes: five filled with metallic powder (copper (Cu), aluminium (Al), zinc (Zn), iron (Fe), and nickel (Ni)), and one empty. The neutron detector has $512 \times 512$ pixels, 0.055 mm pixel size. A set of spectral projections were acquired at 120 equally-spaced angular positions over 180$^{\circ }$ rotation with 15 min exposure. Additionally, 8 flat field images (4 before and 4 after the acquisition) were acquired with the same exposure.

A typical problem of spectral measurements is that noise statistics vary quite drastically across the spectrum. The beam spectrum at the IMAT beamline has a crude bell shape with a peak around 3 Å [32]. Additionally, the time-sensitive detector suffers from dead time meaning counts loss, hence, additional signal distortions [34]. To alleviate the count loss problem, the time (wavelength) domain is split in several independent measurement intervals (4 in this case) and a special correction technique is applied to the measured data [34]. Each interval has an individual bin width; for this study the following bin width was used: $0.7184 \cdot 10^{-3}$ Å, $1.4368 \cdot 10^{-3}$ Å, $\cdot 10^{-3}$ Å and $2.8737 \cdot 10^{-3}$ Å. To benchmark N2N, we generated three additional datasets by rebinning the dataset in the original resolution (2840 energy bins split into 4 measurement intervals with (1141, 814, 424, 464) bins in each). The rebinning was performed individually in each interval by summing every (4, 2, 2, 1), (8, 4, 4, 2), and (16, 8, 8, 4) bins, resulting in datasets with 1366, 681 and 339 wavelength bins, respectively.

As a proxy to demonstrate the noisiness of the data, as a function of frequency, we plot the standard deviation of pixel values for one projection angle but different wavelengths along the spectrum in Fig. 4. Vertical dashed lines separate independent intervals. Note, that standard deviation increases drastically with the increase of flux (the effect of counts loss becomes more apparent).

 figure: Fig. 4.

Fig. 4. Limitations of time-sensitive detectors require splitting the whole wavelength domain into several measurement intervals (4 in our case, brackets depicted with dashed lines). Each interval has an individual wavelength bin width. In this plot, we show how standard deviation (as a proxy characteristic of noisiness) changes with the change in wavelength. Additionally, to benchmark the method, we generated three additional datasets by rebinning the spectral dimension of the original dataset.

Download Full Size | PDF

4.2.2 Training and analysis details

In this experiment, we compare the effect of noise reduction applied to the projections (N2N(P)) before reconstruction with that of applying it to the already reconstructed slices (N2N(S)). In both cases, we trained a model $f_{\theta }$ by performing essentially the same loss optimization procedure as in the previous case study

$$\mathbb{E}_{i,j} \Vert f_{\theta}(x_{i,j-1}, x_{i,j+1}) - x_{i,j} \Vert_1 \xrightarrow[\theta]{} \min,$$
where now $x_{i,j}$ can represent either the projection for an angle $i$ and an energy channel $j$, or a reconstructed slice number $i$ and an energy channel $j$. We used $i$ to randomly split the dataset into the training and validation subsets in the $80/20$ ratio.

The combination of the N2N denoising approach and the conventional FBP reconstruction was compared with the advanced iterative reconstruction routine proposed in [24]. The latter relies on expert expectations on how the reconstructed image should look like. Which becomes increasingly complex with increasing complications of the sample under investigation. As in this case the reconstructed samples are expected to appear as solids, i.e. homogeneous regions, the authors assumed a piece-wise constant signal in the spatial domain. This prior knowledge is enforced through TV regularization [35,36]. The signal in the spectral domain is expected to be piece-wise smooth based on theoretical predictions for the materials employed in this study [37]. In this case, regularization is achieved through Total Generalized Variation (TGV) prior [38]. Hence, we refer to the iterative reconstruction method as TV-TGV. As before, the reconstruction was implemented in CIL [39]. Code to reproduce results is available from [40].

4.2.3 Results discussion

We begin with a visual comparison of different denoising approaches in the spectral domain (Fig. 5). We perform the comparison for the 339 channels image as the same binning was used for the case study in [24]. The theoretical predictions provide the ground truth for the comparison. As in the previous case study, without denoising the conventional FBP reconstruction results are uninterpretable. The N2N performance is comparable to a TV-TGV reconstruction. TV-TGV provides smoother spectra at a cost of spectral and spatial resolution loss. In contrast, N2N results appear sharper spatially but noisier spectrally for low-attenuative materials. Hence, we conclude that there is a certain threshold noise level N2N can handle efficiently.

 figure: Fig. 5.

Fig. 5. Qualitative comparison of denoising techniques in the spectral domain for the neutron imaging dataset. For each material, we selected one representative voxel, and present the theoretical and empirical spectra. Left: results of TV-TGV, N2N done on slices, and N2N done on projections are presented with SSIM measured towards the theoretical spectrogram (see in brackets inside the legend); right: the spectra before denoising are presented. All results are presented for the datatset with 339 energy bins. We note, that N2N provides sharper edges, but noisier predictions.

Download Full Size | PDF

Figure 6(a) shows a comparison of the slices reconstructed from the white beam data (sum of all energies) and from data for a selected single energy channel for TV-TGV, N2N(S), and N2N(P), through direct comparisons of reconstructed slices in the transverse plane. While for Fe and Ni, both N2N(P) and N2N(S) perform comparably, for Cu and Al their performance differs. The attenuation of Al is drastically lower than other materials, which could lead to inconsistent predictions of the model for the projections when another material occludes the Al cylinder. This problem is not relevant for N2N(S). The Cu powder has a larger mean particle size than other powders (the mean particle size is comparable to the voxel size), hence, stronger spatial structures are visible in the cross-section. The structure changes randomly along the sample height. Therefore the N2N(S) model has less information about the structure and might fail to recover it correctly.

 figure: Fig. 6.

Fig. 6. Qualitative (top) and quantitative (bottom) comparisons of the denoising methods for the neuron imaging are presented. For the quantitative comparison, we plot the dependency between the structural similarity index (separately in spectral and spatial domains) and the number of energy channels used. Since the change in the number of channels was done through the binning, the lower amount of channels corresponds to the lower amount of noise in the initial image. We note, that spatially Noise2Noise provides superior denoising.

Download Full Size | PDF

As a reference revealing structures, we use an FBP slice averaged across all energy levels, sacrificing spectral information for spatial. We also report the structural similarity index (SSIM) between the single-energy slices and the reference slice [41]. Both N2N approaches provide a sharper, more detailed image than TV-TGV. Interestingly, while N2N(S) provides a visually better, sharper image, this image has lower SSIM, compared to the N2N(P). We hypothesize, that this is caused by the unintentional reduction of the streak artifacts (highlighted in the top left callout in the N2N(P) slice). Streak artifacts are very common in tomographic imaging and are caused by insufficient angular sampling [42].

We next explore denoising quality in the spatial (Fig. 6(c)) and the spectral domain (Fig. 6(b)) given the increase of noise levels in the input data. We control noise levels by changing binning: the smaller is the binning step–the lower is SNR. We use the white beam slice reconstructed with FBP and the theoretical predictions for SSIM calculations in the spatial and spectral domains, respectively. While iterative reconstruction provides the best results for the spectral domain, it provides the worst result for the spatial domain. Excellent TV-TGV performance heavily capitalizes on the fact that the cylinders are homogeneous inside. In terms of SSIM, N2N(P) outperforms N2N(S) because N2N(S) additionally minimizes streak artifacts due to the angular undersampling, hence, the discrepancy between the reference image and the denoised one grows.

Another important observation is that N2N can be computed for the higher number of channels. The training time of the model stays almost the same, around 20 hours on average for the full volume, calculated on a $4 \times A5000$ machine. After the training, the model is capable of inferencing one projection/slice at the rate of 20-30 energy channels per second. While TV-TGV reconstruction for one slice with 339 channels takes several hours to complete and reconstruction time increases with the increase in the number of channels or the number of slices.

4.3 In vivo cine-radiography

The third case study considers a N2N application to cine-radiography. Cine-radiography or digital real-time radioscopy alias fluoroscopy are different realizations of time-resolved X-ray imaging techniques relying on X-ray projection imaging to study technological or biological processes. In particular, for in vivo or other dose-sensitive applications, the applicable dose and the detection efficiency of the imaging system limits acquisition times, constraining the total observation time or achievable SNR.

For this case study, we employed propagation-based phase contrast imaging (PB-PCI), which is particularly well suited for X-ray imaging of very weakly absorbing soft tissue in biological specimens in the sub-micron up to a few $\mathrm {\mu }$m resolution range [43]. The X-ray wavefield experiences a locally varying phase shift when traversing the specimen, which turns into measurable intensity contrast as a result of free-space wavefield propagation. The object information can be reconstructed from the detected image interference pattern by algorithmic treatments (so-called phase retrieval or PR for short [44]). Here, we applied a convolution with a dedicated low-pass filter in the spatial domain. This so-called Paganin filter [45] heavily affects the noise distribution. On one hand, it significantly reduces high-frequency noise, hence, increases the peak signal-to-noise ratio (PSNR). On the other hand, low-frequency noise becomes more prominent causing so-called “cloudy” artifacts [46]. For a single image, the effect of low-frequency noise might be less disturbing. However, in a time-resolved cine-radiographic sequence, this effect leads to a highly disturbing flickering, since the position of these “clouds” changes randomly from frame to frame, which affects the interpretability of the images by experts.

4.3.1 Data

In this case study, we used a batch of in vivo cine-radiographic data from a behavioral study visualizing the morphodynamics of parasitoid chalcid wasps emerging from their host eggs [47]. The full dataset contained $138$ videos, imaged with $15$ fps ($0.066$ s exposure time per frame) with lengths between $81$ and $7142$ frames per image series. The total number of frames is $263,875$.

We identified a sequence of 100 frames, where the wasp was completely still. From this, we calculated an average frame and used it as a low-noise reference image. This averaged image was used for qualitative results calculations. The average PSNR value before phase retrieval is $25.2$ with a standard deviation of $0.02$. After the Paganin phase retrieval, the PSNR increases to $35.9$ with a standard deviation of $1.2$.

4.3.2 Training and analysis details

Because of the high dynamics in the sample’s motions, we cannot use more than one frame as model input at one pass. We train the model $f_{\theta }$ by optimizing the loss

$$\mathbb{E}_{i,j} \Vert f_{\theta}(x_{i,j-1}) - x_{i,j} \Vert_1 \xrightarrow[\theta]{} \min,$$
where $x_{i,j}$ stands for the frame number $j$ from the image sequence number $i$. We randomly divided all frames into training and validation sets in the $80/20$ ratio according to the index $i$. In addition, we noticed that in some cases the temporal resolution was not high enough to smoothly capture fast movements because the structure positions changed significantly between adjacent frames. We introduced additional filtering to alleviate potential blur caused by the large morphodynamical changes between neighboring frames. During the training, we discard the image pairs whose SSIM was below a manually optimized threshold.

4.3.3 Results discussion

We applied the N2N denoising once before and once after phase retrieval. Table 2 summarizes PSNR and SSIM for both cases (the average of 100 frames without motion was used as a reference for metrics calculation). Applying denoising before the phase retrieval results in significant improvement in PSNR and SSIM. The benefits are maintained even after phase retrieval.

Tables Icon

Table 2. Quantitative comparison of the denoising done before and after phase retrieval for the chalcid wasp cine-radiography. We averaged 100 motion-free frames to use as the reference (noise-free) image for these calculations. We report mean values and 95% confidence intervals. The denoising before phase retrieval provides a slight improvement in measures both before and after phase retrieval.

To qualitatively assess the benefits of denoising done before the phase retrieval, we show exemplary frames in Fig. 7. Note that after the denoising and before phase retrieval the complex structures of the insect leg and interference fringes become more visible (Fig. 7(b)). We also visually compare how the noise changes between consequent frames without (Fig. 7(c)) and with (Fig. 7(d)) denoising. We note that the noise not only becomes less sharp without blurring the sample (Fig. 7(a)) but also produces less sudden changes in consequent frames. This makes it easier to evaluate the morphodynamics or, reversely, would allow reducing the dose even further. While denoising made the images smoother, there is no drastic blur, and even relatively small details (e.g., legs or antennae) are preserved.

 figure: Fig. 7.

Fig. 7. Qualitative examination of the denoising performed for the chalcid wasp cine-radiography. We show the same cropped slice before and after the denoising. We also compare results before and after the phase retrieval. Denoising is done only before phase retrieval. In (b) we demonstrate an enlarged view of the leg and wing of the wasp before PR (a callout from the red rectangle in (a)). In (c) and (d) we show consequent frames of the noise without the sample, before and after denoising. All four frames are plotted with the same value range. This demonstrates that not only the noise becomes less prominent, but also the evolution of the cloudy noise becomes less drastic after the denoising.

Download Full Size | PDF

5. Discussion

In this paper, we proposed and tested a way to relax the data constraints of the N2N method. We found that despite the relaxation of constraints, this method significantly improved the quality of images across different imaging modalities while avoiding over-smoothing in both the energy and time domains.

In the case of spectral X-ray tomography (Section 4.1), our method demonstrated dramatic improvement in image quality. It not only reduced noise but also helped in clearly visualizing the actual structures, even in areas where signal was seemingly imperceptible prior to denoising. It enhanced the k-edges, facilitating further material decomposition, and improved the mean Area Under Precision-Recall Curve (AUPRC) for material decomposition by more than 10%.

For neutron imaging (Section 4.2), N2N performed comparably to TV-TGV reconstruction, providing sharper images spatially but appearing noisier spectrally for low-attenuative materials. We noted a thought-provoking trade-off in this modality where TV-TGV provided smoother results at the cost of spectral and spatial resolution loss. Additionally, N2N exhibited a remarkable ability to cope with high noise levels and an efficient processing of the high number of channels, showcasing its potential for applications that require high-resolution imaging.

It is worthy to note that in case if the spectral CT, N2N can be applied to both projection images and to tomographic slices after reconstruction. Any corrections in the projection domain are challenging as they might cause or exaggerate existing inconsistency between projections (a consistent sinogram has strong restrictions expressed as Helgason–Ludwig consistency condition [48]). However, our empirical studies did not show any noticeable artifacts due to this inconsistency.

In the domain of in vivo cine-radiography (Section 4.3), applying N2N denoising before the phase retrieval significantly improved the PSNR and SSIM metrics. It not only reduced sharp noise without blurring the image but also minimized sudden changes in consequent frames, making the assessment of morphodynamics easier or potentially allowing for further dose reduction. However, the method was not beneficial being applied after the phase retrieval, which leads to thought that it is yet sensitive to the noise distribution.

While we recognized the method’s vulnerability to significantly dissimilar training image pairs, we have suggested potential solutions, such as discarding images based on image similarity metrics. Despite this vulnerability, the method demonstrated unforeseen benefits, such as significantly reducing the appearance of ring and undersampling artifacts in some cases, which is a topic of future research.

Overall, our results demonstrated the significant potential of the N2N method for improving image quality in multi-channel imaging, suggesting that this could be a promising direction for future research in the field. We also acknowledged the limitations and vulnerabilities of the method and proposed potential solutions, setting a clear agenda for future work to improve and refine this approach.

6. Conclusion

In this paper, we explored the applicability of the N2N method to the denoising of time or energy-resolved radiographic image sequences and related 4D tomographic reconstructions. N2N is a distribution-agnostic method, it does not explicitly assume any particular noise or signal properties. The only requirement originally proposed was the ability to sample pairs of images that share a common signal but have independent and identically distributed noise. In this paper, we have demonstrated that this requirement, while not exactly met by the multichannel data, can be relaxed to successfully apply the method.

The presented case studies showed that this method offers a robust and efficient alternative to conventional denoising methods and regularized iterative reconstruction methods. The N2N method does not require fine-tuning of parameters or handcrafted regularization terms for any new dataset. Therefore, its application can be heavily automated. Finally, the N2N method relies on a rather intuitive assumption, hence it can be easily explained to non-experts in the ML domain and smoothly introduced into their measurement practice.

Funding

Bundesministerium für Bildung und Forschung (HIGH-LIFE, SMART-MORPH).

Acknowledgments

This work made use of computational support by CoSeC, the Computational Science Centre for Research Communities, through the Collaborative Computational Project in Tomographic Imaging (CCPi).

Disclosures

Authors declare no conflicts of interest.

Data availability

The neutron tomography dataset is openly available at [49]. The code to reproduce the dataset, training, and evaluation is openly available at [50].

References

1. J. Moosmann, A. Ershov, V. Altapova, T. Baumbach, M. S. Prasad, C. LaBonne, X. Xiao, J. Kashef, and R. Hofmann, “X-ray phase-contrast in vivo microtomography probes new aspects of Xenopus gastrulation,” Nature 497(7449), 374–377 (2013). [CrossRef]  

2. R. Warr, E. Ametova, R. J. Cernik, G. Fardell, S. Handschuh, J. S. Jørgensen, E. Papoutsellis, E. Pasca, and P. J. Withers, “Enhanced hyperspectral tomography for bioimaging by spatiospectral reconstruction,” Sci. Rep. 11(1), 20818 (2021). [CrossRef]  

3. A. E. Ilesanmi and T. O. Ilesanmi, “Methods for image denoising using convolutional neural network: a review,” Complex Intell. Syst. 7(5), 2179–2198 (2021). [CrossRef]  

4. J. Lehtinen, J. Munkberg, J. Hasselgren, S. Laine, T. Karras, M. Aittala, and T. Aila, “Noise2Noise: Learning image restoration without clean data,” 35th International Conference on Machine Learning, ICML 20187, 4620–4631 (2018).

5. M. Prakash, M. Delbracio, P. Milanfar, and F. Jug, “Interpretable Unsupervised Diversity Denoising and Artefact Removal,” International Conference on Learning Representations (2022).

6. R. C. Gonzalez and R. E. Woods, Digital Image Processing (Prentice Hall, 2008).

7. A. Buades, B. Coll, and J. M. Morel, “A non-local algorithm for image denoising,” in Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. II (IEEE, 2005), pp. 60–65.

8. L. Fan, F. Zhang, H. Fan, and C. Zhang, “Brief review of image denoising techniques,” Vis. Comput. Ind. Biomed. Art 2(1), 7 (2019). [CrossRef]  

9. S. Gu and R. Timofte, “A brief review of image denoising algorithms and beyond,” Inpainting and Denoising Challenges pp. 1–21 (2019).

10. P. Rodríguez, “Total variation regularization algorithms for images corrupted with different noise models: a review,” J. Electr. Comput. Eng. 2013, 1–18 (2013). [CrossRef]  

11. J. Batson and L. Royer, “Noise2Seif: Blind denoising by self-supervision,” 36th International Conference on Machine Learning, ICML 20192019-June, 826–835 (2019).

12. A. Krull, T. O. Buchholz, and F. Jug, “Noise2void-Learning denoising from single noisy images,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition2019-June, 2124–2132 (2019).

13. M. Papkov, K. Roberts, L. A. Madissoon, J. Shilts, O. Bayraktar, D. Fishman, K. Palo, and L. Parts, “Noise2Stack: Improving Image Restoration by Learning from Volumetric Data,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12964 LNCS, 99–108 (2021).

14. P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” in Proceedings of the 25th International Conference on Machine Learning, (ACM Press, New York, New York, USA, 2008), pp. 1096–1103.

15. E. Dalsasso, L. Denis, and F. Tupin, “As if by Magic: Self-Supervised Training of Deep Despeckling Networks with MERLIN,” IEEE Trans. Geosci. Remote Sensing 60, 1–13 (2022). [CrossRef]  

16. P. Iakubovskii, “Segmentation Models Pytorch,” (2019).

17. G. Davis, N. Jain, and J. Elliott, “A modelling approach to beam hardening correction,” in Developments in X-ray Tomography VI, vol. 7078 (SPIE, 2008), pp. 423–432.

18. C. Egan, S. Jacques, M. Wilson, M. Veale, P. Seller, A. Beale, R. Pattrick, P. Withers, and R. Cernik, “3d chemical imaging in the laboratory by hyperspectral x-ray computed tomography,” Sci. Rep. 5(1), 15979 (2015). [CrossRef]  

19. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, A. Müller, J. Nothman, G. Louppe, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research 12, 2825–2830 (2011).

20. M. Getzin, J. J. Garfield, D. S. Rundle, U. Kruger, A. P. Butler, M. Gkikas, and G. Wang, “Increased separability of k-edge nanoparticles by photon-counting detectors for spectral micro-ct,” J. X-Ray Sci. Technol. 26(5), 707–726 (2018). [CrossRef]  

21. J. Tuszynski, “2022 photonattenuation–software for modeling of photons passing through different materials,” (2006).

22. J. Punnoose, J. Xu, A. Sisniega, W. Zbijewski, and J. Siewerdsen, “Spektr 3.0—a computational tool for x-ray spectrum modeling and analysis,” Med. Phys. 43(8Part1), 4711–4717 (2016). [CrossRef]  

23. J. S. Jørgensen, E. Ametova, G. Burca, G. Fardell, E. Papoutsellis, E. Pasca, K. Thielemans, M. Turner, R. Warr, W. R. Lionheart, and P. J. Withers, “Core Imaging Library – Part I: a versatile Python framework for tomographic imaging,” Phil. Trans. R. Soc. A 379(2204), 20200192 (2021). [CrossRef]  

24. E. Ametova, G. Burca, S. Chilingaryan, G. Fardell, J. S. Jørgensen, E. Papoutsellis, E. Pasca, R. Warr, M. Turner, W. R. Lionheart, and P. J. Withers, “Crystalline phase discriminating neutron tomography using advanced reconstruction methods,” J. Phys. D: Appl. Phys. 54(32), 325502 (2021). [CrossRef]  

25. D. Fundamentals, “Nuclear physics and reactor theory,” Tech. rep., Technical Report (1993).

26. J. R. Santisteban, L. Edwards, A. Steuwer, and P. J. Withers, “Time-of-flight neutron transmission diffraction,” J. Appl. Crystallogr. 34(3), 289–297 (2001). [CrossRef]  

27. W. Kockelmann, G. Frei, E. H. Lehmann, P. Vontobel, and J. R. Santisteban, “Energy-selective neutron transmission imaging at a pulsed source,” Nucl. Instrum. Methods Phys. Res., Sect. A 578(2), 421–434 (2007). [CrossRef]  

28. J. R. Santisteban, L. Edwards, M. E. Fizpatrick, A. Steuwer, and P. J. Withers, “Engineering applications of Bragg-edge neutron transmission,” Appl. Phys. A 74(0), s1433–s1436 (2002). [CrossRef]  

29. M. Strobl, I. Manke, N. Kardjilov, A. Hilger, M. Dawson, and J. Banhart, “Advances in neutron radiography and tomography,” J. Phys. D: Appl. Phys. 42(24), 243001 (2009). [CrossRef]  

30. P. M. Bentley, “Instrument suite cost optimisation in a science megaproject,” J. Phys. Commun. 4(4), 045014 (2020). [CrossRef]  

31. J. Jørgensen, E. Ametova, G. Burca, G. Fardell, E. Papoutsellis, E. Pasca, A. Liptak, D. Kazantsev, W. Lionheart, and M. Turner, “Neutron TOF imaging phantom data to quantify hyperspectral reconstruction algorithms,” STFC ISIS Neutron and MuonSource (2019).

32. G. Burca, W. Kockelmann, J. James, and M. E. Fitzpatrick, “Modelling of an imaging beamline at the ISIS pulsed neutron source,” J. Instrum. 8(10), P10001 (2013). [CrossRef]  

33. W. Kockelmann, T. Minniti, D. E. Pooley, et al., “Time-of-flight neutron imaging on IMAT@ISIS: A new user facility for materials science,” J. Imaging 4(3), 47 (2018). [CrossRef]  

34. A. S. Tremsin, J. V. Vallerga, J. B. McPhate, O. H. Siegmund, and R. Raffanti, “High resolution photon counting with mcp-timepix quad parallel readout operating at > 1 khz frame rates,” IEEE Trans. Nucl. Sci. 60(2), 578–585 (2013). [CrossRef]  

35. L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Phys. D 60(1-4), 259–268 (1992). [CrossRef]  

36. E. Y. Sidky, C.-M. Kao, and X. Pan, “Accurate image reconstruction from few-views and limited-angle data in divergent-beam CT,” J. X-Ray Sci. Technol. 14, 119–139 (2006).

37. M. Boin, “NXS: a program library for neutron cross section calculations,” J. Appl. Crystallogr. 45(3), 603–607 (2012). [CrossRef]  

38. K. Bredies, K. Kunisch, and T. Pock, “Total generalized variation,” SIAM J. Imaging Sci. 3(3), 492–526 (2010). [CrossRef]  

39. E. Papoutsellis, E. Ametova, C. Delplancke, G. Fardell, J. S. Jørgensen, E. Pasca, M. Turner, R. Warr, W. R. B. Lionheart, and P. J. Withers, “Core Imaging Library - Part II: multichannel reconstruction for dynamic and spectral tomography,” Phil. Trans. R. Soc. A. 379(2204), 20200193 (2021). [CrossRef]  

40. E. Ametova, G. Burca, S. Chilingaryan, G. Fardell, J. S. Jørgensen, E. Papoutsellis, E. Pasca, R. Warr, M. Turner, W. R. B. Lionheart, and P. J. Withers, “Code to reproduce results of “Crystalline phase discriminating neutron tomography using advanced reconstruction methods”,” (2021).

41. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans. on Image Process. 13(4), 600–612 (2004). [CrossRef]  

42. A. C. Kak and M. Slaney, Principles of computerized tomographic imaging (SIAM, 2001).

43. R. Fitzgerald, “Phase-Sensitive X-Ray Imaging,” Phys. Today 53(7), 23–26 (2000). [CrossRef]  

44. L. M. Lohse, A. L. Robisch, M. Töpperwien, S. Maretzke, M. Krenkel, J. Hagemann, and T. Salditt, “A phase-retrieval toolbox for X-ray holography and tomography,” J. Synchrotron Radiat. 27(3), 852–859 (2020). [CrossRef]  

45. D. Paganin, S. C. Mayo, T. E. Gureyev, P. R. Miller, and S. W. Wilkins, “Simultaneous phase and amplitude extraction from a single defocused image of a homogeneous object,” J. Microsc. 206(1), 33–40 (2002). [CrossRef]  

46. D. Paganin, A. Barty, P. McMahon, and K. A. Nugent, “Quantitative phase-amplitude microscopy. iii. the effects of noise,” J. Microsc. 214(1), 51–61 (2004). [CrossRef]  

47. R. Spiecker, P. Pfeiffer, A. Biswal, et al., “Bragg magnifier based dose-efficient in vivo X-ray imaging at micrometer resolution,” submitted for publication (2023).

48. S. Helgason, “The radon transform on euclidean spaces, compact two-point homogeneous spaces and grassmann manifolds,” Acta Math. 113(0), 153–180 (1965). [CrossRef]  

49. J. Jorgensen, The neutron tomography dataset ISIS Neutron and Muon Source Data Journal, (2022) https://doi.org/10.5286/ISIS.E.RB1820541.

50. Y. Zharov, “The code to reproduce the dataset, training, and evaluation,” GitHub (2023) https://github.com/DL4XRayTomoImaging-KIT/training-repo/tree/noise2noise.

Data availability

The neutron tomography dataset is openly available at [49]. The code to reproduce the dataset, training, and evaluation is openly available at [50].

49. J. Jorgensen, The neutron tomography dataset ISIS Neutron and Muon Source Data Journal, (2022) https://doi.org/10.5286/ISIS.E.RB1820541.

50. Y. Zharov, “The code to reproduce the dataset, training, and evaluation,” GitHub (2023) https://github.com/DL4XRayTomoImaging-KIT/training-repo/tree/noise2noise.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (7)

Fig. 1.
Fig. 1. The materials and source characteristics used to simulate spectral X-ray tomography. For materials, energy-dependent mass attenuation coefficients (MAC) are presented, and for the simulated Boone/Fewell source, we present the source profile. We selected two pairs of materials with close k-edges that are hard to resolve and one material with the k-edge in the low-flux zone of the source.
Fig. 2.
Fig. 2. Qualitative examination of the denoising of the simulated energy-resolved X-ray CT of a specially devised phantom. On the left, present a noisy (left) and denoised (right) transverse slice both near peak (top) and low (bottom) flux. On the right, we present a comparison of theoretical, noisy, and denoised spectra for different materials. For each material, we selected one representative pixel. Note how denoising is able to recover information even in extremely noisy cases both spatially (b) and spectrally (see the slight k-edge of the Uranium on the (c) plot).
Fig. 3.
Fig. 3. Quantitative examination of the denoising of the simulated energy-resolved X-ray CT of a specially devised phantom. We study the quality of denoising through the lens of further material decomposition. On the top row, we present the binarized material decomposition error. Pixels that are black were assigned the wrong material. On the bottom row, we present the confusion matrix of the material decomposition for different materials, where ideal decomposition should yield the identity matrix. We note that materials with close k-edges are frequently confused before the denoising, and after the denoising, the confusion mainly comes from spatial smoothing.
Fig. 4.
Fig. 4. Limitations of time-sensitive detectors require splitting the whole wavelength domain into several measurement intervals (4 in our case, brackets depicted with dashed lines). Each interval has an individual wavelength bin width. In this plot, we show how standard deviation (as a proxy characteristic of noisiness) changes with the change in wavelength. Additionally, to benchmark the method, we generated three additional datasets by rebinning the spectral dimension of the original dataset.
Fig. 5.
Fig. 5. Qualitative comparison of denoising techniques in the spectral domain for the neutron imaging dataset. For each material, we selected one representative voxel, and present the theoretical and empirical spectra. Left: results of TV-TGV, N2N done on slices, and N2N done on projections are presented with SSIM measured towards the theoretical spectrogram (see in brackets inside the legend); right: the spectra before denoising are presented. All results are presented for the datatset with 339 energy bins. We note, that N2N provides sharper edges, but noisier predictions.
Fig. 6.
Fig. 6. Qualitative (top) and quantitative (bottom) comparisons of the denoising methods for the neuron imaging are presented. For the quantitative comparison, we plot the dependency between the structural similarity index (separately in spectral and spatial domains) and the number of energy channels used. Since the change in the number of channels was done through the binning, the lower amount of channels corresponds to the lower amount of noise in the initial image. We note, that spatially Noise2Noise provides superior denoising.
Fig. 7.
Fig. 7. Qualitative examination of the denoising performed for the chalcid wasp cine-radiography. We show the same cropped slice before and after the denoising. We also compare results before and after the phase retrieval. Denoising is done only before phase retrieval. In (b) we demonstrate an enlarged view of the leg and wing of the wasp before PR (a callout from the red rectangle in (a)). In (c) and (d) we show consequent frames of the noise without the sample, before and after denoising. All four frames are plotted with the same value range. This demonstrates that not only the noise becomes less prominent, but also the evolution of the cloudy noise becomes less drastic after the denoising.

Tables (2)

Tables Icon

Table 1. Quantitative examination of the denoising of the simulated energy-resolved X-ray CT of a specially devised phantom. We numerically compare the material decomposition quality before and after denoising. The denoising provides a prominent quality boost for material decomposition.

Tables Icon

Table 2. Quantitative comparison of the denoising done before and after phase retrieval for the chalcid wasp cine-radiography. We averaged 100 motion-free frames to use as the reference (noise-free) image for these calculations. We report mean values and 95% confidence intervals. The denoising before phase retrieval provides a slight improvement in measures both before and after phase retrieval.

Equations (5)

Equations on this page are rendered with MathJax. Learn more.

I = S + σ ( S )
E i , j f θ ( x i , j 1 , x i , j + 1 ) x i , j 1 θ min ,
x ~ i , j 0.5 = f θ ( x i , j 1 , x i , j ) .
E i , j f θ ( x i , j 1 , x i , j + 1 ) x i , j 1 θ min ,
E i , j f θ ( x i , j 1 ) x i , j 1 θ min ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.