Novel model for infrared and visible image fusion based on &#x2113;<sub>2</sub> norm

HuiBin Yan; Zhongmin Li

doi:10.1364/OSAC.2.003076

1. Introduction

Many areas, such as remote sensing, modern military and medical imaging, have benefited from multi-sensor data fusion technology. Infrared (IR) images are captured by IR sensors which are sensitive to thermal radiation, thus they clearly present thermal radiation distribution about the region surveyed, but low spatial resolution and lack of details are also their main features. In contrast, due to the reflected light capture mechanism of VI sensors, visible (VI) images usually have clear backgrounds and details. However, the targets are indistinguishable in them when the region surveyed is under low-light circumstances or the targets themselves are not visible. IR and VI image fusion integrates the complementary information from the source images into an informative one, which facilitates subsequent applications [1].

A large number of fusion methods have been proposed in IR and VI image fusion field, and we classify them into three categories, namely the multi-scale transform (MST)-based methods [2–17], the deep learning (DL)-based methods [18–26], and other methods [27–33]. Although the MST-based fusion methods usually achieve satisfactory results in most cases because the multi-scale processing mechanism is basically consistent with human visual perception [34], the fused results of the MST-based methods for IR and VIS images are not satisfactory because these two images are manifestations of two different phenomena [27]. Specifically, the thermal radiation in IR images is mainly featured by pixel intensities; while the appearance in VI images is mainly reflected by gradients. In order to solve this problem, Ma et al. proposed a novel fusion method, termed as gradient transfer fusion (GTF) [27]. Their method formulates IR and VI image fusion task as a minimization problem, aiming to preserve the thermal radiation information in IR images and the gradient information in IV images, simultaneously. Indeed, GTF exhibits obvious advantage over several state-of-the-art fusion methods in term of keeping highlighted IR targets and large-scale VI details. However, small-scale details are lost in their fusion results, which is attributed to two reasons. One is that ${\ell _1}$ norm is employed to encourage the sparseness of the gradients, and the other is that GTF ignores the pixel intensities in VI images. Afterwards, Ma et al. also proposed a fusion method using Generative Adversarial Network, named as FusionGAN [19]. In FusionGAN, the generator tends to composite an image with IR thermal radiation information together with VI gradient information, and the discriminator tends to make the image generated by the generator have more VI details. The fusion results obtained by FusionGAN tend to have more details than those obtained by GTF due to the discriminator. However, as adversarial training is uncertain and unsteady, detail loss still exists in the fusion results of FusionGAN. In addition, the fusion results of FusionGAN tend to be smooth and fuzzy, especially the boundaries of targets, which is caused by optimizing the ${\ell _2}$ norm.

In order to achieve better fusion results than GTF and FusionGAN, we propose a novel fusion model based on ${\ell _2}$ norm, where the first term in our objective function tends to constrain the fused image and the IR image to have the similar pixel intensities, and its second term tends to force the fused image and the VI image to have the similar gradient distribution. Note that the first term and the second term in our objective function are computed using ${\ell _2}$ norm other than ${\ell _1}$ norm used in GTF. To address the problem caused by optimizing the ${\ell _2}$ norm, we introduce two weights into our objective function to make our fusion results clearer, especially the boundaries of the targets in IR images, which is inspired by weighted least squares filtering (WLSF) framework [36].

In order to demonstrate the advantages of our proposed method visually, a fusion example is shown in Fig. 1. The IR image and VI image are shown in Fig. 1(a) and 1(b), respectively. The bunker is highlighted in the IR image, and the background and details are clear in the VI image. Figure 1(c) shows the fusion result obtained by a recently proposed fusion method, named as ADF, which is based on traditional strategy. Figure 1(d-f) show the fusion results obtained by GTF, FusionGAN, and our method, respectively. From Fig. 1, we can find that ADF mainly focuses on extracting the detailed information from the source images and ignores the thermal radiation information in IR image, making its fusion results not suitable for target detection and localization. GTF and FusionGAN can retain the thermal radiation information in IR image well. Compared with GTF, the main advantage of our method is that it can retain more VI details. Compared with FusionGAN, our method can retain more VI details, and the target boundaries in our fusion result are clearer.

Fig. 1. (a)–(b) are the IR image, VI image respectively. (c)–(f) are the fusion results obtained by ADF [35], GTF [27], FusionGAN [19], and our method.

Download Full Size | PDF

The main contributions of this paper are summarized as follows. A novel fusion model for IR and VI image fusion based on ${\ell _2}$ norm is proposed. Different from ${\ell _1}$ norm-based fusion model, our fusion model can be easily solved and the direct mapping from the source images to the fused image can be obtained because ${\ell _2}$ norm is differentiable. To the best of our knowledge, methods which can obtain the mathematical function formula between the source images and the fusion result have not been studied yet. In addition, our method has higher computational efficiency than most fusion methods because we can directly compute the fused image according to the mathematical function formula between the source images and the fusion result. Experimental results demonstrate that our method shows obvious advantages over several state-of-the-art methods in achieving fusion results with highlighted IR targets along with abundant VI details.

The remaining parts of this paper are outlined as follows. Related work is presented in Section 2. Section 3 elaborates on the proposed fusion method. Experiments are conducted in Section 4. We give some conclusions in Section 5.

2. Related work

Current IR and VI image fusion methods can be divided into three categories including the multi-scale transform (MST)-based methods [2–17], the deep learning (DL)-based methods [18–26], and other methods [27–33].

The multi-scale transform (MST)-based methods mainly consist of three steps: decomposition, fusion, and reconstruction. In order to obtain perceptually better fusion results than conventional MST-based methods, Zhou et al. proposed a novel hybrid multi-scale decomposition-based method [9]. As a new field in machine learning research, DL has been introduced into image fusion due to its good feature extraction capabilities. For example, Ma et al. proposed a fusion method based on Generative Adversarial Network (GAN), named as FusionGAN [19]. There is no doubt that DL-based methods will yield better fusion results in the near future, but they require the support of powerful hardware due to their computational complexity. In addition, as for IR and VI image fusion, the training data is limited. Furthermore, it is hard to define the ground truth of the fusion results. Other fusion methods for IR and VI images are mainly based on total variation [27], latent low-rank representation [32] and etc. These methods address IR and VIs image fusion from different novel perspectives. For example, Ma et al. proposed a fusion method, termed as gradient transfer fusion (GTF) [27]; their method formulates the fusion task as a minimization problem. As image registration is a prerequisite for image fusion, a lot of image registration techniques [37–40] have been proposed to ensure the fusion quality. More details about IR and VI fusion methods and image registration techniques can be found in [1], which was published in just one year.

Motivated by GTF and the differentiability of ${\ell _2}$ norm, we propose a novel fusion model based on ${\ell _2}$ norm. Compared with ${\ell _1}$ norm-based fusion method such as GTF, our method based on ${\ell _2}$ norm has two advantages. One is that as ${\ell _2}$ norm is differentiable, our fusion model can be easily solved and the direct mapping from the source images to the fused image can be obtained. The other one is that our method makes a better balance between the first term and the second term and thus can retain more texture information in VI image with almost the same thermal radiation information in IR image. Specifically speaking, in GTF, only a fixed balance parameter $\lambda $ was used to balance the two terms in their objective function; in our fusion model, in addition to $\lambda $ we also introduce two weights to further balance the two terms used in our objective function, which is based on the basic consideration that significant gradient information in IR image should be retained in the fused image while other gradient information should be transferred from the visible image. The role of the weights used in our objective function will be analyzed in section 3.3 and experimentally validated in section 4.2. Note that without weights used in our objective function, the fusion performance of our method is inferior to that of GTF mainly because in this case ${\ell _1}$ norm is a more reasonable choice than ${\ell _2}$ norm, which was also explained in GTF work.

3. Methodology

In this section, we firstly present our fusion model based on ${\ell _2}$ norm and its solution, and then summarize our fusion method, finally analyze the role of the weights used in our fusion model.

3.1 Fusion model and solution

3.1.1 Fusion model

Given a pair of IR and VI images, our fusion goal is to retain both the IR thermal radiation information and the VI appearance information. For convenience, we suppose that the IR image, the VI image and the fused image are all gray images of size $m \times n$, and their column forms are denoted by ${\boldsymbol u,\, v}$, and ${\boldsymbol x} \in {{\boldsymbol R}^{mn \times 1}}$, respectively.

The thermal radiation information in IR image is mainly represented by pixel intensities. As our goal is to preserve the IR thermal radiation information, the following objective function should be as small as possible.

(1)$$\left\Vert{{\boldsymbol x} - {\boldsymbol u}} \right\Vert_2^2,$$

where ${\left\Vert\cdot \right\Vert_2}$ denotes the ${\ell _2}$ norm. The appearance information in VI image is mainly featured by its gradients. As our goal is to preserve the VI appearance information, a straightforward scenario is that the following objective function should be as small as possible.

(2)$$\left\Vert{\sqrt {{{({{\nabla_x}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}\textrm{ + }{{({{\nabla_y}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}} } \right\Vert_2^2,$$

where ${\nabla _x}$ and ${\nabla _y}$ denote x-direction and y-direction gradient operator, respectively. Directly combining Eq. (1) and Eq. (2), we formulate the fusion task as the following minimization problem.

(3)$$\mathop {\arg \min }\limits_{\boldsymbol x} \left( {\left\Vert{{\boldsymbol x} - {\boldsymbol u}} \right\Vert_2^2 + \lambda \left\Vert{\sqrt {{{({{\nabla_x}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}\textrm{ + }{{({{\nabla_y}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}} } \right\Vert_2^2} \right),$$

where $\lambda $ is a trade-off parameter. As Eq. (3) is a ${\ell _2}$ norm optimization problem, the fused image ${\textbf x}$ obtained by directly optimizing it tends to be smooth, especially the boundaries of the target are blurry. Inspired by weighted least squares filtering (WLSF) framework [36], we address the above problem by adding two weights ${a_x}({\boldsymbol u} )$ and ${a_y}({\boldsymbol u} )$ into the second term, and finally we formulate the fusion task as follows.

(4)$$\mathop {\arg \min }\limits_{\boldsymbol x} \left( {\left\Vert{{\boldsymbol x} - {\boldsymbol u}} \right\Vert_2^2 + \lambda \left\Vert{\sqrt {{a_x}({\boldsymbol u} )\cdot {{({{\nabla_x}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}\textrm{ + }{a_y}({\boldsymbol u} )\cdot {{({{\nabla_y}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}} } \right\Vert_2^2} \right),$$

where notation ${\cdot} $ represents point multiplication, and ${a_x}({\boldsymbol u} )$ and ${a_y}({\boldsymbol u} )$ are defined as follows.

(5)$${a_x}({\boldsymbol u} )\textrm{ = }{\left( {{{\left|{\frac{{\partial l}}{{\partial x}}} \right|}^\alpha } + \varepsilon } \right)^{\textrm{ - }1}},\;\;\;{a_y}({\boldsymbol u} )\textrm{ = }{\left( {{{\left|{\frac{{\partial l}}{{\partial y}}} \right|}^\alpha } + \varepsilon } \right)^{\textrm{ - }1}},$$

where $l\textrm{ = }\log {\boldsymbol u}, \, \log $ represents the logarithmic function with e as the base, and $\alpha $ and $\varepsilon $ are two parameters which are set to 1.2 and 0.0001, respectively. In the experimental section, we will demonstrate the effectiveness of the two weights ${a_x}({\boldsymbol u} )$ and ${a_y}({\boldsymbol u} )$.

3.1.2 Model solution

Let $F({\boldsymbol x} )\textrm{ = }\left\Vert{{\boldsymbol x} - {\boldsymbol u}} \right\Vert_2^2 + \lambda \left\Vert{\sqrt {{a_x}({\boldsymbol u} )\cdot {{({{\nabla_x}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}\textrm{ + }{a_y}({\boldsymbol u} )\cdot {{({{\nabla_y}({{\boldsymbol x} - {\boldsymbol v}} )} )}^2}} } \right\Vert_2^2$, and then let $\frac{{\partial \textrm{F}({\boldsymbol x} )}}{{\partial {\boldsymbol x}}}\textrm{ = }0$, finally we can obtain the solution of Eq. (4).

(6)$$\;{\boldsymbol x} = {({{\boldsymbol I} + \lambda {\boldsymbol L}} )^{\textrm{ - }1}}({{\boldsymbol u} + \lambda {\boldsymbol Lv}} ),$$

where ${\boldsymbol I}$ is a unit matrix, and ${\boldsymbol L}\textrm{ = }{\boldsymbol D}_x^\textrm{T}{{\boldsymbol A}_x}{\boldsymbol D}_x + {\boldsymbol D}_y^\textrm{T}{{\boldsymbol A}_y}{\boldsymbol D}_y$. ${\boldsymbol D}_x$ and ${\boldsymbol D}_y$ are the forward difference operators, and ${\boldsymbol D}_x^\textrm{T}$ and ${\boldsymbol D}_y^\textrm{T}$ are the backward difference operators. ${{\boldsymbol A}_x}$ and ${{\boldsymbol A}_y}$ are diagonal matrices containing the weights ${a_x}({\boldsymbol u} )$ and ${a_y}({\boldsymbol u} )$, respectively. Equation (6) is the mathematical function formula between the source images and the fused image, which is effective and efficient.

3.2 Procedure of our fusion method

Our fusion method based on ${\ell _2}$ norm is summarized in Algorithm 1.

osac-2-11-3076-i001

3.3 Weights analysis

As we demonstrated in section 3.1.1 that weights ${a_x}({\boldsymbol u} )$ and ${a_y}({\boldsymbol u} )$ are introduced to address the smoothing problem of the fused image ${\boldsymbol x}$, in this subsection we simply explain it. When pixels at $p$ in IR image ${\boldsymbol u}$ are at a significant edge, especially the boundaries of the target, the weights calculated by Eq. (5) tend to be small values, and in this case in order to minimize Eq. (4), the first term in Eq.(4) will be as small as possible, thus the significant edge at $p$ in IR image ${\boldsymbol u}$ will be kept well in the fused image ${\boldsymbol x}$; When pixels at $p$ in IR image ${\boldsymbol u}$ are in other smooth regions, the weights calculated by Eq. (5) tend to be large values, and in this case in order to minimize Eq. (4), the second term in Eq.(4) will be as small as possible, therefore, the gradient information at $p$ in the fused image ${\boldsymbol x}$ will be as similar as possible to that in VI image ${\boldsymbol v}$.

4. Experiments

In this section, we firstly introduce the instructions about the experiments, and then validate the effectiveness of the weights used in our fusion model, finally we compare our method with several state-of-the-art fusion methods on eighteen widely used IR and VI image pairs.

4.1 Instructions about the experiments

4.1.1 Dataset

Eighteen pairs of images popular in IR and VIS fusion domain shown in Fig. 2 are adopted in our experiments. Most of them are downloaded from the TNO dataset [41], and the remaining three pairs, Fig. 2(i), Fig. 2(l), and Fig. 2(m), are provided by Zhang, the first author of [28].

Fig. 2. The source images used in our experiments.

Download Full Size | PDF

4.1.2 Compared methods

Eight IR and VIS image fusion methods are compared with our method including wavelet (WAV) [42], fourth-order partial differential equation (FPDE) [43], anisotropic diffusion fusion (ADF) [35], convolutional sparse representation (CSR) [44], latent low-rank representation (LATLRR) [32], deep learning framework (DLF) [25], GTF [27], and FusionGAN [19]. WAV is a traditional MST-based fusion method, and the remaining seven fusion methods were proposed in just three years. Their codes are publicly available, and all the parameters in them remain unchanged. The trade-off parameter $\lambda $ in our method is set to 0.2.

4.1.3 Evaluation metrics

As there is no evidence that any image fusion evaluation metric is better than the others, six evaluation metrics are selected to comprehensively evaluate different methods. They are entropy (EN), standard deviation(SD), spatial frequency (SF), local mutual information (LMI) [45], structural similarity index measure (SSIM) [46], and visual information fidelity (VIF) [47]. EN measures the information contained in an image. SD mainly measures the contrast of an image. SF measures the detail and texture of an image. LMI measures the local information transferred from the source images. SSIM measures the structural similarity between the source images and the fused image. VIF is a human perception-based metric. For all the metrics, a larger value means a better performance of the fusion method. The codes for all the metrics are publicly available, and we keep the values of all the parameters unchanged.

4.2 Validation of weights

In this subsection, we verify the effectiveness of weights introduced into Eq. (4) to address the problem of smooth fused images. Figure 3(a) and 3(b) show the IR image and VI image, respectively, the fusion results obtained by our method without weights are shown in Fig. 3(c-e), and the fusion results obtained by our method with weights are shown in Fig. 3(f-h). From Fig. 3(c-e), we can see that as $\lambda $ increases, although the appearance information in the fusion results obtained by our method without weights is becoming more and more, the boundaries of the bunker become increasingly blurred. In contrast, in Fig. 3(f-h), as $\lambda $ increases, the boundaries of the bunker are still clear as the appearance information in the fusion results obtained by our method with weights is becoming more and more. It demonstrates that the smooth fused images can be well addressed by introducing the weights into Eq. (4).

Fig. 3. (a)–(b) are the IR image, VI image, respectively. (c)–(e) are the fusion results obtained by our method without weights. (f)–(h) are the fusion results obtained by our method with weights.

Download Full Size | PDF

4.3 Validation of fusion performance

4.3.1 Visual quality

Figs. 4–7 show the fusion results of all methods on four pairs of IR and VI images, i.e., “Bunker”, “Lake”, “Kaptein_1654”, and “Kaptein_1623”, respectively. From Figs. 4–7, we can find that all the compared methods except GTF and FusionGAN cannot keep the thermal radiation information in IR images well, and the targets in their fusion results, such as the bunker in Fig. 4, the lake in Fig. 5, and the humans in Figs. 6 and 7, are not highlighted. Although the thermal radiation information in the fusion results obtained by GTF and FusionGAN is kept well, the details in their fusion results are less than those in our fusion results. In addition, our methods also perform better than FusionGAN in retaining edges (see the red rectangles labeled parts in Fig. 4). In summary, our fusion results look like the IR images with abundant appearance details. Furthermore, the fusion results obtained by our method on all the image pairs are shown in Fig. 8.

Fig. 4. (a)–(b) are the IR image, VI image respectively. (c)–(k) are the fusion results obtained by different fusion methods on “Bunker” image pair.

Download Full Size | PDF

Fig. 5. (a)–(b) are the IR image, VI image respectively. (c)–(k) are the fusion results obtained by different fusion methods on “Lake” image pair.

Download Full Size | PDF

Fig. 6. (a)–(b) are the IR image, VI image respectively. (c)–(k) are the fusion results obtained by different fusion methods on “Kaptein_1654” image pair.

Download Full Size | PDF

Fig. 7. (a)–(b) are the IR image, VI image respectively. (c)–(k) are the fusion results obtained by different fusion methods on “Kaptein_1123” image pair.

Download Full Size | PDF

Fig. 8. Fusion results obtained by our method on all the image pairs

Download Full Size | PDF

4.3.2 Fusion metrics

Figure 9 shows the quantitative results of six metrics on all the image pairs, and the values in each legend are the average metric value of each fusion method. It can be seen from Fig. 9 that our method achieves the best values on most image pairs for the first four metrics (EN,SD,SF, and LMI), and the average metric value is the best for each of them, which demonstrates that our fusion results are more informative, and the details in them are clearer. As all the three methods (GTF, FusionGAN, and Ours) tend to obtain fused images which look like IR images with abundant VI texture information, they tend to lose some information in IR and IV images, especially the luminance information in IV images. SSIM and VIF measure the structural similarity and visual information fidelity between the source images and the fused image respectively, thus the two metric values of these three methods may not perform well, which can be seen from the quantitative results in Fig. 9. From Fig. 9, compared with GTF we can also find that our method has an obvious advantage in term of VIF. This is because our fusion results have more textures than those obtained by GTF, leading to better visual performance.

Fig. 9. Quantitative results of six metrics on all the image pairs, the values in the each legend are the average metric value of each fusion method.

Download Full Size | PDF

Table 1 shows the average running time of each method on all the image pairs. This experiment is conducted on a desktop computer with a 3.60 GHz CPU and 8.00 GB memory, where the version of MATLAB is MATLAB 2016a in Windows 10. The running time of GAN is not listed in Table 1 because all the other methods are implemented with MATLAB while FusionGAN is implemented with PAYTHON. The result in Table 1 shows that our method has good calculation efficiency, second only to WAV and ADF.

Table 1. The average running time of each method on all the image pairs (unit: second)

View Table

5. Conclusions

Within this paper, we formulate infrared (IR) and visible (VI) image fusion task as a ${\ell _2}$ norm optimization problem, aiming to obtain a fused image which looks like the IR image with abundant VI appearance details. To address the problem caused by optimizing the ${\ell _2}$ norm, two weights are introduced, and their effectiveness has been analyzed and experimentally verified. Our fusion method can obtain the mathematical function formula between the source images and the fusion result, which makes our method not only significantly different from the current fusion methods, but also lower in computation cost than most fusion methods. Qualitative results show that our fusion results look like the IR images with abundant appearance details. Quantitative results demonstrate that our fusion results are more informative, and the details in them are clearer.

Funding

National Natural Science Foundation of China (61263040); China Scholarship Council (201608360166); Nanchang Hangkong University (YC2018019); Education Department of Jiangxi Province (GJJ170602).

Disclosures

The authors declare no conflicts of interest.

References

1. J. Ma, Y. Ma, and C. Li, “Infrared and visible image fusion methods and applications: A survey,” Inf. Fusion 45, 153–178 (2019). [CrossRef]

2. S. Li, X. Kang, and J. Hu, “Image fusion with guided filtering,” IEEE Trans. Image Process. 22(7), 2864–2875 (2013). [CrossRef]

3. X. Yan, H. Qin, J. Li, H. Zhou, J. Zong, and Q. Zeng, “Infrared and visible image fusion using multiscale directional nonlocal means filter,” Appl. Opt. 54(13), 4299–4308 (2015). [CrossRef]

4. S. Zhenfeng, L. Jun, and C. Qimin, “Fusion of infrared and visible images based on focus measure operators in the curvelet domain,” Appl. Opt. 51(12), 1910–1921 (2012). [CrossRef]

5. Z. Zhou, M. Dong, X. Xie, and Z. Gao, “Fusion of infrared and visible images for night-vision context enhancement,” Appl. Opt. 55(23), 6480–6490 (2016). [CrossRef]

6. X. Yan, H. Qin, J. Li, H. Zhou, and J. G. Zong, “Infrared and visible image fusion with spectral graph wavelet transform,” J. Opt. Soc. Am. A 32(9), 1643–1652 (2015). [CrossRef]

7. X. Zhang, Y. Ma, F. Fan, Y. Zhang, and J. Huang, “Infrared and visible image fusion via saliency analysis and local edge-preserving multi-scale decomposition,” J. Opt. Soc. Am. A 34(8), 1400–1410 (2017). [CrossRef]

8. Z. Fu, X. Wang, J. Xu, N. Zhou, and Y. Zhao, “Infrared and visible images fusion based on RPCA and NSCT,” Infrared Phys. Technol. 77, 114–123 (2016). [CrossRef]

9. Z. Zhou, B. Wang, S. Li, and M. Dong, “Perceptual fusion of infrared and visible images through a hybrid multi-scale decomposition with Gaussian and bilateral filters,” Inf. Fusion 30, 15–26 (2016). [CrossRef]

10. J. Ma, Z. Zhou, B. Wang, and H. Zong, “Infrared and visible image fusion based on visual saliency map and weighted least square optimization,” Infrared Phys. Technol. 82, 8–17 (2017). [CrossRef]

11. X. Luo, Z. Zhang, B. Zhang, and X. Wu, “Image Fusion With Contextual Statistical Similarity and Nonsubsampled Shearlet Transform,” IEEE Sens. J. 17(6), 1760–1771 (2017). [CrossRef]

12. D. P. Bavirisetti and R. Dhuli, “Two-scale image fusion of visible and infrared images using saliency detection,” Infrared Phys. Technol. 76, 52–64 (2016). [CrossRef]

13. B. Cheng, L. Jin, and G. Li, “General fusion method for infrared and visual images via latent low-rank representation and local non-subsampled shearlet transform,” Infrared Phys. Technol. 92, 68–77 (2018). [CrossRef]

14. V. P. S. Naidu, “Image Fusion Technique using Multi-resolution Singular Value Decomposition,” Def. Sci. J. 61(5), 479–484 (2011). [CrossRef]

15. W. Tan, H. Zhou, J. Song, H. Li, Y. Yu, and J. Du, “Infrared and visible image perceptive fusion through multi-level Gaussian curvature filtering image decomposition,” Appl. Opt. 58(12), 3064–3073 (2019). [CrossRef]

16. J. Zhu, W. Jin, L. Li, Z. Han, and X. Wang, “Fusion of the low-light-level visible and infrared images for night-vision context enhancement,” Chin. Opt. Lett. 16(1), 013501 (2018). [CrossRef]

17. J. Chen, X. Li, L. Luo, X. Mei, and J. Ma, “Infrared and visible image fusion based on target-enhanced multiscale transform decomposition,” Inf. Sci. 508, 64–78 (2020). [CrossRef]

18. Y. Liu, X. Chen, J. Cheng, H. Peng, and Z. Wang, “Infrared and visible image fusion with convolutional neural networks,” Int. J. Wavelets, Multiresolut. Inf. Process. 16(03), 1850018 (2018). [CrossRef]

19. J. Ma, W. Yu, P. Liang, C. Li, and J. Jiang, “FusionGAN: A generative adversarial network for infrared and visible image fusion,” Inf. Fusion 48, 11–26 (2019). [CrossRef]

20. Y. Zhang, Y. Liu, P. Sun, H. Yan, X. Zhao, and L. Zhang, “IFCNN: A General Image Fusion Framework Based on Convolutional Neural Network,” Inf. Fusion 54, 99–118 (2020). [CrossRef]

21. J. Ma, P. Liang, W. Yu, C. Chen, X. Guo, J. Wu, and J. Jiang, “Infrared and visible image fusion via detail preserving adversarial learning,” Inf. Fusion 54, 85–98 (2020). [CrossRef]

22. X. Ren, F. Meng, T. Hu, Z. Liu, and C. Wang, “Infrared-Visible Image Fusion Based on Convolutional Neural Networks (CNN),” proceedings of International Conference on Intelligent Science and Big Data Engineering (Springer, 2018), pp. 301–307.

23. H. Li and X.-J. Wu, “DenseFuse: A Fusion Approach to Infrared and Visible Images,” IEEE Trans. Image Process. 28(5), 2614–2623 (2019). [CrossRef]

24. H. Li, X.-J. Wu, and T. S. Durrani, “Infrared and Visible Image Fusion with ResNet and zero-phase component analysis,” arXiv preprint arXiv:1806.07119 (2018).

25. H. Li, X.-J. Wu, and J. Kittler, “Infrared and Visible Image Fusion using a Deep Learning Framework,” in 2018 24th International Conference on Pattern Recognition (ICPR) (IEEE, 2018).

26. H. Xu, P. Liang, W. Yu, J. Jiang, and J. Ma, “Learning a Generative Model for Fusing Infrared and Visible Images via Conditional Generative Adversarial Network with Dual Discriminators,” proceedings of Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (2019), pp. 3954–3960.

27. J. Ma, C. Chen, C. Li, and J. Huang, “Infrared and visible image fusion via gradient transfer and total variation minimization,” Inf. Fusion 31, 100–109 (2016). [CrossRef]

28. Y. Zhang, L. Zhang, X. Bai, and L. Zhang, “Infrared and visual image fusion through infrared feature extraction and visual information preservation,” Infrared Phys. Technol. 83, 227–237 (2017). [CrossRef]

29. H. Guo, Y. Ma, X. Mei, and J. Ma, “Infrared and visible image fusion based on total variation and augmented Lagrangian,” J. Opt. Soc. Am. A 34(11), 1961–1968 (2017). [CrossRef]

30. B. Cheng, L. Jin, and G. Li, “Infrared and low-light-level image fusion based on ℓ2-energy minimization and mixed-ℓ1-gradient regularization,” Infrared Phys. Technol. 96, 163–173 (2019). [CrossRef]

31. C. H. Liu, Y. Qi, and W. R. Ding, “Infrared and visible image fusion method based on saliency detection in sparse domain,” Infrared Phys. Technol. 83, 94–102 (2017). [CrossRef]

32. H. Li and X.-J. Wu, “Infrared and visible image fusion using Latent Low-Rank Representation,” arXiv preprint arXiv:1804.08992 (2018).

33. H. Li and X.-J. Wu, “Infrared and visible image fusion using a novel deep decomposition method,” arXiv preprint arXiv:1811.02291 (2018).

34. G. Piella, “A general framework for multiresolution image fusion: from pixels to regions,” Inf. Fusion 4(4), 259–280 (2003). [CrossRef]

35. D. P. Bavirisetti and R. Dhuli, “Fusion of infrared and visible sensor images based on anisotropic diffusion and Karhunen-Loeve transform,” IEEE Sens. J. 16(1), 203–209 (2016). [CrossRef]

36. Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski, “Edge-preserving decompositions for multi-scale tone and detail manipulation,” ACM Trans. Graph. 27(3), 1 (2008). [CrossRef]

37. J. Ma, W. Qiu, Z. Ji, M. Yong, and Z. Tu, “Robust L2E Estimation of Transformation for Non-Rigid Registration,” IEEE Trans. Signal Process. 63(5), 1115–1129 (2015). [CrossRef]

38. J. Ma, J. Jiang, C. Liu, and Y. Li, “Feature Guided Gaussian Mixture Model with Semi-Supervised EM and Local Geometric Constraint for Retinal Image Registration,” Inf. Sci. 417, 128–142 (2017). [CrossRef]

39. J. Ma, H. Zhou, Z. Ji, G. Yuan, J. Jiang, and J. Tian, “Robust Feature Matching for Remote Sensing Image Registration via Locally Linear Transforming,” IEEE Trans. Geosci. Remote Sens. 53(12), 6469–6481 (2015). [CrossRef]

40. J. Ma, J. Zhao, Y. Ma, and J. Tian, “Non-rigid visible and infrared face registration via regularized Gaussian fields criterion,” Pattern Recognit. 48(3), 772–784 (2015). [CrossRef]

41. A. Toet, “TNO Image Fusion Dataset,” (April, 2015), http://figshare.com/articles/TNO_Image_Fusion_Dataset/1008029.

42. H. Li, B. Manjunath, and S. K. Mitra, “Multisensor image fusion using the wavelet transform,” Graph. Models Image Process. 57(3), 235–245 (1995). [CrossRef]

43. D. P. Bavirisetti, G. Xiao, and G. Liu, “Multi-sensor image fusion based on fourth order partial differential equations,” in International Conference on Information Fusion (ICIF) (IEEE, 2017), pp. 1–9.

44. Y. Liu, X. Chen, R. K. Ward, and Z. J. Wang, “Image fusion with convolutional sparse representation,” IEEE Signal Process. Lett. 23(12), 1882–1886 (2016). [CrossRef]

45. M. Hossny, S. Nahavandi, D. Creighton, and A. Bhatti, “Image fusion performance metric based on mutual information and entropy driven quadtree decomposition,” Electron. Lett. 46(18), 1266–1268 (2010). [CrossRef]

46. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process. 13(4), 600–612 (2004). [CrossRef]

47. Y. Han, Y. Cai, Y. Cao, and X. Xu, “A new image fusion performance metric based on visual information fidelity,” Inf. Fusion 14(2), 127–135 (2013). [CrossRef]

Novel model for infrared and visible image fusion based on ℓ₂ norm

Abstract

1. Introduction

2. Related work

3. Methodology

3.1 Fusion model and solution

3.1.1 Fusion model

3.1.2 Model solution

3.2 Procedure of our fusion method

3.3 Weights analysis

4. Experiments

4.1 Instructions about the experiments

4.1.1 Dataset

4.1.2 Compared methods

4.1.3 Evaluation metrics

4.2 Validation of weights

4.3 Validation of fusion performance

4.3.1 Visual quality

4.3.2 Fusion metrics

5. Conclusions

Funding

Disclosures

References

Cited By

Figures (9)

Tables (1)

Equations (6)

OSA Continuum