1. Introduction
In the process of image acquisition, compression, transmission and so on, distortion will be inevitably introduced, resulting in degradation of visual quality. Image quality has a direct impact on human visual perception and information acquisition, leading to the importance of image quality assessment (IQA) in the field of image processing and analysis [1]. IQA methods are divided into subjective and objective evaluations. Subjective methods obtain image quality by human observation and scoring, which can produce accurate and reliable results. However, subjective methods are complicated and time-consuming, the evaluation results will be interfered by external environment, resulting in limited application scenarios. Therefore, objective evaluation has received great attention in recent years. According to the availability of reference images, objective evaluation can be categorized into full-reference (FR), reduced-reference (RR), and no-reference (NR) methods [2]. Since it is difficult to acquire all or part information of the reference images in many practical environments, NR IQA methods are with greatest practical value and have gained increasingly attention in recent years. Blur is an important type of image distortion, which is usually caused by defocusing, target motion, camera jitter, compression, etc., so the evaluation of image blur/sharpness is very critical. In this paper, we focus on blur distortion and study NR or blind image blur/sharpness quality assessment method.
In recent years, great progress has been made in the research of no-reference image blur assessment methods based on human visual system (HVS). These methods can be divided into two types according to whether the extracted features are to be trained to generate quality score. In the first type of methods, a relatively small number of spatial or frequency domain features related to blur degree are extracted and used directly as quality scores after simple additive or multiplicative fusion. In contrast, a large number of spatial and frequency domain features are extracted in the second type of methods, and machine learning is employed to train the mapping form image features to quality score.
Considering that blur will change the structure of image, several methods have been proposed by extracting the edge, gradient, local contrast and other features representing the spatial structure, and obtained high accuracy. In edge-based methods, Marziliano et al. first used Sobel operator to detect the edge in an image, and calculated the distance from the beginning to the end pixel of the edge as the edge width [3]. Quality score is defined as the average width of all local edges. This method achieved good evaluation results for different blurred versions of a single image, but did not perform well with images with diverse content. To address this problem, Ferzli et al. integrated the just noticeable blur (JNB) into the probability summation model, and proposed an image blur metric [4] (JNBM) based on just noticeable blur. The edge blocks of the image are first selected, then the local contrast and edge width of the blocks are calculated. Finally, a probabilistic summation model is applied to simulate the perceived blur of the local image blocks to generate the quality score. Narvekar et al. improved JNBM in 2011, their implementation estimated the blur detection probability of each edge in the image, and calculated the cumulative blur detection probability (CPBD) [5] to predict blur quality.
As an effective descriptor of image structure, gradient is commonly used for image blur evaluation. Li et al. argued that blur affects the moment size of image, their implementation [6] used Chebyshev moment as an effective shape descriptor to measure image blur. Gradient of the image is first calculated to characterize shape, and then the Chebyshev moment of gradient map is calculated in blocks. Finally, non-DC moment energy normalized by variance is applied for blind image blur evaluation (BIBLE). Later, Li et al. proposed a sparse representation based no-reference image sharpness assessment method (SPARISH) [7]. First, natural images are used to train a complete visual dictionary, image blocks are represented as the sparse coefficient of an over-complete dictionary. Considering that blur will reduce high-frequency details, the block energy is calculated by sparse coefficient, and the image sharpness is represented by the variance normalization energy of a group of high-frequency blocks. Zhan et al. argued that gradient not only reflects structure, but also directly represents blur of the image. They verified through experiments that HVS tend to judge sharpness of the image by maximum gradient, and used the variability of gradients to represent variations of image content [8]. Sharpness score is calculated as the weighted product of the maximum gradient and the variability of gradients (MGVG). Since only gradient is considered, this method has low computational complexity, and shows pleasurable evaluation effect on synthetic blurred images.
Bahrami et al. [9] found that the maximum local variation (MLV) is subject to image content and blurring, which is similar with the Gaussian distribution in regions with small varying texture while similar with the Laplace distribution in regions with strong edges and flat content. The sharpness of the image is measured by the standard deviation of the weighted MLV distribution. Gu et al. argued that blur will increase the similarity of local autoregressive (AR) model parameters and proposed an AR model based image sharpness metric (ARISM) [10]. A 8-order AR model was first established by training each pixel with its 8 neighborhoods. Then the energy and contrast differences of AR parameters were calculated. Finally, a percentage fusion strategy was utilized to predict the overall sharpness score. Hosseini et al. designed an HVS-related filter (HVS-MaxPol) to extract blur-sensitive features [11], the high-order center moment after image filtering was calculated as the sharpness score. This method utilizes HVS characteristics more effectively and achieves high accuracy on both synthetically and real blurred image databases.
Considering that the blur will reduce the high frequency component of image in the frequency domain, some scholars also evaluate the blur of image by frequency domain characteristics. Based on the fact that blur would destroy the LPC structure of the image, Hassen et al. proposed a local phase correlation (LPC) based no-reference image sharpness index (LPC-SI) in the complex wavelet transform domain [12], The blur of the image is estimated by the intensity of the LPC. Gvozden et al. observed the impact of blur on local contrast of images, and proposed a blind image sharpness evaluation algorithm based on local contrast statistics (BISHARP) [13]. Local root mean square is first calculated to generate a contrast map, the generated contrast map is then converted to the wavelet domain. Sharpness score of the image is estimated by the sorted high frequency wavelet coefficients. Kerouh et al. transformed the edge image into the frequency domain with DCT transformation [14], modeled the histogram of DCT coefficient with exponential probability density function (PDF), and the sharpness of image is measured by steepness of PDF. Evaluation methods in the frequency domain can obtain promising accuracy, but due to transformation from spatial domain to frequency domain, they usually require high computational complexity.
The first type of methods directly calculate the quality score with image features without training, and their evaluation results do not depend on the training set. They usually have simple computational processes, and the contributions of extracted features to blur are obvious. They perform well on synthetically blurred images, but not suitable for evaluation of real blurred images, which is mainly because the extracted features are not enough to describe characteristics of real blurred images. In contract, the second type of learning-based methods extract multiple features in spatial or frequency domain, and generate the quality scores through learning algorithms such as support vector regression (SVR). Moorthy et al. modeled the frequency domain coefficient distribution of images after pyramid wavelet transform [15], and extracted a total of 88 statistical features for no-reference distortion type identification and quality evaluation (DIIVINE). Mittal et al. modeled the distribution of spatial local normalized luminance, and extracted 18 distribution features for blind image spatial quality evaluation (BRISQUE) [16]. Li et al. proposed a general NR image quality assessment metric using a GRadient-Induced Dictionary (GRID) [17], image features are extracted using the dictionary based on Euclidean-norm coding and max-pooling. After feature extraction, both [15, 16, 17] used SVR to learn the mapping from feature space to quality score. They are general distortion-oriented evaluation methods, but also perform well in the evaluation of blur distortion. Aiming at blur evaluation, Yue et al. calculated the image spatial local binary model (LBP) [18], and the histogram of LBP was used as feature for SVR learning quality score. Li et al. proposed a no-reference and robust image sharpness evaluation method (RISE) by learning both spatial and DCT domain features [19]. Considering the multi-scale perception characteristics of HVS, RISE extracted 11 spatial and frequency features, and used SVR to learn the quality score. The evaluation effect of RISE is quite well, especially on real blurred databases.
With great progress of neural network (NN) technology, some scholars try to apply neural network to general IQA [20, 21, 22, 23]. However, few work directly apply NN for blur evaluation, and NN for IQA usually have small number of layers, which is mainly because databases for IQA are usually small, sample images are not enough to train deep NN. Therefore, NN-based IQA methods, especially methods for blur evaluation still need further research. At present, it is more suitable to evaluate blur quality with machine learning methods by extracting carefully-designed features.
To sum up, no-reference image blur evaluation methods can obtain better performance by learning effective features to build quality model. However, most of the previous learning-based blur evaluation methods are based on the overall statistical features of the image and are suitable for describing the characteristics of uniform blurred images, resulting in considerable evaluation effect of synthetically blurred databases. They ignored that the sensitivity of blur differs between different image contents, while blur of image in real scene is mostly non-uniform. If the impact of different local contents on blur can be distinguished during feature extraction, the evaluation performance is expected to be further improved. Therefore, inspired by [19], we proposed a no-reference image blur assessment method based on multi-scale spatial local features. Taking into account the sensitivity of different content areas in the image to blur by applying a local feature extraction and image block clustering strategy, which enriches the multi-scale image similarity of original and gaussian scale images. Moreover, we observe that not only the change in blurring between the original and the Gaussian scale images, but also the change between images in the each scale can effectively evaluate the blur of image. We innovatively define the energy ratio between Gaussian scale images, which not only utilizes the multi-scale characteristics of HVS, but also extends the feature analysis to inter-scale.
Three major contributions of this paper are listed as follows.
1) The proposed method separately considers the sensitivity of different regions in the image to blur, and combines the multi-scale characteristics of the HVS to extract the contrast features between the original and the Gaussian scale image by three regions.
2) By analyzing changes in blur between images in each scale, this method add the contrast between Gaussian scale images as effective features to evaluate blur, that is, image energy ratio of each image.
3) Taking the viewing distance of human eyes into consideration, the original image is down-sampled, and the distribution parameters of the local maximum gradient map at multiple resolutions of the image are calculated to estimate the overall blur statistically.
The rest of this paper are organized as follows. In Section 2, the block diagram of proposed method is introduced along with the implementation steps in detail. In Section 3, description of the databases and the results and analysis of the performance evaluation are illustrated. Finally, the paper is concluded in Section 4.
2. Image Blur Assessment Based on Multi-scale Spatial Local Features
2.1 Analysis of Image Characteristics in Gaussian Scale Space
HVS’s perception of image quality has multi-scale characteristics. Lindeberg's work [24] showed that representation of the original image in the Gaussian scale space can be obtained through the multi-scale transformation based on Gaussian convolution, which is often used to extract the essential features of the image. For an image I(x,y) it’s scale space images L(x,y,σ) can be obtained by convolution with a Gaussian convolution function G(x,y,σ) with variable scale
L(x,y,σ) = I(x,y,) * G(x,y,σ) (1)
Where (x,y) denote the coordinate of the pixel, σ is the scale parameter of the Gaussian convolution function, and ∗ denotes convolution operation. The two-dimensional Gaussian convolution function G(x,y,σ) is defined as:
\(G(x, y, \sigma)=\frac{1}{2 \pi \sigma^{2}} e^{-\frac{\left(x^{2}+y^{2}\right)}{2 \sigma^{2}}}\) (2)
In Gaussian scale space, the spatial size and resolution of the image remain unchanged, while detail information of the original image is gradually suppressed with the increase of scale. Scale space simulates the imaging process of scene on retina with different distances. Fig. 1 shows two original images with different blur scores and their scale space versions. Sizes of the Gaussian filter are 3×3, 9×9, 15×15 and 21×21, and corresponding standard deviations are 2, 4, 6 and 8 respectively. Five scale space images are constructed, represented as scale 0-4, where scale 0 is the original image. As shown in Fig. 1, scale space images are blurred versions of the original image, and their extents of blur are determined by the size of the Gaussian kernel. Furthermore, the constructed scale space images for the original images with different blur quality exhibit different characteristics. The original "butterfly" image in the first row has a high visual quality. With the increase of Gaussian kernel parameters, Gaussian scale space images are obviously blurred compared with the original image. While the second row shows Gaussian scale space images of "airplane" image with low quality score, which are very similar with the original image, indicate that the sharpness of the image can be measured by the overall similarity between the original and Gaussian scale space images [19].
Fig. 1. Two original images and their Gaussian scale space images. Original image "butterfly" in the first row, DMOS=21.4194, original image "airplane" in the second row, DMOS=43.2646
Through further observation on Fig. 1, it can be found that the original image “butterfly” in Fig. 1 (a) has a large amount of blurred backgrounds, but it is still considered to be of high-quality. This is because different image contents have different sensitivity or masking effect to blur distortion, and the blur distortion in the image is usually non-uniform. And the judgment of HVS on the quality of blur image is also not uniformly distributed in spatial, but more through the sharpest parts in the image. In addition, the regions with different blur degrees in Fig. 1 (b)-(e) show different characteristics after multiple Gaussian smoothing. Fig. 2 shows the image blocks containing smooth, edge and texture information taken from the original image in Fig. 1 and the corresponding scale space image blocks convolved by Gaussian kernels. As can be seen from Fig. 2, the original and multiple Gaussian convolution image blocks containing smooth information are very similar, and almost no change can be seen. The blur degree of edge image blocks increases gradually after Gaussian convolution, and the difference between them is most obvious. Texture image blocks vary greatly when the Gaussian convolution intensity is large and the impact of the Gaussian convolution on them is between smooth blocks and edge blocks. Inspired by this, this work divides the original image into three regions: smooth, edge and texture, and then the classified similarity between the original and the corresponding Gaussian scale space image blocks is calculated as the feature of blur degree to evaluate image quality. Our method can better describe the situation that the local content of the image is affected by blur compared with the whole image similarity measurement in scale space [19], and the experimental results also confirm that our method is significantly improved over [19].
Fig. 2. Smooth, edge, and texture blocks of two original images, and the corresponding Gaussian scale image blocks. (a) Blocks of the image "butterfly", (b) blocks of the image "airplane". Smooth blocks in the first row, edge blocks in the second row and texture blocks in the third row
2.2 Proposed Block Diagram
The block diagram of proposed model is shown in Fig. 3. According to the sensitivity of different content of the image to blur, image is first classified into three categories in blocks: smooth, edge and texture. Then contrast features between the original and Gaussian scale space images, along with features between images in each scale, are calculated based on classification results. In addition, the local maximum gradient distribution characteristics on multi-resolution are extracted to simulate the effect of viewing distance on blur. Finally, all features are combined and SVR is employed to learn the quality score.
Fig. 3. Block diagram of the proposed method
2.3 Local Feature Extraction and Image Block Classification
For a blurred image, we first convert it into gray scale. Then the image is divided into non-overlapping blocks with size of n×n. Sobel edge operator is applied to extract edge pixels, and the threshold of edge detection is adjusted to 4 by experiments. The gradient G of the image block is calculated as:
\(G=\sqrt{G_{b h}^{2}+G_{b v}^{2}}\) (3)
\(G_{b h}=\left[\begin{array}{rrr} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{array}\right] * I, G_{b v}=\left[\begin{array}{ccc} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{array}\right] * I\) (4)
where Gbh and Gbv are the horizontal and vertical gradient map of image block.
In order to divide the image into smooth, edge and texture regions, we extract 4 local features of each image block in gray scale. Regions of the image with clear edges or complex textures contain more information and higher energy, while regions with blur or smooth information are the opposite. Therefore, the gradient energy (Ebh, Ebv) in horizontal and vertical directions of the image block are firstly calculated as local features, which are defined as:
\(E_{b h}=\sum_{x=1}^{n} \sum_{y=1}^{n} G_{b h}(x, y)^{2}, E_{b v}=\sum_{x=1}^{n} \sum_{y=1}^{n} G_{b v}(x, y)^{2}\) (5)
where (x,y) is the pixel coordinate. In addition, scan each n×n block, the number of edge pixels Mb of the block is calculated. Moreover, the standard deviation σb of image block is also extracted to evaluate the change of content in gray scale.
Finally, 4-dimensional local features of each image block are fused, which are represented as xb = {Ebh, Ebv, Mb, σb}. Feature set of the overall image are (x1,x2,...xB,), where B denotes the number of image blocks. K-means clustering algorithm is employed to classify image blocks. Fig. 4 shows the results of four images after feature clustering. As shown in Fig. 4, clustering algorithm performs well in differentiating the smooth, edge and texture regions of the image. The blurry background, wings and tiny flowers on the left of Fig. 4 (a) are classified as smooth, edge and textured regions, respectively. While the sky, plane outline and text, and ground of Fig. 4 (b) are classified as smooth, edge and texture regions respectively. The clouds in Fig. 4 (c) are classified as texture regions. Classification results are consistent with the judgment of human eyes, which verify the effectiveness of our image block classification strategy.
Fig. 4. Classification results of images, smooth blocks are depicted in grey, edge blocks in orange and the texture blocks in blue, block size is 8×8
2.4 Local Contrast Features between Original and Gaussian Scale Space Images
Gradient and singular value features have been commonly used in IQA, which are effective features to describe structure and sharpness of the image. In this paper, contrast features between the original and the corresponding Gaussian scale space image blocks are calculated in the block layer to express the sensitivity of different image contents to blur distortion.
The gradient G of the original block and the corresponding Gaussian scale blocks are first calculated by Sobel operator, gradient similarity of the original and the corresponding Gaussian scale space image blocks is defined as:
\(G S_{q}(k)=\frac{1}{n \times n} \sum_{x=1}^{n} \sum_{y=1}^{n} \frac{2 G_{0}(x, y) G_{q}(x, y)+T_{1}}{G_{0}(x, y)^{2}+G_{q}(x, y)^{2}+T_{1}}\) (6)
where k is the block index, q ∈ {1,2,3,4}represents the scale of the image, T1 is a constant to keep the ratio stable, and the value of T1 is 10-7. According to the classification results in Section 2.3, the average gradient similarity for each class between the original and corresponding Gaussian scale space image blocks are calculated as:
\(G S_{q}^{p}=\frac{1}{N_{p}} \sum G S_{q}(k), k \in R_{p}\) (7)
\(G S_{q}^{e}=\frac{1}{N_{e}} \sum G S_{q}(k), k \in R_{e}\) (8)
\(G S_{q}^{t}=\frac{1}{N_{t}} \sum G S_{q}(k), k \in R_{t}\) (9)
where Rp, Re and Rt are the set of smooth, edge and texture blocks respectively, and Np, Ne and Nt are the number of smooth, edge and texture blocks.
Average singular value similarity between the original and the corresponding Gaussian scale space image block are then extracted to estimate the change of structure. Image blockbI Ib is decomposed into Ib=UΣVT, U and V are orthogonal matrices, which are called the left and the right singular matrix of I. Σ is a diagonal matrix, and σi on the diagonal are called the singular value. The singular value vector of Ib is (σ1,σ2,...,σr)=, where r denotes the rank of the matrix Ib. Singular value similarity between the original block and the corresponding Gaussian scale space blocks is defined as:
\(S S_{q}(k)=\frac{2 s_{0} s_{q}+T_{2}}{s_{0}^{2}+s_{q}^{2}+T_{2}}\) (10)
where q∈{1,2,3,4} represents the scale of the image. The constant T2 prevents the result from being unstable as the denominator approaches 0, and the value of T2 is 10-7. According to the classification results of image blocks, the average singular value similarity for each class between the original and corresponding Gaussian scale space image blocks are calculated as:
\(S S_{q}^{p}=\frac{1}{N_{p}} \sum S S_{q}(k), k \in R_{p}\) (11)
\(S S_{q}^{e}=\frac{1}{N_{e}} \sum S S_{q}(k), k \in R_{e}\) (12)
\(S S_{q}^{t}=\frac{1}{N_{p}} \sum S S_{t}(k), k \in R_{t}\) (13)
In this section, the classified average gradient and singular value similarity between original and the Gaussian scale space image blocks are calculated to measure the blur of different content in the original image. A total of 24 local average features are extracted, which are \(\left\{G S_{q}^{p}, G S_{q}^{e}, G S_{q}^{t}, S S_{q}^{p}, \mathrm{~S} S_{q}^{e}, S S_{q}^{t}\right\}\) , q ∈ {1,2,3,4}.
2.5 Energy Ratio Features of the Images between Scales
Besides the similarity between original image and corresponding Gaussian scale images, we also extracted the similarity between images in each scale. As shown in Fig. 1, scale space images are not only burred versions of the original image, but also of other lower-scale images. The blur of image blocks does not increase uniformly with the increase of gaussian kernel. To simulate the change of Gaussian kernel between scale space images, we defined the contrast feature between different Gaussian scales, that is energy ratio of each scale image and their higher-scale versions.
Energy represents the information of image. It has demonstrated that blur is mainly expressed as the speared of edges and the decrease of energy in high-frequency areas. With the increase of Gaussian convolution scale, edges and details of the image are gradually blurred, resulting in the gradually attenuation of information and energy. Inspired by this, we proposed to evaluate the change of blur on scale images by the energy, which is defined as the sum of squares of the gradient:
\(E_{b}=\sum_{x-1}^{n} \sum_{y=1}^{n} G(x, y)^{2}\) (14)
Considering that HVS tends to judge the sharpness of image by the sharpest region, while blur mainly affects the edges and texture areas of the image, with limited impact on smooth areas. We first calculate the energy of each block in the image, then sort these blocks in descending order according to the energy to take out the top p% high energy block, and calculate the average energy of these blocks reflecting sharp regions:
\(\bar{E}=\frac{\sum_{i=1}^{Z} E(i)}{Z}\) (15)
where Z = \(\lfloor B \times p \%\rfloor\) is the number of top p% high energy blocks of the image, B represents the total number of blocks in the image, and \(\lfloor \rfloor\) means the downward rounding.
In this section, the energy ratio Rq between the scale q image and the image after scale q is calculated, which is defined as the ratio of the difference and sum between the average energy of the top p% high-energy blocks of scale q and the average energy of this part of images after scale q:
\(R_{q}=\frac{\left(\bar{E}_{q}-\frac{1}{4-q} \sum_{i>q}^{4} \bar{E}_{i}\right)+T_{3}}{\left(\bar{E}_{q}+\frac{1}{4-q} \sum_{i>q}^{4} \bar{E}_{i}\right)+T_{3}}\) (16)
where q=0,1,2,3. The constant T3 avoids the result being unstable when the denominator approaches 0, and the value of T3 is 10-7 .The larger Rq is, the greater the energy change of the high energy blocks of the image at scale q by Gaussian convolution, indicating that the details of the image of scale q are more affected by the convolution, and the image of sale is sharper. Fig. 5 depicts the process of energy ratio calculating between Gaussian scales. For each scale image, the energy ratio is calculated with all higher-scale images, not only the impact of Gaussian blur on the original image, but also impact of blur on each scale space images are considered.
Fig. 5. Energy ratio between scales
Fig. 6 shows the energy ratio of scale space images (b), (c), (d) and (e)and corresponding higher scale images of “butterfly” and “airplane”. The energy ratios of the "butterfly" image are much higher than that of "airplane", showed that the "butterfly" image is sharper than "airplane". And as scale q increases, the trend of energy ratio decreasing accordance with the trend of image quality degradation, which indicate that the energy ratio of high energy blocks between Gaussian scales are effective features to express the blur of the image.
Fig. 6. Energy ratio between Gaussian scale images
2.6 Distribution Characteristics of Multi-resolution Local Maximum Gradient
Blurring would smooth details of the image and increase the correlation between adjacent pixels, thus changes the statistical relationship between pixels. In this section, the local maximum gradient (LMG) of pixels is defined as the maximum gradient of pixels along four directions, that are horizontal, vertical, main diagonal and sub-diagonal. Histogram of LMG is plotted to analyze the statistical relationship of neighborhood pixels in images with different blur quality. Fig. 7 depicts the LMG distribution histograms of image “butterfly” and “airplane”. To make it clear, we took the natural log of the probability density as the y-coordinate. As shown in Fig. 7, the distribution of LMG in different blur images differs obviously. Moreover, the distribution of LMG are similar with the distribution of the Maximum Local Variation (MLV) in [9], that is, LMG of the texture region is similar with Gaussian distribution, while LMG of the edge and smooth region is similar with the hyper-Laplacian distribution. The distribution of LMG can be well described by the Generalized Gaussian Distribution (GGD) [25], which has also been used to model the distribution of frequency and spatial coefficients in general no-reference IQA [15,16]. Consequently, the GGD with zero mean is applied to fit the distribution of LMG of the image, which is defined as:
\(f\left(x ; \alpha, \sigma^{2}\right)=\frac{\alpha}{2 \beta \Gamma(1 / \alpha)} \exp \left(-\left(\frac{|x|}{\beta}\right)^{\alpha}\right)\) (17)
where \(\beta=\sigma \sqrt{\Gamma(1 / \alpha) / \Gamma(3 / \alpha)}, \Gamma(\cdot)\) is the gamma function, \(\Gamma(x)=\int_{0}^{\infty} t^{x-1} e^{-t} d t, \alpha\) controls the shape of distribution while σ2 controls the variance.
Fig. 7. LMG distribution of two original images and their down-sampled versions. The first row depicts the distribution of “butterfly,” the second row depicts the distribution of “airplane”
In addition, as the viewing distance decreases, the image resolution decreases and the image appears sharper. For a reduced resolution image, the actual scene represented by two adjacent pixels is further away and the local maximum gradient (LMG) becomes larger. We down-sample the original image by 2 and 4 times to construct the multi-resolution representation of it. Fig. 7 plots the LMG distribution of the original images “butterfly” and “airplane” and the corresponding reduced resolution images. LMG distribution extends in the positive direction of the horizontal axis to different degrees, which means the image gradient increases. We calculate the GGD distribution parameters (α , σ2) of the LMG for each resolution image, and extract 6 distribution parameters for each image as a group of features. α and σ2 are estimated using the method based on moment matching [26].
2.7 Model Training and Blur Evaluation
To sum up, total 34 features are extracted for each image, including 24 multi-scale local average similarity features, 4 inter-scale energy ratio features and 6 multi-resolution LMG distribution parameter features. Finally, these features are integrated and SVR is employed to learn the mapping of features to quality score to obtain the quality prediction model. During the test, 34 dimensional features were extracted from the test image and the trained SVR model was used to predict the blur quality score of the images. LIBSVM [27] code package was used for SVR training and testing, and RBF function was selected as the kernel function.
3. Experimental Results and Analysis
3.1 Experiment Settings
We evaluate the performance of the proposed method on five public IQA databases, including four synthetically blurred databases LIVE [28], TID2008 [29], TID2013 [30], CSIQ [31] and one real blurred database BID [32]. Images in the synthetically blurred databases are generated by Gaussian low-pass filtering of the reference images, their type of blur distortion is single, and blur distributed uniformly in these images. While blurred images in the real blurred database BID are taken by consumer equipments, which have variable distortion types, including defocus, simple motion, complex motion, and mixed blur distortion, which are more likely to occur in real scenes. LIVE, TID2008, TID2013, CSIQ and BID databases have 145, 100, 125, 150, 586 burred images, respectively. Fig. 8 shows several example images from the LIVE and BID databases. The first row shows images from LIVE database, in which Gaussian blur are introduced artificially, and distributes uniformly in the whole image. By contrast, blur types of images from BID database are variable and their shooting scenes are complicated, such as intense movement between the camera and the scene in (d), defocusing of the camera in (e), slight shaking of the camera and movement of targets in the night scene in (f). As observed in Fig. 8, the real blurred images are much more complicated, leading it more difficult to evaluate the quality of them.
Fig. 8. Example images from LIVE and BID databases
We use three criteria to measure the performance of image blur assessment methods, including Spearman rank ordered correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE). SROCC measures the monotonicity of prediction while PLCC and RMSE evaluate the accuracy of prediction. Better prediction results correspond to larger SROCC and PLCC with a maximum value of 1, and correspond to smaller RMSE. Before calculating the three criteria, a 5-parameter logistic function [33] is used to perform a non-linear fitting of the objective score:
\(q(x)=\beta_{1}\left(\frac{1}{2}-\frac{1}{1+e^{\beta_{2}\left(x-\beta_{3}\right)}}\right)+\beta_{4} x+\beta_{5}\) (18)
where β1 , β2 , β3 , β4 and β5 are fitting parameters.
3.2 Comparison with Image Blur Evaluation Methods
In this section, we compare the proposed method with nine state-of-the-art no-reference blur assessment methods, including JNB [4], CPBD [5], LPC-SI [12], MLV [9], ARISM [10], SPARISH [7], RISE [20], BISHARP [13], MGVG [8]. In the experiment, 80% images of each database are randomly selected for training and the remaining 20% for testing. To avoid errors caused by the selection of training-test sets, the reported experiment results are the median of 1000 tests. Three performance criteria of different methods on five blur image databases are shown in Table 1, where the best two results are marked in boldface. Our method achieves the best performance in four of the five databases. PLCC in both LIVE and TID2013 databases are higher than 0.97, and SROCC is higher than 0.96, which are obviously better than other methods. MGVG obtained the best performance in CSIQ database, whose PLCC reached 0.9669. However, PLCC of MGVG in BID database is only 0.1870 and SROCC is 0.1793, indicating that MGVG is not suitable for evaluating the quality of real blurred images. We notice that the evaluation performance of RISE and our method on the BID database is obviously better than others, showing that methods of directly using features as quality scores cannot evaluate real blurred images effectively. Compared with RISE, our method makes a great improvement in real burred images. To be specific, PLCC and SROCC by our method reached 0.6337 and 0.6208 respectively, while the result of RISE is only 0.6017 and 0.5839, indicating that our method comprehensively extracts the features of the real blurred images and thus can evaluate the quality of them better.
Table 1. Results of performance evaluation on five databases
In addition, scatter plot is used as an intuitive performance evaluation. Fig. 9 shows scatter plots of our method against RISE on five databases, where data points describe the correspondence between the subjective and objective scores of the images in the database, and the black line are the logarithmic fitting curves of all points. Scatter plots of the evaluation results of RISE for five databases are relatively scattered, especially for the high-quality images of CSIQ database and the low-quality images of BID database. Compared with RISE, scatter plots distributions of our method on five databases are more concentrated on both sides of the fitting curve, and the fitting curve is more linear, which indicates that our method has better prediction accuracy and monotonicity than RISE. Consequently, our method is superior to RISE for both synthetically and real blurred images.
Fig. 9. Comparison of scatter plots on five databases. (a) Scatter plots of RISE. (b) Scatter plots of our method
3.3 Contributions of Components
Our multi-scale spatial local features-based image blur evaluation method extracts three kinds of features to train the quality model, including classified local contrast features between the original and the Gaussian scale space images, energy ratio features between Gaussian scales and distribution features of LMG with multi-resolution. To illustrate the contributions of various features to the quality model, each kind of features and the integrated features are trained respectively in this section. Similarly, randomly select 80% images of each database for training and the remaining 20% for testing, the train-test process is repeated 1000 times, and take the median value as the experimental results. As shown in Table 2, the evaluation performance of model trained by integrated features on five databases is better than the models trained by single type features. Specifically, the classified local contrast features perform best. The PLCC on LIVE, TID2008 and TID2013 reaches 0.9648, 0.9532 and 0.9654 respectively, while the evaluation performance of the energy ratio features and LMG distribution features are lower than the local contrast features. In addition, the PLCC and SROCC of the models trained by single-class features on BID database are all less than 0.6, while the PLCC of the model trained by integrated features reached 0.6337 and SROCC reached 0.6208. Real image blur is often determined by multiple factors, such as defocusing, relative motion, etc., so it is necessary to extract more diversified features to establish reliable quality model. Therefore, each kind of features have a good contribution to the quality model, and the best results are obtained through the combined training by the integrated features, which indicates the complementarity of the three kinds of features in the quality prediction.
Table 2. Contributions of components
3.4 Impact of Parameters
Evaluation performance of the proposed method is related to the size of image blocks and selected top p% high energy blocks. Therefore, block size n and high energy block percentage p% are adjusted to compare the performance of the models and look for the best results. Size of image blocks affects the classification results of image contents. Smaller block means more feature extraction operations and increases computational complexity, while larger block may contain both sharp and blur image contents, leading to the inaccuracy of classification result. High-energy block percentage p% determines the amount of image blocks used to extract energy ratio features. In this section, impact of these two parameters on the blur evaluation model is tested on five databases. Similarly, 80% images in each database were selected for training and the remaining 20% for testing, and the median of 1000 training tests was taken as the experimental results.
To identify how image block size affects the performance of quality model, we compare PLCC and SROCC of quality model at block size n= 4, 8 and 16 in Table 3, as well as the results of direct average and weighted average by the number of images. Both direct average and weighted average PLCC and SROCC achieve the best results when n=8. Therefore, considering the evaluation performance and computational complexity, we chose the block size of 8×8. To analyze the impact of p on the quality model, we change the value of p from 10 to 100 with an interval of 10 and the size of image block is fixed as 8×8. Experiments were carried out on five databases and the results of PLCC and SROCC were compared in Fig. 10. Experimental results show that the evaluation performance of the quality model is stable under different values of p PLCC and SROCC on several databases are relatively larger when p is 40, so p is chosen as 40 in our implementation
Table 3. Contributions of components
Fig. 10. Impact of percentage of high energy blocks
3.5 Impact of Training Proportion
In this section, we adjust proportion of images for model training and testing to analyze the impact of training images on performance of quality model. We randomly select 80%, 70%, 60%, and 50% of the images in each database for training, and the rest for testing, median of 1000 tests are reported as results. Table 4 shows the performance of models trained with different proportions of images on five databases. Performance of the quality model decreases slightly with the decrease of the proportion of training images. However, when only 50% of the images are used to train the model, our method still performs well on five databases. In particular, PLCC and SROCC tested on LIVE and TID2013 databases are above 0.94, and PLCC and SROCC tested on BID database are above 0.56, which are still better than most previous methods, indicates that the proposed method still has a considerable prediction effect when training the quality model with a small number of samples.
Table 4. Impact of training proportion
4. Conclusion
In this paper, we proposed a no-reference image blur assessment method based on multi-scale spatial local features. Dividing the image into smooth, edge and texture regions based on the different sensitivity of the local content to blur distortion, not only the similarity between the original image and the Gaussian scale space images is compared, but also the energy variation characteristics between the Gaussian scale space images are considered. Moreover, the distribution characteristics of local maximum gradient under multi-resolution are calculated to simulate the effect of observation distance on image blur quality. The experimental results on four synthetically blurred databases and one real blurred database show the higher prediction accuracy of our method compared with previous state-of-the-art no-reference image blur assessment methods, especially on real blurred database. In addition, the contributions of each kind of features and the integrated features to quality model are proved by extensive experiments.
References
- A. C. Bovik and Z. Wang, "Modern image quality assessment," Synthesis Lectures on Image, Video, and Multimedia Processing, vol. 2, no. 1, pp. 1-156, January, 2006. https://doi.org/10.2200/S00010ED1V01Y200508IVM003
- S. Athar and Z. Wang, "A comprehensive performance evaluation of image quality assessment algorithms," IEEE Access, vol. 7, pp. 140030-140070, September, 2019. https://doi.org/10.1109/ACCESS.2019.2943319
- P. Marziliano, F. Dufaux, S. Winkler and T. Ebrahimi, "Perceptual blur and ringing metrics: application to JPEG2000," Signal Processing Image Communication, vol. 19, no. 2, pp. 162-173, February, 2004.
- R. Ferzli and L. J. Karam, "A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB)," IEEE Transactions on Image Processing, vol. 18, no. 4, pp. 717-728, March, 2009. https://doi.org/10.1109/TIP.2008.2011760
- N. D. Narvekar and L. J. Karam, "A no-reference blur metric based on the cumulative probability of blur detection (CPBD)," IEEE Transactions on Image Processing, vol. 20, no. 9, pp. 2678-2683, March, 2011. https://doi.org/10.1109/TIP.2011.2131660
- L. Li, W. Lin, X. Wang, G. Yang, K. Bahrami and A. C. Kot, "No-reference image blur assessment based on discrete orthogonal moments," IEEE Transactions on Cybernetics, vol. 46, no. 1, pp. 39-50, January, 2016. https://doi.org/10.1109/TCYB.2015.2392129
- L. Li, D. Wu, J. Wu, H. Li, W. Lin and A. C. Kot, "Image sharpness assessment by sparse representation," IEEE Transactions on Multimedia, vol. 18, no. 6, pp. 1085-1097, June, 2016. https://doi.org/10.1109/TMM.2016.2545398
- Y. Zhan and R. Zhang, "No-reference image sharpness assessment based on maximum gradient and variability of gradients," IEEE Transactions on Multimedia, vol. 20, no. 7, pp.1796-1808, July, 2018. https://doi.org/10.1109/tmm.2017.2780770
- K. Bahrami and A. C. Kot, "A fast approach for no-reference image sharpness assessment based on maximum local variation," IEEE Signal Processing Letters, vol. 21, no. 6, pp.751-755, June, 2014. https://doi.org/10.1109/LSP.2014.2314487
- K. Gu, G. Zhai, W. Lin, X. Yang and W. Zhang, "No-reference image sharpness assessment in autoregressive parameter space," IEEE Transactions on Image Processing, vol. 24, no. 10, pp. 3218-3231, October, 2015. https://doi.org/10.1109/TIP.2015.2439035
- M. Hosseini, Y. Zhang, and K. Plataniotis, "Encoding visual sensitivity by MaxPol convolution filters for image sharpness assessment," IEEE Transactions on Image Processing, vol. 28, no. 9, pp. 4510-4525, September, 2019. https://doi.org/10.1109/tip.2019.2906582
- R. Hassen, Z. Wang and M. M. A. Salama, "Image sharpness assessment based on local phase coherence," IEEE Transactions on Image Processing, vol. 22, no. 7, pp. 2798-2810, July, 2013. https://doi.org/10.1109/TIP.2013.2251643
- G. Gvozden, S. Grgic and M. Grgic, "Blind image sharpness assessment based on local contrast map statistics," Journal of Visual Communication and Image Representation, vol. 50, pp. 145-158, January, 2018. https://doi.org/10.1016/j.jvcir.2017.11.017
- F. Kerouh, D. Ziou and A. Serir, "Histogram modeling-based no reference blur quality measure," Signal Processing Image Communication, vol. 60, pp. 22-28, September, 2017. https://doi.org/10.1016/j.image.2017.08.014
- A. K. Moorthy and A. C. Bovik, "Blind image quality assessment: from natural scene statistics to perceptual quality," IEEE Transactions on Image Processing, vol. 20, no. 12, pp. 3350-3364, December, 2011. https://doi.org/10.1109/TIP.2011.2147325
- A. Mittal, A. K. Moorthy and A. C. Bovik, "No-reference image quality assessment in the spatial domain," IEEE Transactions on Image Processing, vol. 21, no. 12, pp. 4695-4708, December, 2012. https://doi.org/10.1109/TIP.2012.2214050
- L. Li, D. Wu, J. Wu, J. Qian, B. Chen, "No-reference image quality assessment with a gradient-induced dictionary," KSII Transactions on Internet and Information Systems, vol. 10, no. 1, pp. 288-307, January, 2016. https://doi.org/10.3837/tiis.2016.01.017
- G. Yue, C. Hou, K. Gu and N. Ling, "No reference image blurriness assessment with local binary patterns," Journal of Visual Communication and Image Representation, vol. 49, pp. 382-391, September, 2017. https://doi.org/10.1016/j.jvcir.2017.09.011
- L. Li, W. Xia, W. Lin, Y. Fang and S. Wang, "No-reference and robust image sharpness evaluation based on multiscale spatial and spectral features," IEEE Transactions on Multimedia, vol. 19, no. 5, pp. 1030-1040, May, 2017. https://doi.org/10.1109/TMM.2016.2640762
- S. Bosse, D. Maniry, K. Muller, K. Robert, T. Wiegand and W. Samek, "Deep neural networks for no-reference and full-reference image quality assessment," IEEE Transactions on Image Processing, vol. 27, no. 1, pp. 206-219, January, 2018. https://doi.org/10.1109/TIP.2017.2760518
- K. Ma, LIU W. Liu, K. Zhang, Z. Duanmu, Z. Wang and W. Zuo, "End-to-end blind image quality assessment using deep neural networks," IEEE Transactions on Image Processing, vol. 27, no. 3, pp. 1202-1213, March, 2018. https://doi.org/10.1109/TIP.2017.2774045
- Y. Ma, X. Cai, F. Sun and S. Hao, "No-Reference image quality assessment based on multi-task generative adversarial network," IEEE Access, vol. 7, pp. 146893-146902, September, 2019. https://doi.org/10.1109/ACCESS.2019.2942625
- W. Zhang, K. Ma, J. Yan, D. Deng and Z. Wang, "Blind image quality assessment using a deep bilinear convolutional neural network," IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 1, pp. 36-47, January, 2020. https://doi.org/10.1109/tcsvt.2018.2886771
- T. Lindeberg, "Scale-space theory: a basic tool for analyzing structures at different scales," Journal of Applied Statistics, vol. 21, no. 2, pp. 225-270, September, 1994. https://doi.org/10.1080/757582976
- K. Sharifi and A. Leon-garcia, "Estimation of shape parameter for generalized Gaussian distributions in sub-band decompositions of video," IEEE Transactions on Circuits and Systems for Video Technology, vol. 5, no. 1, pp. 52-56, February, 1995. https://doi.org/10.1109/76.350779
- N. E. Lasmar, Y. Stitou, and Y. Berthoumieu, "Multiscale skewed heavy tailed model for texture analysis," in Proc. of 16th IEEE International Conference on Image Processing, pp. 2281-2284, November 7-10, 2009.
- C. C. Chang and C. J. Lin, "LIBSVM: A library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, pp. 1-27, May, 2011. https://doi.org/10.1145/1961189.1961199
- H. R. Sheikh, Z. Wang, L. Cormack, A.C. Bovik, "LIVE image quality assessment database release 2," 2006.
- N. Ponomarenko, V. Lukin, A. Zelensky, K. Egiazarian, M. Carli and F. Battisti, "TID2008-a database for evaluation of full-reference visual quality assessment metrics," Advances of Modern Radioelectronics, vol. 10, no. 4, pp. 30-45, February, 2009.
- N. Ponomarenko, L. Jin, O. Leremeiev, V. Lukin, K. Egiazarian, J. Astola, B. Vozel, K. Chehdi, M. Carli, F. Battisti and C.C.J. Kuo, "Image database TID2013: Peculiarities, results and perspectives," Signal Processing Image Communication, vol. 30, pp. 57-77, January, 2015. https://doi.org/10.1016/j.image.2014.10.009
- LARSON E. C. Larson and D. M. Chandler, "Categorical subjective image quality CSIQ database," 2009.
- A. Ciancio, A. L. N. T. Targino da Costa, E. A. B. da Silva, A. Said, R. Samadani and P. Obrador, "No-reference blur assessment of digital pictures based on multifeature classifiers," IEEE Transactions on Image Processing, vol. 20, no. 1, pp. 64-75, January, 2011. https://doi.org/10.1109/TIP.2010.2053549
- H.R. Sheikh and A. Bovik, "A statistical evaluation of recent full reference image quality assessment algorithms," IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3340-3451, November, 2006.