Vehicle Face Recognition Algorithm Based on Weighted Nonnegative Matrix Factorization with Double Regularization Terms

Shi, Chunhe;Wu, Chengdong;

doi:10.3837/tiis.2020.05.017

KSII Transactions on Internet and Information Systems (TIIS)

Volume 14 Issue 5
/
Pages.2171-2185
/
2020
/
1976-7277(pISSN)
/
1976-7277(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

Vehicle Face Recognition Algorithm Based on Weighted Nonnegative Matrix Factorization with Double Regularization Terms

Shi, Chunhe (College of Information Science and Engineering, Northeastern University) ;
Wu, Chengdong (Faculty of Robot Science and Engineering, Northeastern University)

Received : 2019.09.18
Accepted : 2020.03.11
Published : 2020.05.31

https://doi.org/10.3837/tiis.2020.05.017 Citation PDF KSCI HTML

Download PDF

⟨ Previous Next ⟩

Abstract

In order to judge that whether the vehicles in different images which are captured by surveillance cameras represent the same vehicle or not, we proposed a novel vehicle face recognition algorithm based on improved Nonnegative Matrix Factorization (NMF), different from traditional vehicle recognition algorithms, there are fewer effective features in vehicle face image than in whole vehicle image in general, which brings certain difficulty to recognition. The innovations mainly include the following two aspects: 1) we proposed a novel idea that the vehicle type can be determined by a few key regions of the vehicle face such as logo, grille and so on; 2) Through adding weight, sparseness and classification property constraints to the NMF model, we can acquire the effective feature bases that represent the key regions of vehicle face image. Experimental results show that the proposed algorithm not only achieve a high correct recognition rate, but also has a strong robustness to some non-cooperative factors such as illumination variation.

Keywords

1. Introduction

With the rapid growth of the number of vehicles, more fake plate vehicles appear than before, and a large number of surveillance cameras have to be added on roads to monitor these violations [1]. However, it is very difficult to deal with so many videos in time by the traditional manual processing method, therefore, it is of great significance to design an automatic fake plate vehicle detection algorithm for surveillance video. Affected by the installation position and angle of surveillance camera, only the vehicle face area can be captured as shown in Fig. 1.

E1KOBZ_2020_v14n5_2171_f0001.png 이미지

Fig. 1. The captured vehicle image

Obviously, the vehicle face area contains less features comparing with the whole vehicle, which brings some difficulties to multi-category vehicles recognition, so we need to pay more attention to the characteristics of several key regions in vehicle face images such as logo, grille, light and rearview mirror, etc. In the other words, the vehicle can be recognized by these key regions as shown in Fig. 2. From the above analysis, it is very important to establish the basis images of these key areas, where each vehicle image can be represented by the linear superposition of the basis images. In addition, we hope that while acquiring a suitable set of basis images, the new features obtained by decomposition can be conducive to the correct recognition of vehicle face images. Therefore, the major innovation of this paper is that for the multi-category and limited annotated vehicle face images, the feature bases which represent the key regions of vehicle face can be acquired based on the improved NMF model, where the NMF model can meet the requirements of the basis images and new features well.

E1KOBZ_2020_v14n5_2171_f0002.png 이미지

Fig. 2. The relation between the whole vehicle face and some key regions

The remainder of this paper is organized as follows. In section 2, some related works are addressed. The original feature extraction of vehicle face image is completed in section 3. In section 4, an improved NMF method for vehicle face recognition is proposed. Section 5 uses a projected gradient algorithm to solve the proposed NMF objective function. We prove the effectiveness of the proposed algorithm through experiments in section 6. Finally, the conclusion is drawn in section 7.

2. Related Work

The development of vehicle recognition technology mainly experiences two key stages, which are based on traditional artificial feature extraction and deep learning respectively.

(1) Artificial feature extraction and classification. As an important global feature, color information has been widely used in vehicle recognition by scholars. For instance, Baek [2] and Kim [3] extracted the color histograms as the vehicle feature from RGB and HSV color spaces respectively; However, in the captured image the vehicle color is susceptible to light, so the specular-free image and the weighted-light-influence image were introduced by Hu [4], which make the extracted color features more robust to illumination variation; And besides color information, the texture, edge and shape of vehicle were also used as important global features by Chen [5]□Negri [6]□Zhang [7]. With the increasing number of vehicle types, there are minor visual differences between some vehicles of different types, so it is necessary to describe vehicle details by extracting local features. Lam [8] proposed a multi-scale spatial model to describe the vehicle local texture; Similarly using multi-scale theory, Psyllos [9] realized vehicle Logo recognition based on Scale-Invariant Feature Transform (SIFT) feature; And for some non-coorperative factors such as partial occlusion, attitude or angle variation, Deformable Part Model (DPM) and feature descriptor were adopted by Li [10] and Zhang [11], which make the recognition algorithm robust to the above factors. In addition, in order to represent the vehicle structure better, Leotta [12] and Yebes [13] modeled the vehicle in three-dimensional space.

(2) Vehicle recognition based on deep learning. The purpose of deep learning is to analyze the image layer by layer from shallow to deep by simulating the brain, and improve the accuracy recognition [14]. In recent years, deep learning has been widely used in vehicle recognition. For instance, Zhang [15] and Hu [16] achieved vehicle body color recognition accurately by combining Convolutional Neural Networks (CNN) model with spatial pyramid model; And Liu [17] and Hu [18] achieved vehicle recognition based on Fast Region Convolutional Neural Networks (Fast R-CNN) and Boltzmann machine respectively; In addition, Deep neural network (DBN) model is adopted by Wu [19] and Wang [20] to realize vehicle classification. In brief, the deep learning models mentioned above have achieved certain results in varying degrees.

From the above analysis, we can see that most of the researchers focused on the recognition based on the whole vehicle image, and there are relatively few studies on vehicle face recognition at present. Therefore, it will be of great significance to propose an effective vehicle face recognition algorithm.

3. Original Feature Extraction

Influenced by illumination variation, there may be some color difference for the same vehicle in different captured images as shown in Fig. 3, which reduces the effectiveness of the color feature-based recognition algorithms. In addition, under the condition that the number of annotated samples is limited, the algorithms based on deep learning are also difficult to achieve good effect. Therefore, we will pay more attention to the local regions with significant features, such as logo, grille, light and rearview mirror, etc.

E1KOBZ_2020_v14n5_2171_f0003.png 이미지

Fig. 3. Vehicle color variation under different light intensities

According to image processing knowledge, the regions are surrounded by edges, and the edges are formed by the high-frequency pixels with similar directions, so it is critical to represent the frequency information of these pixels accurately for original feature extraction [21]. Because the Histogram of Oriented Gradient (HOG) takes into account both thefrequency and direction features, it is reasonable to use HOG as the original feature of vehicle face image.

First, the vehicle face area is segmented from the captured image based on Fast R-CNN model [22] and normalized into NN× pixels in size as shown in Fig. 4;

E1KOBZ_2020_v14n5_2171_f0008.png 이미지

Fig. 4. Image preprocessing

Then, the preprocessed image is divided into some blocks, each of which isMM× pixels in size, and the adjacent blocks overlap T pixels. As a result, the number of blocks k can be obtained by Eq.1.

\(k=\left(\left\lfloor\frac{N-M}{M-T}\right\rfloor+1\right) \times\left(\left\lfloor\frac{N-M}{M-T}\right\rfloor+1\right)\) (1)

In addition, the number of angle intervals is supposed as t, the original eigenvector dimension will be n, where n = k × t.

4. Feature Dimension Reduction Based on Improved NMF

It is very important to acquire the effective feature bases for vehicle recognition which represent the key regions of vehicle face, the common dimension reduction methods include Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), etc [23-24], where the elements of decomposition matrix can be positive or negative based on these methods. The negative elements are acceptable mathematically, but are hard to be explained in building basis images because of lacking of physical meanings [25]. For example, we know that face image can be constructed through the linear superposition of basis images, where the pixel values and the weight coefficients should be not negative in the factorization matrices. Therefore, it is more appropriate to adopt NMF based dimension reduction method, where the idea of NMF is that a given nonnegative matrix can be represented by two nonnegative matrices multiplication approximatively as Eq.2 [26],

Y_nxm≈ U_nxrV_rxm, s.t. u_kiv_ij ≥ 0 (2)

where all column vectors of Y are the original feature vectors, all column vectors of U and V represent the basis vectors and the weighted coefficient vectors respectively, and the decomposition error should be small enough as Eq.3.

\(\boldsymbol{U}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}}{\arg \min } \frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{V}\|_{2}\) (3)

In order to make the decomposition more conducive to accurate recognition, it is necessary to add some appropriate constraint conditions to the decomposition besides the nonnegative constraint. Here, we can consider this problem from the following three aspects.

(1) Weighted constraint. The decomposed basis vectors can be considered to represent the different regions of vehicle face, and the important degrees of these regions are different during recognition, so it is reasonable to add weighted constraints to basis vectors, where the objective function Eq.3 can be improved to Eq.4,

\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{U, V, Z}{\arg \min } \frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}\) (4)

where Z represents the weight matrix.

(2) Sparse constraint. Generally, only a small number of basis vectors play important roles in recognition, which are considered to represent the key regions of vehicle face, so it is necessary to add sparse constraint to the weight matrix Z, and the objective function Eq.4 can be improved to Eq.5.

\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}, \boldsymbol{Z}}{\arg \min }\left\{\frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{0}\right\}\) (5)

According to compressed sensing [27], it is a NP-hard problem to solve matrix 0-norm, so we use 2-norm instead of 0-norm in solving the sparsity of matrix, and Eq.5 is further improved to Eq.6,

\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}, \boldsymbol{Z}}{\arg \min }\left\{\frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{2}\right\}\) (6)

where α is a balance parameter.

(3) Classification property constraint. According to pattern recognition theory, the features of the samples with the same label should be as similar as possible [28-29]. Therefore, we add the within-class similarity and inter-class distinction measures to the objective function, and the final objective function is shown as Eq.7,

\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}, \boldsymbol{Z}}{\arg \min }\left\{\frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{2}+\frac{\beta}{2}\left(f_{i}(\boldsymbol{V})-f_{e}(\boldsymbol{V})\right)\right\}\) (7)

where β is another balance parameter other than α, ()ifV and ()efV represent the within-class similarity and the inter-class distinction measures respectively. Then, we will give the detailed functional forms of f_i(V) and f_e(V).

The within-class similarity function f_i(V):

The auxiliary matrix A is required as shown in Eq.8,

\(\boldsymbol{A}=\left[\begin{array}{llll} \boldsymbol{A}_{1} & & & \\ & \boldsymbol{A}_{2} & & \\ & & \mathrm{O} & \\ & & & \boldsymbol{A}_{c} \end{array}\right]_{m \times m}\) (8)

\(\boldsymbol{A}_{i}=\left[\begin{array}{cccc} \frac{1}{d_{i}} & \frac{1}{d_{i}} & \mathrm{~L} & \frac{1}{d_{i}} \\ \frac{1}{d_{i}} & \frac{1}{d_{i}} & \mathrm{~L} & \frac{1}{d_{i}} \\ \mathrm{M} & \mathrm{M} & \mathrm{O} & \mathrm{M} \\ \frac{1}{d_{i}} & \frac{1}{d_{i}} & \mathrm{~L} & \frac{1}{d_{i}} \end{array}\right], i=1,2, \mathrm{~K}, c\) (9)

\(\boldsymbol{V} \boldsymbol{A}=\left[\begin{array}{lllllll} \overline{\boldsymbol{V}}_{1} & \overline{\boldsymbol{V}}_{1} & \mathrm{~L} & \overline{\boldsymbol{V}}_{1} & \overline{\boldsymbol{V}}_{2} & \mathrm{~L} & \overline{\boldsymbol{V}}_{c} \end{array}\right]_{m \times n}\) (10)

where c and d_i are the numbers of the labels and the samples with label-i respectively in Y, and \(\overline{\boldsymbol{V}}_{i}\) represents the average feature vector of the samples with label-i.

\(f_{i}(\boldsymbol{V})=\|\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A}\|_{2}\) (11)

The inter-class distinction function f_e(V) :

The auxiliary matrix B is required as shown in Eq.12,

\(\boldsymbol{B}=\left[\begin{array}{cccc} \frac{1}{m} & \frac{1}{m} & \mathrm{~L} & \frac{1}{m} \\ \frac{1}{m} & \frac{1}{m} & \mathrm{~L} & \frac{1}{m} \\ \mathrm{M} & \mathrm{M} & \mathrm{O} & \mathrm{M} \\ \frac{1}{m} & \frac{1}{m} & \mathrm{~L} & \frac{1}{m} \end{array}\right]_{m \times m}\) (12)

\(\boldsymbol{V B}=\left[\begin{array}{llll} \overline{\boldsymbol{V}} & \overline{\boldsymbol{V}} & \mathrm{L} & \overline{\boldsymbol{V}} \end{array}\right]\) (13)

where \(\overline{\boldsymbol{V}}\) is the average feature vector of all samples.

\(V A-V B=\left[\begin{array}{lllllll} \bar{V}_{1}-\bar{V} & \bar{V}_{1}-\bar{V} & \mathrm{~L} & \bar{V}_{1}-\bar{V} & \bar{V}_{2}-\bar{V} & \mathrm{~L} & \bar{V}_{c}-\bar{V} \end{array}\right]_{m \times n}\) (14)

\(f_{e}(\boldsymbol{V})=\|\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B}\|_{2}\) (15)

In summary, the objective function can be written in the form of Eq.16.

\(\begin{aligned} \boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*} &=\underset{U, Z, V}{\arg \min } J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V}) \\ &=\underset{U, Z, V}{\arg \min } \frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{2}+\frac{\beta}{2}\left(\|\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A}\|_{2}-\|\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B}\|_{2}\right) \end{aligned}\) (16)

5. Objective Function Solution Based on Projected Gradient Method

In order to take the partial derivative conveniently, the function (),J,UZV can be written to Eq.17,

\(\begin{aligned} J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})=& \frac{1}{2} \operatorname{tr}\left[(\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V})^{T}(\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V})\right]+\frac{\alpha}{2} \operatorname{tr} \boldsymbol{Z}^{T} \boldsymbol{Z}+\\ & \frac{\beta}{2}\left\{\operatorname{tr}\left[(\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A})^{T}(\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A})\right]-\operatorname{tr}\left[(\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B})^{T}(\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B})\right]\right\} \end{aligned}\) (17)

The partial derivatives are shown in Eq.18, Eq.19 and Eq.20.

\(\frac{\partial J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})}{\partial \boldsymbol{U}}=-\boldsymbol{Y} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}+\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}\) (18)

\(\frac{\partial J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})}{\partial \boldsymbol{Z}}=-\boldsymbol{U}^{T} \boldsymbol{Y} \boldsymbol{V}^{T}+\boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T}+\alpha \boldsymbol{Z}\) (19)

\(\begin{aligned} \frac{\partial J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})}{\partial \boldsymbol{V}} &=-\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{Y}+\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}+\beta \boldsymbol{V}-\beta \boldsymbol{V} \boldsymbol{A}^{T}-\beta \boldsymbol{V} \boldsymbol{A}+\beta \boldsymbol{V} \boldsymbol{A} \boldsymbol{B}^{T}+\\ & \beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{A}^{T}-\beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{B}^{T} \end{aligned}\) (20)

Then, the multiplicative update rules can be deduced finally as shown in Eq.21, Eq.22 and Eq.23.

\(u_{i j} \leftarrow u_{i j} \frac{\left(\boldsymbol{Y} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}\right)_{i j}}{\left(\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}\right)_{i j}}\) (21)

\(z_{i j} \leftarrow z_{i j} \frac{\left(\boldsymbol{U}^{T} \boldsymbol{Y} \boldsymbol{V}^{T}\right)_{i j}}{\left(\boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T}+\alpha \boldsymbol{Z}\right)_{i j}}\) (22)

\(v_{i j} \leftarrow v_{i j} \frac{\left(\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{Y}+\beta \boldsymbol{V} \boldsymbol{A}+\beta \boldsymbol{V} \boldsymbol{A}^{T}+\beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{B}^{T}\right)_{i j}}{\left(\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}+\beta \boldsymbol{V}+\beta \boldsymbol{V} \boldsymbol{A} \boldsymbol{B}^{T}+\beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{A}^{T}\right)_{i j}}\) (23)

After the iteration rules are determined, the training and recognition methods of the proposed NMF model can be summarized as shown in Method 1 and Method 2 respectively.

Method 1

Input: All original feature vectors and their labels in Y , two balance parameters α and β .

Step.1: Initialize U⁽⁰⁾ , Z⁽⁰⁾ , V⁽⁰⁾ , the maximum number of iterations n_max , and the error threshold e . In addition, the count variable t is set as 0.

Step.2: t = t +1 .

Step.3: Solve J(U^(t), Z^(t), V^(t)).

if J(U^(t), Z^(t), V^(t)) < e of t > n_max

goto Step.5

else

goto Step.4

Step.4: Update U , Z and V .

\(u_{i j}^{(t+1)} \leftarrow u_{i j}^{(t)} \frac{\left(\boldsymbol{Y}\left(\boldsymbol{V}^{(t)}\right)^{T} \boldsymbol{Z}^{(t) T}\right)_{i j}}{\left(\boldsymbol{U}^{(t)} \boldsymbol{Z}^{(t)} \boldsymbol{V}^{(t)}\left(\boldsymbol{V}^{(t)}\right)^{T}\left(\boldsymbol{Z}^{(t)}\right)^{T}\right)_{i j}}\)

\(z_{i j}^{(t+1)} \leftarrow z_{i j}^{(t)} \frac{\left(\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{Y}\left(\boldsymbol{V}^{(t)}\right)^{T}\right)_{i j}}{\left(\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{U}^{(t)} \boldsymbol{Z}^{(t)} \boldsymbol{V}^{(t)}\left(\boldsymbol{V}^{(t)}\right)^{T}+\alpha \boldsymbol{Z}^{(t)}\right)_{i j}}\)

\(v_{i j}^{(t+1)} \leftarrow v_{i j}^{(t)} \frac{\left(\boldsymbol{Z}^{(t)}\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{Y}+\beta \boldsymbol{V}^{(t)} \boldsymbol{A}+\beta \boldsymbol{V}^{(t)} \boldsymbol{A}^{T}+\beta \boldsymbol{V}^{(t)} \boldsymbol{B} \boldsymbol{B}^{T}\right)_{i j}}{\left(\left(\boldsymbol{Z}^{(t)}\right)^{T}\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{U}^{(t)} \boldsymbol{Z}^{(t)} \boldsymbol{V}^{(t)}+\beta \boldsymbol{V}^{(t)}+\beta \boldsymbol{V}^{(t)} \boldsymbol{B} \boldsymbol{A}^{T}+\beta \boldsymbol{V}^{(t)} \boldsymbol{A} \boldsymbol{B}^{T}\right)_{i j}}\)

goto Step.2

Step.5: Acquire U^∗ , Z^∗ and V^∗ .

End training.

Method 2

Input: The original feature vector of the unknown vehicle face image Y_w , and the similarity threshold ξ .

Step 1: Solve \(\boldsymbol{V}_{w}=\left(\boldsymbol{Z}^{* T} \boldsymbol{U}^{* T} \boldsymbol{U}^{*} \boldsymbol{Z}^{*}\right)^{-1} \boldsymbol{Z}^{* T} \boldsymbol{U}^{* T} \boldsymbol{Y}_{w}\) .

Step 2: if D(\(\boldsymbol{V}_{w}\), \(\bar{V}_{i}\)) = max {D(\(\boldsymbol{V}_{x}\), \(\bar{V}_{i}\)), i = 1,2,K, s} and D(\(\boldsymbol{V}_{w}\), \(\bar{V}_{i}\)) < ξ

The label of Y_w is i ;

else

There is no matching result of Y_w in the data set;

where \(D\left(\boldsymbol{V}_{i}, \boldsymbol{V}_{j}\right)=\frac{\left\langle\boldsymbol{V}_{i}, \boldsymbol{V}_{j}\right\rangle}{\left\|\boldsymbol{V}_{i}\right\| \mathrm{g}\left\|\boldsymbol{V}_{j}\right\|}\).

According to [25], the iteration has been proved to be convergent.

6. Experiment

6.1 Data Set

At present, there is no large-scale public data set of vehicle face images, so we build a new data set, where all vehicle face images were taken from 22 surveillance cameras which were distributed on different roads. The number of captured images is 103028, of which 80197 are effective, and some of the samples are shown in Fig. 5.

E1KOBZ_2020_v14n5_2171_f0004.png 이미지

Fig. 5. Partial samples in data set

In the effective samples, there are 4136 pairs of vehicle face images, each of which represents the same vehicle, so we select these images as the positive test samples. In addition, 5000 pairs of vehicle images are selected as the negative test samples, where each pair of images represent the different vehicles.

6.2 Parameters Setting

In the proposed algorithm, some parameters are determined according to experience, and the others are acquired based on experimental results.

(1) The experience-based parameters.

According to the research experience about other recognition problems, the empirical values of some parameters can be given in the proposed algorithm. In Eq.1, 256N=, 32M=, 8T=; In Eq.2, the number of training samples 600m=; In addition, the number of HOG angle intervals t and the maximum number of iterations n_max are set as 8 and 25000 respectively.

(2) The experiment-based parameters.

Besides the above empirical parameters, there are 4 parameters r, α, β and ξ which need to be determined experimentally, that is, through comparing the recognition performances under different parameters, we can obtain the most appropriate parameters which make the recognition effect best.

From the above analysis, it is reasonable to determine the parameters r, α and β according to Eq.24,

\(r^{*}, \alpha^{*}, \beta^{*}=\underset{r, \alpha, \beta}{\arg \min }\left(F_{\text {far }}(r, \alpha, \beta)+F_{f r r}(r, \alpha, \beta)\right)\) (24)

where r/n ∈ {0.2,0.3,...,0.7} , α, β∈ {10,1,0.1,0.01} , F_far and F_frr represent False Accept Rate (GAR) and False Reject Rate (FRR) respectively, and the experimental results are shown in Fig. 6.

E1KOBZ_2020_v14n5_2171_f0005.png 이미지

Fig. 6. Comparison of recognition performances under different parameters

From Fig. 6, when r = 0.4n, α = 0.1, and β = 1, the best recognition performance can be achieved.

Unlike the parameters r, α and β, the similarity threshold ξ can be determined according to Eq.25

\(\xi^{*}=\underset{\xi}{\arg \min }\left(G_{g a r}(\xi)-F_{f a r}(\xi)\right)\) (25)

where G_gar represents Genuine Accept Rate (GAR), and the experimental result is shown in Fig. 7.

E1KOBZ_2020_v14n5_2171_f0006.png 이미지

Fig. 7. The GAR and FAR curves under different thresholds

It can be seen that when ξ = 0.86, the best classification result can be acquired.

6.3 Comparison and Analysis of Algorithms

After determining all parameters of the algorithm, we compare the proposed algorithm with some dimension reduction methods such as PCA [30], LDA [24], Sparse NMF (SNMF) [31], Discriminant NMF (DNMF) [25], and t-SNE [32] some existed vehicle recognition algorithms which are based on color feature [3], SIFT feature [9], 3D model [13] and CNN [15,17] respectively by FAR-FRR curves, where the comparison results are shown in Fig. 8.

E1KOBZ_2020_v14n5_2171_f0007.png 이미지

Fig. 8. The performance comparison result of different algorithms

From Fig. 8, we can see that the proposed algorithm outperforms the other algorithms in performance, where the main reasons are as follows:

(1) PCA belongs to unsupervised learning, although the feature dimensions can be reduced effectively, the contribution to classification is not great, which leads to that the recognition effect is unsatisfactory. Different from PCA, LDA belongs to supervised learning, and the dimension reduction is realized according to the feature differences, so the recognition effect based on LDA is better than those based on PCA. From the analysis in Section 4, NMF has been used more and more because of its physical meaning, and according to pattern recognition theory, sparse and discriminant constraints are usually added to the NMF model, which can further improve the recognition effect. In addition, t-SNE is a non-linear dimensionality reduction method, and it can retain the local features of the vehicle face images well, so the recognition effect can be also improved. In the proposed algorithm, we add classification property constraint to the NMF model according to the special characteristics of vehicle face image, which makes the feature more conductive to recognition after dimension reduction.

(2) Under different illumination conditions, the same color vehicles will appear a certain degree of color differences in the captured images, which weakens the effectiveness of the color-based recognition algorithms, in other words, the algorithms have weak robustness to illumination variation. The key point in vehicle image is another important feature besides color such as SIFT point, but there are relatively few feature points in some vehicle face images, which reduces the feature effectiveness and brings difficulties to recognition. Different from the above manually extracted features, more effective features can be extracted automatically based on deep learning methods such as CNN. However, only a limited number of samples are annotated in the data set, so the over-fitting problem may be easily caused in the training process of model, which will affect the universality of the model greatly. Since only the vehicle face region is captured, it is concluded that the extracted features based on 3D are not very effective.

According to the above experimental results and analysis, the better recognition results can be achieved based on the proposed algorithm in terms of accuracy and robustness.

7 Conclusion

According to the proposed idea that vehicle type can be determined by a few key regions in vehicle face image, we can acquire a set of effective feature bases through the improved NMF model, and achieve the correct recognition of vehicle face, that is, the accurate detection of fake plate vehicles. Therefore, the proposed algorithm is of great significance both in theoretical research and in practical application. However, although some good results have been achieved, there are still some problems to be solved in order to further improve the universality of the algorithm, such as the scale of the data set needs to be expanded, and the same type of vehicles with different license plates need to be differentiated precisely.

References

M. Swathy, P. S. Nirmala and P. C. Geethu, "Survey on vehicle detection and tracking techniques in video surveillance," International Journal of Computer Applications, vol. 160, no. 7, pp. 22-25, 2017. https://doi.org/10.5120/ijca2017913086
N. Baek, S. M. Park, K. J. Kim and S. B. Park, "Vehicle color classification based on the support vector machine method," in Proc. of International Conference on Intelligent Computing, pp. 1133-1139, August 21-24, 2007.
K. J. Kim, S. M. Park and Y. J. Choi, "Deciding the number of color histogram bins for vehicle color recognition," in Proc. of Asia-Pacific Services Computing Conference, pp. 134-138, December 9-12, 2008.
W. Hu, J. Yang, L. Bai, Bai L and L. X. Yao, "A new approach for vehicle color recognition based on specular-free image," in Proc. of sixth International Conference on Machine Vision, pp. 90671Q, December 24, 2013.
P. Chen, X. Bai and W. Liu, "Vehicle color recognition on urban road by feature context," IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 5, pp. 2340-2346, 2014. https://doi.org/10.1109/TITS.2014.2308897
P. Negri, X. Clady, M. Milgram and R. Poulenard, "An oriented-contour point based voting algorithm for vehicle type classification," in Proc. of International Conference on Pattern Recognition, pp. 574-577, August 20-24, 2006.
B. Zhang, "Reliable classification of vehicle types based on cascade classifier ensembles," IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 1, pp. 322-332, 2013. https://doi.org/10.1109/TITS.2012.2213814
W. W. L. Lam, C. C. C. Pang and N. H. C. Yung, "Vehicle-Component Identification Based on Multiscale Textural Couriers," IEEE Transactions on Intelligent Transportation Systems, vol. 8, no. 4, pp. 681-694, 2007. https://doi.org/10.1109/TITS.2007.908144
A. P. Psyllos, C. N. E. Anagnostopoulos and E. Kayafas, "Vehicle Logo Recognition Using a SIFT-Based Enhanced Matching Scheme," IEEE Transactions on Intelligent Transportation Systems, vol. 11, no. 2, pp. 322-328, 2010. https://doi.org/10.1109/TITS.2010.2042714
L. J. Li, H. Su, E. and E. P. Xing, "Object bank: a high-level image representation for scene classification and semantic feature sparsification," Advances in Neural Information Processing Systems, pp. 1378-1386, 2010.
N. Zhang, R. Farrell, F. Iandola and T. Darrell, "Deformable part descriptors for fine-grained recognition and attribute prediction," in Proc. of 2013 IEEE International Conference on Computer Vision, pp. 729-736, December 1-8, 2013.
M. J. Leotta and J. L. Mundy, "Vehicle surveillance with a generic, adaptive, 3D vehicle model," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 7, pp. 1457-1469, 2011. https://doi.org/10.1109/TPAMI.2010.217
J. J. Yebes, L. M. Bergasa and M. Garca-garrido, "Visual object recognition with 3D-aware features in KITTI urban scenes," Sensors, vol. 15, no. 4, pp. 9228-9250, 2015. https://doi.org/10.3390/s150409228
Y. Lecun, Y. Bengio and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
Q. Zhang, Z. Li, J. F. Li, J. Zhang, H. Zhang and X. G. Li, "Vehicle color recognition using Multiple-Layer Feature Representations of lightweight convolutional neural network," Signal Processing, vol. 147, pp. 146-153, 2018. https://doi.org/10.1016/j.sigpro.2018.01.021
C. Hu, X. Bai, L. Qi and P. Chen, "Vehicle Color Recognition With Spatial Pyramid Deep Learning," IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 5, pp. 1-10, 2015. https://doi.org/10.1109/TITS.2015.2393752
M. Liu, C. Yu, H. F. Ling and J. Lei, "Hierarchical joint CNN-based models for fine-grained cars recognition," in Proc. of International Conference on Cloud Computing and Security, pp. 337-347, July 29-31, 2016.
A. Hu, H. Li, F. Zhang and W. Zhang, "Deep Boltzmann machines based vehicle recognition," in Proc. of The 26th Chinese Control and Decision Conference, pp. 3033-3038, May 31-June 2, 2014.
Y. Y. Wu and C. M. Tsai, "Pedestrian, bike, motorcycle, and vehicle classification via deep learning: deep belief network and small training set," in Proc. of 2016 International Conference on Applied System Innovation, pp. 1-4, May 26-31, 2016.
H. Wang, Y. F. Cai and L. Chen, "A Vehicle Detection Algorithm Based on Deep Belief Network," The Scientific World Journal, vol. 2014, pp. 1-7, 2014.
Z. H. Liu, Z. H. Lai, W. H. Ou, K. B. Zhang and R. J. Zheng, "Structured optimal graph based sparse feature extraction for semi-supervised learning," Signal Processing, vol. 170, 107456, 2020. https://doi.org/10.1016/j.sigpro.2020.107456
S. Q. Ren, K. M. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39. no. 6, pp. 1137-1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
A. Sharma, K. K. Paliwal and G. C. Onwubolu, "Class-dependent PCA, MDC and LDA: A combined classifier for pattern classification," Pattern Recognition, vol. 39, no. 7, pp. 1215-1229, 2006. https://doi.org/10.1016/j.patcog.2006.02.001
Z. H. Liu, J. J. Wang, G. Liu and L. Zhang, "Discriminative low-rank preserving projection for dimensionality reduction," Applied Soft Computing, vol. 85, 105768, 2019. https://doi.org/10.1016/j.asoc.2019.105768
J. Sun, X. B. Cai and F. M. Sun, "Dual graph-regularized Constrained Nonnegative Matrix Factorization for Image Clustering," KSII Transactions on Internet and Information Systems, vol. 11, no.5, pp. 2607-2627, 2017. https://doi.org/10.3837/tiis.2017.05.017
M. H. Wan, Z. H. Lai, Z. Ming and G. W. Yang, "An improve face representation and recognition method based on graph regularized non-negative matrix factorization," Multimedia Tools and Applications, vol. 78, no. 15, pp. 22109-22126, 2019. https://doi.org/10.1007/s11042-019-7454-2
D. L. Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289-1306, 2006. https://doi.org/10.1109/TIT.2006.871582
M. H. Wan, M. Li, G. W. Yang, S. Gai and Z. Jin, "Feature extraction using two-dimensional maximum embedding difference," Information Sciences, vol. 274, pp. 55-69, 2014. https://doi.org/10.1016/j.ins.2014.02.145
M. H. Wan, Z. H. Lai, G. W. Yang, Z. J. Yang, F. L. Zhang and H. Zheng, "Local graph embedding based on maximum margin criterion via fuzzy set," Fuzzy Sets and Systems, vol. 318, pp. 120-131, 2017. https://doi.org/10.1016/j.fss.2016.06.001
Y. Tang, C. Z. Zhang, R. S. Gu, P. Li and B. Yang, "Vehicle detection and recognition for intelligent traffic surveillance system," Multimedia tools and applications, vol. 76, no. 4, pp. 5817-5832, 2017. https://doi.org/10.1007/s11042-015-2520-x
F. Esposito, N. Gillis, D. Del Buono, "Orthogonal joint sparse NMF for microarray data analysis," Journal of mathematical biology, vol. 79, no. 4, pp. 223-247, 2019. https://doi.org/10.1007/s00285-019-01355-2
L. Hajderanj, I. Weheliye, D. Chen, "A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification," in Proc. of the 2019 8th International Conference on Software and Information Engineering, pp. 232-236, April 9-12, 2019.

Cited by

Vehicle Face Re-identification Based on Nonnegative Matrix Factorization with Time Difference Constraint vol.15, pp.6, 2020, https://doi.org/10.3837/tiis.2021.06.009

KSII Transactions on Internet and Information Systems (TIIS)

Vehicle Face Recognition Algorithm Based on Weighted Nonnegative Matrix Factorization with Double Regularization Terms

Abstract

Keywords

1. Introduction

2. Related Work

3. Original Feature Extraction

4. Feature Dimension Reduction Based on Improved NMF

5. Objective Function Solution Based on Projected Gradient Method

6. Experiment

6.1 Data Set

6.2 Parameters Setting

(1) The experience-based parameters.

(2) The experiment-based parameters.

6.3 Comparison and Analysis of Algorithms

7 Conclusion

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)