1. Introduction
Face recognition has received considerable attention in pattern recognition and machine learning because of its wide applications [1,2]. In the past several decades, a large number of outstanding face recognition methods have been proposed and face recognition has made much progress. However, face recognition is still a great challenge owing to the following two main reasons. First, the face image matrices are always transformed into high-dimensional vectors in the face recognition problem. The image matrix-to-vector transformation will cause the rapid incensement of the dimensionality. This is so called “curse of dimensionality” [3]. Compared with the dimensionality of high-dimensional data, the number of the training samples is much less. The high-dimensional Small Size Sample (SSS) problem will lead to the over fitting or unstable problems in face recognition [4]. Second, the face images of one individual may contain variations of illuminations, postures and expressions. This means that the data uncertainty in face recognition may severely affect the performance of face recognition [5]. In general, the training sample set would not include all the varieties. Consequently, insufficient training samples will lead to low accuracies [6-8]. Furthermore, if the difference between different individuals is much smaller than it in the same individual, which is caused by the variations, the performance will be dropped sharply. To add virtual face images to the set of training samples is partially able to remedy the difficulty [9].
In order to overcome the “curse of dimensionality”, feature extraction technology is applied in face recognition. Principal Component Analysis (PCA) [7] and Linear Discriminant Analysis (LDA) [8] are two of the greatest well-known dimension reduction techniques. PCA constructs a serial of orthogonal vectors to pursuit a best representation while LDA is able to separate different classes in the low-dimensional subspace. Up to now, the researchers continue to improve the performances of PCA and LDA since they were proposed [9-12,51]. Two-dimensionality extension and kernel extension are two important aspects. Two-dimensionality PCA (2DPCA) [13] and Two-Dimensionality LDA (2DLDA) [14] not only preserve structure information embedded in face images, but also avoid the SSS problem and reduce the computational time. As the improvement of LDA, Uncorrelated Linear Discriminant Analysis (ULDA) [15] attracts much attention. Although these methods that are based on PCA and LDA have been applied successfully, they are linear transformation and sensitive to noise. As for high-dimensionality data such as the face images data, they are not always linear distribution in the space. In order to extract the non-linear characteristics, kernel technology is applied in PCA [16,17] and LDA [18]. Furthermore, the researchers do many works to improve the performance of kernel PCA in face recognition in recent years. For example, Fang et al proposed kernel representation-based nearest neighbor classifier [19]. Lu utilized the symmetry of face image and proposed symmetrical principal component analysis (SPCA) to improve the accuracy [20], Heo et al use fuzzy memberships to extend the kernel PCA [21]. Reference [22] used a few partial data points to determine which data points can be used to reduce time consuming and computation memory in kernel PCA. Besides face recognition, kernel PCA is used for decentralized fault detection [23,24], denoising [25], identification of severe weather outbreaks [26], similarity invariants shape recognition [27], microarray gene data analysis [28].
In recent years, the representation based classifications, such as Sparse representation based classification (SRC) [29] and collaborative representation based classification (CRC) [30] received much attention in face recogntion field. SRC searches a “sparse” linear combination of training samples to represent the test sample. In other words, SRC chooses a subset from training set to represent test samples through ℓ1 regularization. The experiment results show that SRC exceeds many other algorithms and is robust to occlusion, illumination and noise. However, ℓ1 regularization causes the high computational cost because it is not a simple closed-form solution. After that, SRC withℓ2,1[31] regularization and ℓ2 regularization [32,33] were proposes to reduce the computational cost. Compared with conventional SRC, these proposed methods achieve satisfactory and robust face recognition results. Though conventional SRC usually leads to sparser, SRC with ℓ2,1 regularization can receive group sparsity [34,35]. Moreover, SRC with ℓ2,1 is implemented using ℓ1 and ℓ2 regularizations simultaneously [35-38], which is different from conventional SRC. A lot of research works have been done to improve SRC. Xu et al proposed a supervised sparse representation method with a heuristic strategy [39]. Jiang et al proposed a semi-supervised discriminant analysis based sparse method for face recognition [40]. Reference [41] divided all the training samples into several blocks and then determined whether the block is in occluded using the linear regression technique. A two-phase test samples sparse representation method was proposed to reduce the high computational cost and improve the performance of the face recognition [42]. Lia et al proposed a new decision rule for sparse representation. Wang et al extent SRC to kernel space and proposed multi-kernel learning for sparse representation [43]. Reference [6] induced a kernel distance to determine N nearest neighbors of the test samples from training set to realize the “sparseness”. Why can sparse representation receive the high accuracy and robustness for face recognition? Zhang et al in reference [30] attribute it to that sparse representation chooses from all the training samples to represent the test sample. In other words, SRC utilizes the similarity of face images to reduce the unreasonable representation residual. This strategy induces collaborative representation based classification (CRC). CRC can play a similar role to the sparse ℓ0 regularization in enhancing the discrimination of representation.
However, if the test sample that has the same differential vector with two types of training samples, SRC cannot distinguish their categories. In order to prevent such problem, some researchers proposed a method which combine sparse and kernel technology to perform classification. In this paper, we propose an improvement approach for face recognition, which is based on sparse representation and PCA in kernel space. The proposed method estimates the distances between test sample and the all the training samples based on collaborative representation idea in kernel space. We select S training samples which get the S smallest distances. After that, KPCA is used to reduce the dimensionality and extract the most relevant information for classification. The proposed method implements the sparse representation under ℓ2 regularization and extracts the most relevant information twice to improve the robustness. In order to test the performance of the proposed method, we compare it with several state-of-the-art methods including SRC, CRC, KCRC and TPTSR in the ORL, the GT and the UMIST face databases. The experiment results show that our method is more effective and robust.
The rest of this paper is organized as follows: Section 2 demonstrates the related works; Section 3 discusses our proposed method in detail; Section 4 conducts extensive experiments to demonstrate the performance of our works; and Section 5 concludes the paper.
2. Related works
In this section, some important works including SRC, TPTSR, CRC and Kernel CRC, which related to our works are reviewed. We suppose that we have c individuals with ni training samples from the ith individual, i=1,2, ⋯ ,c and The training set is X=[x1,x2, ⋯ ,xn], Xi is the sub training set that contains the training samples from the ith individual, and y is the test sample.
2.1 Sparse representation
The main assumption of sparse representation is that the training samples from one individual are lie on a subspace. So the test samples from the cth class can be represented approximately by the linear span of Xc:
where αc is the reconstructive coefficients. Eq. (1) can be rewritten in term of the training samples from all individuals as
where α0=[0T, (αc)T, 0T]. Since c is unknown, SRC aims to solve the following ℓ0 minimization problem:
where ||‧||0 denotes the ℓ0 norm, which is the number of non-zero entries in the vector.
Unfortunately, it is a NP-hard problem to solve even to get the approximate solution. According to compressive sensing [44,45], if the is sparse enough the solution of ℓ0 minimization problem is approximately equal to ℓ1 minimization problem:
Considered the occlusion, Eq. (4) can be expressed as:
where ε>0 is a given tolerance.
After that, the representative residual can be gotten by:
where is the coefficient from Xi. Finally, the test sample will be identified by:
2.2 Collaborative Representation
In order to reduce the unreasonable representation residual, all the training samples are used to represent the test sample via a linear combination. Compared with SRC, CRC receives not only high performance, but also low computational cost. CRC aims to solve the following ℓ2 minimization problem:
where λ is the regularization parameter. λ makes the least square solution much more stable and a certain amount of “sparsity” to .
The least squares estimation is performed to estimate the coefficient:
The representation residual ei can be calculated as follows:
where ξp,i and Xp,i represent the pth coefficient and the pth training sample from the ith class respectively.
The rule in favor of the class with the minimum distance can be calculated by:
2.3 Kernel Collaborative Representation
Kernel method is the effective technology to extract the nonlinear feature and has been applied in computer vision and pattern recognition in recent years [46,47]. If we use a nonlinear mapping φ, the original data space R is mapped into a higher dimensional feature space F
Denote the mapped samples from the original feature space as ϕ=[φ(x1), φ(x2),⋯,φ(xN)].
The objective function of Kernel CRC can be written as:
where λ is the regularized parameter. It is easy and analytical to solve as
Suppose that there is a kernel k(·,·) induced by feature mapping function φ and k(xi, xj)=φT(x1)φ(x2) represents a nonlinear similarity between two vectors xi and xj.
If we denote
We can use Eq.(16) to calculate the coefficients:
Then we can calculate the representation residual for each class by
where ni is the number of the training samples from the ith class, represent the pth coefficients and the pth training samples from the ith class in kernel space respectively.
Finally, we identify the class which makes the minimums representation residual.
2.4 Two-phase test samples sparse representation
The main assumption of TPTSR is that the test sample and its M “nearest neighbors” are probably from the same class. First, TPTSR searches M “nearest neighbors” for test sample via CRC. Second, TPTSR represents the test sample via linear combination of its M “nearest neighbors”. At last, the test sample is determined by the representation residual. Compared with CRC and SRC, TPTSR ignores the training samples that are far from the test sample.
SCR implements the sparse ℓ0 regularization. While CRC enhances the discrimination of representation through ℓ2 regularization. KCRC extents CRC to kernel space to improve the nonlinear classification ability. TPTSR searches M “nearest neighbors” of test sample for CRC. We combine the merits of TPTSR and KPCA to propose an improved kernel principal component analysis method based on sparse representation to improve the accuracy and robustness for face recognition.
3. The proposed method
3.1 Presentation of the proposed method
In this section, we present the improved kernel principal component analysis method for face recognition. We first search the S nearest training samples of the test sample in kernel space. And then the features are extracted by using KPCA. At last, these features are used to identify the class label of the test sample. The specific steps are as follows:
Step 1 Search S nearest training samples in kernel space
We first assume that all the samples are mapped into a new space by using mapping φ. It is also supposed that in kernel space the test samples y can be approximately represented by the combination of all training samples:
where wi is the representation coefficient of the ith training sample. Using the least squares algorithm, we can obtain the following solution:
where
W = [w1, w2, ⋯, wn]T, , ki,j = k(xi,xj) = (φ(xi),φ(xj)), Ky = XTφ(y) = (k(y,x1),k(y,x2), ⋯ ,k(y,xn))T, I is the identity matrix and γ is a small positive constant makes the least square solution much more stable. (φ(xi), φ(xj)) stands for the inner product of φ(xi) and φ(xj). X=[φ(x1) ...φ(xn)]. In order to search S nearest training samples of the test sample, we give the distance defined as follows:
From Eq. (21) we know that the smaller qi is, the higher relevance between the test sample and the training sample is. So we are able to select the S nearest training samples based on qi. The samples which get the S smallest distances are selected and denoted as and the number of the ith individual samples is denote as
Step 2 Feature extraction using KPCA.
In this step, we want to extract the feature from X'. Similar with step 1, we have
Then the eigenvalues λ and eigenvectors v of covariance matrix C are given by
We select the first d eigenvectors which correspond to the d largest eigenvalues. They are denoted as respectively. azi is the feature we extract from the training sample This means [az1,az2, ⋯ ,azs]T=K'C. Similarly, we get the feature of the test sample φ(y), which is denoted as bφ(y) by
Step 3 Classification
In this step we will calculate the distance between bφ(y) and azi, where i=1,2, ⋯ ,c. Then the test sample will be identified to the class which it get the smallest distance.
First, the S nearest training samples are used to represent sparsely the test samples. The represent coefficients are estimated by:
where is the regularization parameter and makes the least square solution much more stable. After that, the residual can be calculated by
where is ith the class the jth sample coefficient from is the ith class the jth feature from Â.
Finally, the classification rule in favor of the class with the minimum residual can be expressed as
3.2 Analysis
There are several advantages in our method. First, like the conventional kernel method, the proposed method is able to increase the separability of the samples in kernel space. This means that the distances between different classes will be enlarged when the kernel trick is used. The theoretical and experiment results offered in [12] show that the samples that are mapped into the higher-dimensional space is more beneficial to correct classification. Second, the proposed method extracts the most relevant features for test samples twice: (1) we select the nearest training samples globally in kernel space. All the samples including the test sample are mapped into the kernel space. At the same time, the collaborative representation idea is used to select the most nearest samples in the new feature space. (2) KPCA is implemented to extract dominating features from the test sample and S nearest training samples. The sparse representation of the test sample is executed with ℓ2 regularization. So the extracted features are more suitable for classification.
4. Experimental Classification Results and Analysis
In this section, we compare our method with several stat-of-the-art face recognition methods including SRC, CRC, KCRC and TPTSR on the ORL [48], the GT [49] and the MUIST [50] face databases. In our experiments, all the face images were normalized to 32×32 pixel. The first t(t=2,3,4,5, ⋯) samples per individual are used for training and the remaining samples are used for test. In our method, KPCA is implemented to reduce the dimension and contain 95% energy.
4.1 Experiments on ORL face database
The ORL face database consists of a total of 400 images from 40 people. Each person has 10 images. For some individuals, the images are taken at different times. The varieties of this face database include the open or closed eyes, smiling or no smiling, wearing glasses or no glasses. Also the face images are taken with a tolerance for some titling and rotation of the face of up to 20°. And there are some variations in the scale of up to about 10%. Fig. 1 shows some samples of the ORL face database.
Fig. 1.All the images from one individual on ORL face database.
In this experiment, we choose the first t(t=2,3,4,5,6) of each individual to form the training set and the remaining samples for test. The total number of training samples is different according to different t. So we set S to be an integer. Suppose s is the sparseness coefficient and S equal s multiply to the total number of training samples. In all experiments, s is set as 0.6. The comparative results are summarized in Table 1.
Table 1.Comparative result of different methods on ORL face database
From Table 1, we can find that the two methods based on kernel, KCRC and the proposed method, get the highest accuracies. When t=2 and t=6 KCRC and our method have the same accuracies. When t=3 and t=4, our method is better than any other methods including KCRC. In general, the proposed method is efficient.
In order to investigate the relationship between accuracy and the coefficients, s and dimensionality, we estimate the accuracies by fixing s and dimensionality respectively in different number of training samples. Fig. 2 shows the accuracy variation with different sparseness coefficient s (s is setting from 0.1 to 1.0 and the step is 0.1) when PCA contains 100% energy. Fig. 3 shows the relationship between accuracy and dimensionality by fixing the sparseness coefficient (s=0.6). Table 2 shows the highest accuracy and the sparseness coefficient s in each number of training samples.
Fig. 2.The accuracies in different sparseness coefficient s (from 0.1 to 1.0) in the ORL face database.
Fig. 3.The accuracies in different dimension in ORL face database.
Table 2.The highest accuracy and its value of sparseness coefficient (s) in ORL face database
From Fig. 2 we can find that the accuracies seem to oscillate according in different sparseness coefficient . The accuracy cannot reach the highest level when all the training samples are used (s=1). Though face recognition is a Small Sample Size Problem, some training samples are not suitable for classification.
We can find from Table 2 that the highest accuracy in different number of training samples need different sparseness coefficient. Also, the accuracies when t=2 (s=0.7) and t=6 (s=0.9) in Table 2 are higher than those in Table 1. This means that the dimension and sparseness coefficient s are all important for the accuracy. In order to investigate the relationship between accuracy and dimension, we calculate the accuracies according to different dimension by fixing the sparseness coefficient (s=0.6). Fig. 3 and Table 3 show the experiment results.
Table 3.The highest accuracy and the dimension in ORL face databas
From Fig. 3 we can observe that the accuracies increase sharply in any cases when the dimensionality is small. When the dimensionality is large enough the accuracies oscillate according to the increment of dimensionality. Table 3 The highest accuracy and the dimension in ORL face database.
Table 3 shows that the highest accuracy does not happen in the largest dimensionality. From the experiment results that shown in section 4.1, we can find that if the suitable sparseness coefficient and dimensionality are set, the higher accuracies can be gotten.
4.2 Experiments on GT face database
The Georgia Tech (GT) face database contains images of 50 people taken two or three sessions. The pictures show frontal and/or tilted faces with different facial expressions, lighting conditions and scale. Each image is manually labeled determine the position of the face in the image. Fig. 4 shows all the images of one subject on GT face database.
Fig. 4.All the images of one individual on Georgia Tech face image dataset.
In this experiments, we choose the first t(t=2,3,4, ⋯ ,14) samples per individual to form the training set and the remaining samples are used for test. The contrastive methods include SRC, CRC, KCRC and TPTSR. The experiment results are shown in Table 4.
Table 4.Comparative result of different methods on GT face database
From Table 4 we can observe that the methods based on kernel have a big improvement than the other methods. Though CRC get the lowest accuracy, KCRC is able to get higher accuracies than SRC. On the GT face database, TPTSR which is under ℓ2 regularization is better than SRC, which is under ℓ1 regularization. We owe the improvement to the tactics which selects M nearest neighbors for the test sample. The proposed method not only inherit this merit, but also extent it to the kernel space. From Fig. 4 we can observe that the face images are under big variations. These variations will bring the difficulty fo feature extraction and classification. So we can find some methods such as CRC and SRC can not get the high accuracy in Table 4. The proposed method gets higher accuracies than any other methods in any causes. Compared with KCRC, our method has a big improvement. The average accuracy of KCRC and our method are 68.52% and 72.51% respectively. This means the proposed method has over 5.8% improvement. The experimental results show that the proposed method is effective and more robust than the others.
In order to investigate the relationship between accuracy and sparseness coefficient, the relationship between accuracy and dimensionality respectively, we calculate the accuracies with different sparseness coefficients by fixing dimensionality and calculate the accuracies with different dimensionalities by fixing sparseness coefficients. The experimental results are shown in Fig. 5, Table 5 and Fig. 6, Table 6 respectively.
Fig. 5.The accuracies in different sparseness coefficient s (from 0.1 to 1.0) in GT face database. (a), (b) and (c) are the result with different number of training sample.
Table 5.The highest accuracy and its value of sparseness coefficient (s) in GT face database
Fig. 6.The accuracies in different dimension in GT face database. (a), (b) and (c) are the results with different number of training samples.
Table 6.The highest accuracy and its value of dimensionality in GT face database
From Fig. 5 we can see that almost all the accuracies hit rock bottom when all the training samples are used for classification. We can observe from Table 5 that at most situations the highest accuracies happen in high sparseness level in the GT face database.
Fig. 6 and Table 6 show relationship between accuracy and dimensionality. We calculate the accuracies according to different dimension by fixing the sparseness coefficient (s=0.6).
Fig. 6 and Table 6 show the fact that the accuracies have a big improvement according to the increment of dimensionality when the dimensionality is small. After that, the growth rates are smooth when the dimensionality is large enough. But the highest accuracy always does not happen in the largest dimensionality.
4.3 Experiments on UMIST face database
The UMIST face database consists of 564 images of 20 individuals. Each individual is shown in ranges of postures from profile to frontal views. In our experiments, we select the first 19 face images each individual, total 380 face images, to form the subset. Fig. 7 shows some face images in our experiments.
Fig. 7.Some face images from UMIST database in our experiments.
Similar with the above experiment setting, we select the first t(t=2,3, ⋯ ,12) of each individual for training set and the remaining samples for test. Table 7 gives the comparative results of different methods in the UMIST face database.
Table 7.Comparative result of different methods on UMIST face database
From Table 7 we can find that KCRC and our method that are based on kernel space get the highest performance. But TPTSR get the much less accuracies than any other methods. It seems that the large posture variations of the UMIST face database leads to the lack of training samples problem for TPTSR. Though it has a similar tactics to get the sparse representation under ℓ2 regularization, our method has much higher improvement than TPTSR. Maybe it is more separable for large posture variations in kernel space. Compared with KCRC, the proposed method is able to get a higher performance except t=3,5,9. Though the proposed method cannot get higher accuracies than KCRC in any cases, the average accuracy is higher than that of KCRC. In general, the proposed method is effective and robust.
Similar with Section 4.1 and Section 4.2, we investigate the relationship between accuracy and sparseness coefficient, and the relationship between accuracy and dimensionality. The experiment results are shown in Fig. 8, Table 8 and Fig. 9, Table 9.
Fig. 8.The accuracies in different sparseness coefficient S (from 0.1 to 1.0) in UMIST face database. (a), (b), (c) and (d) are the result with different number of training sample.
Table 8.The highest accuracy and its value of sparseness coefficient (s) in UMIST face database.
Fig. 9.The accuracies in different dimension in the UMIST face database. (a), (b), (c) and (d) are the results with different number of training samples.
Table 9.The highest accuracy and the dimension in UMIST face database.
From Fig. 8 we can find something different compared with Section 4.1 and Section 4.2. At most situation, the high accuracies need more training samples. In another words, the accuracies will be higher when s is bigger in UMIST face database.
From Table 8 we can see that we can get the highest accuracies when s is enough large except the case when t=2. From Fig. 7 we can find that the face images from UMIST have big varieties in ranges of postures from profile to frontal views. Maybe these big varieties aggravate the SSS problem. So it needs more training samples to represent the test samples well.
Fig. 9 and Table 9 show the relationship between accuracy and dimensionality.
Fig. 9 shows that the accuracy improves more quickly when dimension is small. When the dimensionality is large enough the accuracy seems stable. We also can find that the highest accuracy does not appear in the highest dimension in any case of t. In another words, we do not need all the information of the training samples for classification though face recognition is a classical SSS problem.
From Table 9 we can find that we need much smaller dimensionality than that in the ORL and the GT face databases to get the highest accuracy.
5. Conclusions
In this paper, an improved kernel principal component analysis method based on spare representation is proposed for more performance in face recognition. Our method implements sparse representation under ℓ2 regularization based on collaborative representation in kernel space. In order to improve the robustness, the proposed method extracts the most relevant features for test sample twice. The comparative experiments are conducted in the ORL, the GT and the UMIST face databases. Also, we investigate the relationship between accuracy and sparseness coefficient by fixing the dimensionality, the relationship between accuracy and dimensionality by fixing the sparseness coefficient respectively. The compared methods include several stat-of-the-art face recognition methods including SRC, CRC, KCRC and TPTSR. The experimental results verify the efficiency and robustness of the proposed method. From the expreimental results, we are able to observe that the sparseness coefficient and dimensionality are all important for accuracy. In the future works, we will try to find out the concrete relationship of the two parameters and improve the accuracy.
References
- W. Zhao, R. Chellappa, A. Rosenfeld and P. Phillips, “Face recognition: a literature survey,” ACM Computing Serveys, vol. 3, no. 4, pp. 399-458, 2003. Article (CrossRef Link). https://doi.org/10.1145/954339.954342
- A. Samal and P. A. Iyengar, “Automatic recognition and analysis of human faces and facial expression: A survey,” Pattern Recognition, vol. 25, no.1, pp. 65-77, 1999. Article (CrossRef Link). https://doi.org/10.1016/0031-3203(92)90007-6
- W. Huang and H. Yin, “On nonlinear dimensionality reduction for face recognition,” Image and Vision Computing, vol. 30, no.4-5, pp. 355-366, 2012. Article (CrossRef Link). https://doi.org/10.1016/j.imavis.2012.03.004
- P.C. Hsieh and P.C. Tung, “Shadow compensation based on facial symmetry and image average for robust face recognition,” Neurocomputing, vol. 73, no. 13-15, pp. 2708-2717, 2010. Article (CrossRef Link). https://doi.org/10.1016/j.neucom.2010.04.015
- Y. Xu, X. Fang, X. Li, J. Yang, J. You, H. Liu and S. Teng, “Data uncertainty in face recognition,” IEEE Transactions on Cybernetics, vol. 44, no. 10, pp. 1950-1961, 2014. Article (CrossRef Link). https://doi.org/10.1109/TCYB.2014.2300175
- J. Lu, K.N. Plataniotis and A.N. Venetsanopoulos, “Regularized discriminant analysis for the small sample size problem in face recognition,” Pattern Recognition Letters, vol. 24, no. 16, pp. 3079-3087, 2003. Article (CrossRef Link). https://doi.org/10.1016/S0167-8655(03)00167-3
- Y. Xu, X. Li, J. Yang and D. Zhang, “Integrate the original face image and its mirror image for face recognition,” Neurocomputing, vol. 131, pp. 191-199, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.neucom.2013.10.025
- W. Huang, X. Wang, Y. Ma, Y. Jiang, Y. Zhu and Z. Jin, “Robust Kernel Collaborative Representation for Face Recognition,” Optical Engineering, vol. 54, no.5, pp. 053103-1-053103-10, 2015. Article (CrossRef Link). https://doi.org/10.1117/1.OE.54.5.053103
- Y. Xu, X. Li, J. Yang, Z. Lai and D. Zhang, “Integrating conventional and inverse representation for face recognition,” IEEE Transactions on Cybernetics, vol. 44, no. 10, pp. 1738-1746, 2014. Article (CrossRef Link). https://doi.org/10.1109/TCYB.2013.2293391
- M.Turk and A. Pentland, “Eigenfaces for Recognition,” Journal of Cognitive Neuroscience, vol.3, no. 1, pp. 71-86, 1991. Article (CrossRef Link). https://doi.org/10.1162/jocn.1991.3.1.71
- D.L. Swets and J. Weng, “Using discriminant eigenfeatures for image retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.18, no. 8, pp. 831-836, 1996. Article (CrossRef Link). https://doi.org/10.1109/34.531802
- K.R. Muller, “An introduction to kernel-based learning algorithms,” IEEEE Transactions on Neural Networks, vol. 12, no. 2, pp. 181-201, 2001. Article (CrossRef Link). https://doi.org/10.1109/72.914517
- Y. Jian, Z. David, A.F. Frangi and Y. Jing-Yu, “Two-Dimensional PCA: A New Approach to Appearance-Based Face Representation and Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no.1, pp. 131-137, 2004. Article (CrossRef Link). https://doi.org/10.1109/TPAMI.2004.1261097
- J. Yang, D. Zhang, X. Yong and J.Y. Yang, “Two-dimensional discriminant transform for face recognition,” Pattern Recognition, vol. 38, no. 7, pp. 1125-1129, 2005. Article (CrossRef Link). https://doi.org/10.1016/j.patcog.2004.11.019
- Z. Jin, J.Y. Yang, Z.S. Hu and Z. Lou, “Face recognition based on uncorrelated discriminant transformation,” Pattern Recognition, vol. 34, no. 7, pp. 1405-1416, 2001. Article (CrossRef Link). https://doi.org/10.1016/S0031-3203(00)00084-4
- A.M. Jadea, B. Srikanth, V.K. Jayaraman, B.D. Kulkarni, J.P. Jog and L. Priya, “Feature extraction and denoising using kernel PCA,” Chemical Engineering Science, vol. 58, no.19, pp. 4441-4448, 2003. Article (CrossRef Link). https://doi.org/10.1016/S0009-2509(03)00340-3
- H. Hoffmann, “Kernel PCA for novelty detection,” Pattern Recognition, vol. 40, no. 3, pp. 863-874, 2007. Article (CrossRef Link). https://doi.org/10.1016/j.patcog.2006.07.009
- F. Ye, Z. Shi and Z. Shi, "A Comparative Study of PCA, LDA and Kernel LDA for Image Classification," in: ISUVR '09 International Symposium on, pp. 51-54, 2009. Article (CrossRef Link).
- X. Fang, Y. Lu, Z. Li, L. Yu and Y. Chen, “Kernel representation-based nearest neighbor classifier,” Optik, vol. 125, pp. 2320-2326, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.ijleo.2013.10.074
- C. Lu, C. Zhang, T. Zhang and W. Zhuang, “Kernel based symmetrical principal component analysis for face classification,” Neurocomputing, vol. 70, pp. 904-91, 2007. Article (CrossRef Link). https://doi.org/10.1016/j.neucom.2006.10.019
- G. Heo, P. Gader and H. Frigui, “RKF-PCA: Robust kernel fuzzy PCA,” Neural Networks, vol. 22, pp. 642-650, 2009. Article (CrossRef Link). https://doi.org/10.1016/j.neunet.2009.06.013
- R. Zhang, W. Wang and Y. Ma, “Approximations of the standard principal components analysis and kernel PCA,” Expert Systems with Applications, vol. 37, pp. 6531-6537, 2010. Article (CrossRef Link). https://doi.org/10.1016/j.eswa.2010.02.133
- Y.W. Zhang, H. Zhou and S.J. Qin, “Decentralized Fault Diagnosis of Large-scale Processes Using Multiblock Kernel Principal Component Analysis,” Acta Automatica Sinica, vol. 36, no. 4, pp. 593-597, 2010. Article (CrossRef Link). https://doi.org/10.3724/SP.J.1004.2010.00593
- S.W. Choi, C. Lee, J.M. Lee, J.H. Park and I. Lee, “Fault detection and identification of nonlinear processes based on kernel PCA,” Chemometrics and Intelligent Laboratory Systems, vol. 75, pp. 55-67, 2005. Article (CrossRef Link). https://doi.org/10.1016/j.chemolab.2004.05.001
- T.J. Hansen, T.J. Abrahamsen and L.k. Hansen, “Denoising by semi-supervised kernel PCA preimaging, ” Pattern Recognition Letter, vol. 49, pp. 114-120, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.patrec.2014.06.015
- A.E. Mercer, M.B. Richman and L.M. Leslie, “Identification of severe weather outbreaks using kernel principal component analysis,” Procedia Computer Science, vol. 6, pp. 231-236, 2011. Article (CrossRef Link). https://doi.org/10.1016/j.procs.2011.08.043
- H. Sahbi, “Kernel PCA for similarity invariant shape recognition,” Neurocomputing, vol. 70, pp. 3034-3045, 2007. Article (CrossRef Link). https://doi.org/10.1016/j.neucom.2006.06.007
- X. Li and L. Shu, “Kernel based nonlinear dimensionality reduction for microarray gene expression data analysis,” Expert Systems with Applications, vol. 36, pp. 7644-7650, 2009. Article (CrossRef Link). https://doi.org/10.1016/j.eswa.2008.09.070
- J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry and Y. Ma, “Robust face recognition via sparse representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, pp. 210–227, 2009. Article (CrossRef Link). https://doi.org/10.1109/TPAMI.2008.79
- L. Zhang, M. Yang and X. Feng, "Sparse Representation or Collaborative Representation: Which Helps Face Recognition?" in Proc. of IEEE International Conference on Computer Vision, pp. 471-478, 2011. Article (CrossRef Link).
- C.X. Ren, D.Q. Dai and H. Yan, "Robust classification using ℓ2,1-norm based regression model," Pattern Recognition, vol. 45, pp. 2708-2718, 2012. Article (CrossRef Link). https://doi.org/10.1016/j.patcog.2012.01.003
- Y. Xu, Q. Zhu, Z. Fan, D. Zhang, D. Zhang, J. Mi and Z. Lai, “Using the idea of the sparse representation to perform coarse to fine face recognition,” Information Sciences, vol. 238, pp. 138-148, 2013. Article (CrossRef Link). https://doi.org/10.1016/j.ins.2013.02.051
- P. Zhu, L. Zhang, Q. Hu, and S.C.K. Shiu, "Multi-scale Patch based Collaborative Representation for Face Recognition with Margin Distribution Optimization," Lecture Notes in Computer Science, vol. 7572, pp. 822-835, 2012. Article (CrossRef Link).
- Y. Xu, D. Zhang, J. Yang and J.Y. Yang, “A two-phase test sample sparse representation method for use with face recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 9, pp. 1255-1262, 2011. Article (CrossRef Link). https://doi.org/10.1109/TCSVT.2011.2138790
- F. Dornaika, Y. Traboulsi and A. Assoum, “Adaptive two phase sparse representation classifier for face recognition,” Advanced Concepts for Intelligent Vision Systems, pp. 182-191, 2013. Article (CrossRef Link).
- X. Shi, Y. Yang, Z. and Z. Lai, “Face recognition by sparse discriminant analysis via joint L2,1 -norm minimization,” Pattern Recognition, vol. 47, no. 7, pp. 244-2453, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.patcog.2014.01.007
- C. Hou, F. Nie, D. Yi and Y. Wu, "Feature Selection via Joint Embedding Learning and Sparse Regression," in Proc. of the Twenty-Second International Joint Conference on Artificial Intelligence, pp. 1324-1329, 2012. Article (CrossRef Link).
- S. Ghosh, “On the grouped selection and model complexity of the adaptive elastic net,” Statistics and Computing, vol. 21, no. 3, pp. 451-462, 2011. Article (CrossRef Link). https://doi.org/10.1007/s11222-010-9181-4
- Y. Xu, W. Zuo and Z. Fan, “Supervised sparse representation method with a heuristic strategy and face recognition experiments,” Neurocomputing, vol. 79, no. 1, pp. 125-131, 2012. Article (CrossRef Link). https://doi.org/10.1016/j.neucom.2011.10.013
- J. Jiang, H. Gan, L. Jiang, C. Gao and N. Sang, “Semi-supervised Discriminant Analysis and Sparse Representation-based self-training for Face Recognition,” Optik, vol. 125, no. 9, pp. 2170-2174, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.ijleo.2013.10.043
- A. Eftekhari, M. Forouzandfar, H. A. Moghaddam and J. Alirezaie, “Block-wise 2D kernel PCA/LDA for face recognition,” Information Processing Letters, vol. 110, pp. 761-766, 2010. Article (CrossRef Link). https://doi.org/10.1016/j.ipl.2010.06.006
- Y. Xu, X. Zhu, Z. Li, G. Liu, Y. Lu and H. Liu, “Using the original and ‘symmetrical face’ training samples to perform representation based two-step face recognition,” Pattern Recognition, vol. 46, no. 4, pp. 1151-1158, 2013. Article (CrossRef Link). https://doi.org/10.1016/j.patcog.2012.11.003
- J. Wang, H. Bensmail and X. Gao, “Feature selection and multi-kernel learning for sparse representation on a manifold,” Neural Networks, vol. 51, pp. 9-16, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.neunet.2013.11.009
- R.G. Baraniuk, “Compressive Sensing [Lecture Notes],” IEEE Signal Processing Magazine, vol. 24, no. 4, pp.118-121, 2007. Article (CrossRef Link). https://doi.org/10.1109/MSP.2007.4286571
- J. Zhang, C. Zhao, D. Zhao and W. Gao, "Image compressive sensing recovery using adaptively learned sparsifying basis via L0 minimization," Signal Processing, vol. 103, pp. 114-126, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.sigpro.2013.09.025
- G. Shenghua, T.W. Hung and C.L. Tien, “Sparse Representation with Kernels,” IEEE Transactions on Image Processing, vol. 22, no 2, pp. 423-434, 2013. Article (CrossRef Link). https://doi.org/10.1109/TIP.2012.2215620
- W. Yang, Z. Wang, J. Yin, C. Sun and K. Ricanek, “Image classification using kernel collaborative representation with regularized least square,” Applied Mathematics and Computation, vol. 222, pp.13-28, 2013. Article (CrossRef Link). https://doi.org/10.1016/j.amc.2013.07.024
- http://www.cam-orl.co.uk
- http://www.anefian.com/face_reco.htm
- http://www.sheffield.ac.uk/eee/research/iel/research/face
- H.K. Palo and M.N. Mohanty, “Classification of Emotions of Angry and Disgust,” Smart Computing Review, vol. 5, no. 3, pp.151-158, 2015. Article (CrossRef Link).
Cited by
- DCNN Optimization Using Multi-Resolution Image Fusion vol.14, pp.11, 2020, https://doi.org/10.3837/tiis.2020.11.003