DOI QR코드

DOI QR Code

AN ITERATIVE METHOD FOR ORTHOGONAL PROJECTIONS OF GENERALIZED INVERSES

  • Srivastava, Shwetabh (Department of Mathematics, IIT Kharagpur) ;
  • Gupta, D.K. (Department of Mathematics, IIT Kharagpur)
  • Received : 2012.12.01
  • Accepted : 2013.06.05
  • Published : 2014.01.30

Abstract

This paper describes an iterative method for orthogonal projections $AA^+$ and $A^+A$ of an arbitrary matrix A, where $A^+$ represents the Moore-Penrose inverse. Convergence analysis along with the first and second order error estimates of the method are investigated. Three numerical examples are worked out to show the efficacy of our work. The first example is on a full rank matrix, whereas the other two are on full rank and rank deficient randomly generated matrices. The results obtained by the method are compared with those obtained by another iterative method. The performance measures in terms of mean CPU time (MCT) and the error bounds for computing orthogonal projections are listed in tables. If $Z_k$, k = 0,1,2,... represents the k-th iterate obtained by our method then the sequence of the traces {trace($Z_k$)} is a monotonically increasing sequence converging to the rank of (A). Also, the sequence of traces {trace($I-Z_k$)} is a monotonically decreasing sequence converging to the nullity of $A^*$.

Keywords

1. Introduction

Let ℂm×n and denote the set of all complex (m × n) matrices and all complex (m × n) matrices of rank r, respectively. For let I, A† A∗, R(A), N(A) and rank(A) represent the identity matrix of appropriate order, the Moore-Penrose inverse, the conjugate transpose, the range space, the null space and the rank of A, respectively. Penrose [4] has shown that AA† and A†A are hermitian idempotents and thus known as orthogonal projections of A. He has further shown that AA† is a projection on R(A) along N(A∗) and A†A is a projection on R(A∗) along N(A). Many applications of statistics, prediction theory, control analysis and numerical analysis often require computation of generalized inverses and its associated orthogonal projections. Accordingly, it is important both practically and theoretically to find efficient algorithms for computing a Moore-Penrose inverse and its associated orthogonal projections of a given arbitrary matrix. For we denote its eigenvalues by

The Moore-Penrose inverse has been extensively studied by many researchers [6,4,10,9,5,12,1] and many direct and iterative methods along with their convergence analysis and estimation of errors are proposed in the literature. It is the unique matrix X satisfying the following four Penrose equations.

An iterative method for k = 0, 1, 2, . . .

starting with Y0 = αA∗ generates a sequence {Yk} converging to A† if α satisfies

This method is a variant of the well known quadratically convergent Schultz method. In [2], its relation to the iterative method

for X0 = αA∗ is shown to be for k = 0, 1, 2, . . .. Another iterative method for computing the Moore-Penrose inverse based on the Penrose equations XAX = X and (XA)∗ = XA given by Petkovic et al [10] starting with X0 = βA∗ is

where, β be an appropriate real number. If L is the desired limit matrix and Xk is the k-th estimate of L, then the convergence properties of examined iterative method can be studied with the aid of error matrix Ek = Xk−L. If an iterative method is expressible as a simple matrix formula, Ek+1 is a sum of several terms

All suitable algorithms have a zero-order term equal to 0. Hence, the first-order term determine the terminal convergence properties.

There is very little work done on the computation of orthogonal projections as their computation is a very difficult task. They are important in applied fields of nature science, such as solution to various systems of linear equation, eigenvalue problems, the linear least square problems and in determining the rank and the nullity of rectangular matrices. Golub and Kahan [7] have observed that the correct determination of rank of A is a critical factor in these methods, even more so in the direct methods for computing A†. Ben-Israel and Cohen [3] developed the following iterative method for computing AA† based on {Yk} obtained from (3). For k = 0, 1, 2, . . . define

for M0 = γAA∗, where, Mk = AYk and γ satisfies

They have also established that the sequence of traces {trace(Mk)} is a monotonically increasing sequence converging to the rank(A) and the sequence of traces {trace(I −Mk)} is a monotonically decreasing sequence converging to the nullity of A∗.

This paper describes an iterative method for orthogonal projections AA† and A†A of an arbitrary matrix A, where A† represents the Moore-Penrose inverse. Convergence analysis along with the first and second order error estimates of the method are investigated. Three numerical examples are worked out to show the efficacy of our work. The first example is on a full rank matrix, whereas the other two are on full rank and rank deficient randomly generated matrices. The results obtained by the method are compared with those obtained by another iterative method. The performance measures in terms of mean CPU time (MCT) and the error bounds for computing orthogonal projections are listed in tables. If Zk, k = 0, 1, 2,... represents the k-th iterate obtained by our method then the sequence of the traces {trace(Zk)} is a monotonically increasing sequence converging to the rank of (A). Also, the sequence of traces {trace(I − Zk)} is a monotonically decreasing sequence converging to the nullity of A∗, where I and A∗ denote the identity matrix of appropriate order and conjugate transpose of A.

This paper is organized in five Sections. The first Section is the introduction. In Section 2, the iterative method for computing the orthogonal projections of a generalized inverse of an arbitrary complex matrix A is described. A convergence theorem is established along with the first and second order error terms in Section 3. It is also shown that the sequence of traces {trace(Zk)} is a monotonically increasing sequence converging to the rank(A) and the sequence of traces {trace(I − Zk)} is a monotonically decreasing sequence converging to the nullity of A∗. In Section 4, three numerical examples are worked out to show the efficacy of our work. One example is on full rank matrix and the other two are on generated randomly full rank and rank deficient matrices. The results obtained by the method are compared with those obtained by another iterative method. The performance measures in terms of mean CPU time (MCT) and the error bounds for computing orthogonal projections are listed in tables. Finally, conclusions are included in Section 5.

 

2. An iterative method for AA†

In this section, we shall extend the iterative method of [10] to compute the orthogonal projections of the generalized inverses. For this purpose, we first describe the iteration given in [10] for computing A† and its convergence properties. Assume that A ∈ ℂm×n then X = A† ∈ ℂn×m. Using (ii) and (iv) of equation (2), we get

Hence, for arbitrary β ∈ R, this gives

or

This leads to the following iterative method given in [10]

with X0 = βA∗ for an appropriate real number β. This can also be written as

Now, to get our iterative method for computing AA†, we pre-multiply (5) with X0 = βA∗ by A and take Zk = AXk. This gives

for some appropriate real number β. The following Lemmas will be useful for establishing the convergence analysis of (6) with Z0 = βAA∗ in section 3.

Lemma 2.1. AA†Zk = Zk, for all k ≥ 0.

Proof. Using mathematical induction, for k = 0, we get

Let it holds for some k, i.e., AA†Zk = Zk. It is easy to show that it also holds for k+1, since,

Lemma 2.2. ZkAA† = Zk, for all k ≥ 0.

Proof. Using mathematical induction, for k = 0, we get

Let it holds for some k, i.e., ZkAA† = Zk. It is easy to show that it also holds for k+1, since,

 

3. Convergence analysis

In this section, we shall establish the convergence analysis of the iterative method (6) with Z0 = βAA∗ described in Section 2 for computing AA†.

Theorem 3.1. Iterative method (6) with Z0 = βAA∗ converges to Z = AA† under the assumption

For β < 1, the method has a linear convergence, while for β = 1, its convergence is quadratic. The first-order and the second order terms corresponding to the error estimation of (6) are equal to

respectively.

Proof. We shall first prove that

as k → ∞. From (6), we get

Taking norm on both sides, this gives

Using

we obtain

The sequence of error matrices {Ek} satisfies

Now, we will show that sk = ∥Ek∥ → 0 as k → ∞. By using mathematical induction we shall first prove that sk < 1 for all k = 0, 1, . . .. It holds true for k = 0, since s0 =∥ (Z0 − Z) ∥< 1. Assume that it holds true for some k, i.e., sk < 1. To show that it also holds true for k + 1, we take norm on (8) to get

Thus, it holds true for all k = 0, 1, 2, . . .. Hence, sk≥ 0 is a decreasing and bounded sequence. This implies that sk converges to s as k → ∞. This gives 0 ≤ s < 1. Taking limit as k → ∞ on (9) we get

this implies that either s = 0 or s ≥ 1 and hence we conclude that s = 0. This completes the proof that sk→ 0 when k → ∞. Thus, Zk → Z when k → ∞. This proves the first part of the theorem. Putting Zk = Ek − Z in (6), it is not difficult to verify that the error matrix Ek+1 given by

implies

Using Lemma 2.1 and Lemma 2.2, this gives

Clearly, error1 vanishes if and only if β = 1, while error2 is always non-zero. Hence, the method has linear convergence for β ≠ 1 and quadratic for β = 1. This completes the proof of the theorem.

The convergence condition can easily be verified by getting the equivalent condition which does not involve the unknown Z. To obtain this, we use the following Lemma.

Lemma 3.2 ([8]). . Let M ∈ ℂn×n and ϵ > 0 be given. There is at least one matrix norm ∥.∥ such that

where ρ(M) = max{|λ1(M) |, . . . ,| λn(M)|} denotes the spectral radius of M.

This leads to the convergence condition ∥(βAA∗ − Z)∥< 1 equivalent to ρ(βAA∗ − Z) < 1. The following Lemma shows one property of the spectral radius function ρ.

Lemma 3.3 ([11]). . If P ∈ ℂn×n and S ∈ ℂn×n are such that P = P2 and PS = SP then

Lemma 3.4. Let the eigenvalues of matrix AA∗ satisfy (1). The ρ(βAA∗ − Z) < 1 is satisfied if

Proof. Let P = Z and S = βAA∗ − I. since P2 = P and

From Lemma 3.2, we get

Thus, βλi(AA∗)−1, for i = 1, 2, . . . , r are the eigenvalues of the matrix βλiAA∗−I.

Theorem 3.5. The iterative method given by (6) with Z0 = βAA∗ satisfies

where, dk = ∥Ek+1 − Ek∥

Proof. From (8), we get

This gives

Taking limit as k → +∞ on (10), we get since ∥Ek∥→ 0 from Theorem 3.1. Similarly, using Zk+1 − Zk = Ek+1 − Ek, we get

This leads to

Also,

implies

The following Lemma shows one additional property of the sequence {Zk}.

Lemma 3.6. The sequence {Zk} generated by (6) with Z0 = βAA∗ satisfies R(Zk) = R(AA∗) and N(Zk) = N(AA∗) for k ≥ 0.

Proof. This Lemma can be proved by mathematical induction. It trivially holds for k = 0. Let y ∈ N(Zk). From (6), we have

This gives y ∈ N(Zk+1) and N(Zk) ⊆ N(Zk+1). Proceeding similarly, it can be shown that R(Zk) ⊇ R(Zk+1). From mathematical induction, this gives N(Zk) ⊇ N(Z0) = N(AA∗) and R(Zk) ⊆ R(Z0) = R(AA∗). To prove equality, let us consider If y ∈ N then for some k0∈ N0, where N0 be the set of natural numbers. This leads to Zky = 0 for all k ≥ k0. Using Theorem 3.1, this gives

Thus, y ∈ N(Z) = N(AA†) = N(A∗). This implies N ⊆ N(A∗). From

we get N(Zk) = N(AA∗). Also from

and R(Zk) ⊆ R(AA∗), we get R(Zk) = R(AA∗).

 

4. Numerical examples

In this section, we shall work out three numerical examples to show the efficacy of our work. All these examples are run on an Intel core 2 Duo processor running at 2.80 GHz and using MATLAB 7.4 (R2009b). The first example is on a full rank matrix, whereas the other two are on full rank and rank deficient randomly generated matrices. The results obtained by our method are compared with those obtained by another method for computing the orthogonal projections. The performance measures in terms of mean CPU time (MCT) and the error bounds for computing orthogonal projections are listed in tables. The randomly generated matrices are tested 50 times. Figures are plotted for {trace(Zk)} and {trace(I − Zk)}, where Zk, k = 0, 1, 2, . . . represents the k-th iterate obtained by our method, with x-axis representing the number of iterations and y-axis representing these sequences. The termination criterion used is max{∥Zk+1 − Zk∥F } ≤ ϵ where, ∥.∥F stands for the Frobenius-norm of a matrix, The value of the parameter ϵ is taken as equal to 10−7.

Example 4.1. Consider the matrix A of order (5 × 4) of rank(A) = 4 and N(A∗) = 1 given by

The eigenvalues of the matrix AA∗ are

The convergence criteria for (6) for eigenvalues of AA∗ is given by

Thus, the sequence of iterates {Zk} generated by (6) converge to the orthogonal projection AA† given by

The Mean CPU time (MCT) and the error bounds of our method and the method (4) for computing the orthogonal projection are listed in TABLE 1. The trace(Zk) and trace(I − Zk) are plotted with the number of iterations in FIGURE 1 and FIGURE 2. As expected, the sequence {trace(Zk)} is a monotonically increasing sequence converging to the rank(A) and the sequence of {trace(>(I − Zk)} is a monotonically decreasing sequence converging to the nullity of A∗.

TABLE 1Mean CPU time (MCT) and Error bounds for different values of β

FIGURE 1trace(Zk) for Example 1

FIGURE 2trace(I − Zk) for Example 1

Example 4.2. Consider a (30 × 30) matrix A whose elements are generated randomly from [−0.2, 0.2] with rank(A) = 30 and N(A∗) = 0. The Mean CPU time (MCT) and the error bounds of our method and the method (4) for computing the orthogonal projection are listed in TABLE 2. The trace(Zk) and trace(I −Zk) are plotted with the number of iterations in FIGURE 3 and FIGURE 4. As expected, the sequence {trace(Zk)} is a monotonically increasing sequence converging to the rank(A) and the sequence of {trace(I − Zk)} is a monotonically decreasing sequence converging to the nullity of A∗.

TABLE 2Mean CPU time (MCT) and Error bounds

FIGURE 3trace(Zk) for Example 2

FIGURE 4trace(I − Zk) for Example 2

Example 4.3. Consider a rank deficient (100×50) matrix A whose elements are generated randomly from [−0.2, 0.2] with rank(A) = 50 and N(A∗) = 50. The Mean CPU time (MCT) and the error bounds of our method and the method (4) for computing the orthogonal projection are listed in TABLE 3. The trace(Zk) and trace(I − Zk) are plotted with the number of iterations in FIGURE 5 and FIGURE 6. As expected, the sequence {trace(Zk)} is a monotonically increasing sequence converging to the rank(A) and the sequence of {trace(I − Zk)} is a monotonically decreasing sequence converging to the nullity of A∗.

TABLE 3Mean CPU time (MCT) and Error bounds

FIGURE 5trace(Zk) for Example 3

FIGURE 6trace(I − Zk) for Example 3

 

5. Conclusions

An iterative method for computing orthogonal projections of an arbitrary matrix A is developed. Convergence analysis along with the first and second order error estimates of the method are investigated. Three numerical examples are worked out to show the efficacy of our work. The first example is on a full rank matrix, whereas the other two are on full rank and rank deficient randomly generated matrices. The results obtained by the method are compared with those obtained by another iterative method. The performance measures in terms of mean CPU time (MCT) and the error bounds for computing orthogonal projections are listed in tables. It is also observed that the sequence of traces {trace(Zk)} is monotonically increasing and converges to the rank of (A) where as, the sequence of traces {trace(I−Zk)} is monotonically decreasing and converges to the nullity of A∗.

References

  1. E. Andruchow, G. Corach, and M. Mbekhta, On the geometry of generalized inverses, Math. Nachr. 278 (2005), 756-770. https://doi.org/10.1002/mana.200310270
  2. A. Ben-Israel and A.Charnes, Contribution to the theory of generalized inverses, J. Soc. Indust. Appl. Math. 11 (1963), 667-669. https://doi.org/10.1137/0111051
  3. A. Ben-Israel and D. Cohen, On iterative computation of Generalized inverses and asso-cialted projections, J. Siam Numer. Anal. 3 (1966), 410-419. https://doi.org/10.1137/0703035
  4. A. Ben-Israel and T. N. E. Greville, Generalized Inverses: Theory and Applications, Springer-Verlag, NewYork, 2003.
  5. Haibin Chen and Yiju Wang, A Family of higher-order convergent iterative methods for computing the Moore-Penrose inverse, Appl. Math. Comput. 218 (2011), 4012-4016. https://doi.org/10.1016/j.amc.2011.05.066
  6. W. Guo and T. Huang, Method of elementary transformation to compute Moore-Penrose inverse, Appl. Math. Comput. 216 (2010), 1614-1617. https://doi.org/10.1016/j.amc.2010.03.016
  7. G. Golub and W. Kahan , Calculating the singular values and pseudo-inverse of a matrix, Milestones in Matrix Computation: The Selected Works of Gene H. Golub with Commentaries (2007).
  8. R.A. Horn and C.R. Johnson, Matrix Analysis Cambridge University Press, Cambridge, UK, 2012.
  9. Vasilios N. Katsikis, Dimitrios Pappas and Athanassios Petralias, An improved method for the computation of the Moore-Penrose inverse matrix, Appl. Math. Comput. 217 (2011), 9828-9834. https://doi.org/10.1016/j.amc.2011.04.080
  10. Marko D. Petkovic and Predrag S.Stanimirovic, Iterative method for computing Moore-Penrose inverse based on Penrose equations, J. Comput. Appl. Math. 235 (2011), 1604-1613. https://doi.org/10.1016/j.cam.2010.08.042
  11. P.S.Stanimirovic and D.S. Cvetkovicillic, Succesive matrix squaring algorithm for com-puting outer inverses, Appl. Math. Comput., 203 (2008), 19-29. https://doi.org/10.1016/j.amc.2008.04.037
  12. LiWeiguo, Li Juan and Qiao Tiantian, A family of iterative methods for computing Moore-Penrose inverse of a matrix, Linear Algebra Appl. 438 (2013), 47-56. https://doi.org/10.1016/j.laa.2012.08.004