Next: Computational Routines Up: The Implicitly Restarted Arnoldi Previous: Eigenvector/Null-Space Purification

Stopping Criterion

This section considers the important question of determining when a length m Arnoldi factorization has computed approximate eigenvalues of acceptable accuracy.

Let where and Then

suggests that the Ritz pair is a good approximation to an eigenpair of ${\bf A}$ if the last component of an eigenvector for is small. If the upper Hessenberg matrix is unreduced (has no zero subdiagonal elements) then standard results imply that However, this quantity can be quite small even if all of the subdiagonal element of are far from zero. Usually, this is how convergence takes place, but it is also possible for to become small. If the quantity is small enough, then all m eigenvalues of are likely to be good approximations to m eigenvalues of In the Hermitian case, this estimate on the residual can be turned into a precise statement about the accuracy of the Ritz value as an approximation to the eigenvalue of A that is nearest to . However, an analogous statement in the non-Hermitian case is not possible without further information concerning non-normality and defectiveness.

We shall develop a crude but effective assessment of accuracy based upon this estimate. Far more sophisticated analysis is available for the symmetric problem in [36] and in [40] for the non-symmetric case. It is easily shown that

Thus, the Ritz pair is exact eigenpair for a nearby problem whenever or is small (relative to ). The advantage of using the Ritz estimate is to avoid explicit formation of the direct residual when accessing the numerical accuracy of an approximate eigenpair.

In the Hermitian case a small residual implies an accurate answer. However, in the non-Hermitian case, a small does not necessarily imply that the Ritz pair is an accurate approximation to an eigenpair of The following theorem indicates what accuracy might be expected of a Ritz value as an approximation to an eigenvalue of

Theorem 14349

Suppose that is a simple eigenvalue of ${\bf A}$ nearest the eigenvalue of Denote the left and right eigenvectors for by and , respectively, each of unit length. Then

Proof: See Wilkinson [48, p. 68].

The number is the cosine of the angle between and and it determines the conditioning of . Thus, if and are nearly orthogonal, the eigenvalue is highly sensitive to perturbations in ${\bf A}$ but if they are nearly parallel then is insensitive to perturbations. For Hermitian matrices so that and will be an excellent approximation to .However, if the left and right eigenvectors are nearly orthogonal, then even if , where is machine precision, may contain few digits, if any, of accuracy. Roughly, as a rule of thumb, if and then the leading t-d decimal digits of will agree with those of .

The question of how close the Ritz vector is to is complicated by the fact that an eigenvector is not a unique quantity. Any scaling of an eigenvector by a nonzero complex number is also an eigenvector. For this reason, it is better to estimate the positive angle between an eigenvector and its approximation.

Theorem 14386

Suppose that is a Schur form for ${\bf A}$ and let be a simple eigenvalue of ${\bf A}$ nearest the eigenvalue of with corresponding eigenvectors and respectively. If is the positive angle between and then

Proof: See 5 in [5].

The definition of the quantity sep in this theorem is

and the norm is the Frobenius norm.

This is a more refined indicator of eigenvector sensitivity that accounts for non-normality as well as clustering of eigenvalues. Varah [46] shows that

where the latter bound is only defined for nonzero .Thus, the conditioning of the eigenvector problem depends upon both the distance to the other eigenvalues of ${\bf A}$ and the sensitivity of Varah also notes that both upper bounds may be significant over estimates. Note that when ${\bf A}$ is symmetric, and it may be shown that the first bound is an equality. Multiple eigenvalues or clusters of eigenvalues cause further complications. The above estimates may be extended to these cases through the conditioning of invariant subspaces and angles between invariant subspaces.

The conclusion we must draw is that the eigenvalues of a non-symmetric matrix may be very sensitive to perturbations such as those introduced by round-off error. This sensitivity is intricately tied to the departure from normality of the given matrix. The classic example of a matrix with an extremely ill-conditioned eigensystem is where is bi-diagonal with the number on the diagonal and all ones on the super-diagonal. The eigenvalues of this perturbed Jordan matrix satisfy . This is quite contrary to the behavior of eigenvalues of normal matrices. If the matrix ${\bf A}$ is near to a matrix with such an ill-conditioned eigensystem, then we can say little about the accuracy of the computed eigenvalues and eigenvectors. The best we can say is that we have computed the exact eigenvalues and eigenvectors of a nearby matrix (or matrix pencil).

We must be content with a stopping criterion that assures small backward error. This strategy is used in ARPACK, where the Ritz pair is considered converged when

is satisfied where is machine precision. d. Since , this implies that ( 4.6.2) is satisfied with and this test is invariant to scaling of ${\bf A}$ through multiplication by a nonzero scalar.

The backward error is defined as the smallest, in norm, perturbation such that the Ritz pair is an eigenpair for .The recent study [25] presents a thorough discussion of the many issues involved in determining stopping criteria for the non-symmetric eigenvalue problem. In ARPACK we are more stringent than just asking for a small backward error relative to . We instead ask for small backward error relative to the projected matrix .

Next: Computational Routines Up: The Implicitly Restarted Arnoldi Previous: Eigenvector/Null-Space Purification

Chao Yang
11/7/1997