Matrix sign function

In mathematics, the matrix sign function is a matrix function on square matrices analogous to the complex sign function.^[1] It was introduced by J.D. Roberts in 1971 as a tool for model reduction and for solving Lyapunov and Algebraic Riccati equation in a technical report of Cambridge University, which was later published in a journal in 1980.^[2]^[3]

Definition

The matrix sign function is a generalization of the complex signum function $csgn (z) = {\begin{cases} 1 & if R e (z) > 0, \\ - 1 & if R e (z) < 0, \end{cases}$ to the matrix valued analogue $csgn (A)$ . Although the sign function is not analytic, the matrix function is well defined for all matrices that have no eigenvalue on the imaginary axis, see for example the Jordan-form-based definition (where the derivatives are all zero).

Properties

Theorem: Let $A \in ℂ^{n \times n}$ , then $csgn (A)^{2} = I$ .^[1] Theorem: Let $A \in ℂ^{n \times n}$ , then $csgn (A)$ is diagonalizable and has eigenvalues that are $\pm 1$ .^[1] Theorem: Let $A \in ℂ^{n \times n}$ , then $(I + csgn (A)) / 2$ is a projector onto the invariant subspace associated with the eigenvalues in the right-half plane, and analogously for $(I - csgn (A)) / 2$ and the left-half plane.^[1] Theorem: Let $A \in ℂ^{n \times n}$ , and $A = P [\begin{matrix} J_{+} & 0 \\ 0 & J_{-} \end{matrix}] P^{- 1}$ be a Jordan decomposition such that $J_{+}$ corresponds to eigenvalues with positive real part and $J_{-}$ to eigenvalue with negative real part. Then $csgn (A) = P [\begin{matrix} I_{+} & 0 \\ 0 & - I_{-} \end{matrix}] P^{- 1}$ , where $I_{+}$ and $I_{-}$ are identity matrices of sizes corresponding to $J_{+}$ and $J_{-}$ , respectively.^[1]

Computational methods

The function can be computed with generic methods for matrix functions, but there are also specialized methods.

Newton iteration

The Newton iteration can be derived by observing that $csgn (x) = \sqrt{x^{2}} / x$ , which in terms of matrices can be written as $csgn (A) = A^{- 1} \sqrt{A^{2}}$ , where we use the matrix square root. If we apply the Babylonian method to compute the square root of the matrix $A^{2}$ , that is, the iteration $X_{k + 1} = \frac{1}{2} (X_{k} + A X_{k}^{- 1})$ , and define the new iterate $Z_{k} = A^{- 1} X_{k}$ , we arrive at the iteration $Z_{k + 1} = \frac{1}{2} (Z_{k} + Z_{k}^{- 1})$ , where typically $Z_{0} = A$ . Convergence is global, and locally it is quadratic.^[1]^[2] The Newton iteration uses the explicit inverse of the iterates $Z_{k}$ .

Newton–Schulz iteration

To avoid the need of an explicit inverse used in the Newton iteration, the inverse can be approximated with one step of the Newton iteration for the inverse, $Z_{k}^{- 1} \approx Z_{k} (2 I - Z_{k}^{2})$ , derived by Schulz(de) in 1933.^[4] Substituting this approximation into the previous method, the new method becomes $Z_{k + 1} = \frac{1}{2} Z_{k} (3 I - Z_{k}^{2})$ . Convergence is (still) quadratic, but only local (guaranteed for $‖ I - A^{2} ‖ < 1$ ).^[1]

Applications

Solutions of Sylvester equations

Theorem:^[2]^[3] Let $A, B, C \in ℝ^{n \times n}$ and assume that $A$ and $B$ are stable, then the unique solution to the Sylvester equation, $A X + X B = C$ , is given by $X$ such that $[\begin{matrix} - I & 2 X \\ 0 & I \end{matrix}] = csgn ([\begin{matrix} A & - C \\ 0 & - B \end{matrix}]) .$ Proof sketch: The result follows from the similarity transform $[\begin{matrix} A & - C \\ 0 & - B \end{matrix}] = [\begin{matrix} I & X \\ 0 & I \end{matrix}] [\begin{matrix} A & 0 \\ 0 & - B \end{matrix}] {[\begin{matrix} I & X \\ 0 & I \end{matrix}]}^{- 1},$ since $csgn ([\begin{matrix} A & - C \\ 0 & - B \end{matrix}]) = [\begin{matrix} I & X \\ 0 & I \end{matrix}] [\begin{matrix} I & 0 \\ 0 & - I \end{matrix}] [\begin{matrix} I & - X \\ 0 & I \end{matrix}],$ due to the stability of $A$ and $B$ . The theorem is, naturally, also applicable to the Lyapunov equation. However, due to the structure the Newton iteration simplifies to only involving inverses of $A$ and $A^{T}$ .

Solutions of algebraic Riccati equations

There is a similar result applicable to the algebraic Riccati equation, $A^{H} P + P A - P F P + Q = 0$ .^[1]^[2] Define $V, W \in ℂ^{2 n \times n}$ as $[\begin{matrix} V & W \end{matrix}] = csgn ([\begin{matrix} A^{H} & Q \\ F & - A \end{matrix}]) - [\begin{matrix} I & 0 \\ 0 & I \end{matrix}] .$ Under the assumption that $F, Q \in ℂ^{n \times n}$ are Hermitian and there exists a unique stabilizing solution, in the sense that $A - F P$ is stable, that solution is given by the over-determined, but consistent, linear system $V P = - W .$ Proof sketch: The similarity transform $[\begin{matrix} A^{H} & Q \\ F & - A \end{matrix}] = [\begin{matrix} P & - I \\ I & 0 \end{matrix}] [\begin{matrix} (- A - F P) & - F \\ 0 & (A - F P) \end{matrix}] {[\begin{matrix} P & - I \\ I & 0 \end{matrix}]}^{- 1},$ and the stability of $A - F P$ implies that $(csgn ([\begin{matrix} A^{H} & Q \\ F & - A \end{matrix}]) - [\begin{matrix} I & 0 \\ 0 & I \end{matrix}]) [\begin{matrix} X & - I \\ I & 0 \end{matrix}] = [\begin{matrix} X & - I \\ I & 0 \end{matrix}] [\begin{matrix} 0 & Y \\ 0 & - 2 I \end{matrix}],$ for some matrix $Y \in ℂ^{n \times n}$ .

Computations of matrix square-root

The Denman–Beavers iteration for the square root of a matrix can be derived from the Newton iteration for the matrix sign function by noticing that $A - P I P = 0$ is a degenerate algebraic Riccati equation^[3] and by definition a solution $P$ is the square root of $A$ .

References

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 ^1.6 ^1.7 Higham, Nicholas J. (2008). Functions of matrices : theory and computation. Society for Industrial and Applied Mathematics. Philadelphia, Pa.: Society for Industrial and Applied Mathematics (SIAM, 3600 Market Street, Floor 6, Philadelphia, PA 19104). ISBN 978-0-89871-777-8. OCLC 693957820.
↑ ^2.0 ^2.1 ^2.2 ^2.3 Roberts, J. D. (October 1980). "Linear model reduction and solution of the algebraic Riccati equation by use of the sign function". International Journal of Control. 32 (4): 677–687. doi:10.1080/00207178008922881. ISSN 0020-7179.
↑ ^3.0 ^3.1 ^3.2 Denman, Eugene D.; Beavers, Alex N. (1976). "The matrix sign function and computations in systems". Applied Mathematics and Computation. 2 (1): 63–94. doi:10.1016/0096-3003(76)90020-5. ISSN 0096-3003.
↑ Schulz, Günther (1933). "Iterative Berechung der reziproken Matrix". ZAMM - Journal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte Mathematik und Mechanik. 13 (1): 57–59. Bibcode:1933ZaMM...13...57S. doi:10.1002/zamm.19330130111. ISSN 1521-4001.

[:0-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 ^1.6 ^1.7 Higham, Nicholas J. (2008). Functions of matrices : theory and computation. Society for Industrial and Applied Mathematics. Philadelphia, Pa.: Society for Industrial and Applied Mathematics (SIAM, 3600 Market Street, Floor 6, Philadelphia, PA 19104). ISBN 978-0-89871-777-8. OCLC 693957820.

[:1-2] 2.0 ^2.1 ^2.2 ^2.3 Roberts, J. D. (October 1980). "Linear model reduction and solution of the algebraic Riccati equation by use of the sign function". International Journal of Control. 32 (4): 677–687. doi:10.1080/00207178008922881. ISSN 0020-7179.

[:2-3] 3.0 ^3.1 ^3.2 Denman, Eugene D.; Beavers, Alex N. (1976). "The matrix sign function and computations in systems". Applied Mathematics and Computation. 2 (1): 63–94. doi:10.1016/0096-3003(76)90020-5. ISSN 0096-3003.

[4] Schulz, Günther (1933). "Iterative Berechung der reziproken Matrix". ZAMM - Journal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte Mathematik und Mechanik. 13 (1): 57–59. Bibcode:1933ZaMM...13...57S. doi:10.1002/zamm.19330130111. ISSN 1521-4001.

[1]

[2]

[3]

[4]

Matrix sign function

Contents

Definition

Properties

Computational methods

Newton iteration

Newton–Schulz iteration

Applications

Solutions of Sylvester equations

Solutions of algebraic Riccati equations

Computations of matrix square-root

References

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

In other projects

In other languages