Permutation matrix

From The Right Wiki
(Redirected from Permutation matrices)
Jump to navigationJump to search

In mathematics, particularly in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry of 1 in each row and each column with all other entries 0.[1]: 26  An n × n permutation matrix can represent a permutation of n elements. Pre-multiplying an n-row matrix M by a permutation matrix P, forming PM, results in permuting the rows of M, while post-multiplying an n-column matrix M, forming MP, permutes the columns of M. Every permutation matrix P is orthogonal, with its inverse equal to its transpose: P1=PT.[1]: 26  Indeed, permutation matrices can be characterized as the orthogonal matrices whose entries are all non-negative.[2]

The two permutation/matrix correspondences

There are two natural one-to-one correspondences between permutations and permutation matrices, one of which works along the rows of the matrix, the other along its columns. Here is an example, starting with a permutation π in two-line form at the upper left:

π:(12343241)Rπ:(0010010000011000)Cπ:(0001010010000010)π1:(12344213)

The row-based correspondence takes the permutation π to the matrix Rπ at the upper right. The first row of Rπ has its 1 in the third column because π(1)=3. More generally, we have Rπ=(rij) where rij=1 when j=π(i) and rij=0 otherwise. The column-based correspondence takes π to the matrix Cπ at the lower left. The first column of Cπ has its 1 in the third row because π(1)=3. More generally, we have Cπ=(cij) where cij is 1 when i=π(j) and 0 otherwise. Since the two recipes differ only by swapping i with j, the matrix Cπ is the transpose of Rπ; and, since Rπ is a permutation matrix, we have Cπ=RπT=Rπ1. Tracing the other two sides of the big square, we have Rπ1=Cπ=Rπ1 and Cπ1=Rπ.[3]

Permutation matrices permute rows or columns

Multiplying a matrix M by either Rπ or Cπ on either the left or the right will permute either the rows or columns of M by either π or π−1. The details are a bit tricky. To begin with, when we permute the entries of a vector (v1,,vn) by some permutation π, we move the ith entry vi of the input vector into the π(i)th slot of the output vector. Which entry then ends up in, say, the first slot of the output? Answer: The entry vj for which π(j)=1, and hence j=π1(1). Arguing similarly about each of the slots, we find that the output vector is

(vπ1(1),vπ1(2),,vπ1(n)),

even though we are permuting by π, not by π1. Thus, in order to permute the entries by π, we must permute the indices by π1.[1]: 25  (Permuting the entries by π is sometimes called taking the alibi viewpoint, while permuting the indices by π would take the alias viewpoint.[4]) Now, suppose that we pre-multiply some n-row matrix M=(mi,j) by the permutation matrix Cπ. By the rule for matrix multiplication, the (i,j)th entry in the product CπM is

k=1nci,kmk,j,

where ci,k is 0 except when i=π(k), when it is 1. Thus, the only term in the sum that survives is the term in which k=π1(i), and the sum reduces to mπ1(i),j. Since we have permuted the row index by π1, we have permuted the rows of M themselves by π.[1]: 25  A similar argument shows that post-multiplying an n-column matrix M by Rπ permutes its columns by π. The other two options are pre-multiplying by Rπ or post-multiplying by Cπ, and they permute the rows or columns respectively by π−1, instead of by π.

The transpose is also the inverse

A related argument proves that, as we claimed above, the transpose of any permutation matrix P also acts as its inverse, which implies that P is invertible. (Artin leaves that proof as an exercise,[1]: 26  which we here solve.) If P=(pi,j), then the (i,j)th entry of its transpose PT is pj,i. The (i,j)th entry of the product PPT is then

k=1npi,kpj,k.

Whenever ij, the kth term in this sum is the product of two different entries in the kth column of P; so all terms are 0, and the sum is 0. When i=j, we are summing the squares of the entries in the ith row of P, so the sum is 1. The product PPT is thus the identity matrix. A symmetric argument shows the same for PTP, implying that P is invertible with P1=PT.

Multiplying permutation matrices

Given two permutations of n elements 𝜎 and 𝜏, the product of the corresponding column-based permutation matrices Cσ and Cτ is given,[1]: 25  as you might expect, by CσCτ=Cστ, where the composed permutation στ applies first 𝜏 and then 𝜎, working from right to left: (στ)(k)=σ(τ(k)). This follows because pre-multiplying some matrix by Cτ and then pre-multiplying the resulting product by Cσ gives the same result as pre-multiplying just once by the combined Cστ. For the row-based matrices, there is a twist: The product of Rσ and Rτ is given by

RσRτ=Rτσ,

with 𝜎 applied before 𝜏 in the composed permutation. This happens because we must post-multiply to avoid inversions under the row-based option, so we would post-multiply first by Rσ and then by Rτ. Some people, when applying a function to an argument, write the function after the argument (postfix notation), rather than before it. When doing linear algebra, they work with linear spaces of row vectors, and they apply a linear map to an argument by using the map's matrix to post-multiply the argument's row vector. They often use a left-to-right composition operator, which we here denote using a semicolon; so the composition σ;τ is defined either by

(σ;τ)(k)=τ(σ(k)),

or, more elegantly, by

(k)(σ;τ)=((k)σ)τ,

with 𝜎 applied first. That notation gives us a simpler rule for multiplying row-based permutation matrices:

RσRτ=Rσ;τ.

Matrix group

When π is the identity permutation, which has π(i)=i for all i, both Cπ and Rπ are the identity matrix. There are n! permutation matrices, since there are n! permutations and the map C:πCπ is a one-to-one correspondence between permutations and permutation matrices. (The map R is another such correspondence.) By the formulas above, those n × n permutation matrices form a group of order n! under matrix multiplication, with the identity matrix as its identity element, a group that we denote 𝒫n. The group 𝒫n is a subgroup of the general linear group GLn() of invertible n × n matrices of real numbers. Indeed, for any field F, the group 𝒫n is also a subgroup of the group GLn(F), where the matrix entries belong to F. (Every field contains 0 and 1 with 0+0=0, 0+1=1, 0*0=0, 0*1=0, and 1*1=1; and that's all we need to multiply permutation matrices. Different fields disagree about whether 1+1=0, but that sum doesn't arise.) Let Sn denote the symmetric group, or group of permutations, on {1,2,...,n} where the group operation is the standard, right-to-left composition ""; and let Sn denote the opposite group, which uses the left-to-right composition ";". The map C:SnGLn() that takes π to its column-based matrix Cπ is a faithful representation, and similarly for the map R:SnGLn() that takes π to Rπ.

Doubly stochastic matrices

Every permutation matrix is doubly stochastic. The set of all doubly stochastic matrices is called the Birkhoff polytope, and the permutation matrices play a special role in that polytope. The Birkhoff–von Neumann theorem says that every doubly stochastic real matrix is a convex combination of permutation matrices of the same order, with the permutation matrices being precisely the extreme points (the vertices) of the Birkhoff polytope. The Birkhoff polytope is thus the convex hull of the permutation matrices.[5]

Linear-algebraic properties

Just as each permutation is associated with two permutation matrices, each permutation matrix is associated with two permutations, as we can see by relabeling the example in the big square above starting with the matrix P at the upper right:

ρP:(12343241)P:(0010010000011000)P1:(0001010010000010)κP:(12344213)

So we are here denoting the inverse of C as κ and the inverse of R as ρ. We can then compute the linear-algebraic properties of P from some combinatorial properties that are shared by the two permutations κP and ρP=κP1. A point is fixed by κP just when it is fixed by ρP, and the trace of P is the number of such shared fixed points.[1]: 322  If the integer k is one of them, then the standard basis vector ek is an eigenvector of P.[1]: 118  To calculate the complex eigenvalues of P, write the permutation κP as a composition of disjoint cycles, say κP=c1c2ct. (Permutations of disjoint subsets commute, so it doesn't matter here whether we are composing right-to-left or left-to-right.) For 1it, let the length of the cycle ci be i, and let Li be the set of complex solutions of xi=1, those solutions being the ith roots of unity. The multiset union of the Li is then the multiset of eigenvalues of P. Since writing ρP as a product of cycles would give the same number of cycles of the same lengths, analyzing ρp would give the same result. The multiplicity of any eigenvalue v is the number of i for which Li contains v.[6] (Since any permutation matrix is normal and any normal matrix is diagonalizable over the complex numbers,[1]: 259  the algebraic and geometric multiplicities of an eigenvalue v are the same.) From group theory we know that any permutation may be written as a composition of transpositions. Therefore, any permutation matrix factors as a product of row-switching elementary matrices, each of which has determinant −1. Thus, the determinant of the permutation matrix P is the sign of the permutation κP, which is also the sign of ρP.

Restricted forms

  • Costas array, a permutation matrix in which the displacement vectors between the entries are all distinct
  • n-queens puzzle, a permutation matrix in which there is at most one entry in each diagonal and antidiagonal

See also

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Artin, Michael (1991). Algebra. Prentice Hall. pp. 24–26, 118, 259, 322. ISBN 0-13-004763-5. OCLC 24364036.
  2. Zavlanos, Michael M.; Pappas, George J. (November 2008). "A dynamical systems approach to weighted graph matching". Automatica. 44 (11): 2817–2824. CiteSeerX 10.1.1.128.6870. doi:10.1016/j.automatica.2008.04.009. S2CID 834305. Retrieved 21 August 2022. Let On denote the set of n×n orthogonal matrices and Nn denote the set of n×n element-wise non-negative matrices. Then, Pn=OnNn, where Pn is the set of n×n permutation matrices.
  3. This terminology is not standard. Most authors use just one of the two correspondences, choosing which to be consistent with their other conventions. For example, Artin uses the column-based correspondence. We have here invented two names in order to discuss both options.
  4. Conway, John H.; Burgiel, Heidi; Goodman-Strauss, Chaim (2008). The Symmetries of Things. A K Peters/CRC Press. p. 179. doi:10.1201/b21368. ISBN 978-0-429-06306-0. OCLC 946786108. A permutation—say, of the names of a number of people—can be thought of as moving either the names or the people. The alias viewpoint regards the permutation as assigning a new name or alias to each person (from the Latin alias = otherwise). Alternatively, from the alibi viewoint we move the people to the places corresponding to their new names (from the Latin alibi = in another place.)
  5. Brualdi 2006, p. 19
  6. Najnudel & Nikeghbali 2013, p. 4