FastICA

From The Right Wiki
Jump to navigationJump to search

FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology.[1][2] Like most ICA algorithms, FastICA seeks an orthogonal rotation of prewhitened data, through a fixed-point iteration scheme, that maximizes a measure of non-Gaussianity of the rotated components. Non-gaussianity serves as a proxy for statistical independence, which is a very strong condition and requires infinite data to verify. FastICA can also be alternatively derived as an approximative Newton iteration.

Algorithm

Prewhitening the data

Let the X:=(xij)N×M denote the input data matrix, M the number of columns corresponding with the number of samples of mixed signals and N the number of rows corresponding with the number of independent source signals. The input data matrix X must be prewhitened, or centered and whitened, before applying the FastICA algorithm to it.

  • Centering the data entails demeaning each component of the input data X, that is,
xijxij1Mjxij
for each i=1,,N and j=1,,M. After centering, each row of X has an expected value of 0.
  • Whitening the data requires a linear transformation L:N×MN×M of the centered data so that the components of L(X) are uncorrelated and have variance one. More precisely, if X is a centered data matrix, the covariance of Lx:=L(X) is the (N×N)-dimensional identity matrix, that is,
E{LxLxT}=IN
A common method for whitening is by performing an eigenvalue decomposition on the covariance matrix of the centered data X, E{XXT}=EDET, where E is the matrix of eigenvectors and D is the diagonal matrix of eigenvalues. The whitened data matrix is defined thus by
XD1/2ETX.

Single component extraction

The iterative algorithm finds the direction for the weight vector wN that maximizes a measure of non-Gaussianity of the projection wTX, with XN×M denoting a prewhitened data matrix as described above. Note that w is a column vector. To measure non-Gaussianity, FastICA relies on a nonquadratic nonlinear function f(u), its first derivative g(u), and its second derivative g(u). Hyvärinen states that the functions

f(u)=logcosh(u),g(u)=tanh(u),andg(u)=1tanh2(u),

are useful for general purposes, while

f(u)=eu2/2,g(u)=ueu2/2,andg(u)=(1u2)eu2/2

may be highly robust.[1] The steps for extracting the weight vector w for single component in FastICA are the following:

  1. Randomize the initial weight vector w
  2. Let w+E{Xg(wTX)T}E{g(wTX)}w, where E{...} means averaging over all column-vectors of matrix X
  3. Let ww+/w+
  4. If not converged, go back to 2

Multiple component extraction

The single unit iterative algorithm estimates only one weight vector which extracts a single component. Estimating additional components that are mutually "independent" requires repeating the algorithm to obtain linearly independent projection vectors - note that the notion of independence here refers to maximizing non-Gaussianity in the estimated components. Hyvärinen provides several ways of extracting multiple components with the simplest being the following. Here, 1M is a column vector of 1's of dimension M. Algorithm FastICA

Input: C Number of desired components
Input: XN×M Prewhitened matrix, where each column represents an N-dimensional sample, where C<=N
Output: WN×C Un-mixing matrix where each column projects X onto independent component.
Output: SC×M Independent components matrix, with M columns representing a sample with C dimensions.

for p in 1 to C: wp Random vector of length N while wp changes wp1MXg(wpTX)T1Mg(wpTX)1Mwp wpwpj=1p1(wpTwj)wj wpwpwp
output W[w1,,wC]
output SWTX

See also

References

  1. 1.0 1.1 Hyvärinen, A.; Oja, E. (2000). "Independent component analysis: Algorithms and applications" (PDF). Neural Networks. 13 (4–5): 411–430. CiteSeerX 10.1.1.79.7003. doi:10.1016/S0893-6080(00)00026-5. PMID 10946390.
  2. Hyvarinen, A. (1999). "Fast and robust fixed-point algorithms for independent component analysis" (PDF). IEEE Transactions on Neural Networks. 10 (3): 626–634. CiteSeerX 10.1.1.297.8229. doi:10.1109/72.761722. PMID 18252563.

External links