Data Mining - NMF

Alternating least squares method:
If the two matrices W and H in FIG off his green equations simultaneously, non-linear problem, first consider the initialization of a matrix W, then the matrix H is determined at limit nonnegative matrix, and then using the obtained matrix Conversely solution matrix W is H, ..., until meeting up error limits, this method is called alternating least squares method.
Here Insert Picture Description
Taking into account this factor out of the matrix W and H are not unique, and there may be negative elements, we also need the results of each iteration out for standardization.
Active Set Methods:
In norm F, decomposition of the matrix can be seen as the least squares for each column of the matrix A is carried out in the two-norm sense.
Active set method drawback: for speed NMF is too slow.
Here Insert Picture Description
Here Insert Picture Description
NMF can be used for classification;
Here Insert Picture Description
As said earlier, we also need to initialize a matrix W, to determine how this matrix it?
The use
based on the selection of the SVD calculation:
the MATLAB experiments SVD-based selection operation
Theory: First, a matrix A for the first left singular vectors obtained by the SVD U . 1 and the first right singular vector V . 1 (transposed) were used as a first row and first column of the matrix H of the matrix W. Since the matrix A is non-negative irreducible, it is possible to ensure that the two non-negative singular vectors.
The second left singular vectors of A, and the second right singular vector used to construct a new matrix C, a negative element in the zero C, then C is of the SVD, take the first left singular vectors, the right singular vector (transpose) respectively, the second row of W, H of the second row.
A third of the left singular vectors, and the third right singular vector used to construct ............

Here Insert Picture Description
Test Results: The
Here Insert Picture Description
visible matrix H of the first four columns (corresponding to the document) can be initially selected Google Image showing good;
contrast,
the third column (corresponding to the document) may not initially be selected Google keyword given a good indication, because (0.5251 0) is not only a zero dollars, and in which the proportion of non-zero elements is not large.

Published 109 original articles · won praise 30 · views 10000 +

Guess you like

Origin blog.csdn.net/qq_43448491/article/details/103068951