Mathematics Machine Learning - a feature vector, matrix diagonalization

@

  1. Linear Algebra
    vector, vector space; matrix, linear transformation;
    eigenvalues, eigenvectors; singular values, singular value decomposition
  2. Probability and Statistics
    random events; conditional probability, total probability, Bayesian probability;
    statistics, common distribution; basic principles
  3. Optimization theory
    limit, derivative; linear approximation, Taylor expansion;
    convex function, Jensen inequality; least square method; gradient, gradient descent

And a linear transformation matrix

It describes any linear transformation matrix, and a linear transformation preserves the straight parallel lines, but useless movement origin.
\ [\ PMB V = \ bmatrix the begin {Y} X \\ \\ \\ Z \ bmatrix End {} = \ begin {bmatrix} x \\ 0 \\ 0 \\ \ end {bmatrix} + \ begin {bmatrix} 0 \\ y \\ 0 \\ \ end {bmatrix} + \ begin {bmatrix} 0 \\ 0 \\ \\ Z \ bmatrix End {} \]
\ [\ PMB V = \ bmatrix the begin {Y} X \\ \\ \\ Z \ bmatrix End {} = X \ Times \ bmatrix the begin {0}. 1 \\ \\ 0 \\ \ end {bmatrix} + y \ times \ begin {bmatrix} 0 \\ 1 \\ 0 \\ \ end {bmatrix} + z \ times \ begin {bmatrix} 0 \\ 0 \\ 1 \\ \ end {bmatrix} \]
coordinates of each vector are indicated with a displacement parallel to the respective axes.
If the rows of the matrix are interpreted as the basis vectors of the coordinate system, then multiplied by the matrix is equivalent to performing a coordinate conversion If \ (aM = b \) , we can say, \ (M will be a \) converted to \ (b \) .
From this point of view, the term " conversion " and " multiplication " are equivalent.
Frankly, the matrix is not a mystery, it's just in a compact way to express mathematical operations required for coordinate conversion. Further, with the linear algebra matrix operation, a simple conversion method of simple or more complex derived transformation.
We can not depend on the coordinate system to zoom in an arbitrary direction, and the \ (\ vec {n} \ ) is the direction parallel to the scaling unit vector, k is a scaling factor, the scaling direction passing through the origin and parallel to \ (\ VEC {n} \) linear (2D) is or planar (3D) is performed.

\ (\ vec {v} = \ vec {v} _ {+} || \ vec {v} _ {⊥} \)
\ (\ vec {v} _ {||} = (\ begin {of} \ ln \ begin {n}) \ begin {n} \)
\ (\ vec {v} _ {⊥} = \ begin {of} - \ begin {of} _ {||} = \ begin {of} - ( \ begin {of} \ times \ begin {n}) \ begin {n} \)

\ (\ begin {of} ^ { '} = \ vec {v} _ {||} ^ {'} + \ vec {v} _ {⊥} ^ { '} \)
\ (\ begin {of} _ {⊥} ^ { '} = \ vec {v} _ {⊥} = \ begin {of} - (\ begin {of} \ times \ begin {n}) \ begin {n} \)
\ (\ begin { v} _ {||} ^ { '} = k \ times \ vec {v} _ {||} = k \ times (\ begin {of} \ times \ begin {n}) \ begin {n} \)
\ (\ begin {of} ^ { '} = \ vec {v} _ {⊥} = \ begin {of} - (\ begin {of} \ times \ begin {n}) \ begin {n} + a \ ln (\ begin {of} \ times \ begin {n}) \ begin {n} = \ begin {of} + (k-1) (\ begin {n} \ times \ vec {v}) \ begin {n } \)
(待补充)

Eigenvalue and eigenvector

Defined: $ A \ (n-order square matrix, for \) \ the lambda \ (, nonzero vector \) \ VEC V \ (, so that \) \ PMB A \ VEC V = \ the lambda \ VEC V \ (, then : \) \ $ called the lambda value characteristic, \ (\ VEC V \) corresponds called \ (\ the lambda \) eigenvectors
feature value may be 0, the feature vector is not 0

\ (\ PMB a \ VEC = X \ the lambda \ ~ ~ ~ ~ the X-the X-vec \ not = 0 \)

\ ((\ PMB A- \ the lambda \ PMB E) \ = 0 the X-vec \)

\ (| \ PMB A- \ the lambda \ PMB E | = 0 \)
where \ (\ lambda \) and \ (x \) as we need to seek value

  • \ (\ pmb Ax \) a linear transformation of a representative vector, \ (\ the lambda X \) vector representing scaling transform
  • Meaning that the feature vector is that vector in which the stretching occurs only transform
  • The corresponding characteristic values ​​measured for the stretch factor
  • Characteristic value is the speed of movement, characterized in that the direction of motion vectors

NOTE: Only matrix to calculate eigenvalues and eigenvectors

*
Example:
\ [\ A = PMB \ bmatrix the begin {}. 4. 3 & 0 \\ \\ & -5 \ bmatrix End {} \]
Eigenvalue: **
\ (| \ PMB A- \ the lambda \ PMB E | \)
$
= \} bmatrix the begin {
4- \ & the lambda 0 \
. 3 &-5- \ the lambda
\ bmatrix End {}
$
\ (= (4- \ the lambda) (- 5- \ lambda) = 0 \)
to give: \ (\ lambda_ {. 1} = -. 5, \ lambda_ {2} =. 4 \)
for the feature value \ (\ lambda_ {. 1} = -. 5 \) , calculates the feature vector \ (\ X_ {}. 1 PMB \)
\ (\ bmatrix the begin {}. 9 & 0. 3 & 0 \\ \\ \ bmatrix End {} \ CDOT \ VEC X = 0 ~~~ \ PMB. 1} = X_ {\ bmatrix the begin {0} \\ 1 \\\ end {bmatrix} \)
for the feature value \ (\ lambda_ = {2}. 4 \) , calculates the feature vector \ (\ pmb X_ {2} \)
\(\begin{bmatrix}0&0\\3&-9\\\end{bmatrix}\cdot\vec x=0~~~\pmb X_{2}=\begin{bmatrix}3\\1\\\end{bmatrix}\)

Example:
\ [\ A PMB = \ {bmatrix the begin. 4} -2 & \\\ \\ -1. 3 & bmatrix End {} \]
Eigenvalue:
\ (| \ PMB A- \ the lambda \ VEC X | = \ the begin {bmatrix} 4- \ lambda & -2
\\ 3 & -1- \ lambda \ end {bmatrix} = (4- \ lambda) (- 1- \ lambda) + 6 = 0 \) to give: \ (\ {lambda_. 1 } = 1, \ lambda_ {2
} = 2 \) for the feature value \ (\ lambda_ {. 1} =. 1 \) , calculates the feature vector \ (\ PMB X_ {. 1} \)
\ (\ the begin {bmatrix}. 3 & - 2 \\ 3 & -2 \ end {
bmatrix} \ cdot \ vec x = 0 ~~~ \ pmb X_ {1} = \ begin {bmatrix} 2 \\ 3 \\\ end {bmatrix} \) for the feature value \ (\ lambda_ {2} = 2 \) , calculates the feature vector \ (\ PMB X_ {2} \)
\ (\ bmatrix the begin {2} -2 & -3 \\\ \\. 3 & bmatrix End {} \ CDOT \ VEC x = 0 ~~~ \ pmb X_ { 2} = \ begin {bmatrix} 1 \\ 1 \\\ end {bmatrix} \)
another calculation , first, \ (\ vec x \) expressed as a feature vector \ (\ begin {bmatrix} 1 \\ 1 \\\ end {bmatrix} \) and\ (\ begin {bmatrix} 2 \\ 3 \\\ end {bmatrix} \) a linear combination, namely:
\ [\ X VEC = \ bmatrix the begin {2}. 1 \\\ \\ bmatrix End {} = - 1 \ cdot \ begin {bmatrix}
1 \\ 1 \\\ end {bmatrix} +1 \ cdot \ begin {bmatrix} 2 \\ 3 \\\ end {bmatrix} \] then, the eigenvalue and corresponding coefficients ( characteristic value) is multiplied to obtain:
\ [\ VEC Y = -1 \ cdot2 \ CDOT \ bmatrix the begin {}. 1. 1 \\\ \\} End {bmatrix + 1'd \ cdot1 \ CDOT \ bmatrix the begin {2} \\ 3 \\\ end {bmatrix} = \
begin {bmatrix} 0 \\ 1 \\\ end {bmatrix} \] this \ (\ vec y = \ pmb A \ vec x = \ begin {bmatrix} 0 \\ 1 \\\ end {bmatrix} \) identical, it represents \ (\ pmb a \) of the vector \ (\ vec x \) of the linear transformation corresponds to \ (\ pmb a \) eigenvalues and eigenvectors and \ ( \ vec x \) linear combination, it can be said when the linear transformation, eigenvalue and eigenvector matrix may represent a matrix.
matrix acts as a map, actually zoom feature vectors, each feature vector zoom level is feature value.
the \ (\ vec x \)Represents a linear combination of vectors of feature vectors (corresponding to feature vectors group), to give the corresponding right eigenvectors weight. Then, each weight value is multiplied with the feature, this mapping is the essential scaling operation.
***

Seeking eigenvalue

Singular matrix

Similarity matrix

Definition: If \ (\ pmb A \) and \ (\ pmb B \) are n-order square matrix, if there exists a invertible matrix \ (\ PMB P \) , so \ (\ pmb P ^ {- 1} \ A CDOT \ CDOT \ P PMB = B \) , called \ (\ pmb A \) and (\ pmb B \) \ similar

Diagonalization


Definition proved
Definition : Suppose a \ (n \ times n \) order square matrix \ (\ PMB A \) , there are \ (n-\) linearly independent eigenvectors \ (v_1, v_2, \ cdots , v_n \ ) , constituent features of all feature vectors vector matrix \ (\ PMB S \) , then there is \ (\ PMB S ^ {-. 1} \ PMB a \ PMB S = \ the Lambda \) , where \ (\ the Lambda \) is by the \ (\ pmb S \) corresponding to the eigenvalues of the diagonal matrix composition, i.e.:
\ [\ PMB S ^ {-}. 1 \ PMB a \ S PMB = \ the Lambda = \ bmatrix the begin {} \ \\ & lambda_1 \ ddots \\ && \ lambda_n \ end {bmatrix} \]
demonstrated:
\ (\ PMB A \ PMB S = \ PMB A \ the begin {bmatrix} V_1 & V_2 & V_3 & \ cdots & V_n \ End {bmatrix} = \ the begin {bmatrix} \ lambda_1v_1 & \ lambda_2v_2 & \ lambda_3v_3 & \ cdots & \ lambda_nv_n \ end {bmatrix} = \ pmb S \ begin {bmatrix} \ lambda_1 \\ & \ ddots \\ && \ lambda_n \ end {bmatrix} = \ pmb S \ pmb {\ Lambda} \)

\(\pmb S^{-1}\pmb A\pmb S=\pmb S^{-1}\pmb S\pmb{\Lambda}=\pmb{\Lambda}\)

\ (\ PMB = A \ PMB S \ PMB {\ the Lambda} \ PMB S ^ {-}. 1 \) (matrix diagonalization)
***
Example:
\ [\ A = PMB \ bmatrix the begin {2} -3 & \ \ -10 & 6 \\\ end {bmatrix
} \] Diagonalization \ (\ PMB A \) .

解:
\(\pmb A-\lambda\pmb E=\begin{bmatrix}-3-\lambda&2\\-10&6-\lambda\end{bmatrix}\)

\ ((- 3- \ lambda) (6- \ lambda) + 20 = 0 \)

\ (\ Lambda_1 = 1, ~~~ \ lambda_2 = 2 \)

Corresponding to \ (\ lambda_1 \) feature vector \ (V_1 \) :

\(\begin{bmatrix}-4&2\\-10&5\end{bmatrix}\cdot v_1=0, ~~~v_1=\begin{bmatrix}1\\2\end{bmatrix}\)

Corresponding to \ (\ lambda_2 \) feature vector \ (V_2 \) :

\(\begin{bmatrix}-5&2\\-10&4\end{bmatrix}\cdot v_2=0, ~~~v_2=\begin{bmatrix}2\\5\end{bmatrix}\)

\(\pmb P=\begin{bmatrix}\vec v_1&\vec v_2\end{bmatrix}=\begin{bmatrix}1&2\\2&5\\\end{bmatrix}\)

\ (\ PMB P ^ {-. 1} = \ {bmatrix the begin} -2. 5 & \\ -. 3. 1 & \ bmatrix End {} \) (second order, the change to the main diagonal, the diagonal becomes a negative number / det (P))


Guess you like

Origin www.cnblogs.com/zxingwork/p/12488857.html