Kalman Filter (Kalman Filter) concept introduction and detailed formula derivation

Kalman filter

Kalman filter is a high-efficiency recursive filter ( autoregressive filter ) that can estimate the state of a dynamic system from a series of incomplete and noisy measurements . Kalman filtering will consider the joint distribution at each time according to the value of each measurement at different times, and then generate an estimate of the unknown variable, so it will be more accurate than the estimation method based only on a single measurement. Kalman filtering gets its name from Rudolf Kalman, one of the main contributors .

The algorithm of the Kalman filter is a two-step procedure. In the estimation step, the Kalman filter produces an estimate about the current state, which also includes uncertainty. Just observe the next measurement (which must contain some degree of error, including random noise). Estimates are updated through a weighted average , and measurements with higher certainty are weighted more heavily. The algorithm is iterative and can be executed in a real-time control system , requiring only the current input measurements, past calculated values, and its uncertainty matrix, and no other past information is required.

Applications

A typical example of Kalman filtering is to predict the coordinates and velocity of the object's position from a limited set of noisy sequences of observations of the object's position ( possibly biased ) . It can be found in many engineering applications (such as radar , machine vision ). At the same time, Kalman filtering is also an important topic in control theory and control system engineering .

For example, with radar, people are interested in their ability to track targets. But the measured values ​​of the target's position, velocity, and acceleration are often noisy at any time. Kalman filtering uses the dynamic information of the target to try to remove the influence of noise and get a good estimate of the target position . This estimate can be an estimate of the current target position (filtering), an estimate of future positions (prediction), or an estimate of past positions (interpolation or smoothing).

Basic Dynamic System Model

Kalman filtering is built on linear algebra and hidden Markov models (hidden Markov model). Its basic dynamical system can be represented by a Markov chain built on a linear operator disturbed by Gaussian noise (i.e., normally distributed noise) . The state of the system can be represented by a vector whose elements are real numbers . With each increase of the discrete time , this linear operator will act on the current state to generate a new state, and will also introduce some noise, and at the same time, some known controller control information of the system will also be join in. At the same time, another linear operator, corrupted by noise, produces the visible output of these hidden states.

In order to use the Kalman filter to estimate the internal state of the observed process from a series of noisy observation data, the process must be modeled under the framework of the Kalman filter. That is to say, for each step k , define the matrix F k F_kFk, H k H_k Hk, Q k Q_k Qk, R k R_k Rk, sometimes it is also necessary to define B k B_kBk,as follows.

​ Kalman filter model assumes k + 1 k+1k+The real state at moment 1 is fromkkThe state at time k evolves and conforms to the state equation:
xk = F kxk − 1 + B kuk + wk x_{k}=F_kx_{k-1}+B_{k}u_{k}+w_{k}xk=Fkxk1+Bkuk+wk
in

  • F k F_k Fkis acting on xk − 1 x_{k-1}xk1The state transition model (matrix or vector) on .
  • B k B_k Bkis acting on the controller vector uk u_kukThe input-control model on .
  • w k w_k wkis the process noise, and it is assumed that its mean is zero, and the covariance matrix is ​​Q k Q_kQkmultivariate normal distribution.

w k ∼ N ( 0 , Q k ) w_k{\sim} N(0,Q_k) wkN(0,Qk)

where cov { w ( k ) } = Q ( k ) cov\{w(k)\}=Q(k)co v { w ( k )}=Q ( k ) represents the covariance matrix.

​ At time k, for real state xk x_kxkA measurement of zk z_kzkSatisfy the following formula, that is, the observation equation :
zk = H kxk + vk z_k=H_kx_k+v_kzk=Hkxk+vk
where H k H_kHkis the observation model, which maps the real state space to the observation space, vk v_kvkis the observation noise, its mean is zero, and the covariance matrix is ​​R k R_kRkAnd obey the normal distribution.
vk ∼ N ( 0 , R k ) v_k{\sim}N(0,R_k)vkN(0,Rk)
wherecov { v ( k ) } = R ( k ) cov\{v(k)\}=R(k)co v { v ( k )}=R ( k ) represents the covariance matrix.

​ Initial state and noise at each moment { x 0 , w 1 , … , wk , v 1 , … , vk } \{x_0,w_1,…,w_k,v_1,…,v_k\}{ x0,w1,,wk,v1,,vk} are considered to beindependentof each other.

algorithmic logic

Kalman filtering is a recursive estimation, that is, the estimated value of the current state can be calculated as long as the estimated value of the state at the previous moment and the observed value of the current state are known, so there is no need to record historical information of observations or estimates. The Kalman filter is different from most filters in that it is a pure time-domain filter. It does not need to be designed in the frequency domain and then converted to the time domain like low-pass filters and other frequency domain filters. accomplish.

Kalman filtering is mainly divided into two parts: prediction and update . It needs to construct the state equation and observation equation of the system, and the initial state of the system is known. In the prediction phase, the filter uses estimates from the previous state to make an estimate of the current state. In the update phase, the filter optimizes the prediction obtained in the prediction phase with observations of the current state to obtain a new, more accurate estimate.

The state of the Kalman filter is represented by the following two variables:

  • X ^ k ∣ k = E ( X k ∣ Y 1 , Y 2 , … , Y k ) \hat{X}_{k|k}=E(X_k|Y_1,Y_2,…,Y_k)X^kk=E ( XkY1,Y2,,Yk) represents the estimate of the state at time k.
  • X ^ k ∣ k − 1 = E ( X k − 1 ∣ Y 1 , Y 2 , … , Y k − 1 ) \hat{X}_{k|k-1}=E(X_{k-1} |Y_1,Y_2,…,Y_{k-1})X^kk1=E ( Xk1Y1,Y2,,Yk1) represents the prediction of the state at time k+1 given the state of the past k moments.
  • P ^ k ∣ k \hat{P}_{k|k} P^kk, the posterior estimation error covariance matrix, which measures the accuracy of the estimated value.

​ Then the prediction and update (Prediction-Correction) process is as follows:
X ^ k − 1 ∣ k − 1 → P rediction X ^ k ∣ k − 1 → C correction X ^ k ∣ k \hat{X}_{k-1 |k-1}\xrightarrow{Prediction}\hat{X}_{k|k-1}\xrightarrow{Correction}\hat{X}_{k|k}X^k1∣k1Prediction X^kk1Correction X^kk

1. Predict X ^ k − 1 ∣ k − 1 ⇒ X ^ k ∣ k − 1 \hat{X}_{k-1|k-1}\Rightarrow\hat{X}_{k|k-1}X^k1∣k1X^kk1

​ In the prediction step, the state at the current moment is predicted based on the state and control amount at the previous moment. This predicted value is an estimate because it does not take into account the observed values ​​at the current moment. The error covariance matrix of the predicted value is calculated from the error covariance matrix and the system noise covariance matrix at the previous moment.
{ x ^ k ∣ k − 1 = F kx ^ k − 1 + B kuk P k ∣ k − 1 = F k P k − 1 ∣ k − 1 F k T + Q k \begin{cases} \hat{x }_{k|k-1}=F_{k}\hat{x}_{k-1}+B_ku_{k}\\ \\ P_{k|k-1}=F_{k}P_{k -1|k-1}F_k^T+Q_k \end{cases} x^kk1=Fkx^k1+BkukPkk1=FkPk1∣k1FkT+Qk

2. 更新 X ^ n ∣ n − 1 ⇒ X ^ n ∣ n \hat{X}_{n|n-1}\Rightarrow\hat{X}_{n|n} X^nn1X^nn

In the update step, the estimated value of the state at the current moment is calculated according to the observed value and the predicted value at the current moment. This estimate is a more accurate estimate because it already takes into account the observations at the current moment. The error covariance matrix of the state estimate is calculated from the error covariance matrix, observation noise covariance matrix and Kalman gain calculated in the prediction step.
{ K k = P k ∣ k − 1 H k T ( H k P k ∣ k − 1 H k T + R k ) − 1 x ^ k ∣ k = x ^ k ∣ k − 1 + K k ( zk − H kx ^ k ∣ k − 1 ) P k ∣ k = ( I − K k H k ) P k ∣ k − 1 \begin{cases} K_{k}=P_{k|k-1}H^T_{ k}{(H_{k}P_{k|k-1}H^T_{k}+R_{k})^{-1}}\\ \\ \hat{x}_{k|k}= \hat{x}_{k|k-1}+K_{k}(z_k-H_{k}\hat{x}_{k|k-1})\\ \\ P_{k|k}= (I-K_{k}H_{k})P_{k|k-1} \end{cases} Kk=Pkk1HkT(HkPkk1HkT+Rk)1x^kk=x^kk1+Kk(zkHkx^kk1)Pkk=(IKkHk)Pkk1
The above five formulas are the core formulas of Kalman filtering. The more concise and easy-to-understand form of the update step is to first calculate the following three quantities:
{ y ^ k = zk − H kx ^ k ∣ k − 1 (measurement residual) S k = H k P k ∣ k H k T + R k (measurement residual covariance) K k = P k ∣ k − 1 H k TS k − 1 (optimal Kalman gain) \begin{cases} \hat{y}_{k}=z_{k} -H_k\hat{x}_{k|k-1}\quad(measurement residual)\\ \\ S_{k}=H_{k}P_{k|k}H^T_{k}+R_{ k}\quad(measurement residual covariance)\\ \\ K_{k}=P_{k|k-1}H^T_{k}{S_{k}^{-1}}\quad(optimal Kalman gain) \end{cases} y^k=zkHkx^kk1( measurement residual )Sk=HkPkkHkT+Rk( Measurement residual covariance )Kk=Pkk1HkTSk1( Optimal Kalman Gain )
These are then used to update the filter variables:
{ x ^ k ∣ k = x ^ k ∣ k − 1 + K ky ^ k (updated state estimate) P k ∣ k = ( I − K k H k ) P k ∣ k − 1 (updated covariance estimate) \begin{cases} \hat{x}_{k|k}=\hat{x}_{k|k-1}+K_{k}\hat{y }_{k}\quad(updated state estimation)\\ \\ P_{k|k}=(I-K_{k}H_{k})P_{k|k-1}\quad(updated association variance estimate) \end{cases} x^kk=x^kk1+Kky^k( updated state estimate )Pkk=(IKkHk)Pkk1( updated covariance estimate )
Use the above formula to calculate P k ∣ k P_{k|k}PkkValid only at optimal Kalman gain. For other gains, the formula is more complicated, please refer to the derivation .

3. Kalman filtering process

insert image description here

Formula Derivation

1. Symbol Definition

State equation
xk = F kxk − 1 + B kuk + wk x_{k}=F_kx_{k-1}+B_{k}u_{k}+w_{k}xk=Fkxk1+Bkuk+wk
Among them uk u_kukis the input signal, xk − 1 x_{k-1}xk1is the state variable, wk w_kwkIndicates the process noise. where cov { w ( k ) } = Q ( k ) cov\{w(k)\}=Q(k)co v { w ( k )}=Q(k)

Observation equation
zk = H kxk + vk z_k=H_kx_k+v_kzk=Hkxk+vk
where zk z_kzkis the observed variable, wk w_kwkrepresents the observation noise. where cpv { w ( k ) } = R ( k ) cpv\{w(k)\}=R(k)cpv{ w(k)}=R(k)

State estimation error
x ~ k ∣ k − 1 = xk − x ^ k ∣ k − 1 \tilde{x}_{k|k-1}=x_k-\hat{x}_{k|k-1}x~kk1=xkx^kk1
where x ~ k ∣ k − 1 \tilde{x}_{k|k-1}x~kk1Indicates that the estimated k − 1 k-1k1 moment forxk x_kxkestimate. At time k-1, any value at time k can only be estimated (predicting future values) .

State estimation equation
x ^ k ∣ k − 1 = F kx ^ k − 1 + B kuk \hat{x}_{k|k-1}=F_k\hat{x}_{k-1}+B_{k }u_{k}x^kk1=Fkx^k1+Bkuk
Observation estimation error
e ~ k = zk − z ^ k ∣ k − 1 \tilde{e}_{k}=z_{k}-\hat{z}_{k|k-1}e~k=zkz^kk1
Observation estimation equation
z ^ k ∣ k − 1 = H kx ^ k ∣ k − 1 \hat{z}_{k|k-1}=H_k\hat{x}_{k|k-1}z^kk1=Hkx^kk1
Based on the above formulas, two important error covariance matrices can be defined:

State error covariance matrix
P k ∣ k − 1 = cov { x ~ k ∣ k − 1 } P_{k|k-1}=cov\{\tilde{x}_{k|k-1}\}Pkk1=co v { x~kk1}

Observation error covariance matrix
SK = cov { e ~ k } S_{K}=cov\{\tilde{e}_{k}\}SK=co v { e~k}
The purpose of Kalman filtering is to obtain aniterative estimation expression that can be continuously corrected based on errors, and its specific form should be as follows:
x ^ k ∣ k = x ^ k ∣ k − 1 + K ke ~ k \hat{x}_ {k|k}=\hat{x}_{k|k-1}+K_k\tilde{e}_kx^kk=x^kk1+Kke~k
That is, the predicted value is corrected based on the error. Here K k K_kKkIt is the Kalman gain. Therefore, we need to derive an iterative expression for the variables used and find the Kalman gain.

2. Derivation

Deriving the error covariance matrix

State error covariance matrix :

P k ∣ k − 1 = c o v { x k − x ^ k ∣ k − 1 } = c o v { F k x k − 1 + B k u k + w k − F k x ^ k − 1 − B k u k } = c o v { F k ( x k − 1 − x ^ k − 1 ) + w k } = c o v { F k ( x k − 1 − x ^ k − 1 ) } + c o v { w k } = F k c o v { x k − 1 − x ^ k − 1 } F k T + Q k = F k P k ∣ k F k T + Q k \begin{align}P_{k|k-1}&=cov\{x_k-\hat{x}_{k|k-1}\}\\ &=cov\{F_kx_{k-1}+B_ku_k+w_k-F_k\hat{x}_{k-1}-B_{k}u_{k}\}\\ &=cov\{F_k(x_{k-1}-\hat{x}_{k-1})+w_k\}\\ &=cov\{F_k(x_{k-1}-\hat{x}_{k-1})\}+cov\{w_k\}\\ &=F_kcov\{x_{k-1}-\hat{x}_{k-1}\}F^T_k+Q_k\\ &=F_kP_{k|k}F^T_k+Q_k \end{align} Pkk1=co v { xkx^kk1}=co v { Fkxk1+Bkuk+wkFkx^k1Bkuk}=co v { Fk(xk1x^k1)+wk}=co v { Fk(xk1x^k1)}+co v { wk}=Fkco v { xk1x^k1}FkT+Qk=FkPkkFkT+Qk
Here we get P k ∣ k P_{k|k}Pkkto P k ∣ k − 1 P_{k|k-1}Pkk1A recursive form of P k ∣ k P_{k|k}PkkThe notation form is for the convenience of subsequent derivation, that is, to get
P k − 1 ∣ k − 1 → P k ∣ k − 1 → P k ∣ k \begin{align} P_{k-1|k-1}\ rightarrow P_{k|k-1}\rightarrow P_{k|k} \end{align}Pk1∣k1Pkk1Pkk
Similarly, the recursive form of the observation error covariance matrix can be obtained
: S k = H k P k ∣ k − 1 H k T + R k \begin{align} S_{k}=H_kP_{k|k-1}H ^T_k+R_k \end{align}Sk=HkPkk1HkT+Rk
It can be found that S k S_kSkand S k − 1 S_{k-1}Sk1There is no direct connection, which is why S k S_k is notSkWritten as S k ∣ k − 1 S_{k|k-1}Skk1form.

Therefore, P k ∣ k − 1 → P k ∣ k P_{k|k-1}\rightarrow P_{k|k} is still missingPkk1Pkkrecursive.

Deriving the posterior covariance matrix

By definition, the error covariance P k ∣ k P_{k|k}PkkThe definition of is as follows, just substitute the above definition to deduce and simplify line by line.
P k ∣ k = c o v { x ~ k ∣ k } = c o v { x k − x ^ k ∣ k } = c o v { x k − ( x ^ k ∣ k − 1 + K k e ~ k ) } = c o v { x k − ( x ^ k ∣ k − 1 + K k ( z k − z ^ k ∣ k − 1 ) ) } = c o v { x k − ( x ^ k ∣ k − 1 + K k ( z k − H k x ^ k ∣ k − 1 ) ) } = c o v { x k − ( x ^ k ∣ k − 1 + K k ( H k x k + v k − H k x ^ k ∣ k − 1 ) ) } = c o v { ( x k − x ^ k ∣ k − 1 ) − K k H k ( x k − x ^ k ∣ k − 1 ) − K k v k } = c o v { ( I − K k H k ) ( x k − x ^ k ∣ k − 1 ) − K k v k } = c o v { ( I − K k H k ) ( x k − x ^ k ∣ k − 1 ) } + c o v { K k v k } = ( I − K k H k ) c o v { ( x k − x ^ k ∣ k − 1 ) } ( I − K k H k ) T + K k c o v { v k } K k T = ( I − K k H k ) P k ∣ k − 1 ( I − K k H k ) T + K k R k K k T \begin{align} P_{k|k}&=cov\{\tilde{x}_{k|k}\} \\&=cov\{x_k-\hat{x}_{k|k}\} \\&=cov\{x_k-(\hat{x}_{k|k-1}+K_k\tilde{e}_k)\} \\&=cov\{x_k-(\hat{x}_{k|k-1}+K_k(z_{k}-\hat{z}_{k|k-1}))\} \\&=cov\{x_k-(\hat{x}_{k|k-1}+K_k(z_{k}-H_k\hat{x}_{k|k-1}))\} \\&=cov\{x_k-(\hat{x}_{k|k-1}+K_k(H_kx_k+v_k-H_k\hat{x}_{k|k-1}))\} \\&=cov\{(x_k-\hat{x}_{k|k-1})-K_kH_k(x_k-\hat{x}_{k|k-1})-K_kv_k\} \\&=cov\{(I-K_kH_k)(x_k-\hat{x}_{k|k-1})-K_kv_k\} \\&=cov\{(I-K_kH_k)(x_k-\hat{x}_{k|k-1})\}+cov\{K_kv_k\} \\&=(I-K_kH_k)cov\{(x_k-\hat{x}_{k|k-1})\}(I-K_kH_k)^T+K_kcov\{v_k\}K_k^T \\&=(I-K_kH_k)P_{k|k-1}(I-K_kH_k)^T+K_kR_kK_k^T \end{align} Pkk=co v { x~kk}=co v { xkx^kk}=co v { xk(x^kk1+Kke~k)}=co v { xk(x^kk1+Kk(zkz^kk1))}=co v { xk(x^kk1+Kk(zkHkx^kk1))}=co v { xk(x^kk1+Kk(Hkxk+vkHkx^kk1))}=co v {( xkx^kk1)KkHk(xkx^kk1)Kkvk}=co v {( IKkHk)(xkx^kk1)Kkvk}=co v {( IKkHk)(xkx^kk1)}+co v { Kkvk}=(IKkHk) co v {( xkx^kk1)}(IKkHk)T+Kkco v { vk}KkT=(IKkHk)Pkk1(IKkHk)T+KkRkKkT
Among them, the measurement error vk v_kvkIrrelevant to other items, so can be raised separately. This formula for any Kalman gain K k K_kKkBoth hold, if K k K_kKkis the optimal Kalman gain, then it can be further simplified.

Derivation of Optimal Kalman Gain

The Kalman filter is a minimum mean square error estimator, and the posterior state error estimate is xk − x ^ k ∣ k x_k-\hat{x}_{k|k}xkx^kk

Minimize the expected value E of the squared magnitude of this vector [ ∣ X k − x ^ k ∣ k ∣ 2 ] E[|X_k-\hat{x}_{k|k}|^2]E[Xkx^kk2 ], which is equivalent to minimizing the posterior estimated covariance matrixP k ∣ k P_{k|k}Pkktrace. Expand and offset the terms in the above equation:
P k ∣ k = P k ∣ k − 1 − K k H k P k ∣ k − 1 − P k ∣ k − 1 H k TK k T + K k H k P k ∣ k − 1 H k TK k T + K k R k K k T = P k ∣ k − 1 − K k H k P k ∣ k − 1 − P k ∣ k − 1 H k TK k T + K k ( H k P k ∣ k − 1 H k T + R k ) K k T = P k ∣ k − 1 − K k H k P k ∣ k − 1 − P k ∣ k − 1 H k TK k T + K k S k K k T \begin{align} P_{k|k}&=P_{k|k-1}-K_kH_kP_{k|k-1}-P_{k|k-1}H^ T_kK^T_k+K_kH_kP_{k|k-1}H^T_kK^T_k+K_kR_kK^T_k \\&=P_{k|k-1}-K_kH_kP_{k|k-1}-P_{k|k-1 }H^T_kK^T_k+K_k(H_kP_{k|k-1}H^T_k+R_k)K^T_k \\&=P_{k|k-1}-K_kH_kP_{k|k-1}-P_{ k|k-1}H^T_kK^T_k+K_kS_kK^T_k \end{align}Pkk=Pkk1KkHkPkk1Pkk1HkTKkT+KkHkPkk1HkTKkT+KkRkKkT=Pkk1KkHkPkk1Pkk1HkTKkT+Kk(HkPkk1HkT+Rk)KkT=Pkk1KkHkPkk1Pkk1HkTKkT+KkSkKkT
When the matrix derivative is 0, get P k ∣ k P_{k|k}PkkThe minimum value of the trace:
dtr ( P k ∣ k ) d K k = − 2 ( H k P k ∣ k − 1 ) T + 2 K k S k = 0 \begin{align} \frac{\ mathrm{d}tr(P_{k|k})}{\mathrm{d}K_k}=-2(H_kP_{k|k-1})^T+2K_kS_k=0 \end{align}dKkdtr(Pkk)=2(HkPkk1)T+2K _kSk=0
A commonly used formula must be used here, as follows: [Reference 1] [Reference 2] [Reference 3]
dtr ( BAC ) d A = BTCT \begin{align} \frac{\mathrm{d}tr(BAC) }{\mathrm{d}A}=B^TC^T \end{align}d Adtr(BAC)=BTCT
Then the Kalman gain can be solved:
K k = P k ∣ k − 1 H k TS k − 1 \begin{align} K_k=P_{k|k-1}H^T_kS^{-1}_k \end{ align}Kk=Pkk1HkTSk1
This gain is called the optimal Kalman gain and when used results in the minimum mean square error.

Simplification of Posterior Error Covariance Formula

Multiply S k K k T S_kK^T_k on both sides of the optimal Kalman gain formulaSkKkT得到
K k S k K k T = P k ∣ k − 1 H k T K k T \begin{align} K_kS_kK^T_k=P_{k|k-1}H^T_kK^T_k \end{align} KkSkKkT=Pkk1HkTKkT
According to the posterior error covariance expansion formula, there is
P k ∣ k = P k ∣ k − 1 − K k H k P k ∣ k − 1 − P k ∣ k − 1 H k TK k T + K k S k K k T = P k ∣ k − 1 − K k H k P k ∣ k − 1 \begin{align} P_{k|k}&=P_{k|k-1}-K_kH_kP_{k|k-1} -P_{k|k-1}H^T_kK^T_k+K_kS_kK^T_k \\&=P_{k|k-1}-K_kH_kP_{k|k-1} \end{align}Pkk=Pkk1KkHkPkk1Pkk1HkTKkT+KkSkKkT=Pkk1KkHkPkk1
In practice, this formula is often used due to its simple calculation, but it should be noted that this formula is only valid when the optimal Kalman gain is used. If the arithmetic precision is always so low that numerical stability is a problem, or if suboptimal Kalman gains are deliberately used, then this simplification cannot be used; the a posteriori error covariance formula derived above must be used.

references

An Introduction to the Kalman Filter

Probably the most clearly explained Kalman filter

【EKF】Kalman filter principle

Kalman Filter Wiki Encyclopedia

The most complete formula derivation of Kalman filter

The Trace of Matrix and Proof of Related Properties

Trace - Wikipedia

[Machine Learning] Summary and Detailed Explanation: The Trace of the Matrix and the Derivation of the Trace to the Matrix

Guess you like

Origin blog.csdn.net/qq_37214693/article/details/130927283