线性代数:矩阵&矩阵计算

矩阵本身的理解:

  •  矩阵本身包含基变换,每一列都是从V到W的空间变换:https://www.zhihu.com/question/22218306/answer/88697757

 

单位矩阵(I)

初等矩阵:I经过三种不同的变化

酉矩阵:n阶复方阵U的n个列向量是U空间的一个标准正交基,则U是酉矩阵(Unitary Matrix)。显然酉矩阵是正交矩阵往复数域上的推广。酉矩阵又称为幺正矩阵。充要条件是:方阵U的共扼转置乘以U等于单位阵,则U是酉矩阵。

共轭转置:在实数域,等价于转置;在复数域,是转置后对每个元素取复共轭。

正交矩阵:实数域的酉矩阵

 

对称矩阵 对称矩阵(Symmetric Matrices)是指元素以主对角线为对称轴对应相等的矩阵。 在线性代数中,对称矩阵是一个方形矩阵,其转置矩阵和自身相等。

若尔当标准型:

 

协方差矩阵

正定矩阵,半正定矩阵:

特征值分解

奇异值分解

行列式

相似矩阵

 

矩阵乘法直观理解

– 公交车模型(基变换 or 运动):https://www.zhihu.com/question/21351965

 

 

ref: https://zh.wikipedia.org/wiki/%E7%9F%A9%E9%98%B5%E5%88%86%E8%A7%A3

矩阵乘法理解
Av:
– 向量空间角度:以A (mxn, m行n列),每列为空间基A[:0], A[:1], …, A[:n],v(n维向量)则是对n个基进行线性组合
– 线性变换角度:以A每一行为向量A[0:], A[1:], …, A[m:], 则每一个向量都是对v的一个操作

-错误理解:换空间基:以A每一行为向量,构成一组空间基,则A[i:]v 是v在A[i:]这个基上的投影长度:这是错误的

特例:
MB:
– 视A每列为一组基,如果B是对角矩阵,则AB是对这组基分别进行扩大和缩小,另外形成一个空间
– 线性变换:M有m组操作(行),B有k个输入(列),所以结果应该是对每一个操作,每一个输入都有一个结构,构成mxk结果

正交矩阵

特征值分解的运动方式理解:https://www.zhihu.com/question/21874816/answer/181864044

特征值:wiki
特征向量:wiki

QR分解:wiki
特征分解:wiki
奇异值分解:wiki

ref:

– 知乎:如果直观理解线性代数

– https://yjango.gitbooks.io/superorganism/content/xian_xing_dai_shu.html

卡尔曼滤波

ref: http://blog.csdn.net/lanbing510/article/details/40936343
ref: https://www.zhihu.com/question/22422121

ref: http://bilgin.esme.org/BitsAndBytes/KalmanFilterforDummies
When I started doing my homework for Optimal Filtering for Signal Processing class, I said to myself :”How hard can it be?”. Soon I realized that it was a fatal mistake.

The whole thing was like a nightmare. Nothing made sense. The equations were composed of some ridiculously complex superscripted and subscripted variables combined with transposed matrices and untransposed some other stuff, which are totally unknowable to most of us.

And then, instead of aiming for the homework, I decided first fully concentrating on Kalman Filter itself. This article is the result of my couple of day’s work and reflects the slow learning curves of a “mathematically challenged” person.

If you’re humble enough to admit that you don’t understand this stuff completely, you’ll find this material very enlightening.

So, enjoy it!

Kalman Filter – too complicated?
Kalman Filter
too complicated?

A Quick Insight
As I mentioned earlier, it’s nearly impossible to grasp the full meaning of Kalman Filter by starting from definitions and complicated equations (at least for us mere mortals).

For most cases, the state matrices drop out and we obtain the below equation, which is much easier to start with.

Kalman Filter – quick insight
Remember, the k’s on the subscript are states. Here we can treat it as discrete time intervals, such as k=1 means 1ms, k=2 means 2ms.

Our purpose is to find Estimate of X at state k, the estimate of the signal x. And we wish to find it for each consequent k’s.

Also here, Measurement value is the measurement value. Keep in mind that, we are not perfectly sure of these values. Otherwise, we won’t be needing to do all these. And Kalman gain is called “Kalman Gain” (which is the key point of all these), and Estimate of x on the previous state is the estimate of the signal on the previous state.

The only unknown component in this equation is the Kalman gain Kalman gain. Because, we have the measurement values, and we already have the previous estimated signal. You should calculate this Kalman Gain for each consequent state. This is not easy of course, but we have all the tools to do it.

On the other hand, let’s assume Kalman gain be 0.5, what do we get? It’s a simple averaging! In other words, we should find smarter Kalman gain coefficients at each state. The bottom line is :


Kalman filter finds the most optimum averaging factor for each consequent state. Also somehow remembers a little bit about the past states.

Isn’t this amazing?

Step-by-Step Guide
Here’s a simple step-by-step guide for a quick start to Kalman filtering.

STEP 1 – Build a Model
It’s the most important step. First of all, you must be sure that, Kalman filtering conditions fit to your problem.

As we remember the two equations of Kalman Filter is as follows:

Kalman – equation 1
Kalman – equation 2
It means that each xk (our signal values) may be evaluated by using a linear stochastic equation (the first one). Any xk is a linear combination of its previous value plus a control signal k and a process noise (which may be hard to conceptualize). Remember that, most of the time, there’s no control signal uk.

The second equation tells that any measurement value (which we are not sure its accuracy) is a linear combination of the signal value and the measurement noise. They are both considered to be Gaussian.

The process noise and measurement noise are statistically independent.

The entities A, B and H are in general form matrices. But in most of our signal processing problems, we use models such that these entities are just numeric values. Also as an additional ease, while these values may change between states, most of the time, we can assume that they’re constant.

If we are pretty sure that our system fits into this model (most of the systems do by the way), the only thing left is to estimate the mean and standard deviation of the noise functions Wk-1 and vk. We know that, in real life, no signal is pure Gaussian, but we may assume it with some approximation.

This is not a big problem, because we’ll see that the Kalman Filtering Algorithm tries to converge into correct estimations, even if the Gaussian noise parameters are poorly estimated.

The only thing to keep in mind is : “The better you estimate the noise parameters, the better estimates you get.”

STEP 2 – Start the Process
If you succeeded to fit your model into Kalman Filter, then the next step is to determine the necessary parameters and your initial values.

We have two distinct set of equations : Time Update (prediction) and Measurement Update (correction). Both equation sets are applied at each kth state.

Time Update
(prediction) Measurement Update
(correction)
Kalman Filter – Time Update Equations Kalman Filter – Measurement Update Equations
We made the modeling in STEP1, so we know the matrices A, B and H. Most probably, they will be numerical constants. And even most probably, they’ll be equal to 1.

I suggest you to re-write these equations and see how simplified will these equations become. (if you’re lazy enough not to do it, I’ll do it for you in the Example below).

The most remaining painful thing is to determine R and Q. R is rather simple to find out, because, in general, we’re quite sure about the noise in the environment. But finding out Q is not so obvious. And at this stage, I can’t give you a specific method.

To start the process, we need to know the estimate of x0, and P0.

STEP 3 – Iterate
After we gathered all the information we need and started the process, now we can iterate through the estimates. Keep in mind that the previous estimates will be the input for the current state.

Kalman Filter – Iteration Process
Here, Prior estimate is the prior estimate which in a way, means the rough estimate before the measurement update correction. And also Prior error covariance is the prior error covariance. We use these prior values in our Measurement Update equations.

In Measurement Update equations, we really find Estimate of x at time k which is the estimate of x at time k (the very thing we wish to find). Also, we find which is necessary for the k+1 (future) estimate, together with Estimate of x at k .

The Kalman Gain (Kalman Gain) we evaluate is not needed for the next iteration step, it’s a hidden, mysterious and the most important part of this set of equations.

The values we evaluate at Measurement Update stage are also called posterior values. Which also makes sense.

A Simple Example
Now let’s try to estimate a scalar random constant, such as a “voltage reading” from a source. So let’s assume that it has a constant value of aV (volts), but of course we some noisy readings above and below a volts. And we assume that the standard deviation of the measurement noise is 0.1 V.

Now let’s build our model:
Kalman Filter – Example Equation 1Kalman Filter – Example Equation 2

As I promised earlier, we reduced the equations to a very simple form.
Above all, we have a 1 dimensional signal problem, so every entity in our model is a numerical value, not a matrix.
We have no such control signal uk, and it’s out of the game
As the signal is a constant value, the constant A is just 1, because we already know that the next value will be same as the previous one. We are lucky that we have a constant value in this example, but even if it were any other linear nature, again we could easily assume that the value A will be 1.
The value H = 1, because we know that the measurement is composed of the state value and some noise. You’ll rarely encounter real life cases that H is different from 1.
And finally, let’s assume that we have the following measurement values:

TIME (ms) 1 2 3 4 5 6 7 8 9 10
VALUE (V) 0.39 0.50 0.48 0.29 0.25 0.32 0.34 0.48 0.41 0.45
OK, we should start from somewhere, such as k=0. We should find or assume some initial state. Here, we throw out some initial values. Let’s assume estimate of X0 = 0, and P0 = 1. Then why didn’t we choose P0 = 0 for example? It’s simple. If we chose that way, this would mean that there’s no noise in the environment, and this assumption would lead all the consequent Estimate of X at state k to be zero (remaining as the initial state). So we choose P0 something other that zero.

Let’s write the Time Update and Measurement Update equations.

Time Update
(prediction) Measurement Update
(correction)
Kalman Filter – Time Update Equations for Example Kalman Filter – Measurement Update Equations for Example
Now, let’s calculate the Estimate of X at state k values for each iteration.

k Estimate of x on the previous state Prior error covariance Time Update Measurement Update Estimate of X at state k
1 0.390 0 1 Prior estimate = Estimate of x on the previous state = 0
Prior error covariance = = 1 Kalman gain = 1 / (1 0.1)
= 0.909
Estimate of X at state k = 0.909 . (0.390 – 0)
= 0.35
= (1 – 0.909) . 1
= 0.091 0.355 0.091
2 0.500 0.355 0.091 Prior estimate = 0.355
Prior error covariance = 0.091 Kalman gain = 0.091 / (0.091 0.1)
= 0.476
Estimate of X at state k = 0.355 . 0.476 . (0.500 – 0.355)
= 0.424
= (1 – 0.476) . 0.091
= 0.048 0.424 0.048
3 0.480 0.424 0.048 0.442 0.032
4 0.290 0.442 0.032 0.405 0.024
5 0.250 0.405 0.024 0.375 0.020
6 0.320 0.375 0.020 0.365 0.016
7 0.340 0.365 0.016 0.362 0.014
8 0.480 0.362 0.014 0.377 0.012
9 0.410 0.377 0.012 0.380 0.011
10 0.450 0.380 0.011 0.387 0.010
Here, I displayed the first 2 state iterations in detail, the others follow the same pattern. I’ve completed the other numerical values via a computer algorithm, which is the appropriate solution. If you try to write it as an algorithm, you’ll discover that Kalman Filter is very easy to implement.

The chart here (right) shows that the Kalman Filter algorithm converges to the true voltage value. Here, I displayed the first 10 iterations and we clearly see the signs of convergence. In 50 or so iterations, it’ll converge even better.

To enable the convergence in fewer steps, you should

Model the system more elegantly
Estimate the noise more precisely
OK. We’re done. The only thing to do is collecting the Estimate of X at state k values we’ve calculated. That’s it!

Kalman Filter – Convergence over iterations: The Kalman Filter algorithm converges to the truth over a few iterations
Convergence over iterations
The Kalman Filter algorithm converges to the truth over a few iterations

References
[1] Greg Welch, Gary Bishop, “An Introduction to the Kalman Filter”, University of North Carolina at Chapel Hill Department of Computer Science, 2001
[2] M.S.Grewal, A.P. Andrews, “Kalman Filtering – Theory and Practice Using MATLAB”, Wiley, 2001

高斯白噪声

ref: http://www.cnblogs.com/YoungHit/archive/2012/03/09/2388230.html

 本文科普一下高斯白噪声(white Gaussian noise,WGN)。

  百度百科上解释为“高斯白噪声,幅度分布服从高斯分布,功率谱密度服从均匀分布”,听起来有些晦涩难懂,下面结合例子通俗而详细地介绍一下。

  白噪声,如同白光一样,是所有颜色的光叠加而成,不同颜色的光本质区别是的它们的频率各不相同(如红色光波长长而频率低,相应的,紫色光波长短而频率高)。白噪声在功率谱上(若以频率为横轴,信号幅度的平方为功率)趋近为常值,即噪声频率丰富,在整个频谱上都有成分,即从低频到高频,低频指的是信号不变或缓慢变化,高频指的是信号突变。

  由傅里叶变换性质可知,时域有限,频域无限;频域有限,时域无限。那么频域无限的信号变换到时域上,对应于冲击函数的整数倍(由公式也可推得:)。即说明在时间轴的某点上,噪声孤立,与其它点的噪声无关,也就是说,该点噪声幅值可以任意,不受前后点噪声幅值影响。简而言之,任意时刻出现的噪声幅值都是随机的(这句话实际上说的就是功率谱密度服从均与分布的意思,不同的是,前者从时域角度描述,而后者是从频域角度描述)。这里要指出功率谱密度(Power Spectral Density,PSD)的概念,它从频域角度出发,定义了信号的功率是如何随频率分布的,即以频率为横轴,功率为纵轴。

  既然白噪声信号是“随机”的,那么反过来,什么叫做“相关”呢?顾名思义,相关就是某一时刻的噪声点不孤立,和其它时刻的噪声幅值有关。其实相关的情况有很多种,比如此时刻的噪声幅值比上一时刻的大,而下一时刻的噪声幅值比此时刻的还大,即信号的幅值在时间轴上按从小到大的顺序排列。除此之外,幅值从大到小,或幅值一大一小等都叫做“相关”,而非“随机”的。

  解释完了“白噪声”,再来谈谈“高斯分布”。高斯分布,又名正态分布(normal distribution)。概率密度函数曲线的形状又两个参数决定:平均值和方差。简单来说,平均值决定曲线对称中线,方差决定曲线的胖瘦,即贴近中线的程度。概率密度定义了信号出现的频率是如何随着其幅值变化的,即以信号幅值为横轴,以出现的频率为纵轴。因此,从概率密度角度来说,高斯白噪声的幅度分布服从高斯分布

  描述了“白噪声”和“高斯噪声”两个含义,那么,回到文章开头的解释:高斯白噪声,幅度分布服从高斯分布,功率谱密度服从均匀分布。它的意义就很明确了,上半句是从空域(幅值)角度描述“高斯噪声”,而下半句是从频域角度描述“白噪声”。

  下面以matlab程序演示,感性认识一下高斯白噪声。

程序1(高斯白噪声):

  由上图可以看出,高斯白噪声的功率谱密度服从均匀分布。

  若对噪声进行由小到大排序,则使其从随机噪声变为相关噪声,则功率谱密度就不再是均匀分布了。

程序2(非高斯白噪声):

  下面让我们从高斯白噪声的统计信息和幅值分布看一下它的特点。

程序3(高斯白噪声):

  直方图的纵轴为频次,而概率密度的纵轴为频率,但是两者大致的分布曲线确是一样的,因此,这幅图解释了高斯白噪声的幅度分布服从高斯分布。