主办单位,现为美国佛罗里达诺瓦东南大学数学

作者:美高梅娱城乐平台登陆

主讲人工作单位:美国诺瓦东南大学

Conclusion

In this article we showed that the covariance matrix of observed data is directly related to a linear transformation of white, uncorrelated data. This linear transformation is completely defined by the eigenvectors and eigenvalues of the data. While the eigenvectors represent the rotation matrix, the eigenvalues correspond to the square of the scaling factor in each dimension.

职务

讲座:Eigenvalue inequalities of the product of a Hermitian matrix and a positive definite matrix

Introduction

In this article, we provide an intuitive, geometric interpretation of the covariance matrix, by exploring the relation between linear transformations and the resulting data covariance. Most textbooks explain the shape of data based on the concept of covariance matrices. Instead, we take a backwards approach and explain the concept of covariance matrices based on the shape of data.

In a previous article, we discussed the concept of variance, and provided a derivation and proof of the well known formula to estimate the sample variance. Figure 1 was used in this article to show that the standard deviation, as the square root of the variance, provides a measure of how much the data is spread across the feature space.

图片 1

Figure 1. Gaussian density function. For normally distributed data, 68% of the samples fall within the interval defined by the mean plus and minus the standard deviation.

We showed that an unbiased estimator of the sample variance can be obtained by:

(1) 图片 2

However, variance can only be used to explain the spread of the data in the directions parallel to the axes of the feature space. Consider the 2D feature space shown by figure 2:

图片 3

Figure 2. The diagnoal spread of the data is captured by the covariance.

For this data, we could calculate the variance 图片 4 in the x-direction and the variance图片 5 in the y-direction. However, the horizontal spread and the vertical spread of the data does not explain the clear diagonal correlation. Figure 2 clearly shows that on average, if the x-value of a data point increases, then also the y-value increases, resulting in a positive correlation. This correlation can be captured by extending the notion of variance to what is called the ‘covariance’ of the data:

(2) 图片 6

For 2D data, we thus obtain 图片 7图片 8图片 9 and 图片 10. These four values can be summarized in a matrix, called the covariance matrix:

(3) 图片 11

If x is positively correlated with y, y is also positively correlated with x. In other words, we can state that 图片 12. Therefore, the covariance matrix is always a symmetric matrix with the variances on its diagonal and the covariances off-diagonal. Two-dimensional normally distributed data is explained completely by its mean and its 图片 13 covariance matrix. Similarly, a 图片 14 covariance matrix is used to capture the spread of three-dimensional data, and a 图片 15 covariance matrix captures the spread of N-dimensional data.

Figure 3 illustrates how the overall shape of the data defines the covariance matrix:

图片 16

Figure 3. The covariance matrix defines the shape of the data. Diagonal spread is captured by the covariance, while axis-aligned spread is captured by the variance.

数学与统计学院

主讲人简介:张福振,先后就读并工作于沈阳师范大学和北京师范大学。1993年获美国加利福尼亚大学-圣巴巴拉(UCSB)数学博士学位。现为美国佛罗里达诺瓦东南大学数学教授。数学研究领域为线性与多重线性代数及矩阵分析(以及算子理论,泛函分析和组合数学)。曾任美国数学会《数学评论》评审人, 国际线性代数学会公报《IMAGE》编委, 美国自然科学基金评委。现任多家国际数学杂志编委或客座编委,应邀为近40家SCI杂志审稿人。参与创建佛州华人华侨联合会(CASEC),并曾任主席(1994-1996),南佛州现代中文学校校长等职。2013年获诺瓦东南大学文理学院杰出教授奖(Distinguished Professor of the Year)。2016年获上海市“海外名师”称号。

Eigendecomposition of a covariance matrix

In the next section, we will discuss how the covariance matrix can be interpreted as a linear operator that transforms white data into the data we observed. However, before diving into the technical details, it is important to gain an intuitive understanding of how eigenvectors and eigenvalues uniquely define the covariance matrix, and therefore the shape of our data.

As we saw in figure 3, the covariance matrix defines both the spread (variance), and the orientation (covariance) of our data. So, if we would like to represent the covariance matrix with a vector and its magnitude, we should simply try to find the vector that points into the direction of the largest spread of the data, and whose magnitude equals the spread (variance) in this direction.

If we define this vector as 图片 17, then the projection of our data 图片 18 onto this vector is obtained as 图片 19, and the variance of the projected data is 图片 20. Since we are looking for the vector 图片 21 that points into the direction of the largest variance, we should choose its components such that the covariance matrix 图片 22 of the projected data is as large as possible. Maximizing any function of the form 图片 23 with respect to 图片 24, where 图片 25 is a normalized unit vector, can be formulated as a so called Rayleigh Quotient. The maximum of such a Rayleigh Quotient is obtained by setting 图片 26 equal to the largest eigenvector of matrix 图片 27.

In other words, the largest eigenvector of the covariance matrix always points into the direction of the largest variance of the data, and the magnitude of this vector equals the corresponding eigenvalue. The second largest eigenvector is always orthogonal to the largest eigenvector, and points into the direction of the second largest spread of the data.

Now let’s have a look at some examples. In an earlier article we saw that a linear transformation matrix 图片 28 is completely defined by its eigenvectors and eigenvalues. Applied to the covariance matrix, this means that:

(4) 图片 29

where 图片 30 is an eigenvector of 图片 31, and 图片 32 is the corresponding eigenvalue.

If the covariance matrix of our data is a diagonal matrix, such that the covariances are zero, then this means that the variances must be equal to the eigenvalues 图片 33. This is illustrated by figure 4, where the eigenvectors are shown in green and magenta, and where the eigenvalues clearly equal the variance components of the covariance matrix.

图片 34

Figure 4. Eigenvectors of a covariance matrix

However, if the covariance matrix is not diagonal, such that the covariances are not zero, then the situation is a little more complicated. The eigenvalues still represent the variance magnitude in the direction of the largest spread of the data, and the variance components of the covariance matrix still represent the variance magnitude in the direction of the x-axis and y-axis. But since the data is not axis aligned, these values are not the same anymore as shown by figure 5.

图片 35

Figure 5. Eigenvalues versus variance

By comparing figure 5 with figure 4, it becomes clear that the eigenvalues represent the variance of the data along the eigenvector directions, whereas the variance components of the covariance matrix represent the spread along the axes. If there are no covariances, then both values are equal.

职称

主讲人:张福振

Covariance matrix as a linear transformation

Now let’s forget about covariance matrices for a moment. Each of the examples in figure 3 can simply be considered to be a linearly transformed instance of figure 6:

图片 36

Figure 6. Data with unit covariance matrix is called white data.

Let the data shown by figure 6 be 图片 37, then each of the examples shown by figure 3 can be obtained by linearly transforming 图片 38:

(5) 图片 39

where 图片 40 is a transformation matrix consisting of a rotation matrix 图片 41 and a scaling matrix 图片 42:

(6) 图片 43

These matrices are defined as:

(7) 图片 44

where 图片 45 is the rotation angle, and:

(8) 图片 46

where 图片 47 and 图片 48 are the scaling factors in the x direction and the y direction respectively.

In the following paragraphs, we will discuss the relation between the covariance matrix 图片 49, and the linear transformation matrix 图片 50.

Let’s start with unscaled (scale equals 1) and unrotated data. In statistics this is often refered to as ‘white data’ because its samples are drawn from a standard normal distribution and therefore correspond to white (uncorrelated) noise:

图片 51

Figure 7. White data is data with a unit covariance matrix.

The covariance matrix of this ‘white’ data equals the identity matrix, such that the variances and standard deviations equal 1 and the covariance equals zero:

(9) 图片 52

Now let’s scale the data in the x-direction with a factor 4:

(10) 图片 53

The data 图片 54 now looks as follows:

图片 55

Figure 8. Variance in the x-direction results in a horizontal scaling.

The covariance matrix 图片 56 of 图片 57 is now:

(11) 图片 58

Thus, the covariance matrix 图片 59 of the resulting data 图片 60 is related to the linear transformation 图片 61 that is applied to the original data as follows: 图片 62, where

(12) 图片 63

However, although equation (12) holds when the data is scaled in the x and y direction, the question rises if it also holds when a rotation is applied. To investigate the relation between the linear transformation matrix 图片 64 and the covariance matrix 图片 65 in the general case, we will therefore try to decompose the covariance matrix into the product of rotation and scaling matrices.

As we saw earlier, we can represent the covariance matrix by its eigenvectors and eigenvalues:

(13) 图片 66

where 图片 67 is an eigenvector of 图片 68, and 图片 69 is the corresponding eigenvalue.

Equation (13) holds for each eigenvector-eigenvalue pair of matrix 图片 70. In the 2D case, we obtain two eigenvectors and two eigenvalues. The system of two equations defined by equation (13) can be represented efficiently using matrix notation:

(14) 图片 71

where 图片 72 is the matrix whose columns are the eigenvectors of 图片 73 and 图片 74 is the diagonal matrix whose non-zero elements are the corresponding eigenvalues.

This means that we can represent the covariance matrix as a function of its eigenvectors and eigenvalues:

(15) 图片 75

Equation (15) is called the eigendecomposition of the covariance matrix and can be obtained using a Singular Value Decomposition algorithm. Whereas the eigenvectors represent the directions of the largest variance of the data, the eigenvalues represent the magnitude of this variance in those directions. In other words, 图片 76 represents a rotation matrix, while 图片 77 represents a scaling matrix. The covariance matrix can thus be decomposed further as:

(16) 图片 78

where 图片 79 is a rotation matrix and 图片 80 is a scaling matrix.

In equation (6) we defined a linear transformation 图片 81. Since 图片 82 is a diagonal scaling matrix, 图片 83. Furthermore, since 图片 84 is an orthogonal matrix, 图片 85. Therefore, 图片 86. The covariance matrix can thus be written as:

(17) 图片 87

In other words, if we apply the linear transformation defined by 图片 88 to the original white data 图片 89 shown by figure 7, we obtain the rotated and scaled data 图片 90 with covariance matrix 图片 91. This is illustrated by figure 10:

图片 92

Figure 10. The covariance matrix represents a linear transformation of the original data.

The colored arrows in figure 10 represent the eigenvectors. The largest eigenvector, i.e. the eigenvector with the largest corresponding eigenvalue, always points in the direction of the largest variance of the data and thereby defines its orientation. Subsequent eigenvectors are always orthogonal to the largest eigenvector due to the orthogonality of rotation matrices.

代数教研室全体教师

组织单位:数学与金融学院

主办单位

地点:中区光琼科学楼201

We also construct a simultaneous decomposition for a set of seven general matrices over an arbitrary division ring F with compatible sizes. As applications of the simultaneous matrix decomposition, we give some solvability conditions, general solutions, as well as the range of ranks of the general solutions to some generalized Sylvester matrix equations over an arbitrary division ring F.

时间:2018年6月16日16:00

D512

讲座主要内容:Eigenvalue problem of matrices is of central importance in matrix theory, matrix computation, linear algebra and related areas. Traditionally, eigenvalue inequalities in partial sums involve two Hermitian matrices for sum and a pair of positive semidefinite matrices for product. We show some eigenvalue inequalities of the product of a Hermitian matrix and a positive definite (Hermitian) matrix; we use the results to study perturbation problems of generalized eigenvalues. (Joint work with Bo-Yan Xi.)

科研处 宣传部

时间:下午16:30

2016年 11月 18日

教授

负责国际合作项目、国家自然科学基金项目、教育部博士点基金项目等20多项。多次在大型国际学术会议上作大会报告和特邀报告,多次担任国际学术会议的主席、组织委员和程序委员。多次受邀在新加坡国立大学、荷兰Delft理工大学、新加坡南洋理工大学、英国拉夫堡大学、美国诺瓦东南大学、美国中弗罗里达大学、香港理工大学、澳门科技大学等访问和科学合作研究。已培养博士后5名、博士17名、硕士34名,其中3人获上海市研究生优秀成果奖(2次优秀博士论文、1次优秀硕士论文)。

本文由美高梅下载app发布,转载请注明来源

关键词: