{"id":8303,"date":"2021-06-09T12:13:37","date_gmt":"2021-06-09T12:13:37","guid":{"rendered":"https:\/\/wealthrevelation.com\/data-science\/2021\/06\/09\/rethinking-linear-algebra-part-two-ellipsoids-in-data-science\/"},"modified":"2021-06-09T12:13:37","modified_gmt":"2021-06-09T12:13:37","slug":"rethinking-linear-algebra-part-two-ellipsoids-in-data-science","status":"publish","type":"post","link":"https:\/\/wealthrevelation.com\/data-science\/2021\/06\/09\/rethinking-linear-algebra-part-two-ellipsoids-in-data-science\/","title":{"rendered":"Rethinking linear algebra part two: ellipsoids in data science"},"content":{"rendered":"<div>\n<h3>1 Our expedition of eigenvectors still continues<\/h3>\n<p>This article is still going to be about eigenvectors and PCA, and this article still will not cover <strong>LDA (linear discriminant analysis)<\/strong>. Hereby I would like you to have more organic links of the data science ideas with eigenvectors.<\/p>\n<p>In <a href=\"https:\/\/data-science-blog.com\/blog\/2020\/10\/27\/10360\/\">the second article<\/a>, we have covered the following points:<\/p>\n<ul>\n<li>You can visualize linear transformations with matrices by calculating displacement vectors, and they usually look like vectors swirling.<\/li>\n<li>Diagonalization is finding a direction in which the displacement vectors do not swirl, and that is equal to finding new axis\/basis where you can describe its linear transformations more straightforwardly. But we have to consider diagonalizability of the matrices.<\/li>\n<li>In linear dimension reduction such as PCA or LDA, we mainly use types of matrices called positive definite or positive semidefinite matrices.<\/li>\n<\/ul>\n<p>In <a href=\"https:\/\/data-science-blog.com\/blog\/2020\/11\/23\/pca-and-lda\/\">the last article<\/a> we have seen the following points:<\/p>\n<ul>\n<li>PCA is an algorithm of calculating orthogonal axes along which data \u201cswell\u201d the most.<\/li>\n<li>PCA is equivalent to calculating a new orthonormal basis for the data where the covariance between components is zero.<\/li>\n<li>You can reduced the dimension of the data in the new coordinate system by ignoring the axes corresponding to small eigenvalues.<\/li>\n<li>Covariance matrices enable linear transformation of rotation and expansion and contraction of vectors.<\/li>\n<\/ul>\n<p>I emphasized that the axes are more important than the surface of the high dimensional ellipsoids, but in this article let\u2019s focus more on the surface of ellipsoids, or I would rather say general <em><strong>quadratic curves<\/strong><\/em>. After also seeing how to draw ellipsoids on data, you would see the following points about PCA or eigenvectors.<\/p>\n<ul>\n<li>Covariance matrices are real symmetric matrices, and also they are positive semidefinite. That means you can always diagonalize covariance matrices, and their eigenvalues are all equal or greater than 0.<\/li>\n<li>PCA is equivalent to finding axes of quadratic curves in which gradients are biggest. The values of quadratic curves increases the most in those directions, and that means the directions describe great deal of information of data distribution.<\/li>\n<li>Intuitively dimension reduction by PCA is equal to fitting a high dimensional ellipsoid on data and cutting off the axes corresponding to small eigenvalues.<\/li>\n<\/ul>\n<p>Even if you already understand PCA to some extent, I hope this article provides you with deeper insight into PCA, and at least after reading this article, I think you would be more or less able to visually control eigenvectors and ellipsoids with the Numpy and Maplotlib libraries.<\/p>\n<p>*Let me first introduce some mathematical facts and how I denote them throughout this article in advance. If you are allergic to mathematics, take it easy or please go back to my former articles.<\/p>\n<p>*In the last article, I denoted the covariance of data as <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-520cb534cd5b6bed768a61515b57cb7e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"S\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\">, based on <a href=\"http:\/\/users.isr.ist.utl.pt\/~wurmd\/Livros\/school\/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf\">Pattern Recognition and Machine Learning by C. M. Bishop. <\/a><\/p>\n<p>*Sooner or later you are going to see that I am explaining basically the same ideas from different points of view, using the topic of PCA. However I believe they are all important when you learn linear algebra for data science of machine learning. Even you have not learnt linear algebra or if you have to teach linear algebra, I recommend you to first take a review on the idea of diagonalization, like the second article. And you should be conscious that, in the context of machine learning or data science, only a very limited type of matrices are important, which I have been explaining throughout this article.<\/p>\n<h3>2 Rotation or projection?<\/h3>\n<p>In this section I am going to talk about basic stuff found in most textbooks on linear algebra. In the last article, I mentioned that if <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> is a real symmetric matrix, you can diagonalize <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> with a rotation matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-80bc2c40b839a6e1b89142bea18f1c1b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U = (boldsymbol{u}_1 : cdots : boldsymbol{u}_D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"128\">, such that <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-1cd3d4b0cdcf0ae16d59697ef1aa2764_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^{-1}AU\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"58\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-b776e90d5a3a1bdd02ccda9290228ca6_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= U^{T}AU\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"70\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-4ccea4c94e8713b858bf4d59d6021905_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=Lambda\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"31\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d6c04dca68ac6a974a1c0aaf8686f2f1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Lambda = diag(lambda_{1}, dots , lambda_{D})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"163\">. I also explained that PCA is a case where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-352eb684151958c618a66fc7b2405814_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A=Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"49\">, that is, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> is the covariance matrix of certain data. <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\"> is known to be positive semidefinite and real symmetric. Thus you can always diagonalize <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\"> and any of their engenvalues cannot be lower than 0.<\/p>\n<p>I think we first need to clarify the difference of rotation and projection. In order to visualize the ideas, let\u2019s consider a case of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-cba6ce00f89c488d55cb4fea14df3127_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"D=3\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"48\">. Assume that you have got an orthonormal rotation matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6d950a5450cece8258091565aceaf4b3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U = (boldsymbol{u}_1 : boldsymbol{u}_2 : boldsymbol{u}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"117\"> which diagonalizes <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">. In the last article I said diagonalization is equivalent to finding new orthogonal axes formed by eigenvectors, and in the case of this section you got new orthonoramal basis <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a2f1968d901786eb96d326c40c8d9e53_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{u}_1, boldsymbol{u}_2, boldsymbol{u}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"87\"> which are in red in the figure below. Projecting a point <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6d9837adc588a5fee942c0a4a903a35f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x} = (x, y, z)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"92\"> on the new orthonormal basis is simple: you just have to multiply <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6bfec27ce583970e303355d356b3523a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^T\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"23\">. Let <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6e59f638ef35bb3ed3dcd53b5c779e1b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^T boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"35\"> be <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-812bb1e207cba6fec7acc3f87dd4a032_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(x', y', z')^T\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"81\">, and then <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-bb9ac1c08226cbe43a5647aebb615602_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"left( begin{array}{c} x' \\ y' \\ z' end{array} right) = U^Tboldsymbol{x} = left( begin{array}{c} boldsymbol{u}_1^{T}boldsymbol{x} \\ boldsymbol{u}_2^{T}boldsymbol{x} \\ boldsymbol{u}_3^{T}boldsymbol{x} end{array} right)\" title=\"Rendered by QuickLaTeX.com\" height=\"64\" width=\"218\">. You can see <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-adca1d9bfb92a9c5f3c2cc75064a1730_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"x', y', z'\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"57\"> are <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> projected on <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-51e3df2ff8230ccb1a2562264d8b5124_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_1, boldsymbol{u}_2, boldsymbol{u}_3\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"74\"> respectively, and the left side of the figure below shows the idea. When you replace the orginal orthonormal basis <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7f4d998fa9174f57ef17266437b904bb_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{e}_1, boldsymbol{e}_2, boldsymbol{e}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"80\"> with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a2f1968d901786eb96d326c40c8d9e53_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{u}_1, boldsymbol{u}_2, boldsymbol{u}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"87\"> as in the right side of the figure below, you can comprehend the projection as a rotation from <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4564abd446b0af50f0e829b98d1dffc_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(x, y, z)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"56\"> to <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-4d5e872243130fe0a56e69d7e16853da_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(x', y', z')\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"70\"> by a rotation matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6bfec27ce583970e303355d356b3523a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^T\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"23\">.<\/p>\n<h3><img loading=\"lazy\" class=\"aligncenter wp-image-5664 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/projection-1030x533.png\" alt=\"\" width=\"1030\" height=\"533\"><\/h3>\n<p>Next, let\u2019s see what rotation is. In case of rotation, you should imagine that you rotate the point <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> in the same coordinate system, rather than projecting to other coordinate system. You can rotate <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> by multiplying it with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">. This rotation looks like the figure below.<\/p>\n<p>In the initial position, the edges of the cube are aligned with the three orthogonal black axes <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-de6bccbb4ca202d135d57a1e5a99f16f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{e}_1,  boldsymbol{e}_2 , boldsymbol{e}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"80\">, with one corner of the cube located at the origin point of those axes. The purple dot denotes the corner of the cube directly opposite the origin corner. The cube is rotated in three dimensions, with the origin corner staying fixed in place. After the rotation with a pivot at the origin, the edges of the cube are now aligned with a new set of orthogonal axes <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-533c458f3f1d59e0cc65789a7a79072c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{u}_1,  boldsymbol{u}_2 , boldsymbol{u}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"87\">, shown in red. You might understand that more clearly with an equation: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7451f94fc3f8679e6bfc0e21ae062842_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Uboldsymbol{x} = (boldsymbol{u}_1 : boldsymbol{u}_2 : boldsymbol{u}_3) left( begin{array}{c} x \\ y \\ z end{array} right)\" title=\"Rendered by QuickLaTeX.com\" height=\"64\" width=\"185\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f1475a8d2c78c6dfe7db3790b6fd126a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= xboldsymbol{u}_1 + yboldsymbol{u}_2 + zboldsymbol{u}_3\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"149\">. In short this rotation means you keep relative position of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\">, I mean its coordinates <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4564abd446b0af50f0e829b98d1dffc_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(x, y, z)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"56\">, in the new orthonormal basis. In this article, let me call this a \u201ccube rotation.\u201d<\/p>\n<h3><img loading=\"lazy\" class=\"aligncenter wp-image-5665 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/rotation-1030x661.png\" alt=\"\" width=\"1030\" height=\"661\"><\/h3>\n<p>The discussion above can be generalized to spaces with dimensions higher than 3. When <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-bff117037b63b68339cc0e01df715cbd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U in mathbb{R}^{D times D}\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"81\"> is an orthonormal matrix and a vector <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7e1851549fe597ca346cbb0544d3c620_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x} in mathbb{R}^D\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"57\">, you can project <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> to <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-b85109f93ee7aeb908c623380bc1aa97_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}' = U^T boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"76\">or rotate it to <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-07454f578acce521cd6bd56a7acc9e55_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}'' = U boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"69\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-bebab45053e7b5e1de4bc7474502a454_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}' = (x_{1}', dots, x_{D}')^T\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"144\"> and <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f83adcdc35be52057a3973b37405b78f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}'' = (x_{1}'', dots, x_{D}'')^T\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"148\">. In other words <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-fac4d583cacd7aef8b343fd6258bf18c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x} = U boldsymbol{x}'\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"65\">, which means you can rotate back <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-e196bbdd88bbd136155e9ac3749223bb_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}'\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"16\"> to the original point <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> with the rotation matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">.<\/p>\n<p>I think you at least saw that rotation and projection are basically the same, and that is only a matter of how you look at the coordinate systems. But I would say the idea of projection is more important through out this article.<\/p>\n<p>Let\u2019s consider a function <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-8f447733f205c7809f9c05af06268144_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A) = boldsymbol{x}^T A boldsymbol{x} = (boldsymbol{x}, A boldsymbol{x})\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"209\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-9b0926dc2ebc0c21c5d2890092ff8114_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Ain mathbb{R}^{Dtimes D}\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"81\"> is a real symmetric matrix. The distribution of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7547a43f4094b8e8f49df96da70202e9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"56\"> is quadratic curves whose center point covers the origin, and it is known that you can express this distribution in a much simpler way using eigenvectors. When you project this function on eigenvectors of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">, that is when you substitute <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-92049055680d2f524c38d61ab05d584d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U boldsymbol{x}'\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"29\"> for <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\">, you get <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-1ade69c1a05bdd013980d554cca1b945_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f = (boldsymbol{x}, A boldsymbol{x})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"92\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f177afb230528fddcb603c9531d4ec02_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=(U boldsymbol{x}', AU boldsymbol{x}')\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"113\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f9565ec54522786e0b225a8d20b096ea_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= (boldsymbol{x}')^T U^TAU boldsymbol{x}'\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"127\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-33eb34613e75fd938e7d7d461066af3c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= (boldsymbol{x}')^T Lambda boldsymbol{x}'\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"87\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-31c953a2569539448e5c0bfc24902f9c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= lambda_1 ({x'}_1)^2 + cdots + lambda_D ({x'}_D)^2\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"215\">. You can always diagonalize real symmetric matrices, so the formula implies that the shapes of quadratic curves largely depend on eigenvectors. We are going to see this in detail in the next section.<\/p>\n<p>*<img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-1af946f94a8963a3241eaacb146e9107_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{x}, boldsymbol{y})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"43\"> denotes an inner product of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> and <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-cdb36e8e57e3426c880e96fdf4e1696c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{y}\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"11\">.<\/p>\n<p>*We are going to see details of the shapes of quadratic \u201ccurves\u201d or \u201cfunctions\u201d in the next section.<\/p>\n<p>To be exact, you cannot naively multiply <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> or <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6bfec27ce583970e303355d356b3523a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^T\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"23\"> for rotation. Let\u2019s take a part of data I showed in the last article as an example. In the figure below, I projected data on the basis <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-533c458f3f1d59e0cc65789a7a79072c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{u}_1,  boldsymbol{u}_2 , boldsymbol{u}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"87\">.<\/p>\n<h3><img loading=\"lazy\" class=\"aligncenter wp-image-5659 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/rotation_demo-1030x492.png\" alt=\"\" width=\"1030\" height=\"492\"><\/h3>\n<p>You might have noticed that you cannot do a \u201ccube rotation\u201d in this case. If you make the coordinate system <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a2f1968d901786eb96d326c40c8d9e53_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{u}_1, boldsymbol{u}_2, boldsymbol{u}_3)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"87\"> with your left hand, like you might have done in science classes in school to learn Fleming\u2019s rule, you would soon realize that the coordinate systems in the figure above do not match. You need to flip the direction of one axis to match them.<\/p>\n<h3><img loading=\"lazy\" class=\"size-medium wp-image-5658 aligncenter\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/fleming-300x248.png\" alt=\"\" width=\"300\" height=\"248\"><\/h3>\n<p>Mathematically, you have to consider the determinant of the rotation matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">. You can do a \u201ccube rotation\u201d when <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6207be5558779773e9f23e554a536ba5_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"det(U)=1\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"83\">, and in the case above <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-0a0ff96f666f117c391a8f5d1490bc74_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"det(U)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"51\"> was <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7b34c01098c83fa602de54e9d74d63a9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"-1\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"21\">, and you needed to flip one axis to make the determinant <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-4868771cbc422b5818f85500909ce433_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"1\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"7\">. In the example in the figure below, you can match the basis. This also can be generalized to higher dimensions, but that is also beyond the scope of this article series. If you are really interested, you should prepare some coffee and snacks and textbooks on linear algebra, and some weekends.<\/p>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-5660 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/rotation_demo_2-1030x492.png\" alt=\"\" width=\"1030\" height=\"492\"><\/p>\n<p>When you want to make general ellipsoids in a 3d space on Matplotlib, you can take advantage of rotation matrices. You first make a simple ellipsoid symmetric about xyz axis using polar coordinates, and you can rotate the whole ellipsoid with rotation matrices. I made some simple modules for drawing ellipsoid. If you put in a rotation matrix which diagonalize the covariance matrix of data and a list of three radiuses <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2579026450d5e42aff00089e435bb18c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"sqrt{lambda_1}, sqrt{lambda_2}, sqrt{lambda_3}\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"114\">, you can rotate the original ellipsoid so that it fits the data well.<\/p>\n<h3><img loading=\"lazy\" class=\"wp-image-5651 aligncenter\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/fitting_ellipsoid_3-1030x447.png\" alt=\"\" width=\"922\" height=\"400\"><\/h3>\n<h3>3 Types of quadratic curves.<\/h3>\n<p>*This article might look like a mathematical writing, but I would say this is more about computer science. Please tolerate some inaccuracy in terms of mathematics. I gave priority to visualizing necessary mathematical ideas in my article series. If you are not sure about details, please let me know.<\/p>\n<p>In linear dimension reduction, or at least in this article series you mainly have to consider ellipsoids. However ellipsoids are just one type of quadratic curves. In the last article, I mentioned that when the center of a D dimensional ellipsoid is the origin point of a normal coordinate system, the formula of the surface of the ellipsoid is as follows: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-4146491c930d2f0139a2532d07eb6001_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{x}, Aboldsymbol{x})=1\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"89\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> satisfies certain conditions. To be concrete, when <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-4146491c930d2f0139a2532d07eb6001_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{x}, Aboldsymbol{x})=1\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"89\"> is the surface of a ellipsoid, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> has to be diagonalizable and positive definite.<\/p>\n<p>*Real symmetric matrices are diagonalizable, and positive definite matrices have only positive eigenvalues. Covariance matrices <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\">, whose displacement vectors I visualized in the last two articles, are known to be symmetric real matrices and positive semi-defintie. However, the surface of an ellipsoid which fit the data is <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a5ef6982652d8cf089e2b3c42758fc80_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}^T Sigma ^{-1} boldsymbol{x} = const.\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"135\">, not <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-bd27ef23abd08d9db3159a316c7d6914_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}^T Sigma boldsymbol{x} = const.\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"116\">.<\/p>\n<p>*You have to keep it in mind that <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> are all deviations.<\/p>\n<p>*You do not have to think too much about what the \u201csemi\u201d of the term \u201cpositive semi-definite\u201d means fow now.<\/p>\n<p>As you could imagine, this is just one simple case of richer variety of graphs. Let\u2019s consider a 3-dimensional space. Any quadratic curves in this space can be denoted as <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d414c2c15df6538d0eafdd1465c78c4e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"ax^2 + by^2 + cz^2 + dxy + eyz + fxz + px + qy + rz + s = 0\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"452\">, where at least one of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-e958191364ec0e5fc253e52f103b10ea_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"a, b, c, d, e, f, p, q, r, s\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"156\"> is not <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a5e437be25f29374d30f66cd46adf81c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"0\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\">.\u00a0 Let <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> be <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-357d139e21833ec5073e248b9888b0e2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(x, y, z)^T\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"67\">, then the quadratic curves can be simply denoted with a <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-469ae8f080aab807501eefc47ed0069b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"3times 3\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"40\"> matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> and a 3-dimensional vector <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-83cee32bf0518564ebc36bbfdfca1acc_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{b}\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\"> as follows: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-9fa6d768dedf1c3e4f164b8813fecf9b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}^T Aboldsymbol{x} + 2boldsymbol{b}^Tboldsymbol{x} + s = 0\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"172\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6311f62e55486784a4ba83a6db11f8a9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A = left( begin{array}{ccc} a &amp; frac{d}{2} &amp; frac{f}{2} \\ frac{d}{2} &amp; b &amp; frac{e}{2} \\ frac{f}{2} &amp; frac{e}{2} &amp; c end{array} right)\" title=\"Rendered by QuickLaTeX.com\" height=\"69\" width=\"147\">, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-9688b4eadc8a55245ee087247ff36d57_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{b} = left( begin{array}{c} frac{p}{2} \\ frac{q}{2} \\ frac{r}{2} end{array} right)\" title=\"Rendered by QuickLaTeX.com\" height=\"65\" width=\"86\">. General quadratic curves are roughly classified into the 9 types below.<\/p>\n<h3><img loading=\"lazy\" class=\"wp-image-5663 aligncenter\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/quad_types_1.png\" alt=\"\" width=\"748\" height=\"835\"><\/h3>\n<p>You can shift these quadratic curves so that their center points come to the origin, without rotation, and the resulting curves are as follows. The curves can be all denoted as <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-ff875a7f7b8b4c3bce02927d7c71ff58_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}^T Aboldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"46\">.<\/p>\n<h3><img loading=\"lazy\" class=\"wp-image-5662 aligncenter\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/quad_types_2.png\" alt=\"\" width=\"789\" height=\"807\"><\/h3>\n<p>As you can see, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> is a real symmetric matrix. As I have mentioned repeatedly, when all the elements of a <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6dfee5f597d82f3211914582719a86ad_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"D times D\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"52\"> symmetric matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> are real values and its eigen values are <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-04c501184ae5ce9ff8cd5468151c464e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_{i} (i=1, dots , D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"122\">, there exist orthogonal\/orthonormal matrices <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> such that <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-bd5fb4b305900d8eb49f7b72a7eab05e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^{-1}AU = Lambda\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"94\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d6c04dca68ac6a974a1c0aaf8686f2f1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Lambda = diag(lambda_{1}, dots , lambda_{D})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"163\">. Hence, you can diagonalize the <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6311f62e55486784a4ba83a6db11f8a9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A = left( begin{array}{ccc} a &amp; frac{d}{2} &amp; frac{f}{2} \\ frac{d}{2} &amp; b &amp; frac{e}{2} \\ frac{f}{2} &amp; frac{e}{2} &amp; c end{array} right)\" title=\"Rendered by QuickLaTeX.com\" height=\"69\" width=\"147\"> with an orthogonal matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">. Let <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> be an orthogonal matrix such that <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-21bf58bbd9e16529d710c96fdba86148_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^T A U = left( begin{array}{ccc} alpha  &amp; 0 &amp; 0 \\ 0 &amp; beta &amp; 0 \\ 0 &amp; 0 &amp; gamma end{array} right)\" title=\"Rendered by QuickLaTeX.com\" height=\"64\" width=\"182\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-54a3d042cafeda16b85bc200102a9061_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=left( begin{array}{ccc} lambda_1  &amp; 0 &amp; 0 \\ 0 &amp; lambda_2 &amp; 0 \\ 0 &amp; 0 &amp; lambda_3 end{array} right)\" title=\"Rendered by QuickLaTeX.com\" height=\"64\" width=\"148\">. After you apply rotation by <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> to the curves (a)\u201d ~ (i)\u201d, those curves are symmetrically placed about the xyz axes, and their center points still cross the origin. The resulting curves look like below. Or rather I should say you projected (a)\u2019 ~ (i)\u2019 on their eigenvectors.<\/p>\n<h3><img loading=\"lazy\" class=\"wp-image-5661 aligncenter\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/quad_types_3.png\" alt=\"\" width=\"781\" height=\"857\"><\/h3>\n<p>In this article mainly (a)\u201d , (g)\u201d, (h)\u201d, and (i)\u201d are important. General equations for the curves is as follows<\/p>\n<ul>\n<li>(a)\u201d: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-c52f41ab45b2897c6f6c414f184e8fed_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"frac{x^2}{l^2} + frac{y^2}{m^2} + frac{z^2}{n^2} = 1\" title=\"Rendered by QuickLaTeX.com\" height=\"27\" width=\"133\"><\/li>\n<li>(g)\u201d: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-091b4c05d4034db49d5c0229984ded98_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"z = frac{x^2}{l^2} + frac{y^2}{m^2}\" title=\"Rendered by QuickLaTeX.com\" height=\"27\" width=\"94\"><\/li>\n<li>(h)\u201d: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-c2b87dcaabad22ffcc32e960dd692e1b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"z = frac{x^2}{l^2} - frac{y^2}{m^2}\" title=\"Rendered by QuickLaTeX.com\" height=\"27\" width=\"94\"><\/li>\n<li>(i)\u201d: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-01d94be51a7a4a16622f138d55af2c31_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"z = frac{x^2}{l^2}\" title=\"Rendered by QuickLaTeX.com\" height=\"26\" width=\"50\"><\/li>\n<\/ul>\n<p>, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-fdcd05027a1742ae0bad2055eda35068_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"l, m, n in mathbb{R}^+\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"92\">.<\/p>\n<p>Even if this section has been puzzling to you, you just have to keep one point in your mind: we have been discussing general quadratic curves, but in PCA, you only need to consider a case where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> is a covariance matrix, that is <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-352eb684151958c618a66fc7b2405814_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A=Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"49\">. PCA corresponds to the case where you shift and rotate the curve (a) into (a)\u201d. Subtracting the mean of data from each point of data corresponds to shifting quadratic curve (a) to (a)\u2019. Calculating eigenvectors of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> corresponds to calculating a rotation matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> such that the curve (a)\u2019 comes to (a)\u201d after applying the rotation, or projecting curves on eigenvectors of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\">. Importantly we are only discussing the covariance of certain data, not the distribution of the data itself.<\/p>\n<p>*Just in case you are interested in a little more mathematical sides: it is known that if you rotate all the points <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> on the curve <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-9fa6d768dedf1c3e4f164b8813fecf9b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}^T Aboldsymbol{x} + 2boldsymbol{b}^Tboldsymbol{x} + s = 0\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"172\"> with the rotation matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-650eb7688af6737ac325425b5c9a5982_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"P\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"14\">, those points <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> are mapped into a new quadratic curve <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6027e95bd8dcaa13d302ec350c8a0871_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"alpha x^2 + beta y^2 + gamma z^2 + lambda x + mu y + nu z + rho = 0\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"315\">. That means the rotation of the original quadratic curve with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-650eb7688af6737ac325425b5c9a5982_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"P\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"14\"> (or rather rotating axes) enables getting rid of the terms <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-121d4d6a94ff89b7d8fcb37ffe857f56_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"xy, yz, zx\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"73\">. Also it is known that when <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-123cebdbe52294dedeaf76fbf84583b1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"alpha ' neq 0\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"49\">, with proper translations and rotations, the quadratic curve <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6027e95bd8dcaa13d302ec350c8a0871_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"alpha x^2 + beta y^2 + gamma z^2 + lambda x + mu y + nu z + rho = 0\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"315\"> can be mapped into one of the types of quadratic curves in the figure below, depending on coefficients of the original quadratic curve. And the discussion so far can be generalized to higher dimensional spaces, but that is beyond the scope of this article series. Please consult decent textbooks on linear algebra around you for further details.<\/p>\n<h3>4 Eigenvectors are gradients and sometimes variances.<\/h3>\n<p>In the second section I explained that you can express quadratic functions <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25a9e1f360163f498d7adc4d06576cbc_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A) = boldsymbol{x}^T A boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"127\"> in a very simple way by projecting <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> on eigenvectors of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">.<\/p>\n<p>You can comprehend what I have explained in another way: eigenvectors, to be exact eigenvectors of real symmetric matrices <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">, are gradients. And in case of PCA, I mean when <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-352eb684151958c618a66fc7b2405814_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A=Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"49\"> eigenvalues are also variances. Before explaining what that means, let me explain a little of the totally common facts on mathematics. If you have variables <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-3b0f2a5ecc3c73b15859eb75bed9571d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}in mathbb{R}^D\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"57\">, I think you can comprehend functions <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-15df5cd2e74ae468576ac007f9aa22be_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldysmbol{x})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"34\"> in two ways. One is a normal \u201cfunctions\u201d <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-289572d10c8a043ed959b1d2b8d6cd72_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"35\">, and the others are \u201ccurves\u201d <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f245ecf0f9d0310f8d34ff33cac62d57_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}) = const.\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"106\">. \u201cFunctions\u201d get an input <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> and gives out an output <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-289572d10c8a043ed959b1d2b8d6cd72_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"35\">, just as well as normal functions you would imagine. \u201cCurves\u201d are rather sets of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7e1851549fe597ca346cbb0544d3c620_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x} in mathbb{R}^D\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"57\"> such that <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f245ecf0f9d0310f8d34ff33cac62d57_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}) = const.\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"106\">.<\/p>\n<p>*Please assume that the terms \u201cfunctions\u201d and \u201ccurves\u201d are my original words. I use them just in case I fail to use functions and curves properly.<\/p>\n<p>The quadratic curves in the figure above are all \u201ccurves\u201d in my term, which can be denoted as <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-84577d8f29939d06cae5a8a9a9c06d5b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A_3, boldsymbol{b}_3)=const\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"155\"> or <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-b86d549fe33d6209459bd5258e415380_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A_3)=const\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"130\">. However if you replace <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"z\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"9\"> of (g)\u201d, (h)\u201d, and (i)\u201d with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-9c09a708375fde2676da319bcdfe8b24_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"10\">, you can interpret the \u201ccurves\u201d as \u201cfunctions\u201d which are denoted as <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-857c9f0d06e34054f0bea95ddf5d7222_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A_2)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"64\">. This might sounds too obvious to you, and my point is you can visualize how values of \u201cfunctions\u201d change only when the inputs are 2 dimensional.<\/p>\n<p>When a symmetric <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7694442663723c5d73e85d88fb0bf9b9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"2times 2\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"39\"> real matrices <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-47bcf8f5bd85f586a25ce111d1d119b0_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A_2\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"20\"> have two eigenvalues <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-667cbba6c3f717994e0cdb5024316c13_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_1, lambda_2\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"43\">, the distribution of quadratic curves can be roughly classified to the following three types.<\/p>\n<p>The equations of (g)\u201d , (h)\u201d, and (i)\u201d correspond to each type of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f3aa3d5e40e1543b50c2d86b8d055f41_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f=(boldsymbol{x}; A_2)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"88\">, and thier curves look like the three graphs below.<\/p>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-5650 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/quad_comparison-1030x391.png\" alt=\"\" width=\"1030\" height=\"391\"><\/p>\n<p>And in fact, when start from the origin and go in the direction of an eigenvector <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-40f970bfd679e355af6e6f7b93e620c3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_i\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"17\">, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-130188abd4690d701177358e4ad96950_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_i\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\"> is the gradient of the direction. You can see that more clearly when you restrict the distribution of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-f3aa3d5e40e1543b50c2d86b8d055f41_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f=(boldsymbol{x}; A_2)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"88\"> to a unit circle. Like in the figure below, in case <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-77da3c2811b97db49edd081312e952a7_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_1 = 7, lambda_2 = 3\" title=\"Rendered by QuickLaTeX.com\" height=\"17\" width=\"109\">, which is classified to (g), the distribution looks like the left side, and if you restrict the distribution in the unit circle, the distribution looks like a bowl like the middle and the right side. When you move in the direction of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-68acde4d863cf1925f56a07b4fd9d956_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_1\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"18\">, you can climb the bowl as as high as <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d8bad01e37d6a93d8d77801a4429e98e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_1\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"16\">, in <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-1d75a0594c8832098abb7d5c34fd50cd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_2\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"19\"> as high as <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b5ec09b14d85337fe173890755f5259_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_2\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"17\">.<\/p>\n<h3><img loading=\"lazy\" class=\"aligncenter wp-image-5653 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/quad_curve_gradient-1030x388.png\" alt=\"\" width=\"1030\" height=\"388\"><\/h3>\n<p>Also in case of (h), the same facts hold. But in this case, you can also descend the curve.<\/p>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-5654 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/quad_curve_gradient_2-1030x391.png\" alt=\"\" width=\"1030\" height=\"391\"><\/p>\n<p>*You might have seen the curve above in the context of optimization with stochastic gradient descent. The origin of the curve above is a notorious saddle point, where gradients are all <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a5e437be25f29374d30f66cd46adf81c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"0\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\"> in any directions but not a local maximum or minimum. Points can be stuck in this point during optimization.<\/p>\n<p>Especially in case of PCA, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> is a covariance matrix, thus <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-352eb684151958c618a66fc7b2405814_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A=Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"49\">. Eigenvalues of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\"> are all equal to or greater than <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a5e437be25f29374d30f66cd46adf81c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"0\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\">. And it is known that in this case <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-130188abd4690d701177358e4ad96950_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_i\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\"> is the variance of data projected on its corresponding eigenvector <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-40f970bfd679e355af6e6f7b93e620c3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_i\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"17\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-90e748c34e43b71c56c54bf32c9ea6fd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(i=0, dots , D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"105\">. Hence, if you project <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-53e11b921a189dfe27e8988d6cb05a00_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; Sigma)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"56\">, quadratic curves formed by a covariance matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\">, on eigenvectors of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\">, you get <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-53e11b921a189dfe27e8988d6cb05a00_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; Sigma)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"56\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-b3263462f2de04ded4b18d8d8c2efd10_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= ({x'}_1 : dots : {x'}_D) (lambda_1 {x'}_1 : dots : lambda_D {x'}_D)^t\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"260\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2417df32d788582c01544fd72ae17624_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=lambda_1 ({x'}_1)^2 + cdots + lambda_D ({x'}_D)^2\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"215\">.\u00a0 This shows that you can re-weight <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2140f24ea3c31f9d77b5cc96c5f4edb8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"({x'}_1 : dots : {x'}_D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"96\">, the coordinates of data projected projected on eigenvectors of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">, with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-b70547b84409ce6390fae0e9521babf3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_1, dots, lambda_D\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"78\">, which are variances <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2140f24ea3c31f9d77b5cc96c5f4edb8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"({x'}_1 : dots : {x'}_D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"96\">. As I mentioned in an example of data of exam scores in the last article, the bigger a variance <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-130188abd4690d701177358e4ad96950_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_i\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\"> is, the more the feature described by <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-40f970bfd679e355af6e6f7b93e620c3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_i\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"17\"> vary from sample to sample. In other words, you can ignore eigenvectors corresponding to small eigenvalues.<\/p>\n<p>That is a great hint why principal components corresponding to large eigenvectors contain much information of the data distribution. And you can also interpret PCA as a \u201cclimbing\u201d a bowl of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-c91aff2c7ea0f4230027150813b44841_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A_D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"69\">, as I have visualized in the case of (g) type curve in the figure above.<\/p>\n<p>*But as I have repeatedly mentioned, ellipsoid which fit data well is<img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-946530cccde47c2600d08d7866149d06_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; Sigma ^{-1})\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"74\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-430ae6cac36c70e75ab2fad46a38daad_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=(boldsymbol{x}')^T diag(frac{1}{lambda_1}, dots, frac{1}{lambda_D})boldsymbol{x}'\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"204\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-204b15666bde7f4bf2dcdac449983aa0_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= frac{({x'}_{1})^2}{lambda_1} + cdots + frac{({x'}_{D})^2}{lambda_D} = const.\" title=\"Rendered by QuickLaTeX.com\" height=\"28\" width=\"237\">.<\/p>\n<p>*You have to be careful that even if you slice a type (h) curve <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-c91aff2c7ea0f4230027150813b44841_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A_D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"69\"> with a place <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d43e7728fdadc6bf85994f3c3cd4bd5a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"z=const.\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"79\"> the resulting cross section does not fit the original data well because the equation of the cross section is <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-ae40abb3803829eaae65e70e9f599ec8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_1 ({x'}_1)^2 + cdots + lambda_D ({x'}_D)^2 = const.\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"267\"> The figure below is an example of slicing the same <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-857c9f0d06e34054f0bea95ddf5d7222_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"f(boldsymbol{x}; A_2)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"64\"> as the one above with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-74790055f00b0c3de5373bd351d017fd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"z=1\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"41\">, and the resulting cross section.<\/p>\n<h3><img loading=\"lazy\" class=\"aligncenter wp-image-5652 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/quad_curve_slice-1030x389.png\" alt=\"\" width=\"1030\" height=\"389\"><\/h3>\n<p>As we have seen, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-130188abd4690d701177358e4ad96950_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"lambda_i\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\">, the eigenvalues of the covariance matrix of data are variances or data when projected on it eigenvectors. At the same time, when you fit an ellipsoid on the data, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-0ef71ef2387690d93b04940fb96aeb2c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"sqrt{lambda_i}\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"30\"> is the radius of the ellipsoid corresponding to <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-40f970bfd679e355af6e6f7b93e620c3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_i\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"17\">. Thus ignoring data projected on eigenvectors corresponding to small eigenvalues is equivalent to cutting of the axes of the ellipsoid with small radiusses.<\/p>\n<p>I have explained PCA in three different ways over three articles.<\/p>\n<ul>\n<li>The second article: I focused on what kind of linear transformations convariance matrices <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\"> enable, by visualizing displacement vectors. And those vectors look like swirling and extending into directions of eigenvectors of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\">.<\/li>\n<li>The third article: We directly found directions where certain data distribution \u201cswell\u201d the most, to find that data swell the most in directions of eigenvectors.<\/li>\n<li>In this article, we have seen PCA corresponds to only one case of quadratic functions, where the matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> is a covariance matrix. When you go in the directions of eigenvectors corresponding to big eigenvalues, the quadratic function increases the most. Also that means data samples have bigger variances when projected on the eigenvectors. Thus you can cut off eigenvectors corresponding to small eigenvectors because they retain little information about data, and that is equivalent to fitting an ellipsoid on data and cutting off axes with small radiuses.<\/li>\n<\/ul>\n<p>*Let <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-25b206f25506e6d6f46be832f7119ffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> be a covariance matrix, and you can diagonalize it with an orthogonal matrix <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> as follow: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d8c3912785e38a01fcaca2fd098cf472_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^{T}AU = Lambda\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"87\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-501bcde6261dab3102bcda3cfcaa9938_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Lambda = diag(lambda_1, dots, lambda_D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"163\">. Thus <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a3c475d76182abc04bd900f7489a180f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"A = U Lambda U^{T}\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"87\">. <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2b60fc262803f27ba3717d8ec4eb656d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\"> is a rotation, and multiplying a <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> with <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-ce918bb0b12ecb0692e455ec07a7a279_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Lambda\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"12\"> means you multiply each eigenvalue to each element of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-a4997d1a0a6554f7c4b2e41d93ee7fe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{x}\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\">. At the end <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-6bfec27ce583970e303355d356b3523a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U^T\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"23\"> enables the reverse rotation.<\/p>\n<p>If you get data like the left side of the figure below, most explanation on PCA would just fit an oval on this data distribution. However after reading this articles series so far, you would have learned to see PCA from different viewpoints like at the right side of the figure below.<\/p>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-5655 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/intuitive_PCA_en_jp-1030x522.png\" alt=\"\" width=\"1030\" height=\"522\"><\/p>\n<p>\u00a0<\/p>\n<h3>5 Ellipsoids in Gaussian distributions.<\/h3>\n<p>I have explained that if the covariance of a data distribution is <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-694cdbc8f78038d0eb53cc86377fb95c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{Sigma}\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\">, the ellipsoid which fits the distribution the best is <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-33acb4c8333323c7fcde7ababd98781d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"bigl((boldsymbol{x} - boldsymbol{mu}), boldsymbol{Sigma}^{-1}(boldsymbol{x} - boldsymbol{mu})bigr) = 1\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"206\">. You might have seen the part <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-ecb4b80e12e3277fd9eed4fb38eaa50d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"bigl((boldsymbol{x} - boldsymbol{mu}), boldsymbol{Sigma}^{-1}(boldsymbol{x} - boldsymbol{mu})bigr) =\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"193\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d8ec7771244d6ddd2bb0c8e92538ce3f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{x} - boldsymbol{mu}) boldsymbol{Sigma}^{-1}(boldsymbol{x} - boldsymbol{mu})\" title=\"Rendered by QuickLaTeX.com\" height=\"21\" width=\"151\"> somewhere else. It is the exponent of general Gaussian distributions: <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7e6ea8a18ce5dcc590844cd983c91d81_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"mathcal{N}(boldsymbol{x} | boldsymbol{mu}, boldsymbol{Sigma}) =\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"102\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-735ecc6849bfd08dd31b43ca82ead8df_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"frac{1}{(2pi)^{D\/2}} frac{1}{|boldsymbol{Sigma}|} exp{ -frac{1}{2}(boldsymbol{x} - boldsymbol{mu}) boldsymbol{Sigma}^{-1}(boldsymbol{x} - boldsymbol{mu}) }\" title=\"Rendered by QuickLaTeX.com\" height=\"28\" width=\"295\">.\u00a0 It is known that the eigenvalues of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-925b28bc06ff9531a47f28b28563fefa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma ^{-1}\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"30\"> are <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-ae3a62d59e26c150e158b80eacb6d5f0_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"frac{1}{lambda_1}, dots, frac{1}{lambda_D}\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"77\">, and eigenvectors corresponding to each eigenvalue are also <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-4754415955abf085a27afcb7ce531fa2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{u}_1, dots, boldsymbol{u}_D\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"82\"> respectively. Hence just as well as what we have seen, if you project <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-54b5a18aa4aa28f690d8c9b4f152dd96_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{x} - boldsymbol{mu})\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"58\"> on each eigenvector of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-925b28bc06ff9531a47f28b28563fefa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"Sigma ^{-1}\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"30\">, we can convert the exponent of the Gaussian distribution.<\/p>\n<p>Let <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-e48917c04a4e682fe318d315a0bffafc_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"-frac{1}{2}(boldsymbol{x} - boldsymbol{mu}) boldsymbol{Sigma}^{-1}(boldsymbol{x} - boldsymbol{mu})\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"175\"> be <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-cdb36e8e57e3426c880e96fdf4e1696c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{y}\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"11\"> and <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-85dadfc64db9ca18bad07be3a7b20ddd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U ^{-1} boldsymbol{y}= U^{T} boldsymbol{y}\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"102\"> be <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-b8dce53cc81c876bbd5c89ae6e9b2a31_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"boldsymbol{y}'\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"15\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-c6455b5513eba3ffcb4a59462ebe7404_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"U=(boldsymbol{u}_1 : dots : boldsymbol{u}_D)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"128\">. Just as we have seen, <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d8ec7771244d6ddd2bb0c8e92538ce3f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"(boldsymbol{x} - boldsymbol{mu}) boldsymbol{Sigma}^{-1}(boldsymbol{x} - boldsymbol{mu})\" title=\"Rendered by QuickLaTeX.com\" height=\"21\" width=\"151\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7154adc0f68743d15c4747530c9dfffa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=boldsymbol{y}^TSigma^{-1} boldsymbol{y}\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"83\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-405cd1452c5807ff94e6e602be3c2d47_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=(Uboldsymbol{y}')^T Sigma^{-1} Uboldsymbol{y}'\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"133\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-12b605815af3031b80adfd7f71b1554c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=((boldsymbol{y}')^T U^T Sigma^{-1} Uboldsymbol{y}'\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"151\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-60296f8661370cba36a8df06c5b18089_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= (boldsymbol{y}')^T diag(frac{1}{lambda_1}, dots, frac{1}{lambda_D}) boldsymbol{y}'\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"202\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-283f5d210aaf02c34edccb6932d06e85_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"= frac{({y'}_{1})^2}{lambda_1} + cdots + frac{({y'}_{D})^2}{lambda_D}\" title=\"Rendered by QuickLaTeX.com\" height=\"28\" width=\"164\">. Hence <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-7e6ea8a18ce5dcc590844cd983c91d81_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"mathcal{N}(boldsymbol{x} | boldsymbol{mu}, boldsymbol{Sigma}) =\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"102\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-de71ac90716756a2faa989d108d6b445_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"frac{1}{(2pi)^{D\/2}} frac{1}{|boldsymbol{Sigma}|} exp{ -frac{1}{2}(boldsymbol{y}) boldsymbol{Sigma}^{-1}(boldsymbol{y}) } =  frac{1}{(2pi)^{D\/2}} frac{1}{|boldsymbol{Sigma}|} exp{ -frac{1}{2}(frac{({y'}_{1})^2}{lambda_1} + cdots + frac{({y'}_{D})^2}{lambda_D} ) }\" title=\"Rendered by QuickLaTeX.com\" height=\"32\" width=\"556\"> <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-0d3fcb9492d63fa6694d5b8da06a32d1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"=frac{1}{(2pi)^{1\/2}} frac{1}{|boldsymbol{Sigma}|} expbiggl( -frac{1}{2} frac{({y'}_{1})^2}{lambda_1} biggl) cdots frac{1}{(2pi)^{1\/2}} frac{1}{|boldsymbol{Sigma}|} expbiggl( -frac{1}{2}frac{({y'}_{D})^2}{lambda_D} biggl)\" title=\"Rendered by QuickLaTeX.com\" height=\"43\" width=\"422\">.<\/p>\n<p>*To be mathematically exact about changing variants of normal distributions, you have to consider for example Jacobian matrices.<\/p>\n<p>This results above demonstrate that, by projecting data on the eigenvectors of its covariance matrix, you can factorize the original multi-dimensional Gaussian distribution into a product of Gaussian distributions which are irrelevant to each other. However, at the same time, that is the potential limit of approximating data with PCA. This idea is going to be more important when you think about more probabilistic ways to handle PCA, which is more robust to lack of data.<\/p>\n<p>I have explained PCA over 3 articles from various viewpoints. If you have been patient enough to read my article series, I think you have gained some deeper insight into not only PCA, but also linear algebra, and that should be helpful when you learn or teach data science. I hope my codes also help you. In fact these are not the only topics about PCA. There are a lot of important PCA-like algorithms.<\/p>\n<p>In fact our expedition of ellipsoids, or PCA still continues, just as Star Wars series still continues. Especially if I have to explain an algorithm named probabilistic PCA, I need to explain the \u201cBayesian world\u201d of machine learning. Most machine learning algorithms covered by major introductory textbooks tend to be too deterministic and dependent on the size of data. Many of those algorithms have another \u201cparallel world,\u201d where you can handle inaccuracy in better ways. I hope I can also write about them, and I might prepare another trilogy for such PCA. But I will not disappoint you, like \u201cThe Phantom Menace.\u201d<\/p>\n<h3>Appendix: making a model of a bunch of grape with ellipsoid berries.<\/h3>\n<p>If you can control quadratic curves, reshaping and rotating them, you can make a model of a grape of olive bunch on Matplotlib. I made a program of making a model of a bunch of berries on Matplotlib using the module to draw ellipsoids which I introduced earlier. You can check the codes in <a href=\"https:\/\/github.com\/YasuThompson\/dimension_reduction_codes\/blob\/main\/multi_view_grape_bunch.ipynb\">this page<\/a>.<\/p>\n<p>*I have no idea how many people on this earth are in need of making such models.<\/p>\n<h3><img loading=\"lazy\" class=\"aligncenter wp-image-5656\" src=\"https:\/\/data-science-blog.com\/wp-content\/uploads\/2021\/12\/single_view_grape_bunch-1030x1030.png\" alt=\"\" width=\"713\" height=\"713\"><\/h3>\n<p>I made some modules so that you can see the grape bunch from several angles. This might look very simple to you, but the locations of berries are organized carefully so that it looks like they are placed around a stem and that the berries are not too close to each other.<\/p>\n<h3><img loading=\"lazy\" class=\"aligncenter wp-image-5657 size-large\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2021\/06\/multi_view_grape_bunch-2-e1617150713353-1030x593.png\" alt=\"\" width=\"1030\" height=\"593\"><\/h3>\n<p>\u00a0<\/p>\n<p>The programming code I created for this article is completly available<a href=\"https:\/\/github.com\/YasuThompson\/dimension_reduction_codes\"> here<\/a>.<\/p>\n<p>[Refereces]<\/p>\n<p>[1]C. M. Bishop, \u201cPattern Recognition and Machine Learning,\u201d (2006), Springer, pp. 78-83, 559-577<\/p>\n<p>[2]\u300c\u7406\u5de5\u7cfb\u65b0\u8ab2\u7a0b\u3000\u7dda\u5f62\u4ee3\u6570\u3000\u57fa\u790e\u304b\u3089\u5fdc\u7528\u307e\u3067\u300d, \u57f9\u98a8\u9928\u3001(2017)<\/p>\n<p>[3]\u300c\u3053\u308c\u306a\u3089\u5206\u304b\u308b\u3000\u6700\u9069\u5316\u6570\u5b66\u3000\u57fa\u790e\u539f\u7406\u304b\u3089\u8a08\u7b97\u624b\u6cd5\u307e\u3067\u300d, \u91d1\u8c37\u5065\u4e00\u8457\u3001\u5171\u7acb\u51fa\u7248, (2019), pp. 17-49<\/p>\n<p>[4]\u300c\u3053\u308c\u306a\u3089\u5206\u304b\u308b\u3000\u5fdc\u7528\u6570\u5b66\u6559\u5ba4\u3000\u6700\u5c0f\u4e8c\u4e57\u6cd5\u304b\u3089\u30a6\u30a7\u30fc\u30d6\u30ec\u30c3\u30c8\u307e\u3067\u300d, \u91d1\u8c37\u5065\u4e00\u8457\u3001\u5171\u7acb\u51fa\u7248, (2019), pp.165-208<\/p>\n<p>[5] \u300c\u30b5\u30dc\u30c6\u30f3\u30d1\u30a4\u30bd\u30f3 \u300d<br \/>https:\/\/sabopy.com\/<\/p>\n<p>\u00a0<\/p>\n<div id=\"author-bio-box\">\n<h3><a href=\"https:\/\/data-science-blog.com\/en\/blog\/author\/yasuto\/\" title=\"All posts by Yasuto Tamura\" rel=\"author\">Yasuto Tamura<\/a><\/h3>\n<div class=\"bio-gravatar\"><img loading=\"lazy\" data-del=\"avatar\" alt=\"\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2020\/03\/yasuto-tamura.png\" class=\"avatar pp-user-avatar avatar-70 photo \" height=\"70\" width=\"70\"><\/div>\n<p><a target=\"_blank\" rel=\"nofollow noopener noreferrer\" href=\"http:\/\/www.datanomiq.de\" class=\"bio-icon bio-icon-website\"><\/a><a target=\"_blank\" rel=\"nofollow noopener noreferrer\" href=\"https:\/\/www.linkedin.com\/in\/yasuto-tamura-7689b418b\/\" class=\"bio-icon bio-icon-linkedin\"><\/a><\/p>\n<p class=\"bio-description\">Data Science Intern at <a href=\"http:\/\/www.datanomiq.io\">DATANOMIQ<\/a>.<br \/>\nMajoring in computer science. Currently studying mathematical sides of deep learning, such as densely connected layers, CNN, RNN, autoencoders, and making study materials on them. Also started aiming at Bayesian deep learning algorithms.<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/data-science-blog.com\/en\/blog\/2021\/06\/06\/rethinking-linear-algebra-part-two-ellipsoids-in-data-science-and-mahalanobis-distance\/<\/p>\n","protected":false},"author":0,"featured_media":8304,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8303"}],"collection":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/comments?post=8303"}],"version-history":[{"count":0,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8303\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media\/8304"}],"wp:attachment":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media?parent=8303"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/categories?post=8303"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/tags?post=8303"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}