{"id":8050,"date":"2021-01-11T03:55:56","date_gmt":"2021-01-11T03:55:56","guid":{"rendered":"https:\/\/wealthrevelation.com\/data-science\/2021\/01\/11\/how-principal-component-analysis-pca-works\/"},"modified":"2021-01-11T03:55:56","modified_gmt":"2021-01-11T03:55:56","slug":"how-principal-component-analysis-pca-works","status":"publish","type":"post","link":"https:\/\/wealthrevelation.com\/data-science\/2021\/01\/11\/how-principal-component-analysis-pca-works\/","title":{"rendered":"How Principal Component Analysis, PCA Works"},"content":{"rendered":"<div id=\"tve_editor\" data-post-id=\"8443\">\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d5dd4b3f\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/1-Principal-Component-Analysis.png?resize=622%2C373&amp;ssl=1\" class=\"tve_image wp-image-8445\" alt=\"Principal Component Analysis\" data-id=\"8445\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Principal Component Analysis\" loading=\"lazy\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8445\" alt=\"Principal Component Analysis\" data-id=\"8445\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Principal Component Analysis\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/1-Principal-Component-Analysis.png?resize=622%2C373&amp;ssl=1\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element tve-froala fr-box fr-basic\" data-css=\"tve-u-176d5dd4b49\">\n<p dir=\"ltr\">Whoever tried to build <a href=\"https:\/\/dataaspirant.com\/category\/machine-learning-2\/\" target=\"_blank\" rel=\"noopener\"><strong>machine learning models<\/strong><\/a> with many features would already know the glims about the concept of principal component analysis. In short <strong>PCA<\/strong>.<\/p>\n<p dir=\"ltr\">The inclusion of more features in the implementation of machine learning algorithms models might lead to worsening performance issues. The increase in the number of features will not always improve <a href=\"https:\/\/dataaspirant.com\/six-popular-classification-evaluation-metrics-in-machine-learning\/\" target=\"_blank\" class=\"tve-froala\" rel=\"noopener\"><strong>classification accuracy<\/strong><\/a>.\u00a0<\/p>\n<p dir=\"ltr\">When enough features are <strong>not present<\/strong> in the data, the model is likely to <strong>underfit<\/strong>, and when data contains <strong>too many<\/strong> features, it is expected to overfit or underfit. This phenomenon is known as the c<strong>urse of dimensionality<\/strong>.\u00a0<\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_tw_qs tve_clearfix\" data-url=\"https:\/\/twitter.com\/intent\/tweet\" data-via=\"\" data-use_custom_url=\"\" data-css=\"tve-u-176d5dd4b87\">\n<div class=\"thrv_tw_qs_container\">\n<div class=\"thrv_tw_quote\">\n<p>Learn how the popular dimension reduction technique PCA (principal component analysis) works and learn the implementation in python. #pca #datascience #machinelearning #python\u00a0<\/p>\n<\/p><\/div>\n<p>\n\t\t\t<span><br \/>\n\t\t\t\t<i><\/i><br \/>\n\t\t\t\t<span class=\"thrv_tw_qs_button_text thrv-inline-text tve_editable\">Click to Tweet<\/span><br \/>\n\t\t\t<\/span>\n\t\t<\/p>\n<\/p><\/div>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element\" data-css=\"tve-u-176d5dd4b88\">\n<p dir=\"ltr\">Therefore, we apply dimensionality reduction by selecting the <strong>optimal set<\/strong> of lower dimensionality features in order to <a href=\"https:\/\/dataaspirant.com\/six-popular-classification-evaluation-metrics-in-machine-learning\/\" target=\"_blank\" rel=\"noopener\"><strong>improve classification accuracy<\/strong><\/a>.<\/p>\n<p dir=\"ltr\">Following are the techniques to perform the dimensionality reduction:<\/p>\n<p dir=\"ltr\">If you are not sure about the PCA (principal component analysis )and the need for dimensionality reduction, don&#8217;t worry.<\/p>\n<p dir=\"ltr\">You are in the right place. In this article, we are going to cover everything.<\/p>\n<p dir=\"ltr\">Before we dive further, below are the topics you are going to learn in this article. Only if you read the complete article \ud83d\ude42<\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element\" data-css=\"tve-u-176d5dd4b8b\">\n<p dir=\"ltr\">Let\u2019s start the discussion with the curse of dimensionality and its impact on <a href=\"https:\/\/dataaspirant.com\/for-beginners\/\" target=\"_blank\" rel=\"noopener\"><strong>building machine learning models<\/strong><\/a>.<\/p>\n<h2 id=\"t-1609906201469\" class=\"\">Curse of Dimensionality<\/h2>\n<p dir=\"ltr\">Curse of Dimensionality can be defined as:<\/p>\n<blockquote class=\"\"><p>The set of problems that arise when we work with high-dimensional data.<\/p><\/blockquote>\n<p dir=\"ltr\">The dimension of a dataset is directly related to the <strong>number of features<\/strong> that are present in a dataset.\u00a0<\/p>\n<p dir=\"ltr\">High-dimensional data can be defined as a dataset having a large number of attributes, generally of the order of a hundred or more.<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d5ed3969\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/2-Curse-Of-Dimensionality.png?resize=622%2C373&amp;ssl=1\" class=\"tve_image wp-image-8455\" alt=\"Curse Of Dimensionality\" data-id=\"8455\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Curse Of Dimensionality\" loading=\"lazy\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8455\" alt=\"Curse Of Dimensionality\" data-id=\"8455\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Curse Of Dimensionality\" loading=\"lazy\" src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/2-Curse-Of-Dimensionality.png?resize=622%2C373&amp;ssl=1\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">The difficulties that arise with high dimensional data arise during analysis and visualization of the data to identify patterns. Others manifest when we train the machine learning models.\u00a0<\/p>\n<p dir=\"ltr\">The curse of dimensionality can be defined in other words as:<\/p>\n<blockquote class=\"\"><p>The rise of difficulties due to the presence of high dimensional data when we train the machine learning models.<\/p><\/blockquote>\n<p dir=\"ltr\">The popular aspects of the curse of dimensionality are <\/p>\n<ul class=\"\">\n<li class=\"\">distance concentration<\/li>\n<li class=\"\">data sparsity<\/li>\n<\/ul>\n<p dir=\"ltr\">Before we learn about data sparsity and distance concentration, let\u2019s understand the curse of dimensionality with an example.<\/p>\n<h3 id=\"t-1609906201470\" class=\"\">Understanding the Curse of Dimensionality with regression Example<\/h3>\n<p dir=\"ltr\">We know that as the number of features or dimensions grows in a dataset, the available data which we need to generalize grows exponentially and becomes sparse.\u00a0<\/p>\n<p dir=\"ltr\">So, in high dimensional data The objects appear to be dissimilar and sparse, preventing common data organization strategies from being efficient.<\/p>\n<p dir=\"ltr\">Let\u2019s see how high dimensional data is a curse with the help of the following example.<\/p>\n<p dir=\"ltr\">Consider that we have two points i-e, 0, and 1 in a line, which are a <strong>unit distance<\/strong> away from each other.\u00a0<\/p>\n<p dir=\"ltr\">We introduce another axis again at a unit distance. So, the points are <strong>(0,0)<\/strong> and <strong>(1,1).<\/strong><\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p>Simulating this code, we get the following output:<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d5fbb3aa\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/3-Curse-Of-Dimensionality-Example.png?resize=622%2C373&amp;ssl=1\" class=\"tve_image wp-image-8460\" alt=\"Curse Of Dimensionality Example\" data-id=\"8460\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Curse Of Dimensionality Example\" loading=\"lazy\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8460\" alt=\"Curse Of Dimensionality Example\" data-id=\"8460\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Curse Of Dimensionality Example\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/3-Curse-Of-Dimensionality-Example.png?resize=622%2C373&amp;ssl=1\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element tve-froala fr-box fr-basic\">\n<p dir=\"ltr\">In one dimension, we have 1% of the outlier points uniformly distributed from each other. In <strong>50<\/strong> dimensions, there will be almost <strong>60%<\/strong> of the outlier points.\u00a0<\/p>\n<p dir=\"ltr\">In the same way or \u00a0similarly, in 100 dimensions, almost 90% of the points will be outliers.<\/p>\n<h3 id=\"t-1609906201471\" class=\"\">Data Sparsity<\/h3>\n<p dir=\"ltr\">To accurately predict the outcome for a given input data sample, the <a href=\"https:\/\/dataaspirant.com\/supervised-and-unsupervised-learning\/\" target=\"_blank\" class=\"tve-froala\" rel=\"noopener\"><strong>supervised machine learning models<\/strong><\/a> are trained.\u00a0<\/p>\n<p dir=\"ltr\">When the model is under training. Some part of the data is used for the model training, and the rest is used to <a href=\"https:\/\/dataaspirant.com\/six-popular-classification-evaluation-metrics-in-machine-learning\/\" target=\"_blank\" rel=\"noopener\"><strong>evaluate how the model performs<\/strong><\/a> on unseen data.\u00a0<\/p>\n<p dir=\"ltr\">This evaluation step helps us gain an understanding of whether the model is generalized or not.\u00a0<\/p>\n<p dir=\"ltr\">You can consider any of the below articles for splitting the dataset into train and test.<\/p>\n<p dir=\"ltr\">Model generalization can be defined as the ability of the model to predict the outcome for an <strong>unseen<\/strong> input data accurately.\u00a0<\/p>\n<p dir=\"ltr\">It is mandatory that the unseen input data should come from the same distribution as the one used to train the model. <\/p>\n<p dir=\"ltr\">The accuracy of the generalized model\u2019s prediction on the unseen data should be very close to its accuracy on the training data.<\/p>\n<p dir=\"ltr\">The efficient way to build a generalized model is by capturing a variety of possible combinations of the values of predictor variables and their corresponding targets.<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d5fd0b7a\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/4-Data-Sparsity.png?resize=582%2C410&amp;ssl=1\" class=\"tve_image wp-image-8464\" alt=\"Data Sparsity\" data-id=\"8464\" width=\"582\" data-init-width=\"1024\" height=\"410\" data-init-height=\"721\" title=\"Data Sparsity\" loading=\"lazy\" data-width=\"582\" data-height=\"410\" data-css=\"tve-u-176d8939573\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8464\" alt=\"Data Sparsity\" data-id=\"8464\" width=\"582\" data-init-width=\"1024\" height=\"410\" data-init-height=\"721\" title=\"Data Sparsity\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/4-Data-Sparsity.png?resize=582%2C410&amp;ssl=1\" data-width=\"582\" data-height=\"410\" data-css=\"tve-u-176d8939573\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">For example.\u00a0<\/p>\n<p dir=\"ltr\">If we have to predict a target that is dependent on <strong>two<\/strong> attributes, i-e, age group and gender. Then we have to ideally capture the targets for all possible combinations of values for the two mentioned attributes.\u00a0<\/p>\n<p dir=\"ltr\">The performance of the model can be generalized if the data used to train the model is able to learn the mapping between the attribute values and the target.\u00a0<\/p>\n<p dir=\"ltr\">The model would predict the target accurately as long as the future unseen data comes from the same distribution (a combination of values).<\/p>\n<p dir=\"ltr\"><strong>Age group levels<\/strong><\/p>\n<ul class=\"\">\n<li>Children (0-14 Years)<\/li>\n<li>Youth (15-24 Years)<\/li>\n<li>Adult (25-60 Years)<\/li>\n<li>Senior (61 and over)<\/li>\n<\/ul>\n<p dir=\"ltr\"><strong>Gender Levels<\/strong><\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element tve-froala fr-box fr-basic\">\n<p dir=\"ltr\">In the above example. We considered the dependence of the target value on gender and age group only if we consider the dependence of the target value on a third attribute.\u00a0<\/p>\n<p dir=\"ltr\">Let\u2019s say body type, then the number of training samples required to cover all the combinations increases phenomenally.\u00a0<\/p>\n<p dir=\"ltr\">In the above figure, it is shown that for two variables, we have eight training samples. So, for three variables, we need 24 samples, and so on.<\/p>\n<h3 id=\"t-1609906201472\" class=\"\">Distance Concentration<\/h3>\n<p dir=\"ltr\">Distance concentration can be defined as:<\/p>\n<blockquote class=\"\"><p>The problem of convergence of all pairwise distances to the same value as the data dimensionality increases.<\/p><\/blockquote>\n<p dir=\"ltr\">Some of the machine learning models, such as <a href=\"https:\/\/dataaspirant.com\/hierarchical-clustering-algorithm\/\" target=\"_blank\" rel=\"noopener\"><strong>clustering<\/strong><\/a> or <a href=\"https:\/\/dataaspirant.com\/k-nearest-neighbor-algorithm-implementaion-python-scratch\/\" target=\"_blank\" class=\"tve-froala\" rel=\"noopener\"><strong>nearest neighbors<\/strong><\/a>\u2019 methods, make use of distance-based metrics to identify the proximity of the samples.<\/p>\n<p dir=\"ltr\">The concept of similarity or proximity of the samples may not be qualitatively relevant in higher dimensions due to <a href=\"https:\/\/dataaspirant.com\/five-most-popular-similarity-measures-implementation-in-python\/\" target=\"_blank\" rel=\"noopener\"><strong>distance concentration<\/strong><\/a>.<\/p>\n<h3 id=\"t-1609906201473\" class=\"\">Implications of the Curse of Dimensionality<\/h3>\n<p dir=\"ltr\">The curse of dimensionality has the following implications:<\/p>\n<ul class=\"\">\n<li>Due to a large number of features, the <a href=\"https:\/\/dataaspirant.com\/optimization-algorithms-deep-learning\/\" target=\"_blank\" rel=\"noopener\"><strong>optimization problems<\/strong><\/a> become infeasible.<\/li>\n<li>The probability of recognizing a particular point proceeds to fall due to the absolute scale of inherent points in an n-dimensional space.<\/li>\n<\/ul>\n<h2 id=\"t-1609906201474\" class=\"\">Mitigating Curse of Dimensionality<\/h2>\n<p dir=\"ltr\">To overcome the problems associated with high dimensional data, the techniques termed as \u2018Dimensionality reduction techniques\u2019 are applied.<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d6010ed9\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/5-Dimensionality-reduction-techniques.png?resize=582%2C402&amp;ssl=1\" class=\"tve_image wp-image-8469\" alt=\"Dimensionality reduction techniques\" data-id=\"8469\" width=\"582\" data-init-width=\"1024\" height=\"402\" data-init-height=\"708\" title=\"Dimensionality reduction techniques\" loading=\"lazy\" data-width=\"582\" data-height=\"402\" data-css=\"tve-u-176d894d264\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8469\" alt=\"Dimensionality reduction techniques\" data-id=\"8469\" width=\"582\" data-init-width=\"1024\" height=\"402\" data-init-height=\"708\" title=\"Dimensionality reduction techniques\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/5-Dimensionality-reduction-techniques.png?resize=582%2C402&amp;ssl=1\" data-width=\"582\" data-height=\"402\" data-css=\"tve-u-176d894d264\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">The dimensionality reduction techniques fall into one of the following two categories i-e;<\/p>\n<h2 id=\"t-1609906201475\" class=\"\">Feature selection Methods<\/h2>\n<p dir=\"ltr\">In <a href=\"https:\/\/dataaspirant.com\/feature-selection-methods-machine-learning\/\" target=\"_blank\" rel=\"noopener\"><strong>feature selection techniques<\/strong><\/a>, we test the attributes on the basis of their worth, and then they are selected or eliminated.\u00a0<\/p>\n<p dir=\"ltr\">Following are some of the commonly used Feature selection techniques:<\/p>\n<h3 id=\"t-1609906201476\" class=\"\">Low Variance filter<\/h3>\n<p dir=\"ltr\" id=\"t-1609906201477\">The process flow of this technique is as under:<\/p>\n<ul class=\"\">\n<li class=\" dir=\">The variance of all the attributes in a dataset is compared.\u00a0<\/li>\n<li class=\" dir=\">\u00a0The attributes having sufficiently low variance are discarded.<\/li>\n<li class=\" dir=\">The attributes that do not possess much variance assume a constant value, thus having no contribution to the model\u2019s predictability.<\/li>\n<\/ul>\n<h3 id=\"t-1609906201478\" class=\"\">High Correlation filter<\/h3>\n<p dir=\"ltr\">In this technique, the steps are as under:<\/p>\n<ul class=\"\">\n<li class=\" dir=\">The pairwise correlation between attributes is determined.\u00a0<\/li>\n<li class=\" dir=\">One of the attributes in the pair that has a significantly high correlation is eliminated and the other retained.<\/li>\n<li class=\" dir=\">In the eliminated attribute, the variability is captured through the retained attribute.<\/li>\n<\/ul>\n<h3 id=\"t-1609906201479\" class=\"\">Multicollinearity<\/h3>\n<p dir=\"ltr\"><a href=\"https:\/\/dataaspirant.com\/assumptions-of-linear-regression-algorithm\/\" target=\"_blank\" rel=\"noopener\"><strong>Multicollinearity<\/strong><\/a> occurs when a high degree correlation occurs between two or more independent variables in a <a href=\"https:\/\/dataaspirant.com\/linear-regression\/\" target=\"_blank\" rel=\"noopener\"><strong>regression model. <\/strong><\/a><\/p>\n<p dir=\"ltr\">It means that one independent variable can be determined or predicted from another independent variable.<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d6029881\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/6-Multicollinearity.png?resize=604%2C325&amp;ssl=1\" class=\"tve_image wp-image-8473\" alt=\"Multicollinearity\" data-id=\"8473\" width=\"604\" data-init-width=\"604\" height=\"325\" data-init-height=\"325\" title=\"Multicollinearity\" loading=\"lazy\" data-width=\"604\" data-height=\"325\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8473\" alt=\"Multicollinearity\" data-id=\"8473\" width=\"604\" data-init-width=\"604\" height=\"325\" data-init-height=\"325\" title=\"Multicollinearity\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/6-Multicollinearity.png?resize=604%2C325&amp;ssl=1\" data-width=\"604\" data-height=\"325\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">Inflation Factor (VIF) is a well-known technique used to detect multicollinearity. Attributes having high VIF values, usually greater than 10, are discarded.<\/p>\n<h3 id=\"t-1609906201480\" class=\"\">Feature Ranking<\/h3>\n<p dir=\"ltr\">The attributes can be ranked by <a href=\"https:\/\/dataaspirant.com\/how-decision-tree-algorithm-works\/\" target=\"_blank\" rel=\"noopener\"><strong>decision tree models<\/strong><\/a> such as CART (<a href=\"https:\/\/dataaspirant.com\/classification-and-prediction\/\" target=\"_blank\" rel=\"noopener\"><strong>Classification and Regression<\/strong><\/a> Trees) based on their importance or contribution to the model\u2019s predictability.<\/p>\n<p dir=\"ltr\">The lower-ranked variables in high dimensional data could be eliminated to <strong>reduce<\/strong> the dimensions.<\/p>\n<h3 id=\"t-1609906201481\" class=\"\">Forward selection<\/h3>\n<p dir=\"ltr\">When a <a href=\"https:\/\/dataaspirant.com\/linear-regression-implementation-in-python\/\" target=\"_blank\" rel=\"noopener\"><strong>multi-linear regression model<\/strong><\/a> is built with high dimensional data, then only one attribute is selected at the beginning to build the regression model.\u00a0<\/p>\n<p>Afterward, the remaining attributes are added one by one, and their worth is tested using \u2018Adjusted-R2\u2019 values. <\/p>\n<div class=\"\">If there is a noticeable improvement in <a href=\"https:\/\/dataaspirant.com\/difference-between-r-squared-and-adjusted-r-squared\/\" target=\"_blank\" rel=\"noopener\"><strong>Adjusted-R2 values<\/strong><\/a>, then the variable is retained; else, it is discarded.<\/div>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d603ac44\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/7-Feature-Selection-Method.png?resize=622%2C311&amp;ssl=1\" class=\"tve_image wp-image-8476\" alt=\"Feature Selection Method\" data-id=\"8476\" width=\"622\" data-init-width=\"1024\" height=\"311\" data-init-height=\"513\" title=\"Feature Selection Method\" loading=\"lazy\" data-width=\"622\" data-height=\"311\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8476\" alt=\"Feature Selection Method\" data-id=\"8476\" width=\"622\" data-init-width=\"1024\" height=\"311\" data-init-height=\"513\" title=\"Feature Selection Method\" loading=\"lazy\" src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/7-Feature-Selection-Method.png?resize=622%2C311&amp;ssl=1\" data-width=\"622\" data-height=\"311\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<h2 id=\"t-1609906201482\" class=\"\">Feature Extraction Methods<\/h2>\n<p dir=\"ltr\">There are a number of feature extraction techniques in which the combination of high dimensional attributes is done into <strong>low dimensional<\/strong> components (PCA or ICA). \u00a0<\/p>\n<p dir=\"ltr\">There are a number of feature extraction techniques such as:<\/p>\n<ul class=\"\">\n<li>Independent Component Analysis<\/li>\n<li>Principal Component Analysis<\/li>\n<li>Autoencoder<\/li>\n<li>Partial Least Squares<\/li>\n<\/ul>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d604d3bb\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/8-Feature-Extraction-Method.png?resize=545%2C448&amp;ssl=1\" class=\"tve_image wp-image-8479\" alt=\"Feature Extraction Method\" data-id=\"8479\" width=\"545\" data-init-width=\"1024\" height=\"448\" data-init-height=\"843\" title=\"Feature Extraction Method\" loading=\"lazy\" data-width=\"545\" data-height=\"448\" data-css=\"tve-u-176d8953d6e\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8479\" alt=\"Feature Extraction Method\" data-id=\"8479\" width=\"545\" data-init-width=\"1024\" height=\"448\" data-init-height=\"843\" title=\"Feature Extraction Method\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/8-Feature-Extraction-Method.png?resize=545%2C448&amp;ssl=1\" data-width=\"545\" data-height=\"448\" data-css=\"tve-u-176d8953d6e\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">We will be discussing the Principal Component Analysis in detail.<\/p>\n<h3 id=\"t-1609906201483\" class=\"\">Principal Component Analysis (PCA)<\/h3>\n<p dir=\"ltr\"><strong>Karl Pearson<\/strong> and Harold Hotelling invented Principal Component Analysis in <strong>1901<\/strong> as an analog to the Principal axis theorem.<\/p>\n<p dir=\"ltr\">Principal Component Analysis or PCA can be defined as:<\/p>\n<blockquote class=\"\"><p>A dimensionality-reduction technique in which transformation of high dimensional correlated data is performed into a lower-dimensional set of uncorrelated components also referred to as principal components.<\/p><\/blockquote>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d605f8fa\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/9-Principal-Component-Analysis-With-Dimensions.png?resize=411%2C460&amp;ssl=1\" class=\"tve_image wp-image-8482\" alt=\"Principal Component Analysis With Dimensions\" data-id=\"8482\" width=\"411\" data-init-width=\"915\" height=\"460\" data-init-height=\"1024\" title=\"Principal Component Analysis With Dimensions\" loading=\"lazy\" data-width=\"411\" data-height=\"460\" data-css=\"tve-u-176d8958e34\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8482\" alt=\"Principal Component Analysis With Dimensions\" data-id=\"8482\" width=\"411\" data-init-width=\"915\" height=\"460\" data-init-height=\"1024\" title=\"Principal Component Analysis With Dimensions\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/9-Principal-Component-Analysis-With-Dimensions.png?resize=411%2C460&amp;ssl=1\" data-width=\"411\" data-height=\"460\" data-css=\"tve-u-176d8958e34\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">The <strong>lower-dimensional<\/strong> principal components capture the majority of the information in the high dimensional dataset.<\/p>\n<p dir=\"ltr\">The transformation of an \u2018n\u2019 dimensional data is done into \u2018n\u2019 principal components. Then the selection of these <strong>\u2018n\u2019 <\/strong>principal components subset is based on the percentage of variance in the data intended to be captured through the principal components.<\/p>\n<p dir=\"ltr\">We can also define Principal Component Analysis (PCA) as an exploratory approach to reduce the dataset\u2019s dimensionality into 2D or 3D. <\/p>\n<p dir=\"ltr\">Used in exploratory data analysis for making predictive models.\u00a0<\/p>\n<p dir=\"ltr\">Principal Component Analysis can be declared as a linear transformation of data set that defines a new coordinate rule as under:<\/p>\n<ul class=\"\">\n<li>On the <strong>first axis<\/strong>, the highest variance by any projection of the data set appears to laze.<\/li>\n<li>Similarly, the <strong>second biggest<\/strong> variance on the second axis, and so on.<\/li>\n<\/ul>\n<h3 id=\"t-1609906201484\" class=\"\">Purpose of Principal Component Analysis<\/h3>\n<p dir=\"ltr\">Principal component analysis (PCA) is used for the following purposes:<\/p>\n<ul class=\"\">\n<li>To visualize the high dimensionality data.<\/li>\n<li>To introduce improvements in classification.<\/li>\n<li>To obtain a compact description.\u00a0<\/li>\n<li>To capture as much variance in the data as possible.<\/li>\n<li>To decrease the number of dimensions in the dataset.<\/li>\n<li>To search for patterns in the dataset of high dimensionality.<\/li>\n<li>To discard noise<\/li>\n<\/ul>\n<h2 id=\"t-1609906201485\" class=\"\">How Principal Component Analysis (PCA) Works<\/h2>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d607433d\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/10-How-Principal-Component-Analysis-Works.png?resize=622%2C373&amp;ssl=1\" class=\"tve_image wp-image-8486\" alt=\"How Principal Component Analysis Works\" data-id=\"8486\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"How Principal Component Analysis Works\" loading=\"lazy\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8486\" alt=\"How Principal Component Analysis Works\" data-id=\"8486\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"How Principal Component Analysis Works\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/10-How-Principal-Component-Analysis-Works.png?resize=622%2C373&amp;ssl=1\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">In short, principal component analysis (PCA) can be defined as:<\/p>\n<blockquote class=\"\"><p>Transforming and reshaping a large number of variables into a smaller number of unrelated variables known as principal components (PCs), developed to capture as much of the variance in the dataset as possible.<\/p><\/blockquote>\n<h3 id=\"t-1609906201486\" class=\"\">Objectives of PCA<\/h3>\n<h3 id=\"t-1609906201487\" class=\"\">The following are the main mathematical objectives of PCA:<\/h3>\n<ul class=\"\">\n<li>Finding an orthonormal basis for the data<\/li>\n<li>Sorting the dimensions in the order of importance<\/li>\n<li>Discarding the low significant dimensions<\/li>\n<li>Focusing on uncorrelated and Gaussian components<\/li>\n<\/ul>\n<h3 id=\"t-1609906201488\" class=\"\">Steps involved in PCA<\/h3>\n<p dir=\"ltr\">The following are the main steps involved in Principal Component Analysis.<\/p>\n<ul class=\"\">\n<li class=\" dir=\">Standardization of the PCA.<\/li>\n<li class=\" dir=\">Calculation of the covariance matrix.<\/li>\n<li class=\" dir=\">Finding the eigenvalues and eigenvectors for the covariance matrix.<\/li>\n<li class=\" dir=\">Plotting the vectors on the scaled data.<\/li>\n<\/ul>\n<h3 id=\"t-1609906201489\" class=\"\">Problem depicting PCA requirement<\/h3>\n<p dir=\"ltr\">Let\u2019s suppose that there are <strong>100 students<\/strong> in a class having \u201ck\u201d different features like <\/p>\n<ul class=\"\">\n<li class=\"\">age,\u00a0<\/li>\n<li class=\"\">height, <\/li>\n<li class=\"\">hair color, <\/li>\n<li class=\"\">weight, <\/li>\n<li class=\"\">grade, and many more.<\/li>\n<\/ul>\n<p dir=\"ltr\">It is possible that most of the features may not be useful in describing the student. For this reason, it is mandatory to critically find those <strong>valuable features<\/strong> that characterize the person.<\/p>\n<p dir=\"ltr\">The analysis based on <strong>observing different features<\/strong> of a student:<\/p>\n<ul class=\"\">\n<li>Every student has data in the form of a vector that defines the length of k i-e; characteristic features like\n<ul>\n<li>height, <\/li>\n<li>weight, <\/li>\n<li>hair_color, <\/li>\n<li>grade or 181, 68, black, 99.<\/li>\n<\/ul>\n<\/li>\n<li>Each column represents one student vector. Therefore, n = 100. Here, n represents the number of features of a student.<\/li>\n<li>It creates a k*n matrix.<\/li>\n<li>Each student lies in a k-dimensional vector space.<\/li>\n<\/ul>\n<h2 id=\"t-1609906201490\" class=\"\">Principal Component Analysis Features<\/h2>\n<p dir=\"ltr\">Some of the features of PCA listed below are considered while the rest of them are ignored.<\/p>\n<h3 id=\"t-1609906201491\" class=\"\">PCA Ignored Features<\/h3>\n<ul class=\"\">\n<li class=\" dir=\">Linearly dependent or collinear features. e.g., height and leg size.<\/li>\n<li class=\" dir=\">Constant features. e.g., Number of teeth.<\/li>\n<li class=\" dir=\">Noisy features which are constant. e.g., hair thickness<\/li>\n<\/ul>\n<h3 id=\"t-1609906201492\" class=\"\">PCA Key Features to Keep<\/h3>\n<ul class=\"\">\n<li class=\" dir=\">Low covariance or non-collinear features<\/li>\n<li class=\" dir=\">Features that are variable and have high variance. <\/li>\n<\/ul>\n<h2 id=\"t-1609906201493\" class=\"\">Math Behind Principal Component Analysis<\/h2>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d608e454\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/11-Math-Behind-PCA.png?resize=622%2C373&amp;ssl=1\" class=\"tve_image wp-image-8490\" alt=\"Math Behind PCA\" data-id=\"8490\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Math Behind PCA\" loading=\"lazy\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8490\" alt=\"Math Behind PCA\" data-id=\"8490\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"Math Behind PCA\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/11-Math-Behind-PCA.png?resize=622%2C373&amp;ssl=1\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">It is important to understand the mathematical logic involved before kickstarting PCA. Eigenvalues and eigenvectors play essential roles in PCA.<\/p>\n<h3 id=\"t-1609906201494\" class=\"\">Eigenvectors and eigenvalues<\/h3>\n<p dir=\"ltr\">The source of the PCA is described by the eigenvectors and eigenvalues of a covariance matrix (or correlation).\u00a0\u00a0<\/p>\n<p dir=\"ltr\">Eigenvectors determine the <strong>direction<\/strong> of the new attribute space, and the magnitude is determined by eigenvalues.\u00a0<\/p>\n<p dir=\"ltr\">Let\u2019s consider a simple example depicting the calculation of eigenvalues and eigenvectors.<\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p>Let X represent a <strong>square matrix<\/strong>. The function <strong>scipy.linalg.eig<\/strong> performs the computation of the <strong>eigenvalues<\/strong> and eigenvectors of the square matrix.<\/p>\n<p>The <strong>X output<\/strong> looks like the below.<\/p>\n<p><strong><span data-css=\"tve-u-176d69b42fb\">[[1, 0],\u00a0<\/span><\/strong><\/p>\n<p><strong><span data-css=\"tve-u-176d69b4303\">[0, -2]]<\/span><\/strong><\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p>The function <strong>la.eig<\/strong> returns a tuple (eigvals,eigvecs) where eigvals represents a 1D NumPy array of complex numbers giving the eigenvalues of X. <\/p>\n<p>Then <strong>eigvecs<\/strong> represents a <strong>2D<\/strong> NumPy array having the corresponding eigenvectors in the columns:<\/p>\n<p>The eigenvalues of the matrix X are as:<\/p>\n<p data-css=\"tve-u-176d6a0f885\"><strong>[1. + 0.j -2. + 0.j]<\/strong><\/p>\n<p>The corresponding eigenvectors are as:<\/p>\n<p data-css=\"tve-u-176d6a13558\"><strong>[[1. 0.], [0. 1.]]<\/strong><\/p>\n<p>The main objective of PCA \u00a0is to <strong>reduce the dimensionality<\/strong> of data by projecting it into a smaller subspace, where the axis is formed by the eigenvectors.<\/p>\n<p>All the eigenvectors have a <strong>size of 1<\/strong>, but they define only the new axes\u2019 directions. The eigenvectors having the highest values are the ones that include more information about our<strong> data distribution<\/strong>.<\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<h3 id=\"t-1609906201495\" class=\"\">Covariance Matrix<\/h3>\n<p dir=\"ltr\">The <strong>classic PCA<\/strong> approach determines the covariance matrix. Where each element depicts the covariance between two attributes.\u00a0<\/p>\n<p dir=\"ltr\">The covariance relation between two attributes is shown below:<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d615133f\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/12-Covariance-Matrik.png?resize=622%2C148&amp;ssl=1\" class=\"tve_image wp-image-8494\" alt=\"Covariance Matrik\" data-id=\"8494\" width=\"622\" data-init-width=\"1112\" height=\"148\" data-init-height=\"264\" title=\"Covariance Matrik\" loading=\"lazy\" data-width=\"622\" data-height=\"148\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8494\" alt=\"Covariance Matrik\" data-id=\"8494\" width=\"622\" data-init-width=\"1112\" height=\"148\" data-init-height=\"264\" title=\"Covariance Matrik\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/12-Covariance-Matrik.png?resize=622%2C148&amp;ssl=1\" data-width=\"622\" data-height=\"148\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p dir=\"ltr\">At first, the matrix is created, and then it is converted to the covariance matrix. \u00a0Eigenvalues and eigenvectors can also be calculated using the correlation matrix.<\/p>\n<h2 id=\"t-1609906201496\" class=\"\">Applications of PCA<\/h2>\n<p dir=\"ltr\">The typical applications of PCA are as under:<\/p>\n<ul class=\"\">\n<li>\n<p dir=\"ltr\"><strong>Data Visualization:<\/strong> PCA makes data easy to explore by bringing out strong patterns in the relevant dataset.<\/p>\n<\/li>\n<li>\n<p dir=\"ltr\"><strong>Data Compression:<\/strong> The amount of the given data can be reduced by decreasing the number of eigenvectors used to reconstruct the original data matrix.<\/p>\n<\/li>\n<li>\n<p dir=\"ltr\"><strong>Noise Reduction:<\/strong> PCA can not eliminate noise. It can only reduce the noise. The data noising algorithm of PCA decreases the influence of the noise as much as possible.<\/p>\n<\/li>\n<li>\n<p dir=\"ltr\"><strong>Image Compression:<\/strong> Principal component analysis reduces the dimensions of the image and projects those dimensions to reform the image that retains its qualities.<\/p>\n<\/li>\n<li>\n<p dir=\"ltr\"><strong>Face Recognition:<\/strong> EigenFaces is an approach generated using PCA, which performs <a href=\"https:\/\/dataaspirant.com\/gender-wise-face-recognition-with-opencv\/\" target=\"_blank\" rel=\"noopener\"><strong>face recognition<\/strong><\/a> and reduces statistical complexity in face image recognition.<\/p>\n<\/li>\n<\/ul>\n<h2 id=\"t-1609906201497\" class=\"\">Principal Component Analysis Implementation in Python<\/h2>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d61665b7\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/13-PCA-Implementation-In-Python.png?resize=622%2C373&amp;ssl=1\" class=\"tve_image wp-image-8497\" alt=\"PCA Implementation In Python\" data-id=\"8497\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"PCA Implementation In Python\" loading=\"lazy\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8497\" alt=\"PCA Implementation In Python\" data-id=\"8497\" width=\"622\" data-init-width=\"750\" height=\"373\" data-init-height=\"450\" title=\"PCA Implementation In Python\" loading=\"lazy\" src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/13-PCA-Implementation-In-Python.png?resize=622%2C373&amp;ssl=1\" data-width=\"622\" data-height=\"373\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d6a47b0c\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/14-Iris-dataset.png?resize=622%2C376&amp;ssl=1\" class=\"tve_image wp-image-8510\" alt=\"Iris dataset\" data-id=\"8510\" width=\"622\" data-init-width=\"1024\" height=\"376\" data-init-height=\"620\" title=\"Iris dataset\" loading=\"lazy\" data-width=\"622\" data-height=\"376\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8510\" alt=\"Iris dataset\" data-id=\"8510\" width=\"622\" data-init-width=\"1024\" height=\"376\" data-init-height=\"620\" title=\"Iris dataset\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/14-Iris-dataset.png?resize=622%2C376&amp;ssl=1\" data-width=\"622\" data-height=\"376\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p>Next we are getting the value of <strong>a and b<\/strong>. Now, Let&#8217;s implementing PCA with the covariance matrix.<\/p>\n<\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p>\u00a0Now, standardizing <strong>a<\/strong>, we get, PCA with <strong>two<\/strong> components. For Checking eigenvectors printing those.<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d6ded7d2\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/15-Eigen-Vectors.png?resize=622%2C130&amp;ssl=1\" class=\"tve_image wp-image-8521\" alt=\"Eigen Vectors\" data-id=\"8521\" width=\"622\" data-init-width=\"998\" height=\"130\" data-init-height=\"208\" title=\"Eigen Vectors\" loading=\"lazy\" data-width=\"622\" data-height=\"130\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8521\" alt=\"Eigen Vectors\" data-id=\"8521\" width=\"622\" data-init-width=\"998\" height=\"130\" data-init-height=\"208\" title=\"Eigen Vectors\" loading=\"lazy\" src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/15-Eigen-Vectors.png?resize=622%2C130&amp;ssl=1\" data-width=\"622\" data-height=\"130\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d6dfc15d\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/16-Eigen-Values.png?resize=622%2C72&amp;ssl=1\" class=\"tve_image wp-image-8524\" alt=\"Eigen Values\" data-id=\"8524\" width=\"622\" data-init-width=\"932\" height=\"72\" data-init-height=\"108\" title=\"Eigen Values\" loading=\"lazy\" data-width=\"622\" data-height=\"72\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8524\" alt=\"Eigen Values\" data-id=\"8524\" width=\"622\" data-init-width=\"932\" height=\"72\" data-init-height=\"108\" title=\"Eigen Values\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/16-Eigen-Values.png?resize=622%2C72&amp;ssl=1\" data-width=\"622\" data-height=\"72\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d6e093dc\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/17-PCA-Sorted-Components.png?resize=614%2C136&amp;ssl=1\" class=\"tve_image wp-image-8527\" alt=\"PCA Sorted Components\" data-id=\"8527\" width=\"614\" data-init-width=\"614\" height=\"136\" data-init-height=\"136\" title=\"PCA Sorted Components\" loading=\"lazy\" data-width=\"614\" data-height=\"136\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8527\" alt=\"PCA Sorted Components\" data-id=\"8527\" width=\"614\" data-init-width=\"614\" height=\"136\" data-init-height=\"136\" title=\"PCA Sorted Components\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/17-PCA-Sorted-Components.png?resize=614%2C136&amp;ssl=1\" data-width=\"614\" data-height=\"136\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element\">\n<p>Plotting <strong>PCA<\/strong> with several components;<\/p>\n<\/div>\n<div class=\"thrv_wrapper tve_image_caption\" data-css=\"tve-u-176d6e9c2ad\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/18-PCA-output.png?resize=486%2C370&amp;ssl=1\" class=\"tve_image wp-image-8531\" alt=\"PCA output\" data-id=\"8531\" width=\"486\" data-init-width=\"892\" height=\"370\" data-init-height=\"680\" title=\"PCA output\" loading=\"lazy\" data-width=\"486\" data-height=\"370\" data-css=\"tve-u-176d8962428\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-8531\" alt=\"PCA output\" data-id=\"8531\" width=\"486\" data-init-width=\"892\" height=\"370\" data-init-height=\"680\" title=\"PCA output\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2021\/01\/18-PCA-output.png?resize=486%2C370&amp;ssl=1\" data-width=\"486\" data-height=\"370\" data-css=\"tve-u-176d8962428\" data-recalc-dims=\"1\"><\/span><\/div>\n<div class=\"thrv_wrapper thrv_text_element tve-froala fr-box fr-basic\">\n<h2 class=\"\" id=\"t-1609930641812\">Conclusion<\/h2>\n<p dir=\"ltr\">We know that massive datasets are increasingly widespread in all sorts of disciplines. Therefore, to interpret such datasets, the <strong>dimensionality is decreased<\/strong> so that the highly related data can be preserved.<\/p>\n<p dir=\"ltr\">PCA solves the issue of <strong>eigenvectors and eigenvalues<\/strong>. We make use of PCA to remove collinearity during the training phase of <a href=\"https:\/\/dataaspirant.com\/neural-network-basics\/\" target=\"_blank\" class=\"tve-froala\" rel=\"noopener\"><strong>neural networks<\/strong><\/a> and <strong><a href=\"https:\/\/dataaspirant.com\/linear-regression-implementation-in-python\/\" target=\"_blank\" class=\"tve-froala\" rel=\"noopener\">linear regression<\/a><\/strong>.\u00a0<\/p>\n<p dir=\"ltr\">Furthermore, we can use PCA to avoid <a href=\"https:\/\/dataaspirant.com\/assumptions-of-linear-regression-algorithm\/\" target=\"_blank\" rel=\"noopener\"><strong>multicollinearity<\/strong><\/a> and to decrease the number of variables.\u00a0<\/p>\n<p dir=\"ltr\">PCA can be termed as a linear combination of the p features, and taking these linear combinations of the measurements under consideration is mandatory.\u00a0<\/p>\n<p dir=\"ltr\">So that the number of plots necessary for visual analysis can be reduced while retaining most of the information present in the data. In machine learning, feature reduction is an essential preprocessing step.\u00a0<\/p>\n<p dir=\"ltr\">Therefore, PCA is an effective step of preprocessing for compression and noise removal in the data. It finds a new set of variables smaller than the original set of variables and thus reduces a dataset\u2019s dimensionality.<\/p>\n<\/div>\n<h4 class=\"\">Recommended Courses<\/h4>\n<div class=\"thrv_wrapper thrv-page-section thrv-lp-block\" data-inherit-lp-settings=\"1\" data-css=\"tve-u-176d5dd4a3f\" data-keep-css_id=\"1\">\n<div class=\"tve-page-section-in tve_empty_dropzone  \" data-css=\"tve-u-17481b960b8\">\n<div class=\"thrv_wrapper thrv-columns dynamic-group-kbt3q0q7\" data-css=\"tve-u-17481b95e2b\">\n<div class=\"tcb-flex-row v-2 tcb--cols--3 tcb-medium-no-wrap tcb-mobile-wrap m-edit\" data-css=\"tve-u-176d5dd4a40\">\n<div class=\"tcb-flex-col\">\n<div class=\"tcb-col dynamic-group-kbt3pyfd\" data-css=\"tve-u-17481b95e2d\">\n<div class=\"thrv_wrapper thrv_contentbox_shortcode thrv-content-box tve-elem-default-pad dynamic-group-kbt3pwhk\" data-css=\"tve-u-176d5dd4a57\">\n<div class=\"tve-cb\">\n<div class=\"thrv_wrapper tve_image_caption dynamic-group-kbt3pu4z\" data-css=\"tve-u-176d5dd4a5a\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2020\/08\/deeplearning-course.jpg?resize=173%2C173&amp;ssl=1\" class=\"tve_image wp-image-5170\" alt=\"Deep Learning python\" data-id=\"5170\" width=\"173\" data-init-width=\"150\" height=\"173\" data-init-height=\"150\" title=\"deeplearning-course\" loading=\"lazy\" data-width=\"173\" data-height=\"173\" data-css=\"tve-u-176d5dd4a5b\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-5170\" alt=\"Deep Learning python\" data-id=\"5170\" width=\"173\" data-init-width=\"150\" height=\"173\" data-init-height=\"150\" title=\"deeplearning-course\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/dataaspirant.com\/wp-content\/uploads\/2020\/08\/deeplearning-course.jpg?resize=173%2C173&amp;ssl=1\" data-width=\"173\" data-height=\"173\" data-css=\"tve-u-176d5dd4a5b\" data-recalc-dims=\"1\"><span class=\"tve-image-overlay\"><\/span><\/span><\/div>\n<h4 class=\"\" data-css=\"tve-u-176d5dd4a42\">Machine Learning A to Z Course<\/h4>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"tcb-flex-col\">\n<div class=\"tcb-col dynamic-group-kbt3pyfd\" data-css=\"tve-u-17481b95e2d\">\n<div class=\"thrv_wrapper thrv_contentbox_shortcode thrv-content-box tve-elem-default-pad dynamic-group-kbt3pwhk\" data-css=\"tve-u-176d5dd4a58\">\n<div class=\"tve-cb\">\n<div class=\"thrv_wrapper tve_image_caption dynamic-group-kbt3pu4z\" data-css=\"tve-u-176d5dd4a67\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2020\/10\/Master-Computer-Vision\u2122-OpenCV4-in-Python-with-Deep-Learning-Course.jpg?resize=173%2C173&amp;ssl=1\" class=\"tve_image wp-image-6735\" alt=\"Master-Computer-Vision\u2122-OpenCV4-in-Python-with-Deep-Learning-Course\" data-id=\"6735\" width=\"173\" data-init-width=\"150\" height=\"173\" data-init-height=\"150\" title=\"Master-Computer-Vision\u2122-OpenCV4-in-Python-with-Deep-Learning-Course\" loading=\"lazy\" data-width=\"173\" data-height=\"173\" data-css=\"tve-u-176d5dd4a68\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-6735\" alt=\"Master-Computer-Vision\u2122-OpenCV4-in-Python-with-Deep-Learning-Course\" data-id=\"6735\" width=\"173\" data-init-width=\"150\" height=\"173\" data-init-height=\"150\" title=\"Master-Computer-Vision\u2122-OpenCV4-in-Python-with-Deep-Learning-Course\" loading=\"lazy\" src=\"https:\/\/i1.wp.com\/dataaspirant.com\/wp-content\/uploads\/2020\/10\/Master-Computer-Vision%E2%84%A2-OpenCV4-in-Python-with-Deep-Learning-Course.jpg?resize=173%2C173&amp;ssl=1\" data-width=\"173\" data-height=\"173\" data-css=\"tve-u-176d5dd4a68\" data-recalc-dims=\"1\"><span class=\"tve-image-overlay\"><\/span><\/span><\/div>\n<h4 class=\"\" data-css=\"tve-u-176d5dd4a4a\">Python Data Science Specialization Course<\/h4>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"tcb-flex-col\">\n<div class=\"tcb-col dynamic-group-kbt3pyfd\" data-css=\"tve-u-17481b95e2d\">\n<div class=\"thrv_wrapper thrv_contentbox_shortcode thrv-content-box tve-elem-default-pad dynamic-group-kbt3pwhk\" data-css=\"tve-u-176d5dd4a59\">\n<div class=\"tve-cb\">\n<div class=\"thrv_wrapper tve_image_caption dynamic-group-kbt3pu4z\" data-css=\"tve-u-176d5dd4a69\"><span class=\"tve_image_frame\"><img src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/plugins\/lazy-load\/images\/1x1.trans.gif?ssl=1\" data-lazy-src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2020\/08\/supervised-learning.png?resize=173%2C173&amp;ssl=1\" class=\"tve_image wp-image-4696\" alt=\"supervised learning\" data-id=\"4696\" width=\"173\" data-init-width=\"150\" height=\"173\" data-init-height=\"150\" title=\"supervised learning\" loading=\"lazy\" data-width=\"173\" data-height=\"173\" data-css=\"tve-u-176d5dd4a6a\" data-recalc-dims=\"1\"><img class=\"tve_image wp-image-4696\" alt=\"supervised learning\" data-id=\"4696\" width=\"173\" data-init-width=\"150\" height=\"173\" data-init-height=\"150\" title=\"supervised learning\" loading=\"lazy\" src=\"https:\/\/i2.wp.com\/dataaspirant.com\/wp-content\/uploads\/2020\/08\/supervised-learning.png?resize=173%2C173&amp;ssl=1\" data-width=\"173\" data-height=\"173\" data-css=\"tve-u-176d5dd4a6a\" data-recalc-dims=\"1\"><span class=\"tve-image-overlay\"><\/span><\/span><\/div>\n<h4 class=\"\" data-css=\"tve-u-176d5dd4a51\">Complete Supervised Learning Algorithms<\/h4>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/dataaspirant.com\/principal-component-analysis-pca\/<\/p>\n","protected":false},"author":0,"featured_media":8051,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8050"}],"collection":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/comments?post=8050"}],"version-history":[{"count":0,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8050\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media\/8051"}],"wp:attachment":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media?parent=8050"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/categories?post=8050"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/tags?post=8050"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}