Free AI web copilot to create summaries, insights and extended knowledge, download it at here

Abstract

1: PCA algorithm transforms from old to new feature space so as to remove feature correlation. Picture adapted from: “Python Machine Learning by Sebastian Raschka”</figcaption></figure>A PCA transformation achieves the following:a) Reduce the number of features to be used in the final model by focusing only on the components accounting for the majority of the variance in the dataset.b) Removes the correlation between features.<h2 id="b53f">How does PCA work?</h2>To illustrate how PCA works, we show an example by examining the iris dataset.The code can be found on <a href="https://github.com/bot13956/principal_component_analysis_iris_dataset/blob/master/PCA_irisdataset.R">GitHub</a>.Let us look at the covariance matrix:<figure id="5c84"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*t34aaaogKX22guHgoVUOfg.png"><figcaption>This table shows strong correlations between original features in the Iris dataset.</figcaption></figure><figure id="d306"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*hHyGQv-wX6_kTS9I5FqYrw.png"><figcaption>This figure is a matrix plot that shows scatter plots, density plots, and correlation coefficients between original features. Notice the strong correlations between original features.</figcaption></figure>Let us now examine the transformed c

Options

ovariance matrix:<figure id="7956"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*IbiSITSRYYm2gpZt7OfI6Q.png"><figcaption>This table shows zero correlations between transformed features.</figcaption></figure><figure id="0248"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*1O6cB50KqgdKfhl_sCGBSw.png"><figcaption>This figure is a matrix plot that shows scatter plots, density plots, and correlation coefficients between principal components. We see that the correlation between features has been removed.</figcaption></figure>Here is a summary of helpful indicators from a PCA calculation:<figure id="71de"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*vU4D4rtreCzxxqiRjvPILw.png"><figcaption>Summary of importance of components.</figcaption></figure>Based on this summary, we see that 99.5 percent of the variance is contributed by the first three principal components. This means that in the final model, the fourth principal component PC4 could be dropped since its contribution to the variance is negligible.In summary, we have shown how the PCA algorithm can be implemented in R using the iris dataset for illustrative purposes. You can download the code on <a href="https://github.com/bot13956/principal_component_analysis_iris_dataset/blob/master/PCA_irisdataset.R">GitHub</a>.Thanks for reading.</article></body>

Dimensionality Reduction | Towards AI

Machine Learning: Dimensionality Reduction via Principal Component Analysis

In machine learning, a dataset containing features (predictors) and discrete class labels (for a classification problem such as logistic regression); or features and continuous outcomes (for a linear regression problem), is used to build a predictive model that can make predictions on unseen data. The predictive power of a model depends greatly on the quality and size of the training dataset.

Generally, the larger the dataset the better, however, there is always going to be a tradeoff between the size of the dataset and computational time needed for training. It turns out that in some very large datasets, there might be lots of redundancy in the features or lots of unimportant features in the dataset, and hence dimensionality reduction techniques could be used for selecting only a limited number of relevant features needed for training.

Principal Component Analysis (PCA) is a statistical method that is used for feature extraction. PCA is used for high-dimensional and correlated data. The basic idea of PCA is to transform the original space of features into the space of principal components, as shown below:

**Figure 1: PCA algorithm transforms from old to new feature space so as to remove feature correlation. Picture adapted from: “Python Machine Learning by Sebastian Raschka”**

A PCA transformation achieves the following:

a) Reduce the number of features to be used in the final model by focusing only on the components accounting for the majority of the variance in the dataset.

b) Removes the correlation between features.

How does PCA work?

To illustrate how PCA works, we show an example by examining the iris dataset.

The code can be found on GitHub.

Let us look at the covariance matrix:

**This table shows strong correlations between original features in the Iris dataset.**

**This figure is a matrix plot that shows scatter plots, density plots, and correlation coefficients between original features. Notice the strong correlations between original features.**

Let us now examine the transformed covariance matrix:

**This table shows zero correlations between transformed features.**

**This figure is a matrix plot that shows scatter plots, density plots, and correlation coefficients between principal components. We see that the correlation between features has been removed.**

Here is a summary of helpful indicators from a PCA calculation:

**Summary of importance of components.**

Based on this summary, we see that 99.5 percent of the variance is contributed by the first three principal components. This means that in the final model, the fourth principal component PC4 could be dropped since its contribution to the variance is negligible.

In summary, we have shown how the PCA algorithm can be implemented in R using the iris dataset for illustrative purposes. You can download the code on GitHub.

Thanks for reading.