PCA: The Principal Component of Machine Learning

5 min readDec 4, 2021

Hello everyone, in this blog we will learn about PCA- Principal Component Analysis in detail. First we will get a strong geometric intuition about it then we will se how to apply it step by step and finally a python code implementation of it.

All the code written in this blog is in my github- https://github.com/HarshMishra2002/pca

First of all, PCA is a Feature extraction technique. It reduces the dimension of your data and rescues you from the curse of dimensionality.

According to wikipedia
PCA was invented in 1901 by Karl Pearson, as an analogue of the principal axis theorem in mechanics; it was later independently developed and named by Harold Hotelling in the 1930s.

This shows how old this technique is actually but still today it is the first choice option when it comes to feature extraction especially when you are dealing with very high dimension data.

GEOMETRIC INTUITION:

First we need to understand how feature selection works. It completely depends on variance of data. For example

If we look at the above graph: We can clearly see that d >> d1 i.e. the variance of the data on x-axis is more than the data on y-axis so we will choose the feature which is on x-axis and drop which is on y-axis. But this technique of feature selection seems to fail when data on both axis have similar variation.

for example:

now in this case d = d1 so the variance of data columns on each axis is same. Its not possible to use the above technique to select one data feature and reject another. So here comes PCA.

what we do is rotate the original axis so that we achieve maximum variance at an axis. Those new axis are called Principal Components. Then we project the original data on principal component to get our new data points and these data points is our new dataset which has critical information about the data. In PCA we create new data instead of dropping some and keeping some.

Here you can see. After rotating the original axis we got new axis i.e. Principal Axis 1 and Principal axis 2 and now for the new data points (which we achieved by projecting the old ones to PC) we have different variations i.e. d new >> d1 new.

Now since we have got the geometric intuition, lets see how we actually find these principal components using python as programming language.

Step By Step Solution with Python Implementation:

Step 0: Having a data

Here I have a 3D data with me i.e. 3 input features.

Step 1: Scaling data (Mean Centering)

We scale the data i.e. we bring all the columns in a similar range of value. Here we will apply standardscaler library from sklearn.

Step 2 : Finding covariance matrix

The drawback of finding a variance is that I can find variance of one column with only one axis. SO when it comes to find the relation between two variables we use something called as covariance.

A covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

Step 3 : Finding Eigen Vectors and Eigen Values

Eigen vectors are those vectors when we apply any transformation on it, only the size changes and not the direction.

Eigen values are the measure of shrink or stretch of Eigen vector after transformation.

The transformation here which is been discussed is the transformation applied on the covariance matrix. Basically we find eigen vectors and eigen values of the covariance matrix we found above.

Each eigen vector has one eigen value, the vector with the highest value is our PC 1 (first principal component) and so on..

Step 4 : Selecting the PC and transforming the old data accordingly

Here I will choose the first two PC that is I am converting my 3D data in 2D data. Then we will transform the old data by projecting it on the new axis (PC).

So here we end our very basic first PCA implementation. And that is it for today guys.
I hope you guys got to learn something new and enjoyed this blog. If you do like it than share it with your friends. Take care. keep learning.
You could also reach me through my Linkedin account- https://www.linkedin.com/in/harsh-mishra-4b79031b3/

PCA: The Principal Component of Machine Learning

Written by Harsh Mishra