I am not an expert but even to a novice these 4 statements would look like a sarcasm. Let’s take an example.

Suppose you are working on a data which has 800 features with 1000 records. Now you find out that you need to reduce the features as most of the features are redundant(only if someone told you) but keep in mind you ignored all the mathematics or statistics so you don’t know about Maths or Statistics. Since you just browsed the Data cleaning you are not even aware how to clean the data. So in this case how will you approach this situation?

Ok so you google “Reduce features in dataset” or somewhere while learning Machine Learning you found out that there is something called PCA that does this job for you.

So next step is you google “PCA sklearn”. Check the documentation and apply the following:

from sklearn.decomposition import PCA X = PCA(0.99).fit_transform(X)

Great, now your feature set is reduced to 60 and your training results are good. Now my question to you is:

Do you want to be this type of developer that just get the things done ? Do you want to be a developer that just know how to do, rather than the internal functioning on how it happens behind the scenes?

If you answered ‘yes’ to both the questions then you are a perfect fit for the companies who just want to get things done rather than mastering them. Then ‘Software Engineer’ is a good post for you.

If you answered ‘no’ to both the questions then you are a perfect fit for the companies who are expanding their research division at the moment. Then ‘Research Engineer’ is a good post for you.

In case you want to approach the problem of PCA from research point of view, you would exactly use the same code but with a different thinking behind the scenes. Because then you would visualise the relationship between features, combine new features, calculate the co-relation, calculate the eigenvalues, eigenvectors and finally select those features that contribute 99% to the variance.