In short, Factor Analysis summarizes your large data-set so that relationships and patterns can be easily interpreted and understood.

Five steps to Factor Analysis:

Correlation Matrix

Factor Extraction

How to decide the number of factors?





Factor Loadings

Subjects Algebra Chemistry Geometry Physics Game theory Number theory Set Theory Probability Biology

Subjects Factor-1 Factor-2 Algebra 0.788 0.542 Chemistry 0.368 0.912 Geometry 0.729 0.367 Physics 0.541 0.875 Game theory 0.891 0.333 Number theory 0.795 0.412 Set Theory 0.832 0.390 Probability 0.955 0.324 Biology 0.289 0.816

- Chemistry, Physics and Biology have high Factor Loading in Factor-2.

- Items of Factor-1 is associated to a common latent relationship and can also be labeled as 'Mathematics' and similarly Factor-2 can be labelled as 'Science'.



Factor Rotation

- Once the Initial Factor Loadings have been calculated, the factors are rotated.

- It is a process of manipulation or adjusting the factor axes in order to achieve a simpler and pragmatically more meaningful factor solution.

- Rotation creates a simpler factor structure and makes the factors more clearly distinguishable.

- Orthogonal Rotation - It assumes that factors are not correlated.

- Oblique Rotation - Unlike Orthogonal Rotation, it allows for factor correlation.



Factor Scores

- Factor Scores are the estimated value of the factors.

- It is used to prioritize and rank the factors.

- With the help of Factor Score, you may decide easily that which factors are more important or which factors you need to focus more.

- In most of the cases, you look for the Factor Scores (positive or negative) >= 0.7

- Initially the obtained Factor Score can be low but after some iteration it can be achieved to a high score.



Deciding questions before using Factor Analysis

- Is there are any outliers in data? Since it assumes that there are no outliers in data.

- Is there any multi-collinearity between the variables?

Since for Factor Analysis, there should not be any perfect multi-collinearity between the variables.

- What are the minimum number of factors that can explain all the variation of data-set?

- How well do these factors describe all the data?







Let's say, your data-set contains 200 variables.Can you imagine how cumbersome its gonna be if you analyse your data-set using all the 200 variables?Using Factor Analysis you can reduce a large number of variables into a smaller set of variables (factors), which is capable of explaining the observed variance in the larger number of variables.- Create a correlation matrix for all the variables- Factor Extraction- Calculate Initial Factor Loadings- Factor Rotation- Calculation of Factor Scores- It searches for variables that are strongly correlated to each other.- If the correlation between variables are relatively small, it is very unlikely that they share a common factor.- It focuses to extract factors that accounts for as much variation in the observed variables as possible.- The main purpose of Factor Analysis is to identify combinations of variables, and those combinations are called factors.- Different Factor Extraction methods:-- Maximum Likelihood-- Principal axis factoring-- Unweighted Least Square-- Generalized Least Square-- Image Factoring- Look for the Factor Correlation - If correlation between factors are too high (> 0.7) then there is a high possibility that factors are pretty similar and in this case, merge the two related factors.- Easily Explainable? Are you able to easily interpret and explain associated items of the each factors?- The more items are present in a factor, there is a higher chances to consider it for further analysis.- It represents the correlation between the factor and the variable.- It tells you how much a factor explains a variable.- Factor Loadings close to:=>indicates that the factor strongly influences the variable=>indicates that the factor has a weak influence on the variable- For example, lets say we have nine variables i.e.have high Factor Loadings in Factor-1.