Validating Principal Components Analysis

From Survey Analysis
Jump to: navigation, search

We can classify principal components analysis solutions as follows:

  • A natural factor (i.e., component) is a component that is believed to be a valid estimate of some real physical or psychological phenomena and the variables that are highly correlated with this factor can be viewed as being measurements of this factor where differences between the variables reflect noise.
  • An arbitrary factor is one that provides a useful way of looking at the world. That is, the factor is a helpful summary of the data.
  • A spurious factor is one where the correlations of the individual variables are attributable to some other cause that is not evident when the factors are named. A common example of spurious factors in marketing-related surveys is where factor analysis identifies price-quality dimensions, that have high positive (negative) loadings for quality and high negative (positive) for price. A conclusion to draw from such an analysis may be that consumers see price and quality as being linked. However, it is possible that all of the brands in the market with high prices have low quality, and vice versa. Another common spurious factor occurs when using survey data, where a factors representing response styles are identified but misinterpreted as representing involvement.

Depending upon the purpose of the principal components analysis, validation involves determining either that factors are either natural or arbitrary.



Theory is the prime means of validating factors. The table below shows tidied rotated factor loadings for data collected for a business-to-business relationship marketing study in the US.[1] The seven variables measure different aspects of the relationship between a car tire dealers and retailers, as perceived by the retailers. A preliminary analysis would suggest that the first factor seems to relate to how much the tire dealer likes the distributor (NB: the negative loading of -.8 for “has behaved opportunistically” means that the higher a respondent’s value on this variable, the lower their score on the first factor, so the factor can be thought of as being positively correlated with “has not behaved opportunistically”). The second factor is essentially relationship costs.

Variables Factor 1 Factor 2
Relationship termination costs .9
Relationship benefits .6
Shared values .6 .4
Quality of communications .7 .3
Has behaved opportunistically -.8
Commitment to stay in the relationship .7 .5
Trustworthiness of relationship partner .9

Although the descriptions of the two factors – liking and relationship costs – are summaries of the loadings, if we think about the variables in the analysis it seems likely that the factors may be spurious. It seems inevitable that some of the variables in the analysis are causally related to some of the other variables. For example, where there are high relationship termination costs, a firm is likely to be more committed to the relationship. Similarly, the more trust that a firm has of another firm, the more likely that we would expect that firm to be committed to the relationship [2]. Consequently, it is inappropriate to assume that natural factors underlie the variables (and, thus, if the objective was to identify natural factors then this analysis is shown not to be valid)0.

Strength of relationship with other variables

All else being equal, where components that are more strongly correlated with variables excluded from the principal components analysis are more likely to be valid.


While there is no generally-agreed rule about how parsimony should be taken into account when evaluating principal components analysis, one proposal is that a good solution will is one where two-thirds of the variance is explained by one-third or less the number of possible factors.[3]


  1. Morgan, R. M. and S. D. Hunt (1994). "The Commitment-Trust Theory of Relationship Marketing." Journal of Marketing 58(July): 20-38.
  2. Morgan, R. M. and S. D. Hunt (1994). "The Commitment-Trust Theory of Relationship Marketing." Journal of Marketing 58(July): 20-38.
  3. Lehmann, D. R. (1989). Market Research and Analysis. Homewood, Illinois, Irwin.

scale usage biases

Personal tools