According to DeVellis (2000) confirmatory factor analysis (CFA) is a rigorous
approach to scale dimensionality assessment. Despite the fact that CFA is similar to EFA in some respect, there are fundamental differences between the two
approaches (Hurley et al., 1997). In EFA, statistical method determines the number of factors, whereas using CFA allows a researcher to see how well the proposed structure (number of latent variables and their item-specification) matches the actual data (Gorsuch, 1997). Thus, EFA allows the determination of the underlying factor structure, while in CFA that structure is determined a priori and further tested (Byrne, 2005). CFA is used as a confirmatory test of measurement theory. ‘A
measurement theory specifies how measured variables logically and systematically represent constructs involved in a theoretical model. In other words, measurement theory specifies a series of relationships that suggest how measured variables represent a latent construct that is not measured directly’ (Hair et al., 2006 p.774).
In order to be able to run CFA, a researcher must ensure that the tested model is not under-identified (when the number of parameters to be estimated exceeds the item variance and covariance). It is commonly accepted that confirmatory factor analysis model is identified if there are at least three observed items for each factor
(Bollen, 1989b; Kelloway, 1998). Once the model is correctly specified, CFA provides empirical testing of the relationships among items and constructs represented by the measurement theory (Hair et al., 2006).
5.9.3.1 Assessing the Fit
In conducting CFA there are three most common fitting criteria used: ordinary least square (OLS), generalised least square (GLS) and maximum likelihood (ML)
(Diamantopoulos and Siguaw, 2000). ML is the most widely used type of estimation (followed by GLS) as it is known to produce consistent and reliable results (Anderson and Gerbing, 1988; Bollen, 1989b; Hu and Bentler, 1998).
A variety of fit indices are available to assess the fit of the proposed model to the data. They rely on two traditions in the assessment of model fit: the assessment of absolute fit and the assessment of comparative (incremental) fit (Bollen and
Long, 1993).
Absolute fit indices assess the ability of the model to reproduce the actual
covariance matrix, whereas comparative fit indices assess two or more competing models to determine which one produces the better fit to the data (Kelloway, 1998). In assessing absolute fit, the chi-square statistic (χ2) represents the most
straightforward test of assessing the overall model fit (Diamantopoulos and
Siguaw, 2000). It shows the discrepancy between the data available and a proposed model. When a chi-square test is conducted, the null hypothesis is tested. The null hypothesis implies that the only reason for the deviation of the estimated variance- covariance matrix from the sample variance-covariance matrix is sampling error (Baumgartner and Homburg, 1996). χ2 tends to increase as the sample size or the number of observed variables increase (Hu and Bentler, 1999). If χ2 is non-
and Heatherton, 1994). However, there are other absolute fit indices researchers also rely on (see the table 5.12).
Table 5.12: Absolute Fit Indices
Absolute Fit indices Source
Chi-square Baumgartner and Homburg (1996)
Chi-square/df ratio Marsh, Balla, and McDonald (1988) Goodness-of-Fit Index (GFI) Jöreskog and Sörbom (1984) Adjusted Goodness-of-Fit Index (AGFI) Jöreskog and Sörbom (1984) McDonald’s Fit Index (MFI) McDonald (1989)
Root Mean Square Residual (RMR) Jöreskog and Sörbom (1981) Root Mean-Square Error of Approximation
(RMSEA)
Steiger (1990)
The most commonly used indices are discussed below.
The normed χ2 is a ratio of χ2 to the degrees of freedom (df) (Marsh, Balla, and McDonald,1988) and considers the sensitivity of the chi-square test to the sample size and based on the assumption that the model fits the population perfectly. If that ratio is 3:1 or less the model is considered to have a good fit with the data (Bentler and Chou, 1987).
The goodness-of-fit index (GFI) is based ‘on a ratio of the sum of the square
discrepancies to the observed variances’ (Kelloway, 1998 p. 27). The range of GFI values is from 0 to 1, where values greater than 0.9 are usually considered to indicate a good fit (Kelloway, 1998).
The adjusted goodness-of-fit index (AGFI) is the GFI index adjusted to the degrees of freedom (Jöreskog and Sörbom, 1984). It also ranges from 0 to 1 with values exceeding 0.9 indicating a good fit (Hair et al., 2006).
The root mean squared error of approximation (RMSEA) index is based on the analysis of residuals. ‘Residuals refer to individual differences between observed covariance terms and the fitted covariance terms’ (Hair et al., 2006 p. 796). The smaller the residuals the better fit to the data. Thus, RMSEA shows how well the
model fits the population covariance matrix (Baumagartner and Homburg, 1996). Steiger (1990) suggest that values below 0.05 indicate a very good fit to the data. Comparative fit indices in contrast to absolute fit indices compare the data to a model which a priori provides a poor fit to the data (Bagozzi and
Baumgartner, 1994). The most common type of the model is ‘null’ or ‘independent’ model, which specifies no relationships between the variables (Kelloway, 1998). Most commonly used comparative fit indices include NFI, NNFI and CFI among others (see table 5.13).
Table 5.13: Comparative Fit Indices
Comparative Fit indices Source
Normed Fit Index (NFI) Bentler and Bonett (1980) Incremental Fit Index (IFI) Bollen (1989a)
Nonnormed Fit Index (NNFI) Tucker and Lewis (1973) Comparative Fit Index (CFI) Bentler (1990)
Parsimony Comparative Fit Index (PCFI) Mulaik et al. (1989)
Relative Noncentrality Index (RNI) McDonald and Marsh (1990)
The NFI indicates the percentage improvement of the hypothesised model to the baseline model. The NFI ranges from 0 to 1, with values over 0.9 indicating a good fit to the data (Bentler and Bonett, 1980). If the value of NFI is 0.9, it means that the hypothesised model fits the data 90% better than an ‘independent’ baseline model (Malaik et al., 1999). The non-normed fit index (NNFI) is similar to NFI, but it is adjusted to the number of degrees of freedom. Similarly to NFI (and most of the other indices), NNFI ranges from 0 to 1, with values over 0.9 considered to be good. Bentler (1990) proposed an improved version of the NFI index – comparative index of fit (CFI). Similar to NFI, CFI indicates percentage improvement of the
hypothesised model to the baseline model; however, it is relatively insensitive to model complexity. It also ranges from 0 to 1, with values exceeding 0.9 usually considered to be good (Hu and Bentler, 1999).
5.9.3.2 Model Respecification
parsimony (i.e. model simplicity). Respecification can be achieved either by deleting nonsignificant paths from the model or by adding new paths to the model (c.f. Chin, Peterson and Brown, 2008). Even if it is accepted that almost every model requires further respecification, a researcher has to be careful to retain theoretical integrity and consistency (Shook et al., 2004). Theory trimming (removal of the paths from the model) is a more common approach of model respecification compared to adding the paths (Pedhazur, 1982).
There are several parameters that a researcher has to look at in order to make the right decision about item-removal.
First, similarly to EFA, in CFA estimated loadings (path estimates) help to identify if there is a potential problem with measurement theory. For the item to perform adequately, loadings both have to be significant and have a high value (at least 0.5, but ideally 0.7) (Brown, 2006). In the LISREL Output file, factor loadings can be found in the LX (lambda-x) matrix (loading of variables on the common factor) (Sharma, 1996).
Second, it is necessary to assess the individual differences between observed covariance terms and the fitted covariance terms, which is represented by residuals and standardised residuals (Hu and Bentler, 1995). The better the data fit the
measurement theory the smaller are the standardised residuals. If the standardised residuals are relatively large, it indicates high degree of error and appoints the item for the potential removal from the scale (Bentler, 2007). In the LISREL Output file, residuals and standardised residuals can be found in TD (theta-delta) matrixes. Third, LISREL provides information about the parameters which were not estimated. Modification indices are ‘calculated for every possible relationship that is not free to be estimated. It shows how much the overall model χ2 value would be reduced by freeing this single path’ (Hair et al., 2006 p. 797). It means that if the path is freed, the overall fit of the model will improve (Worthington and Whittaker, 2006).
However, a researcher has to take into account all theoretical factors as well, before making a decision about a particular item removal rather than rely solely on any of statistical criterion.