Hechos relevantes tras la firma del acuerdo de paz Septiembre 2016

Capítulo 2. Guerra, arte y paz. Historia del conflicto colombiano en los siglos XX y XXI y su representación en el arte colombiano

2.5 Hechos relevantes tras la firma del acuerdo de paz Septiembre 2016

In this study, the major statistical assumptions underlying SEM suggested by Kaplan (2000) have been adopted. These assumptions are: sufficient sample, multivariate normality, data set is free of systematic missing data, and correct model specification. In addition to the necessity of data preparation that has been presented in Section 8.3.1, these assumption issues are discussed in the following sections.

8.3.2.3.1 Sample Sufficiency

The statistical characteristics of the different estimators depend on large samples (MacCallum and Austin, 2000). In fact, the issue of sample size is crucial because it has significant effect on the normality of data. Hair et al. (2010) stated that, “[N]ormality can have serious effects in small samples (fewer than 50 cases)”. However, Hair et al. (2010) added, “but the impact effectively diminishes when sample sizes reach 200 cases or more”.

Actually, the sample sufficiency for an SEM approach is a subject debated in the literature. This study follows this recommendation of Hair et al. (2010) and applies the analysis to more than 200 responses. This study is in line with Hu et al. (2003) and Lee (2009), which considers complete cases for analysis and removes partially completed responses.

8.3.2.3.2 Normality in SEM

Normal distribution of data is a statistical term that refers to describing “a symmetrical, bell-shaped curve, which has the greatest frequency of scores in the middle with smaller frequencies towards the extremes” (Pallant, 2010). The concept of normal data distribution is extremely important in statistics (Bowers, 1996). In multivariate statistical analysis, normality is deemed the most fundamental assumption (Hair et al., 2010).

The importance of the normal distribution is reflected in the validity of all resulting statistical tests (Hair et al., 2010). Thus, the resulting statistical tests are invalid if the variation from the normal distribution is quite large (Hair et al., 2010). This means, if the normality assumption has not been met, this may cause distortion to the findings of the statistical data analysis.

Normality can be assessed for outliers, skewness and kurtosis (Ullman, 2006). Statistics software provides other techniques to assess for normality, such as histograms and normal Q-Q plot in SPSS (Pallant, 2010). Two main types of normality with reference to statistical methods are to be checked: univariate normality which is assumed for univariate statistical methods and multivariate normality assumed for multivariate statistical methods (Hair et al., 2010). It is often helpful to examine the indexes of both univariate and multivariate normality to assess normality (Ullman, 2006).

Assessing the severity of the effect of violating the assumption of normal distribution is based on two matters: the sample size and the shape of the non-normal distribution (Hair et al., 2010). Hair et al. (2010) commented on the negligible impacts of un-normal

distribution as, “[W]hat might be considered as unacceptable at small sizes will have a negligible effect at larger sample sizes”. Hair et al. (2010) stated that the impact of normality is effectively reduced when the sample size exceeds 200 responses.

In literature, there is no consensus on what constitutes a large sample size. In some sources, a large sample size is more than 1000 responses. Other sources consider the sample size to be more than 2000 responses. In fact, the non-normality and sample size determines to a large extent the estimation method that is going to be used. For example, when the data is not normally distributed, the Maximum Likelihood (ML) is very sensitive to the size of the sample if it is less than 1000 responses.

 Assessing Univariate Normality

Univariate normality refers to the testing of the normal distribution of a single variable (Hair et al., 2010). This test can be easily carried out and the researchers should always examine the normality for all the metric variables included in the statistical analysis (Hair et al., 2010). The need to check for univariate normality is a prerequisite to the examination of multivariate normality (DeCarlo, 1997). Univariate normality can be assessed by obtaining values of skewness and kurtosis (Pallant, 2010).

Hair et al.(2010) argued that univariate normality does not necessarily guarantee multivariate normality. However, if all individual metric variables meet the univariate normality requirement, then departures from multivariate normality are unimportant (Hair et al., 2010). This opinion contradicts other opinions in the literature. Ullman (2006) stated that it is helpful when assessing normality to examine both univariate and multivariate normality.

Appendix C.10 presents the normality assessment test that was generated using AMOS 20. Results reveal that all the distributions for both skewness and kurtosis of all individual metric variables are normal and within the recommended threshold (Hair et al., 2010).

Except for some variables, they are slightly above the recommended threshold of skewness.

Transformation could be a potential solution to remedy the non-normality. It has a large effect and substantially reduces the univariate kurtosis and skewness when the univariate non-normality is severe (Gao et al., 2008). Therefore, it has been decided in this study not to transform the individual variables and leave their values intact. This is based on the recommendation by Gao et al. (2008) that, “the role of transformation needs to be assessed on a case-by-case basis”. However, because the univariate non-normality is slight or moderate, transformation in this case has only a minor effect (Gao et al., 2008).

 Assessing Multivariate Normality

One of the main concerns about the data in SEM, is whether the sample is multivariate normally distributed (Gao et al., 2008). Thus, it is important to check this criterion has been met before undertaking any analysis of data (Byrne, 2013). It is a problem to SEM analysis that data have multivariate kurtosis (Byrne, 2013), the situation where the multivariate distribution of the measured variables has both peaks and tails and does not have characteristics of a multivariate normal distribution (Byrne, 2013).

Bollen (2006) argued that, the assumption of multivariate normality should not be applied to exogenous measured variables. However, the AMOS users’ guide asserts the necessity to assess multivariate normality for both exogenous and endogenous variables (Arbuckle, 2011). In this study, the researcher follows the recommendation of the AMOS users’ guide.

Although assessing the univariate normality is a necessity, it is not an adequate condition for attaining multivariate normality (DeCarlo, 1997). West et al. (1995) stated that the multivariate distribution of data can still be non-normal regardless of whether the distribution of individual variables is univariate normal. In general, when conducting SEM, it is a critical and important assumption that data is multivariate normally distributed (Byrne, 2013). However, there is no consensus in the literature regarding the need for multivariate normality when univariate normality is attained. Different opinions contradict each other in terms of what is sufficient to conduct SEM using an ML estimator (i.e.

univariate normality only or both univariate normality and multivariate normality).

In spite of the effects of violations of the normality assumption, MLE has proven fairly robust (Hair et al., 2010). However, researchers avoid any reliance on multivariate normality when using AMOS by applying the available bootstrapping option within this statistical software (e.g. (Seddon and Kiew, 2007)). This is in line with Hoyle (2012) and Byrne (2013).

8.3.2.3.3 Model Estimation Techniques

Plausibility of normality and sample sizes are important factors that determine the selection of the appropriate estimation method (Ullman, 2006). Popular estimation methods are: Maximum Likelihood (ML), Unweighted Least Squares (ULS), General Least Squares (GLS), and Asymptotically Distribution Free (ADF) (Barber, 1983).

Deciding what estimation method to use is a debated topic with regard to the sample size. Ullman (2006) stated that ML or GLS estimators are good choices when the sample is medium (over 120) or large. Weiner et al.(1983) argued that when the size of the sample is less than 500 cases, GLS performed slightly better. On the contrary, under ideal conditions, MLE provides stable and valid results with a minimum sample size of 50 cases (Hair et al., 2010)

The effect of sample size is to produce greater stability and more information (Hair et al., 2010). Hair et al. (2010) stated “[G]iven less than ideal conditions, one study recommends a sample size of 200 to provide a sound basis for estimation”. If the researcher has collected more than the absolute minimum size of the sample, larger samples increase stability and mean less variability in the solutions (Hair et al., 2010).

Maximum Likelihood MLE has become the default approach in most SEM software and continues to be a widely used approach by researchers (Hair et al., 2010). According to Weiner et al. (1983), “[M]aximum likelihood is usually the default method in most programs because it yields the most precise (smallest variance) estimates when the data are normal”. However, its potential sensitivity to non-normality create a need for alternative estimation approaches (Hair et al., 2010).

The ADF has received particular attention due to its characteristic of being insensitive to non-normality (Hair et al., 2010; Byrne, 2013). However, ADF is limited in use due to its requirements of large sample sizes (Hair et al., 2010; Byrne, 2013). West et al. (1995) recommended having an extremely large sample (1000-5000) to base the analysis on ADF. Otherwise, ADF performs very poorly and can lead to severely distorted standard errors and estimated values (Curran et al., 1996).

ML is the most common estimation procedure in SEM (Hair et al., 2010; Ullman, 2006).

Hair et al. (2010) stated, ML is “proven fairly robust to violations of the normality assumption”. Researchers made a comparison between ML and other estimation techniques and found that it produced reliable results under different circumstances (Hair et al., 2010). Thus, the SEM analysis in the present study will be based on ML estimation based on the aforementioned discussion.

In document La presencia de la ausencia. (página 56-67)