PREPARACIÓN DEL MAESTRO
Dimensión 6 Tipo de texto Indicadores
2.3 Validación de la propuesta por los criterios de los especialistas.
Before running any statistical analysis on the data in SPSS, the data were screened to assure the completeness and to identify any chances of errors in entering the data. This was achieved by checking the descriptive statistics (frequencies, minimum, maximum, and mean) for each variable and some identified errors in data input were corrected by referring to the original questionnaires.
The chances of having missing data exist in any research, which can affect the validity and generalizability of the results (Loprinzi et al., 2013; McKnight et al., 2007). There is no consensus on what percentage of missing data becomes problematic, however, generally less than 5% of missing data is considered unproblematic, whereas more
103
than 20% missing data is considered to be of concern (Peng et al., 2006) and should be reported and handled properly as it could lead to biased and invalid results (Schlomer et al., 2010). Missing data in any research can be handled in two ways: a) by deleting the cases which have missing values; and b) by applying a statistical treatment to the data. However, Hair et al. (2006) stressed the importance of identifying the patterns in missing data before taking any of the above mentioned actions.
Since the present research was of a sensitive nature where participants were openly given the right to leave any question if they did not want to answer, there was a probability of having missing data. This is why the data was initially checked for the missing values. Missing values for some of the items of measures under consideration were less than 3% and chi-square (χ2) was also applied to identify any patterns for the missing data on each sub-sample.
The results were nonsignificant showing that missing values were missing at random (MAR) and could generally be ignored. However, it was decided to impute the values by replacing the missing values through “mean of nearby points” so that in future data may be analyzed for Structure Equation Modelling (SEM) in AMOS, which requires the treatment of missing values. After this, a test of normality was run.
1.1. Test of Normality
Before running inferential statistics, a test of normality was run for all variables in the study to determine whether the sample distribution approximates normal distribution and symmetry or not for accurate measure of standard deviations and standard errors (Field, 2009). For this, skewness and kurtosis were compared against standard error.
104
Table 5
Descriptives of CEDV and IPVAS along with their Subscales for Skewness and Kurtosis (N = 1046)
95% CI
Variables M SD Min Max Range Med. LL UL Skew Kur
CEDV (total) 18.20 9.79 0 57.27 57.27 16 17. 6 18.79 .93 .89 Viol. 4.83 3.86 0 29 29 4 4.6 5.06 1.86 5.09 Involve 2.58 3.04 0 17 17 2 2.4 2.76 1.50 2.20 R. Fac 2.59 2.15 0 11 11 2 2.5 2.72 1.12 .97 Com. Exp 6.51 3.23 0 21 21 6 6.3 6.72 .45 .23 O. Victim 1.69 1.55 0 12 12 1 1.6 1.78 1.24 2.51 IPVAS (total) 31.7 5.35 18 48 30 32 31. 4 32.03 .10 -.36 Control 11.87 2.62 5 20 15 12 11. 7 12.03 .23 .36 Abuse 11.87 3.31 8 28 20 14.0 14. 1 14.46 .31 -.07 Violence 5.57 2.09 4 16 12 4.38 5.4 5.69 1.46 1.95
Exposure; Involve=Involvement; R.Fac=Risk Factors; Com.Exp=Community Exposure; O.Victim=Other Victimization; Skew= Skewness; Kur= Kurtosis.
S.E. for Skewness (CEDV) = .07; S.E. for Kurtosis = .15; S.E. for Skewness (IPVAS) = .07; S.E. for Kurtosis = .15.
Results in Table 5 shows that skewness and kurtosis for CEDV and IPVAS are less than 1 and these values exceed their respective S.E. but are acceptable. Therefore, normality as per Tukey’s statistics is achieved for IPVAS but not for CEDV. For CEDV subscales, the values of skewness and kurtosis are less than 1 only for ‘community and media exposure’, and for others, they are positively skewed. However, for IPVAS subscales, these values are exceeding 1 only for the ‘violence’ subscale which shows positively skewed data. For the subscales of ‘abuse’ and ‘control’, values of skewness and kurtosis show normal distribution of the data on these subscales. As outliers are not evident in these statistics, graphical plots were consulted, including Box-and-whisker plots- along with Histogram presentation, Normal Q-Q plot, and Detrended Normal Q-Q plot to see the shape of the data. These
105
showed that the data does not meet the assumption of normality for CEDV. However, keeping the phenomenon of CEDV and some previous research findings (Rigterink, 2013; Cunningham & Baker, 2004) in mind, it was not unexpected that data will be positively skewed.
Now, the question arose about whether we can apply parametric test and multivariate analysis to this data or not. Kim (2013) argued that for determining substantial non- normality in a sample greater than 300, absolute values of skewness larger than 2 or an absolute kurtosis larger than 7 may be used as reference values. Following the above mentioned range, the present data comes under the range of normally distributed. Also, according to Field (2009):
“We also know from the central limit theorem that in big samples the sampling distribution tends to be normal anyway – regardless of the shape of the data we actually collected (and remember that the sampling distribution will tend to be normal regardless of the population distribution in samples of 30 or more). As our sample gets bigger then, we can be more confident that the sampling distribution is normally distributed” (p.134).
However, I contacted Field by email (see, Appendix M) and sent the details of my data distribution to get his opinion and, according to his reply, being a large data set, regardless of its shape, the data is symmetrical anyway for CEDV and IPVAS and does not need to be fixed for achieving normality in data shape. Afterwards, statistical analyses were run on the data to meet the objectives and test the assumptions of the research.
106