Filtros de Aire - Resumen del presupuesto

5. Resumen del presupuesto

1.4. Filtros de Aire

The first step in conducting EFA is to examine the correlation between different variables (Comrey and Lee, 1992). For a study to be analysed using factor analysis, the correlation between different variables should exceed 0.3; moreover, and for the sake of validity, communalities value should be higher than 0.4 (Costello and Osborne, 2005). In running the EFA, the following methods and approaches were applied using SPSS (IBM, 2016):

 Correlation Matrix: Coefficient, significance levels, Kaiser-Meyer-Olkin (KMO) and Bartlett’s test of sphericity.

 Extract factors using Principle Component Analysis and based on eigenvalues greater than 1 (Kaiser, 1960).

 Rotation Method: Varimax

 Method for Factor Scores: Regression

The Kaiser-Meyer-Olkin Measure of Sampling Adequacy in KMO and Bartlett's test had a high value of 0.842 with a significance of 0.000 (χ2_{(465, N = 31) = 2248.047, p < .05), which meant}

consistent. The very high score of Cronbach’s Alpha (.909) also confirmed the internal consistency of the survey items. Moreover, the correlation matrix generated for the 31 metrics prove some powerful correlations between them and the communalities value for all of the variables were also higher than 0.5.

After computing correlations between different variables, the second step was to determine the number of factors that should be extracted. Eigenvalue rule and scree plot are two of the most widely nonstatistical guidelines used in the literature to help to choose the right number of factors that should be extracted (DeVellis, 2012). Eigenvalue indicates the amount of variance explained (Suhr, 2006) and information captured (DeVellis, 2012) for each factor. This study followed Kaiser's eigenvalue rule; therefore, factors with the eigenvalue of less than 1 were eliminated (Kaiser, 1960). A scree plot was also used to provide a graphical view of the right number of factors that should be extracted (Thompson, 2004, p. 33). As it is seen in Figure 5- 1, around 69% of the total variance could be explained by nine factors with the eigenvalue of more than one.

The nine components identified in the previous step were then rotated using Varimax rotation as the method. Rotation helps in clarifying the factors that each variable belongs to and makes factor naming easier (Seiler, 2004, p. 177). In this study, rotation helped to determine the relationship between each variable, that was a quality metric for ontology evaluation, with each of the resulting factors.

Table 5-3 presents what is known as “loadings”, that refers to the extent that different variables are related to the hypothetical factor(s) (Comrey and Lee, 1992, p.5); higher loadings are usually preferred in the literature. Comrey and Lee (1992, p. 243) has proposed the following guidelines for loading interpretation: 0.32=poor, 0.45=fair, 0.55=good, 0.63=very good, and 0.71=excellent. Some variables might also have negative loadings on some factors, which means that they are negatively correlated to the factor construct (ibid.). In this study, loadings less than 0.4 were eliminated from the rotated component matrix.

As it is seen in Table 5-3, the first component identified in factor analysis included all the popularity related metrics used in the survey. Metrics related to the trust and reputation of the developer team or organisation were highly loaded on the second reliable component. The third component included metrics that referred to the responsiveness of the developer team and the

Table 5-3 Loadings-9 Factors

Factor Item Loading

Factor 1 QM4_1_Number_Of_Times_Ontology_Been_Reused .713 QM4_2_Popularity_On_Web_Website_Views .771 QM4_3_Popularity_In_Community_Among_Colleagues .665 QM4_4_Popularity_Ontology_Social_Media .700 QM4_6_Reviews_Rating_Of_Ontology .550 QM2_4_Reuse_Import .440 Factor 2 QM3_2_Knowing_Trusting_Ontology_Developers .832 QM3_3_knowing_trusting_ontology_development_organization .709 QM3_4_Flexibility_Ontology_&_Developer_Team .559 QM4_5_Reputation_Developer_Team_Institute .595 Factor 3 QM2_7_Number_of_update_maintenance .832 QM2_8_Frequency_Update_Maintenance .820 QM2_9_Funds_Availability_Update_Maintenance .635 QM3_1_Active_Responsive_Community .509

Factor 4 QM3_5_Extra_Info_Usage_Individuals_Organisations QM3_6_Extra_Info_Usage_Projects .690 .760

QM3_7_Extra_Info_Usage_Purpose .793 Factor 5 QM1_3_Structure .626 QM1_4_Semantic_Richness_&_Correctness .768 QM1_5_Syntactic_Correctness .620 QM1_6_Consistency .520 Factor 6 QM2_10_Accessibility .840 QM3_8_Availability_Wikis_Forums_MailingLists_SupportTeam .533 Factor 7 QM2_2_Documentation .645 QM2_3_Availability_of_metadata .725 QM2_5_Language .432 Factor 8 QM1_1_Scope -.530 QM2_1_Methodology .511 QM2_6_Size -.440 QM2_11_Availability_Publication .571 Factor 9 QM1_2_Content .853 QM1_3_Structure .447

maintenance process. The rest of the components had grouped different metrics related to the internal aspects of ontologies, their accessibility, and additional information about them. After identifying and extracting factors, the next step was to interpret them; researchers should consider different reliability issues before starting the interpretation process. They should also ask questions like “what is the potential value of this factor?” or “do the variables that define the factor reveal all its major aspects?” (Comrey and Lee, 1992). Reliability of different factors can also be tested by the number of variables that are loaded on them and the absolute value of the loadings. Some have recommended that each desirable factor should at least include three variables (Yong and Pearce, 2013).

Stevens (2009, p.333) suggested that “components with four or more loadings above 0.6 in absolute value are reliable, regardless of sample size”. He also argued that a factor can be considered reliable if “the average of the four largest loadings is > 0.60 or the average of the three largest loadings is > .80”. Reliability determination gets more complicated when the absolute value of loadings is lower.

As it is seen in Table 5-3, the loadings of different variables identified in this study are generally high and can mostly be rated as good or very good, sometimes even excellent (Comrey and Lee, 1992). However, the researcher had to deal with the problem of not having enough variables loaded on some of the factors. For example, two variables, namely, “accessibility” and “availability of online mailing list and support team” were highly loaded on component 6; however, having only two variables loaded on this component made it very difficult, if not impossible, to interpret it; also, it did not meet the minimum reliability requirements.

According to the literature, researchers should be cautious when interpreting factors that are based on a few low loadings variables (Comrey and Lee, 1992). To address the harmful effects of under extracting or over extracting factors, Costello and Osborne (2005) has suggested using scree plot, manually setting the number of items to retain and conducting multiple factor analysis, until identifying “best fit”; best fit has been defined as a model with no factor with less than three variables, with the minimum loading value of .3 for different items, and no or few items that cross-load on different factors (ibid.).

In this study, following Kaiser’s eigenvalue rule (Kaiser, 1960) led to having nine factors, some of which had reliability issues, e.g., did not include enough variables loaded on them. Crossloading, the situation in which a variable is loaded on two or more factors (Yong and

Pearce, 2013), was the other issue; size, for example, was fairly loaded on component 1 and 8, with the value of 0.488 and -0.440 respectably. However, it was difficult to link size to the popularity factor (component 1), as this factor is more about how ontologies are used in the community or how many times they are used. Thus, it was decided to add it to factor 8. To address some of the discussed issues, it was decided to keep all the variables, even though some of them were not forming any reliable factor, and to re-conduct factor analysis (Costello and Osborne, 2005).

In document Escuela Politécnica Superior de Jaén (página 115-0)