4. ANÁLISIS E INTERPRETACIÓN DE RESULTADOS
4.1. ENCUESTA APLICADA A LOS ESTUDIANTES ANTES DE LA
Structural equation modelling (SEM) is largely a confirmatory technique that can be used to confirm a pre-specified model that is based on theory and previous research. This model is confirmed for the data being tested through several different fit indices. Although this is often the primary use of SEM, it can also be used in an exploratory manner to develop a model that fits the data or to modify a pre-specified model to fit the data better (Byrne, 2012). SEM provides many useful statistical tools. Firstly, by using SEM it is possible to include not only measured or observed variables in the model (conventionally denoted as a rectangle) but also unobserved or latent variables (typically denoted as ellipses). Latent variables are considered to be un-
measurable constructs such as anxiety, self-esteem, and motivation (K. A. Bollen, 2002; Byrne, 2012). These constructs may be and are often measured by validated scales. Within SEM, the components of the scales may be considered indicators of the latent variable and these indicators are measured variables that should be moderately positively correlated with each other to have internal consistency (K. Bollen & Lennox, 1991). They should be moderately correlated because they are measuring the same construct; therefore, as one indicator increases so should the others. Similarly, one may also use several questions that do not make up a scale as indicators to
represent a construct and then test how well the items measure the construct. These indicators allow researchers to estimate a latent variable but do not allow an exact prediction (K. A. Bollen, 2002). The section of the model examining the relationship between the indicators and latent variable is tested using either confirmatory or exploratory factor analysis and is termed the measurement model.
In this thesis, anxiety is measured using the OASIS scale and the five questions that comprise the OASIS scale serve as indicators to represent the construct of anxiety. The factorial validity of these indicators for anxiety in Ontarian bisexuals will be tested in the measurement
model by using confirmatory factor analysis (CFA). Similar to the OASIS, the factorial validity of the PCL-C will be tested using CFA. Since this thesis is examining the effect of biphobia on PTSD in an exploratory manner, subsequent models will not include PTSD as a latent variable but as a measured item (PCL-C) due to sample size limitations (i.e. with PTSD as a latent variable there are too many free parameters to accurately estimate with a sample size of 405). In addition, this exploratory CFA will have a smaller ratio of participants to free parameters (~ 8:1) than the OASIS CFA (~26:1). This ratio is below the most often suggested required sample size to parameter ratio for accurate estimation (10:1) but above the minimum required sample size to parameter ratio (5:1) (see discussion below). The exposures of biphobia from the gay community and biphobia from the straight community will be included as measured items or scales as
opposed to latent variables. This is for two reasons; firstly, there is not a large enough sample size to analyze the two subscales as latent variables because of the large number of items
measuring each construct. Secondly, this scale was developed for use in bisexual populations and has been validated in two bisexual populations with high internal reliability (Brewster & Moradi, 2010). This is in contrast to the OASIS and PCL-C which have not been specifically validated in bisexual populations. Additionally, the ABES demonstrated high internal reliability in the Risk & Resilience Study.
A second advantage of SEM is that it accounts for measurement error (both random and systematic). It also allows for residual error, or error resulting from predicting dependent (endogenous) variables from independent (exogenous) variables because it is unlikely that the exogenous variable completely predicts the endogenous variable (Byrne, 2012). As a result, it has been stated that SEMs are less-restrictive regression equations (Ditlevsen, Christensen, Lynch, Damsgaard, & Keiding, 2005). Byrne (2012) explains that SEMs estimate these errors whereas regressions assume that errors in the independent variable are non-existent conditional on an observed value. Mplus 7.11 (L. K. Muthén & Muthén, 2013) includes the variances for the exogenous latent variables and assumes that the exogenous variables are not associated with the residual error and that there is no covariance between the measurement errors, both of which are important assumptions for SEM (Byrne, 2012).
Generally, SEMs have been described as a series of regression equations (Multivariate Data Analysis, 2010). Byrne (2012) explains that each equation summarizes a series of
regression equations that include the impact of all variables (latent and observed) on one
variable. As a result, the coefficients calculated for one-way directional arrows can be interpreted as regression coefficients and the coefficients calculated for two-way non-directional arrows are correlation coefficients (Gallion & Scheperle, 2008). These correlation and regression
coefficients comprise two of the parameters of the model. The third type of parameter are the variances of the exogenous variables (MacCallum, 1995). To accurately estimate these parameters, a somewhat arbitrary sample size of ten participants per parameter has been
recommended but a ratio of five participants per parameter has also been suggested as adequate (Bentler & Chou, 1987). This thesis has a sufficient sample size (n=405) to test the specified models. This part of the model that examines the relationships between latent variables or between latent and observed variables (excluding indicator variables) is termed the structural model. In order to obtain the estimates, iterative methods such as maximum likelihood are used until the model is converged (Hoyle, 1995). For clustered samples, Mplus 7.11 uses maximum likelihood with robust standard errors (MLR) which are calculated using a sandwich estimator (Asparouhov & Muthén, 2005).
An additional requirement that must be met in order to test and interpret the model is to have an over-identified model. Byrne (2012) and MacCallum (1995) explain that over-
identification occurs when there are more data points than parameters to estimate, resulting in positive degrees of freedom which allow the model to be rejected. Only in this case is the model considered meaningful. Conversely, a model that is just-identified perfectly matches the data (i.e. there is a unique solution for the parameter estimates) and plausibility cannot be determined since there are no degrees of freedom and the model can never be rejected (Byrne, 2012;
MacCallum, 1995). This occurs when there are an equal number of data points and parameters to estimate (Byrne, 2012). If the model is under-identified (i.e. cannot be estimated) then the model parameters cannot be interpreted; this occurs because the number of parameters exceeds the number of data points (MacCallum, 1995). This is because in an under-identified model, different estimates can define the same model; in other words, the estimates are arbitrary and cannot be evaluated due to lack of constancy (Byrne, 2012). Byrne (2012) explains that it is equivalent to trying to determine a unique value for X and Y when given X+Y=15. This occurs when the parameters to estimate exceed the data points (Byrne, 2012). Byrne (2012) and MacCallum (1995) explain that there are two necessary conditions for over-identification;
establishing scales for the latent variables and ensuring that the number of unknown parameters is not larger than the measured variable variances and covariances (data points), both of which have been established in this thesis. The latent variable scale is automatically established in Mplus 7.11 (L. K. Muthén & Muthén, 2013) by fixing one of the indicator variable values to one (Byrne, 2012).
Since this thesis is using data collected through RDS, it is important to consider
clustering and weighting the data. Stapleton (2006) describes the importance of taking clustering into consideration. Stapleton (2006) explains that SEM conventionally assumes the data were obtained from simple random sampling; therefore, clustered data will underestimate the standard error, may lead to improper rejection of the model, and may lead to estimates that seem to be statistically significant but are not . Weighting is important to consider because there is an unequal probability of selection, as there is when using RDS. Mplus 7.11 (L. K. Muthén & Muthén, 2013) uses pseudomaximum likelihood methods which can be used with models that include latent variables Therefore, this thesis will take into consideration clustering and weighting when estimating the model parameters.
Finally, moderation will be tested by using additive scale interaction terms multiplying the moderator by the independent variables (biphobia from the gay community and biphobia from the straight community) (Klein & Moosbrugger, 2000; L. K. Muthén & Muthén, 2012). Gender identity, discrimination based on race/ethnicity, and discrimination based on ability will be tested first to determine if they are moderators. If they are not found to be moderators then they will be included in the models as potential confounders. Following this, models including the main potential moderators of interest (LGBTQ community identification and involvement, positive bisexual identity, and volunteering/advocacy/activism) will be tested while controlling for confounding.
CHAPTER 5: RESULTS