Diagnóstico Diferencial - Evidencias Y Recomendaciones

4. Evidencias Y Recomendaciones

4.3 Diagnóstico Diferencial

Because the IAT represents a procedural paradigm for measuring implicit cognition rather than a single measure of a specific construct, there is no single incarnation of the IAT to be validated. Given that the IAT can be adapted to measure different constructs (e.g., racial attitudes, gender attitudes, food preference, etc.,), two IATs may have little in common other than the basic structure of the task. Although the IAT paradigm produces many reliable and valid tasks, this does not mean that any single IAT is necessarily a good measure of the target construct. Given that the present study aims to develop a new IAT to assess individual differences in stereotypes of empathy in scientists, it is especially important to evaluate whether the newly developed IAT can meet relevant psychometric criteria. The following sections review the most common psychometric properties for the IAT in general.

3.3.2.1. Reliability

The concept of reliability is very much tied up with the repeatability or reproducibility of any assessment. For any measure to be useful, it needs to have consistency so that it produces more or less the same result for a person each time it is used (Coaley, 2010). High reliability means that a measure gives similar results on different occasions.

3.3.2.1.1. Internal consistency

As a measurement based on response latencies measured in milliseconds, error variance can be easily introduced into studies using IATs - a sneeze, a car horn, or even an eyeblink can add unwanted variance in response latency (Lane et al., 2007). Indeed, evidence has shown that the internal consistency of measures based on reaction time is generally lower than that of those based on self-reports (Buchner & Wippich, 2000). Even so, IATs have gained exponential popularity partly due to their superior reliability over other latency-based implicit measures such as the Go/No-Go Association Task (averaged split-half reliability r = .20; Nosek & Banaji, 2001) or the priming method (e.g., split-half reliability r = .06, Bosson, Swann & Pennebaker, 2000). The split-half method used to test internal consistency is to create two parallel tests from the items within one test, then to compute a composite score for each subtest and correlate the two composite scores (Furr & Bacharach, 2008). According to the rule of thumb, an estimate of 0.6 to 0.7 indicates acceptable reliability, and 0.8 or higher indicates good reliability. The split-half reliability for existing IAT measures tends to range from .70 to .90 (Schmukle & Egloff, 2005).

The split-half approach is based on the perspective that two halves within a test represent parallel subtests, and the reliability of the complete test is based on the associations between the two subtests. From the "item-level" perspective, Chronbach's alpha takes the logic of internal consistency a step farther by conceiving of each item as a subtest. Consequently, the associations among the items can be used to estimate the reliability of the complete test (Furr & Bacharach, 2008). In one study examining the internal consistency of a number of implicit measures, the IAT showed satisfactory internal consistency (Chronbach's alpha = .78), which is relatively rare for other implicit measures such as the evaluative priming task (Cunningham, Preacher, & Banaji, 2001). A meta-analysis of 50 studies using IATs has also reported an acceptable averaged internal consistency of the IAT with Chronbach's alpha = .79 (Hofmann et al., 2005). Based on these findings, the newly-developed IAT in the present study is expected to

show satisfactory internal consistency around the rule of thumb of .70.

3.3.2.1.2. Test-retest reliability

On the other hand, the test-retest reliability (i.e., the consistency of measurement at different time points; Furr & Bacharach, 2008) of the IAT has been found less satisfactory, ranging from .25 to .69 with a mean estimate of about .50 (Lane et al., 2007). That is to say, when we use the same measure in the same population at two different time points, the measure is expected to produce similar results which should demonstrate strong positive correlations with each other. For example, a measurement producing the same data output at every time would, therefore, show a perfect test- retest reliability of r = 1. The satisfactory test-retest value for the self-report questionnaire is usually above .70.

Though the IAT shows somewhat low test-retest reliability as compared to self-report measures, it is important to note that other implicit measures also demonstrate relatively weak test-retest reliability compared to explicit measures and it is not uncommon for implicit measures to show test-retest reliability below .50 (Bosson et al., 2000). The IAT even showed superior averaged test-retest reliability (.69) over other implicit measures, which ranged from -.05 (e.g., Stroop task) to .63 (e.g., Initials birthday preference task) and averaged .03 (Bosson et al., 2000). Given that the IAT usually showed low test-retest reliability as well as the limited time scope of a PhD project, the test-retest reliability of the newly developed IAT is not examined in the present study.

3.3.2.2. Validity

Construct validity concerns the evidence which shows that the test really is a measure of what it claims to measure (Coaley, 2010). In statistical terms, construct validity represents the extent to which a measure's variance is linked with the variance of its

underlying construct (Barrett, Phillips, & Alexander, 1981). There is no single correlation for this and validation requires review of a range of correlations and whether they match what would be expected. For example, the simplest method is to examine the correlation of scores with other accepted measures of the same thing (Coaley, 2010). Therefore, construct validity depends on all the evidence gathered to show that a test does relate to its construct. The following sections review a series of validity criteria as well as construct-unrelated variance that may violate the validity of the IAT.

3.3.2.2.1. Known-groups validity

A valid measurement needs to be able to reliably differentiate between members of different groups, based on prior knowledge or predictions about them. It is important for an IAT to successfully discriminate between groups in order to be regarded as a measure of personal attitudes rather than shared stereotypes embedded in the culture that one lives in (Nosek, Greenwald, et al., 2007). For example, consistent with the prevailing gender role expectations, women indeed were found to implicitly associate self more with arts as compared to math than men did (Nosek et al., 2002b). Such finding can be interpreted as evidence that the IAT is able to predict known group differences.

Moreover, existing evidence has shown that the IAT is sensitive to more subtle differences in the societal evaluation of different groups despite the ambiguity of ingroup preference. Using the IAT method, African Americans showed reduced ingroup preference as compared to European Americans (Ashburn-Nardo et al., 2003), and over-weight and poor people even showed an outgroup preference instead (Rudman, Feinberg, & Fairchild, 2002). Successful discrimination between group members even extends to groups that are defined by behaviours rather than demographics. For example, smokers showed more positive implicit attitudes toward and stronger identity with smoking than non-smokers (Houwer, Custers, & Clercq, 2006). In the present study, it is hypothesized that the IAT can differentiate between science majors and humanities

majors by showing different implicit attitudes toward empathy in scientists.

3.3.2.2.2. Relationship with explicit measures

The convergent-discriminant validity is one type of subordinate validity which is based on the idea that a measure needs to correlate with others of the same thing but not with those measuring other constructs (Campbell & Fiske, 1959). One of the important criteria to evaluate the convergent-discriminant validity of IAT is to examine its correlations with explicit measures, in other words, self-report measures (Greenwald, Nosek, & Banaji, 2003; Nosek, Greenwald, et al., 2007). The relationship between implicit and explicit attitudes has received a great deal of attention that has produced mixed evidence for the original proposed question: "Do implicit and explicit attitudes relate to one another?"

Some of the initial research efforts with the IAT emphasized the distinctiveness of the implicit and explicit cognitions in finding weak to weak relations between IAT and self- report measures (Greenwald et al., 1998). However, with the accumulation of research on the relationship between implicit and explicit measures, recent studies have produced mixed evidence regarding implicit-explicit correlations. As reported by Lane et al. (2007), across 17 IATs that were available at public websites, correlations between implicit and explicit measures range from r = .13 to r = .75 (median r = .22). Laboratory studies have also shown similar variability, with some studies revealing slight or moderate (but generally positive) correlations between IAT and self-reports of the same construct (Bosson et al., 2000; Egloff & Schmukle, 2002), and other studies showing strong and robust correlations between IAT and self-reports (e.g., Cunningham et al., 2001; Jellison, McConnell, & Gabriel, 2004; McConnell & Leibold, 2001). A meta-analysis of IAT studies found that across 126 studies, implicit-explicit correspondence ranged from r = -.25 to r = .60, with an average implicit-explicit correlation ofcit .19 (Hofmann et al., 2005).

Even when IATs and explicit measures do correlate, evidence shows that implicit and explicit attitudes are still distinct constructs (Wilson et al., 2000). Using structural equation modelling, across 57 different pairs of attitude objects, Nosek and Smyth (2007) found that implicit and explicit attitudes were better fit by a model in which they loaded onto two separate factors, rather than a single, latent factor even when implicit and explicit attitudes were highly correlated with one another.

Given that the extent to which implicit and explicit attitudes are correlated varied widely across studies, the more appropriate question for future research to answer should be "Under what conditions, and for what kind of people, are implicit and explicit measures related?" (Olson & Fazio, 2004). The hunt for the convergent-discriminant validity of the IAT may also be more successful when large samples and advanced statistical techniques, such as meta-analysis or latent variable modeling, are used. In the present study, the relationship between implicit and explicit measures of the stereotypes of empathy in scientists are still examined. Following the idea that implicit and explicit stereotypes are distinct constructs, we still hypothesize the implicitly measured stereotypes of empathy in scientists to show weak correlation with explicitly measured self-report stereotypes in the present study. However, it is important to bear in mind that the simple correlations between one IAT and self-report measures should not be interpreted as robust evidence for the convergent-discriminant validity of the IAT.

3.3.2.2.3. Predictive validity

Given that a long-standing interest of psychologists is to understand how attitudes predict behaviour, it is not surprising much research effort has been put on examining whether attitudes captured by IATs are related to meaningful behaviours. The ability to successfully predict behaviours is also an important aspect of psychometric properties of a measurement. As presented before, researchers have drawn three types of

theoretical models of the implicit and explicit stereotypes (i.e., double-dissociation, addictive and interactive; see Section 3.2.1.3) and each model entails a distinct hypothesis about the predictive validity of the IAT.

According to the double-dissociation model, IATs and self-report measures should predict spontaneous and controlled behaviours respectively. Indeed, there is evidence showing that IATs are capable of successfully predicting less controlled behaviours when participants are under a high cognitive load (Hofmann, Friese, & Strack, 2009) or under the influence of alcohol (Hofmann & Friese, 2008). Furthermore, Greenwald, Poehlman, Uhlmann and Banaji (2009) meta-analysed 184 independent samples and found that both implicit and explicit stereotypes were related to a range of behaviours but implicit bias measured by IATs were superior to explicit bias measured by self- reports in predicting physiological responses and non-verbal discriminant behaviours, whereas explicit measures were superior to IATs in predicting more deliberate behaviours such as political candidate choices and brand preferences.

As suggested by the additive model, the IAT and the explicit measures should be seen as different measures of the same attitudes. From this point of view, attitudes can be compared to icebergs, with explicit attitudes residing above the surface of conscious control and implicit attitudes residing below it (Karpinski & Hilton, 2001). If we follow this assumption of a single cognitive system but two different measures, both implicit and explicit measures of the same attitudes should provide a distinctive prediction of behaviours. However, no empirical evidence to date has been found supporting this hypothesis (Perugini, 2005).

According to the interactive model, implicit and explicit processes should work together in a multiplicative way in order to influence behaviours, possibly when individuals feel ambivalent towards something. For example, a study by Frost, Ko, and James (2007) revealed that individuals who were most likely to be passive aggressive when interacting with others showed high implicit aggression when completing a

conditional reasoning task but scored low in self-report aggression questionnaire. Similarly, Jordan and his colleagues (2003) found that individuals displayed the highest level of narcissistic behaviours scored low in implicit self-esteem measured by an IAT but high in explicit self-esteem measured by a self-report questionnaire.

To sum up, existing evidence has shown that IATs can predict a range of meaningful behaviours. It is possible to articulate three predictive models that reflect the relations between implicit and explicit attitudes and behaviours. Empirical studies have found evidence for both the double-dissociation model and interactive model, but not the additive model. The present study will also examine the predictive power of the newly developed IAT for stereotypes of empathy in scientists by looking at its relationship with career aspirations in science.

3.3.2.2.4. Construct-unrelated variance

Several studies have revealed a number of construct-unrelated variables that may have influence on the IAT validity include: order of the compatible and incompatible tasks; cognitive fluency; and prior experience with the IAT.

Firstly, the most commonly observed extraneous factor is the order of the compatible and incompatible task. In the IAT, the compatible task is the task when participants are required to pair items in a stereotypical way, whereas the incompatible task refers to the task when participants are required to pair items in a counter-stereotypical way. Regardless of the content of the tasks, the performance of the preceding pairing task tends to interfere with the performance of the subsequent pairing task. IAT effects are found slightly biased toward indicating that the associations drawn upon in the first- performed task are stronger than those drawn upon the later-performed task (Back, Schmukle, & Egloff, 2005; Klauer, 2005). This extraneous effect will be controlled by counterbalancing the task order as well as adding more trials for practice in the present

study (See later Section 4.3.4).

The second extraneous influence is the individual difference in average response latency, or so-called cognitive fluency (i.e., the ease with which information is processed; Mierke & Klauer, 2003). Participants who react generally more slowly tend to show larger IAT effects (representing stronger implicit bias) than those who react more quickly (McFarland & Crouch, 2002). This extraneous effect can be reduced by applying an advanced scoring algorithm (Greenwald et al., 2003; See later Section 5.2.5).

Moreover, existing evidence also suggests that effect magnitudes with the IAT tend to decline for participants who have prior experience taking an IAT (Greenwald & Nosek, 2001). The advanced scoring algorithm has also been proposed as an effective way to reduce the influence of this factor (Greenwald et al., 2003). The present study will apply this algorithm and test its ability to control the extraneous order variance (see later Section 5.3.6) and prior experience variance (see later Section 5.3.7).

In document A L E R G I A A L I M E N T A R I A (página 15-0)