• No se han encontrado resultados

Principales hallazgos de la investigación

For our first simulation study we choose to revisit Dataset 2 introduced in Chapter 2. Recall that this dataset contains n = 50 rankings, the first 40 of which are informative rankings and the remaining 10 (labelled 41–50) are uninformative/random permutations. Each ranking within this dataset is a complete ranking of K = 20 entities.

Before we can perform Bayesian inference we must first of course choose a suitable prior distribution. As in our previous analyses of these data we choose to let each ordering of the entities be equally likely a priori, that is, let ak = 1 for all k with the resulting prior

distribution over the skill parameters being λsk indep

∼ Ga(1, 1). Specifying a (prior) choice for the concentration parameter of the Dirichlet process is somewhat difficult and so we place a prior distribution over α. We choose aα = bα = 1 so that α ∼ Ga(1, 1), which

gives a fairly weak prior distribution over the number of ranker clusters. Note that here the modal prior number of ranker clusters is 1 (with probability 0.19) and thus seems reasonable given the nature of this dataset – see Table 3.1 for the full prior distribution over the number of clusters. We also need to choose prior probabilities that each ranker is informative, that is, specify pi = Pr(wi = 1) for each ranker i. Here we consider 3

analyses, each defined by particular choices of the pi. Analyses 1 and 2 take the equivalent

specification to those studies considered within Section 2.7, that is, for Analysis 1 we let pi = 0.5 (each ranker is equally likely to be informative as it is uninformative) and in

Analysis 2 we take pi = 0.8 (the true proportion within these data). For the final analysis

(Analysis 3) we assume the standard Plackett–Luce model which is achieved by taking pi = 1. This choice is used to asses how robust our analysis is to assuming all rankers are

informative when, in fact, there are uninformative rankers in the dataset. Intuitively we might think that the Dirichlet process mixture model would be flexible enough to cluster together the informative rankings and form a separate cluster to house the uninformative rankings. However, as we shall see, this turns out not to be the case. Analyses 1 and 2 allow us to compare how our DP mixture model performs in comparison to the (single component) homogeneous model considered in Chapter 2.

Posterior analysis

To generate realisations from the posterior distribution (for each analysis) we implemented the sampling algorithm outlined in Section 3.5.5 with m = 2 auxiliary variables. Each Markov chain was initialised at a random draw from the prior distribution. To obtain 10K (almost) un-autocorrelated realisations from the posterior distribution we needed to thin the output by factors of 60, 20 and 5 for Analyses 1–3 respectively. We therefore ran the scheme for 600K, 200K and 50K iterations for each respective analysis and also allowed each chain a burn-in period of 10K iterations after initialisation – these samples were discarded. The computational time required to perform inference was (approximately) 126, 33 and 11 seconds for each analysis. The mixing of the MCMC chains was assessed by inspecting trace plots of the log complete data likelihood; see Figure 3.3. This is convenient not only because our state space is vast but also because the dimension of the posterior distribution can change at each iteration (depending upon the number of unique ranker groups). It is therefore not realistic to inspect trace plots of individual parameters within the Markov chain, particularly as cluster labels can swap arbitrarily. Convergence was assessed by initialising numerous chains at differing starting values and verifying that the resulting posterior distributions were equivalent (up to stochastic noise).

We begin by determining the posterior distribution formed under Analysis 3 (pi = 1) –

assuming a Dirichlet process mixture of standard Plackett–Luce models. Our intuition a priori led us to believe that our mixture model (outlined in Section 3.5.1) might allow the formation of a cluster housing the informative rankers and a separate cluster to house the uninformative rankers. The marginal posterior distribution for the number of ranker

Iteration Log−lik elihood 0 2000 4000 6000 8000 10000 −2500 −1500 −500 Iteration Log−lik elihood 0 2000 4000 6000 8000 10000 −2500 −1500 −500 Iteration Log−lik elihood 0 2000 4000 6000 8000 10000 −2500 −1500 −500

Figure 3.3: Trace plots of the log complete data likelihood for Analyses 1, 2: pi= 0.5, 0.8 (top left and right respectively) and Analysis 3: pi= 1 (bottom)

i Analysis 1 2 3 4 5 6 7 8 9 ≥ 10 E SD 1 0.86 0.12 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.16 0.42 2 0.75 0.19 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 1.32 0.62 3 0.00 0.01 0.04 0.11 0.20 0.24 0.20 0.12 0.06 0.02 6.12 1.68 Prior 0.19 0.17 0.15 0.12 0.10 0.07 0.06 0.04 0.03 0.07 4.18 3.00

Table 3.1: Posterior probabilities of the number of ranker clusters, Pr(Nr= i

|D), for each of the three analyses. The expectation and standard deviation of the marginal posterior distribution are also shown along with the prior distribution. The modal values are highlighted in bold.

groups (Table 3.1) gives Pr(Nr = 2|D) = 0.01 and so there is little posterior support for this suggestion. Perhaps surprisingly the posterior modal number of ranker groups here is six. However, we note that there is a large amount of uncertainty on this. In contrast, for Analyses 1 and 2, we see significant posterior support for a single homogeneous group with Pr(Nr = 1|D) = 0.86 and 0.75 respectively. Clearly allowing for uncertainty on ranker reliability results in a significant change in posterior beliefs about the ranker groups contained within these data – this is a feature of the model which will be discussed in more detail later.

The marginal posterior distribution for the number of ranker clusters provides a useful insight into the posterior distribution; however, it does not tell the full story. A more in- depth summary of the posterior distribution can be obtained if we consider the underlying grouping structure of the rankers. The posterior distribution of the allocation of rankers to ranker groups is, of course, quite complex. A common way to summarise ranker hetero- geneity is through a single summary allocation to each ranker group, such as the maximum a posteriori (MAP) allocation or the improvements to the MAP allocation proposed by Dahl (2006) and Lau and Green (2007). However, these summaries can be misleading unless the posterior probability of the modal number of groups is fairly large. Note that for the Analysis 3 posterior distribution this is certainly not the case. Instead we prefer to summarise ranker heterogeneity using dissimilarity probabilities ∆ij = Pr(cri 6= crj|D),

that is, the posterior probability that two rankers (i and j) are not allocated to the same cluster. The allocation of rankers to groups could then be determined by thresholding these probabilities. However this too can suffer from inconsistent allocations of say ranker triples, particularly when their dissimilarity probabilities are near the threshold. There- fore, following Medvedovic and Sivaganesan (2002), we use a standard summary method from cluster analysis, namely a dendrogram calculated from the dissimilarity probabili- ties ∆ij. Note that we consider dendrograms formed using the complete linkage method,

also known as furthest neighbour clustering. This method tends to produce more densely packed clusters and does not suffer from “chaining”; see Everitt et al. (2011) for further

0.00 0.01 0.02 0.03 0.04 0.05 48 46 49 43 44 41 45 47 50 39 37 8 42 17 2 15 11 36 3 14 24 4 5 32 9 23 7 20 13 30 18 1 28 25 6 38 33 29 22 21 10 40 35 34 31 27 26 19 16 12 0.00 0.02 0.04 0.06 0.08 0.10 47 45 50 48 41 43 49 46 44 37 8 39 36 17 42 11 3 15 2 4 30 7 24 14 5 32 9 28 18 1 23 25 13 33 38 35 34 31 22 21 20 16 6 40 29 27 26 19 12 10 0.0 0.2 0.4 0.6 0.8 1.0 47 45 43 50 48 46 44 41 49 39 37 8 17 42 3 14 36 11 2 15 24 5 30 32 7 4 23 9 28 1 20 25 13 18 22 6 33 29 27 38 21 35 16 40 34 10 31 26 19 12

Figure 3.4: Complete linkage dendrograms based on the dissimilarities between each pair of rankers for Analyses 1–3 from top to bottom respectively.

details on linkage methods. Of course many other methods could also be used to sum- marise the heterogeneity between rankers, see, for example, Rastelli and Friel (2017) and the references therein.

Figure 3.4 depicts the dendrograms computed from the dissimilarity matrices for each of the analyses considered. The allocation of rankers to ranker groups for Analyses 1 and 2 is somewhat trivial given Pr(Nr= 1|D) = 0.86 and 0.75 respectively. The corresponding dendrograms confirm that all rankers are often grouped together; evident through the val- ues of dissimilarity at which rankers join the main cluster. However it is encouraging to see that the uninformative rankers (with the exception of ranker 42) are last to join the main

rankers to groups is not quite as straightforward. The corresponding dendrogram shows that there is a large cluster containing those rankers numbered{1, 2, . . . , 40, 42}. This con- clusion can be drawn since ∆ij ≤ 0.30 =⇒ (1 − ∆ij) > 0.70 for i6= j ∈ {1, 2, . . . , 40, 42},

that is, any pair of rankers within this set are clustered together at least 70% of the time. Given the large proportion of the time that these rankers are co-clustered, it is reasonable to conclude that they have similar beliefs about the entities. The remaining rankers, those numbered 41, 43, . . . , 50, typically have a dissimilarity greater than 0.5. It is clear from looking at the left hand side of the dendrogram that there is no clear grouping structure be- tween any of these rankers. This is perhaps not surprising given their associated rankings are random permutations and are therefore likely to express contradicting preferences.

We now return to the point we noted earlier, namely that allowing for uncertainty on ranker ability changes posterior beliefs about the number of ranker groups. After investigating the posterior distribution for each analysis, perhaps this result is not as surprising as it first seems. In Analysis 3 the standard Plackett–Luce model does not have the flexibility to down weight the contribution uninformative rankers make to the overall likelihood, and this leads instead to the formation of additional clusters to house those rankers which are not consistent with others (the uninformative rankings); see Figure 3.4. Moreover, these rankers do not even form a single homogeneous cluster due to the high variation in random permutations (as mentioned previously). On the other hand, in Analyses 1 and 2 (mixture of Weighted Plackett–Luce models) the model is able to down weight the uninformative rankers; see Figure 3.5. Recall that when wi = 0 the likelihood of ranking i is constant

(Pr(Xi = xi|λ, wi = 0) = 1/P (Ki, ni)) and does not depend on λ. Thus a ranker who

is deemed to be uninformative is free to join a cluster regardless of their beliefs about the entities as the likelihood is unaffected. Indeed, such a ranker will typically join the largest “active” cluster – this follows from the rich get richer notion underpinning the Dirichlet process (as mentioned in Section 3.3.1). Consequently it is not surprising that the uninformative rankings (41–50) join the main cluster, that is, the cluster housing the informative rankers under Analyses 1 and 2.

We conclude this section with a brief comparison of Analyses 1–3 and those where we assumed all rankers were homogeneous in their beliefs about the entities in Chapter 2. There are significant similarities between the posterior distributions of the ranker weights under Analyses 1 and 2 in both this section and Section 2.7; see Figures 3.5 and 2.4. This is perhaps not surprising given the ranker weights are not cluster-specific and we have significant posterior support for a single ranker cluster in these analyses. Note that when there is only a single ranker cluster, the analyses presented here are analogous to those in Section 2.7. The aggregate rankings formed under Analyses 1 and 2 here are very similar to those under the corresponding homogeneous analyses considered in Section 2.7; see Tables 2.3 and 3.2. Note that here the aggregate ranking is determined by ordering the

0.0 0.2 0.4 0.6 0.8 1.0 Ranking Pr(w i =1|Data) 1 5 10 15 20 25 30 35 40 45 50 Analysis 1 Analysis 2

Figure 3.5: Pr(wi= 1|D) – Posterior probability that ranking i is informative under each analysis (Analysis 1: pi = 0.5, Analysis 2: pi = 0.8). Rankings which are random permutations (41–50) are shown in red.

mean of the (fully) marginal posterior distribution for each entity (marginalised over ranker clusters). Recall that, for the homogeneous analyses in Chapter 2 we observed that the aggregate rankings for the analysis of Dataset 2 under the Weighted Plackett–Luce model were equivalent to those formed by analysing Dataset 1 under the standard Plackett–Luce model. This is the case as the WPL model is able to correctly identify the uninformative rankers and down weight them. The same conclusion can be drawn here; see Figure 3.5 and Table 3.2. Unsurprisingly, for Analysis 3 (pi = 1; the standard Plackett–Luce model),

the aggregate ranking is affected by the misleading prior information that states that the uninformative rankers are informative. This was also observed when considering the homogenous analysis under the standard Plackett–Luce model; see Section 2.4.

The posterior distribution from Analysis 3 clearly suggests that there is significant het- erogeneity between rankers beliefs; see Table 3.1. Summarising such heterogeneous data through an overall aggregate ranking is perhaps not sensible. The differences in preferences between the ranker groups is easily seen though the within-cluster aggregate rankings. Such an aggregate is formed by first conditioning on an appropriate number of ranker groups and then ordering the marginal posterior means of the skill parameters within each group. For Analysis 3, conditioning on 6 ranker groups (the posterior mode), the within-cluster aggregate ranking for ranker cluster 1 (that which typically houses infor- mative rankers) is very similar to the overall aggregate under the other analyses. The remaining within-cluster aggregates (those for ranker clusters 2–6) show little coherence with the true entity preference order and instead appear to be random permutations of the entities. This is perhaps not surprising as these clusters typically house the uninformative rankers.

Heterogeneous PLW Homogeneous PL

pi= 0.5 pi = 0.8 pi = 1

Dataset 2 Dataset 2 Dataset 2 Dataset 1 Dataset 2 ˆ

x λ xagg λ¯ xagg λ¯ xagg λ¯ xagg

1 λ¯1 xagg2 λ¯2 1 20.00 3 30.91 3 25.94 3 7.20 3 27.47 3 11.67 2 19.00 1 26.27 1 22.70 2 6.41 1 25.83 1 11.44 3 18.00 5 25.73 5 22.28 1 6.36 5 23.90 2 11.29 4 17.00 2 24.94 2 21.59 5 6.10 2 22.66 4 9.84 5 16.00 4 20.73 4 17.58 4 5.21 4 18.11 9 9.54 6 15.00 6 18.97 6 16.41 9 5.06 6 17.98 5 9.41 7 14.00 8 18.72 8 16.35 8 5.02 8 17.31 8 9.11 8 13.00 7 16.94 7 14.85 6 4.82 7 16.12 6 8.55 9 12.00 9 16.10 9 14.59 7 4.35 9 15.91 7 8.10 10 11.00 10 14.78 10 13.47 10 4.14 10 14.50 10 7.48 11 10.00 11 12.14 11 10.58 11 3.38 11 11.84 11 6.36 12 9.00 12 10.20 12 9.20 12 2.86 12 9.95 12 5.42 13 8.00 13 8.68 13 7.74 13 2.72 13 9.00 13 5.02 14 7.00 14 7.28 14 6.60 16 2.25 14 7.17 14 4.17 15 6.00 16 7.28 16 6.48 14 2.19 16 7.08 16 3.98 16 5.00 15 5.17 15 4.77 15 1.90 15 4.99 15 3.47 17 4.00 17 4.57 17 4.21 17 1.61 17 4.58 17 3.10 18 3.00 18 2.85 18 2.67 19 1.57 18 2.81 19 2.16 19 2.00 19 2.62 19 2.59 18 1.34 19 2.68 18 2.08 20 1.00 20 1.00 20 1.00 20 1.00 20 1.00 20 1.00

Table 3.2: Aggregate rankings under the infinite mixture of Weighted Plackett–Luce model for the analysis of Dataset 2 (for analyses 1–3; pi= 0.5, 0.8, 1) along with the corresponding posterior means. The results from Table 2.2 (homogeneous standard Plackett–Luce analyses) are also given to facilitate comparison.

Documento similar