2.6 Estudios Empíricos
2.6.3 Limitaciones de la validez de los trabajos empíricos
Given that we have determined the intermediary levels of the hierarchical results to be unreliable, we now turn our attention to the final level. Indeed it is the top level of the hierarchy that is the result of the modularity optimisation process and therefore closest to the best partition of the network, at least within the modularity framework. We can assess the quality of this partition of the network in a number of different ways. Some of these measures are a- spatial and apply generally to networks while others take account of the unique nature of spatially embedded networks.
Modularity
The modularity score itself is a measure of partition quality. This value tells us how different the partition is from a random assignment of nodes to com- munities. In a range of 0 to 1, the score of 0.702 found at the final level of the hierarchy is considered a high value and therefore indicative of definite non random structure. This single value isn’t very informative however so we look at other measures.
Spatial contiguity
We assess the spatial contiguity of the partition through simple visual analysis by plotting each tower at its spatial coordinates with a colour determined by its community assignment. We see that the majority of towers are adjacent to towers of the same colour with only very few spatial outliers. These are likely to be towers with very low ‘populations’ and therefore statistically noisy.
Cohesiveness
We can measure the cohesiveness of each community by calculating the fraction of in-community links for each tower of each community. The fraction of in-
community links for node i is given by
�
jAijδ(ci,cj)
�
jAij where δ(ci, cj) is 1 if i, j are
in the same community. We calculate this value for every node (tower) and find the median to be 0.62 with a lower quartile of 0.51 and upper quartile of 0.71. We show boxplots for each community separately in Figure 6.19 to assess the variation between communities.
Figure 6.19: Proportion of intra-community links for each community.
We see significant differences in the medians and interquartile ranges between different communities with community 1 having a median less than 0.5 whereas community 3 has a median value greater than 0.8. We also see large ranges within some communities and with a number of very low value outliers. We should point out here that a value of less than 0.5 means that less than half of the ties from that particular tower are with other towers in the same com- munity. We map these values in Figure 6.20 to check for any spatial patterns and find that there is indeed clear spatial autocorrelation.
We see that the nodes with the highest values are those that are densely packed at the spatial centre of the community while those nodes closer to the community boundaries generally have lower values. In some ways this is to be expected as the number of same community nodes within close spatial prox-
Figure 6.20: Proportion of intra-community links for each tower. Community boundaries are shown with solid lines.
imity is less for nodes close to the borders so more of the links are with nodes outside the community. On the other hand we expect there to be a difference in the interaction probabilities between nodes within the same community and those in other communities, even if the distance is the same. If this is not the case then communities do not actually tell us anything useful.
Interaction probability
In our next measure we test if there is any difference in the interaction prob- abilities for nodes in the same community and those in separate communities, over a range of distances. As we discussed previously in the context of spatial interaction, the concept of interaction probability is not easy to define but again here we may instead use the ratio of the number of ties to the smaller of the two tower populations. For a given distance we take the median of this ratio for all pairs and refer to it as the normalised median interaction level.
We calculate the normalised median interaction level in 5km bins separately for intra-community and inter-community ties. We plot this for all pairs of nodes in the network (including pairs with no ties) and see in Figure 6.21 that the value for intra-community ties is consistently higher than for inter- community ties. While this appears promising it is still a global measure so we also plot the same values for each community separately in Figure 6.22. Once again we see the same pattern with the intra-community ties have consistently higher values for all communities. These results suggest that the communities do capture the discontinuous effect of space and are not merely manifestations of a continuous spatial process.
Figure 6.21: Normalised median interaction levels for inter-community (black) and intra-community links (red).
Figure 6.22: Normalised median interaction levels for inter-community (black) and intra-community links (red) disaggregated by community.
In order to show that this test is robust, we perform the same analysis on two simulated network partitions. First we use a random assignment of nodes to 14 communities (the same number found at the top level by the method). Again we calculate the normalised median interaction level for each distance bin and plot them in Figure 6.23. We can see that the levels are nearly identical for intra-community and inter-community ties in this case, as we would expect given that the community assignments are entirely random. For the second
test we partition the nodes of the network using k-means on their spatial coordinates with k = 14. This produces spatially contiguous clusters which are more similar to the communities found than the random assignment but still lack any structure related to the social interactions. We see in Figures 6.24 and 6.25 that the intra-community values are sometimes higher and sometimes lower than the inter-community ones. This can be explained by the fact that some of the clusters found by spatial clustering are actually quite similar to the communities found through network partitioning, while others are radically different (Figure 6.26). In general though the test seems to be quite useful at identifying the difference between the soft divides of purely spatial clusters and the harder divides of social communities.
Figure 6.23: Normalised median interaction levels for inter-community (black) and intra-community links (red) for a random partition.
Figure 6.24: Normalised median interaction levels for inter-community (black) and intra-community links (red) for a spatial k-means partition.
Figure 6.25: Normalised median interaction levels for inter-community (black) and intra-community links (red) for a spatial k-means partition disaggregated by community.