Facebook was launched at the Harvard University in February, 2004. It quickly spread to other universities in the Ivy league. The site initially only allowed users to be registered with a university email address. In 2006, however, it started to open its registration to the public. It is now among the top social network sites both in the US and in the world. Facebook introduces the concept of networks which refer to companies, organisations or cities any users belong to. Users can join up to two networks and may only change the network once every 3 months. Some of the networks, such as companies and universities, can only be joined with a proper university email addresses.
For example, the network of the University of Southampton can only be joined if the email addresses ends with “soton.ac.uk”. The network was established in September, 2006 and, at the time of writing, has 24,518 members. More detailed information about
Figure 4.10: Degree distribution for myspace.com. Taken from [4]
Facebook and the demographics of the network of the University of Southampton can be found in Chapter 6.
Algorithm 1 Retrieve the social network of the University of Southampton on Facebook Input: A Random University User on Facebook
Output: The Social Network of the University of Southampton S ADAPTED-BREADTH-FIRST(V )
1: Login on Facebook
2: Enqueue the root node V
3: while The queue is not empty do
4: Dequeue a node
5: Retrieve UID of V
6: for All children of the node do do
7: Enqueue the child node
8: end for
9: Sleep(10)
10: end while
We attempted to contact Facebook for access to the Data of the Facebook users in the network of the University, but received no replies. Thus, we decided to crawl through the data by imitating a normal user who is browsing the Facebook website. This technology may be called Web Scrapping. The algorithm is shown as above. We randomly select a
node of the network. Any node directly linked to this node is then included in our data collection. The process is iterated until all nodes that can be reached from this node have been added to the final sample network. We managed to collect a sample of 15,005 people in December, 2007, 19,604 in October, 2008 and 22,553 in April, 2009. It should be noted that users may change their privacy settings so that even other users in the same University network may not be able to access their list of friends. This problem, however, may sometimes be circumvented by accessing their friends who are willing to list their friends. Some statistics of the data can be found in Figure4.11
Figure 4.11: Summary of data sets from the network of University of Southampton on Facebook
We begin our analysis of friendship inflation by looking at the growth of the average num-ber of friends on Facebook. A first look at this data reveals a steady growth of average number of friends of Facebook users in the network of the University of Southampton.
The number increases from 63 in December, 2007, to 67 in October, 2008 and finally to 73 in April, 2009, as in the left graph in Figure4.12. Given Facebook’s popularity in the University, it is not a surprising discovery that this number keeps increasing. We then investigate the initial network. This means we only look at the data set of 15,005 people in the 2008 and 2009 data collection. Theses users have been previously identified in our 2007 data set. The right graph in Figure4.12indicates that the number increases from 63 in December, 2007, to 66 in October, 2008 and finally to 72 in April, 2009. Thus, the growth of average number of friends is similar in the initial network to the growth network. We conclude that this growth does not only come from the early adopters of Facebook users but also from the users signing up in the following years, presumably the first-year university students.
Next, we compare the degree distributions of the three data sets. Figure4.13 plots the complete graph of degree distributions in a log-log coordinate. The black line repre-sents the degree distribution for the 2007 sample. The red line reprerepre-sents the degree distribution of the 2008 sample. The green line represents the degree distribution of the 2009 sample. All the sample networks exhibit a pattern of power-law degree distri-bution. However, in the scaling region of 50≤k≤500, it shows that the probability pk in both the 2008 and 2009 sample is bigger than that in the 2007 sample, suggesting a monotonic increase in the number of friends for the vast majority of users, both active and less active. It also implies that the degree distribution is not scale-free, instead, it
Figure 4.12: Steady growth of average number of friends of Facebook users in the University of Southampton Network.
demonstrates a multi-scaling behaviour. In the region of 0≤k≤100, the exponents α of all the samples are fairly similar, but beyond the region of k=100, this α becomes bigger for the 2008 and 2009 sample. There is also a slight friendship inflation between the 2008 and 2009 sample.
A closer examinations of the degree distributions of all these three samples, as shown in Figure4.14, Figure4.15and Figure4.16, reveals the lack of clear cutoffs as discussed in the previous section. In particular, the degree distribution of the 2007 and 2008 samples will flatten beyond k=500. The degree distribution of the 2009 sample will exhibit a similar patten when k≥700. The shortage of definite cutoffs implies that Facebook users are capable of befriending more people at low cost by leveraging the technique of static link. To see how online social networks can empower the active users in the friend-making process, we select the people whose friends are over 150. The number of 150, or Dunbar’s number, is the supposed cognitive limit to the number of individuals with whom any one person can maintain stable social relationships. As shown in Figure4.17, the number of people whose friends count over 150 is 1,273, or 8.5% of the sample population for the 2007 sample. This increases to 1,869 in the second sample and 2,768 in the third one. There is an even bigger increase in the ratio of the number of active users and the whole sample population. It climbs to 9.5% in the 2008 sample and 12.3%
in the 2009 sample. The statistics clearly show that the degree distributions of highly active users do not obey the rule of scale-free behaviour. Active users will be involved more in the friend-making process.
The final metric we will investigate is assortativity. In our study, we focus on the con-nections between university members. Concon-nections within the university represent a restricted relationship of Facebook users. These relationships usually reflect real con-nections as they stay in the same campus and city. As indicated in Figure 4.11, the
Figure 4.13: Comparison of Degree Distribution of the Three Data Sets.
assortativity in all three samples is a relatively large positive value. However, we do observe a decline from the value of 0.32 in the 2007 sample to 0.2 in the 2008 sample, which implies that the degree correlation moves from a bigger value to a smaller one.
The change confirms the prediction of the theory of friendship inflation. Readers may notice that the value went up in 2009 to 0.34, which is larger than that in either 2007 or 2008. We argue that this is due to the increase of new members who bring real-world connections to Facebook that shadows the reduced assortativity in the existing social network.