To address RQ2, we developed a semi-automatic friend-grouping visualization tool called FreeBu#1 (as there has been a series of new versions afterwards, see Appendix C), which is short for Friend tree Bubbles.6 This section describes the
method that we use to recommend an initial grouping of the user’s OSN friends, while enabling the user to explore and adapt this grouping according to his/her needs, and eventually publish the results onto his/her OSN account. Because of the prevalence of Facebook among today’s OSNs, we have implemented this tool based on Facebook data. The method is however applicable for any other OSN that provides data access to its social graph and profile data.
6Note that because of Facebook API change, as documented in https://developers.
facebook.com/docs/apps/changelog, the three versions of FreeBu: FreeBu#1 (Chapter 3), FreeBu#2 (Chapter 4) and FreeBu#3 (Chapter 6) are no longer functional. To avoid applications like this being dependent on a particular OSN’s API, we redesigned FreeBu so that it now functions as generic tool for graph data visualization and exploration, as detailed in Chapter 10.
3.5.1
Data
We base our PFA tool on the data retrieved via the Facebook graph API7 with
the user’s access token. We aid the user to group his/her Facebook friends, by firstly recommending an initial grouping structure, assigning appropriate labels to the groups and then letting him/her further adjust the grouping (see the following sections on Computational Model and User Interface). The grouping is constructed based on the user’s friend graph, in which each node is a friend of the user’s, and if two friends are also friends to each other, they are linked. Note that such links are unweighted. We generate labels for the groups with collected attribute-based data.
Education-related data includes a list of schools, where each school has its name and type, e.g. high school or graduate school, with possibly more information such as the year and the concentrations if the user has filled this in. Work- related data includes a list of work-objects, each object contains the name and the location of the employer, the position of the user, and the starting and ending time of the job. Language-related data includes a list of names of the languages that the user speaks. For hobby-related data, we collect the “likes” of a user, which may contain anything, from sports to TV shows, from a public figure to a book, etc.
It is important to let the user form sensible groups on his/her own, by providing options of available OSN data on a meta-level. For example, what data attributes are available on OSN, and how are people distributed over these attributes. Eventually, we let the user decide what data is most relevant and what grouping structure is closest to what he/she has in mind.
3.5.2
Choosing A Community Detection Method
Based on our user study results in Section 3.4.2 and discussion in Section 3.4.3, we adopt a graph-based community detection algorithm – more specifically, the Louvain method [26] – to extract communities from the user’s friend graph. It is a heuristic method that is based on modularity optimization. The method was shown [26] to be efficient and produce communities with good quality. A community is characterized by modularity. Modularity measures the density of links inside communities as compared to links between communities [134]. An area with more mutually connected friends is more likely to be identified as a community. The Louvain method outputs flat communities.
FREEBU#1 (RQ2) 45
Note that we detail the comparison between the modularity-based community detection method and the method that takes both graph and node-attributes into account in Chapter 5. This comparison further showed that the modularity- based method is indeed suitable for detecting communities in ego-networks.
3.5.3
Label Derivation
To support the exploration of the visualization, and help the user identify the characteristics of different groups, it is critical to derive informative labels for communities. The label of a group should highlight the attributes of the people in it. We adopt the F-measure to determine the labels for the communities.
F-measure is a standard measure combining precision and recall (Equation 3.1).
As the labeling experiments in [123] indicate, F-measure comes out as one of the best label-selection measures for communities detected with the user’s Facebook data – the labels with high F-measure scores are generally considered suitable by the users.
F-measure = 2P recision(C, A) ∗ Recall(C, A)
P recision(C, A) + Recall(C, A) (3.1) with P recision(C, A) = |C ∩ A| |A| (3.2) Recall(C, A) = |C ∩ A| |C| (3.3)
C denotes a set of people within the same community c, A denotes a set of
people with the same attribute-value a, e.g. a certain name of a university.
P recision(C, A) measures the proportion of people with the attribute-value a in
the community c to the whole population with attribute-value a. Recall(C, A) measures the proportion of people with the attribute-value a in the community
cto the whole population of the community c.
For each community, a list of labels is generated based on all the data attributes described in Section 3.5.1, and then sorted according to every label’s F-
measure score. The user can determine the number of labels appearing on
3.5.4
Visualization Interface
We adopt the star-tree form to represent the grouping structure. As shown in Figure 3.3, the nodes of the tree are represented by circles, each pair of parent- child nodes connected by straight lines. The root of the tree (the blue circle in the middle) is the user herself, the red circles represent different communities detected by the algorithm, the leaves (the green circles surrounding the red ones) represent the user’s friends on Facebook. We scale the sizes of community circles based on the number of people within each community, a larger size corresponds to more people.
The labels are shown on top of the community circles, if a community contains more than one person. The user can click on one bubble – a community or a person – to zoom in to concentrate on a particular part of the tree. The labels are typically school names, school years and work places. The number in front of the labels indicate the number of people in the corresponding circles. The user can adjust the number of labels shown by sliding the threshold bar. The user can turn the labels for the red and green bubbles on or off via the “rlabels” and “glabels” button. The switch for the tip (an info-box) appearing on top of a bubble is the “speech” button. The user can also press the “shuffle” button to rearrange the layout of the bubbles.
Initially, we provide the user with a one-layer grouping. The user can modify it by adding or removing (sub) groups via the “add” and “delete” buttons, so that the user is able to construct his/her grouping hierarchically, as shown in Figure 3.4. The user can specify the name for the newly added group in the text box below the “add” button (Figure 3.3). The user can also change the members of the groups by “dragging and dropping” friend nodes from one red circle to another, as shown in Figure 3.5. Once the user finishes modifying the groups, he can publish the grouping to his/her Facebook account as “Facebook friend lists” by pressing the “publish” button. However, note that Facebook friend lists are flat groupings. Thus only a one-layer grouping is composed and published.