Since actors are considered as interdependent, and observations are made on pairs of individuals, the smallest unit of analysis in network analysis is dyads, two actors and the ties connecting them. Larger units constructed from dyads include triads, subgroups, groups, and the whole network. Basically, the unit of analysis consists of a set of actors and the ties among them (Wasserman & Faust, 1994).
Network data can be represented as graphs or matrices. A graph is a very intuitive way to represent connections, and many network concepts such as distance and path are built from graph theoretic concepts. However, since data represented in graphs are not particularly adequate for analytical procedures, matrices are used as input for most software tools4.
Mathematically, any graph can be transformed into a matrix.
An important concept in the social network context is ‘mode.’ The number of modes in a matrix is the number of distinct sets of entities represented in it. For instance, in a two dimensional matrix (i.e., a two-way matrix), if the rows and the columns denote different kinds of entities the matrix is called a two-mode matrix, whereas if the rows and columns point to the same kind of entity it is a one-mode matrix.
A traditional social science dataset is usually represented as a case (person) by variable (attribute) matrix, which is a two-way, two-mode matrix. In contrast, a typical data matrix in social network analysis is a square case by case (actor by actor) matrix, called an ‘adjacency’ matrix, which records a social relation (or other dyadic attribute) among a set of actors. Since the rows and the columns are composed of the same set of actors, an adjacent matrix is a two-way one-mode matrix.
Each cell of an adjacency matrix contains a value denoting the presence/absence of the relation between the corresponding row actor and column actor. Characteristics of the relation are also represented. By convention, if the relation is directed, the senders are recorded in the rows, and the receivers are in the columns. In other words, the dataset is recorded such that a row actor does something to a column actor. The strength of the relation is recorded by the value of the cell, where 1 or 0 represent mere presence or absence of the relation. A number greater than 1 could appear if the strength of the relation is available.
Even though a one-mode matrix is considered the canonical dataset in social network analysis, certainly not all network data are represented as one-mode matrices. A two-mode
visualizing networks. An up-to-date list of software tools is available at the INSNA webpage. See http://www.insna.org/INSNA/soft_inf.html. Scott (2000) provides a brief review of network analysis packages in the appendix. Recently, an extensive
matrix called an ‘incidence matrix’ is commonly used to represent a special kind of network, an affiliation network. Affiliation networks will be discussed in detail later in this paper.
2.3.3.2 Measures
Brass (1995) classifies network measures into three categories according to the corresponding unit of analysis and provides a brief definition of each measure.
1) Typical social network measures applied to ties: indirect links, frequency, stability, multiplexity, strength, direction, symmetry (reciprocity)
2) Typical social network measures applied to individual actors: degree, in-degree, out-degree, range (diversity), closeness, betweenness, centrality, prestige. In addition to these measures, concepts for describing roles of actors are also included in this category: star, liaison, bridge, gatekeeper, isolate.
3) Typical social network measures applied to describe entire networks: inclusiveness, component, connectivity (reachability), connectedness, density, centralization, symmetry, transitivity.
Among these measures, we will take a look at the most broadly used measures for each category, and how these measures are used in some network studies.
2.3.3.2.1.Strength – a measure applied to ties
The strength of a tie is a general notion that describes the nature of the relationship. It can be operationalized in a number of ways depending on the particular context (Marsden & Campbell, 1984). In general, the strength is thought of as a “combination of the amount of time, the emotional intensity, the intimacy (mutual confiding), and the reciprocal services
which characterize the tie” (Granovetter, 1983, p. 1361).
2.3.3.2.2 Centrality – a measure applied to actors
Centrality is obviously one of the most important concepts in network analysis. Most empirical studies use some kind of centrality analysis to identify the most important or visible actors within the network (Everett & Borgatii, 2005). Conceptually, centrality measures are used to find out who is central or important in a given network or a subgroup network. A wide variety of specific measures have been proposed so far. Centrality measures can be categorized broadly into four groups: degree, closeness, betweenness, and power (Wasserman & Faust, 1994; Faust, 1997). Freeman (1979) suggested categorization of centrality measures consisting of the first three categories and provides exemplary measures for each. The eigenvector-based measure proposed by Bonacich (1972) stands out from the other three categories, and constitutes the fourth category.
Degree centrality is perhaps the most intuitive notion of centrality. The actor with the most ties is considered most important. However, the simple number of ties, the degree of an actor, can be misleading. Depending on the maximum possible degree (determined by the number of actors in the network) or the overall degree distribution in the network, a certain degree measured on an actor could tell quite a different story about the importance of the actor in a network. In order to address this problem, some kind of normalization of the degree measure is often suggested. A more important criticism is that degree centrality only counts direct ties and does not take indirect ties or paths among actors created by indirect ties into consideration. The other category of centrality measures are proposed to deal with this problem with a different conceptualization of what constitutes the ‘importance’ of an actor.
With closeness centrality, the importance of an actor is determined by relative distance to all other actors. The idea is that, if an actor is relatively close (in other words, a short distance) to all other actors through their direct or indirect ties, they can interact with other actors efficiently and thus can be more influential independent of how many direct ties they have.
Betweenness centrality introduces the concept of ‘control.’ For example, if an actor lies on a path between actor A and actor B in a communication network, information A sends may or may not get to B, depending on whether the actor between them passes on the information or not. In that sense, the actor between the other actors can control the flow of information. The betweenness centrality of actor i counts the number of shortest paths (called geodesic paths) between j and k (pairs of all the other actors) that actor ilies on.
Power centrality takes account not only of ties or paths in which an actor is involved but also of other actors connected to the actor. An actor is considered to be important if he/she has ties to other central actors.
2.3.3.2.3 Connectivity – a measure applied to networks
Connectivity is a graph-theoretic concept. The connectivity of a graph is defined by the reachability between pairs of nodes. Two nodes in a graph are said to be reachable if there is a ‘path’ connecting them. Network connectivity measures the extent to which actors in the network are connected to one another.
Connectivity together with density is used to measure cohesion or cohesiveness of the network and thus to detect subgroups within a network. A cohesive subgroup is one of the most important themes in network research, because it is provides a definition of the
fundamental sociological concept of group in network analytic terms. In social network analysis, groups emerge through a pattern of connections. Dense connections within a relatively bounded set of actors define cohesive groups – variously called cliques, components, circles, etc. - within a network. Technically, cohesive subgroup analysis is in effect partitioning the network into clusters. There is an array of techniques developed to detect patterns of connections and identify groups, including N-Clique, N-Clan, K-Plex, etc. (Scott, 2000; Wasserman & Faust, 1994).