CAPITULO IV: COSTOS DE PRODUCCIÓN Y EXPORTACIÓN DE
4.11 INCOTERMS ECUADOR – JAPÓN
4.11.1 Ruta comercial
The code developed for producing the results of this thesis has been written in Python 2.7 [Ros95]. The library used for constructing and analyzing multilayer networks is Pymnet [Kiv17], and the software for conducting graph isomorphism tests is Bliss [JK07] (and its Python wrapper PyBliss).
Literature Review on Neighborhood
Attack
In this chapter, we present the main literature on neighborhood anonymiza- tion and privacy in social networks (Section 3.1). We also survey the methods that have been developed to prevent neighborhood attack (Section 3.3). Moreover, we illustrate the existing methods for anonymizing multilayer networks and edge-labelled graphs (Section 3.4). The focus of this thesis is neighborhood attack on multiplex networks (a particular type of multilayer networks) and, despite no methods currently exist to address this problem, an overview of both neighborhood attacks on classical single-layer social networks and other types of attacks on multilayer networks (or edge-labelled graphs, that are, for certain aspects, similar to multiplex networks) can be useful to understand the current state in the understanding of this problem.
3.1
Neighborhoods and privacy
More and more data can be modelled as networks nowadays, for instance, social media connections or phone calls. Since networks can be interesting objects also just for their structure, without attributes associated to nodes, one can think that a naïve anonymization approach could be enough for sharing them. However, some structural features can make a node unique and thus re-identifiable in a network, such as the amount of its connections (degree) or the structure of its neighborhood. In this sense, privacy definitions have been adapted to networks, to modify the data to reach, for example, k-degree anonymity [LT08] and k-neighborhood anonymity [ZP08]. The k-neighborhood anonymity tries to prevent the neighborhood attack, such as the re-identification of a node based on its neighborhood.
Another approach to anonymize networks, based on modifying neighbor- hoods, is neighborhood randomization [FW15]. This approach consists in changing the endpoint of an edge within the local neighborhood of a node. Neighborhood randomization provides link privacy, and it is one type of random perturbation method. In particular, it is a link perturbation method. Alternative link perturbation methods, not aiming to protect neighborhoods, add a certain amount of edges randomly to the network [YW09] while deleting the same amount, or perform random edge switching.
An alternative to the classical neighborhood attack is the neighborhood- pair attack [NA13], which consists in the re-identification a node based on the structure of neighborhoods of two connected vertices. Since the attacker’s knowledge is broader than a normal neighborhood attack scenario (where only the structure of a target’s node neighborhood is known), this attack has higher re-identification risk than the classical neighborhood attack.
[Hay+07] and [Hay+08] study and formalize the risks of some structural attacks on social networks, in particular the ones in which the attacker’s knowledge consists in the degree and the neighborhood subgraph of a node at various levels (or hops). For example, the attacker can know only the degree of a node, or, additionally, the degree of its neighbors, etc.. Equally, the neighborhood graphs can be 1-hop or of higher order. Specifically, the first paper [Hay+07] studies the re-identification risk for both degree attacks and subgraphs (or neighborhood) attacks at various levels on some single-layer real world dataset. The second paper [Hay+08] studies the degree attacks at various levels in both real-world data, synthetic data (such as power law, tree, or grid topology graphs) and ER random graphs. In particular, for the latter, the authors distinguish three cases, corresponding to different regimes and edge probabilities: sparse, dense, super dense. They conclude that n the dense and super-dense regime, a node is easily identifiable. In the dense regime, a node is identifiable when the attacker’s knowledge includes at least the degree of the neighbors, besides the degree of the node itself. For networks in the sparse regime, as most real-data are, the re-identification probability depends on the network size. We also study how the uniqueness changes in ER graphs and other graph models in Chapter 5. However, differently from those previous works, we also take into account the full 1- hop neighborhood subgraph of a node. [NS09] presents a de-anonymization method. This work also illustrates that the re-identification is easier if the attacker has, besides the knowledge of certain subgraphs, even partly additional information coming from another social graph, showing how
different percentage of node and edge overlapovE affect the attack’s success
probability. In Chapter 5 we also study the effect of edge overlap in multilayer networks, but more rigorously, and focusing only on neighborhoods. We
properly define the concept of multiplex neighborhoods (in Chapter 4) and systematically analyze how different values of edge overlap, average degree and networks size affect the fraction of nodes that are easily re-identifiable in different network models.
In general, networks are harder to anonymize than other types of data (such as vectors of values), because, besides the possible attributes of nodes,
there are also links, that can reveal information about relations between nodes. An even more difficult task can be the anonymization of multilayer networks, since there are multiple types of links. Few methods are addressing the anonymization of networks with multiple types of links, and some of those are discussed in Section 3.4.