• No se han encontrado resultados

Años 60 – 90 del siglo XX

1.5. El nacimiento de la escuela soviética de interpretación de lenguas

gat2vec requires various parameters to create Gs and Ga, and learn an

embedding. The sensitivity of the choice of parameters on structural net- works have been investigated in previous works, such as [26, 28]. We in- vestigate the parameter sensitivity of γa, λa, and th on attributed graph.

Furthermore, we analyze the joint parameter sensitivity of number of walks (γs, γa), and walk length (λs, λa) on Gs and Ga. In these experiments, we

evaluate on Micro-F1 under the same evaluation process given in Sec- tion 5.3.4. All other parameters are set to default values unless mentioned.

Impact of γa & λa

Figures 5.6(a,b) shows the effects of varying the number of walks per vertex and the walk length on bipartite attribute graphs. We empirically observe that these parameters have direct proportionality relationship with the sparsity. In a very dense graph, a small number of short walks are adequate to explore the neighborhood to capture the contexts. dblpis highly dense, while CiteSeer is very sparse, as shown in Figure 5.8. Therefore, for

5.3. EXPERIMENTS 58 DBLP Blogcatalog CiteSeer 0.2 0.4 0.6 0.2 0.4 0.6 65 0.2 0.4 0.6 67 69 71 50 51 52 78.4 78.8 79.2 79.6 TR Micr o-F1 #Walks 5 10 15 20 (a) DBLP Blogcatalog CiteSeer 0.2 0.4 0.6 0.2 0.4 0.6 0.2 0.4 0.6 67 69 71 73 49 50 51 52 78.4 78.8 79.2 79.6 TR Micr o-F1 Walk Length 40 80 120 160 200 240 (b)

Figure 5.6: Ga Parameter Sensitivity on: (a) Number of Walks(γa), (b) Walk Length(λa)

values of γa and λa. Increasing the number of walks and the walk length

has detrimental effect on the learning representation as it explores the large parts of the graph, which in turn generates contexts that are noisy i.e., include less significant vertices. In case of CiteSeer, we see an opposite trend: precise representations are learned for higher values of γa and λa.

This is motivated by the sparsity of the dataset, which is composed of 10 different disciplines, thus requiring a larger number of longer walks to explore the graph adequately and generate the appropriate contexts.

Impact of γs, γa, & λs, λa

Here we analyze the joint parameter sensitivity on the dblp dataset. In Figure 5.7(a) we increase the number of walks (γs, γa) jointly on both Gs

59 CHAPTER 5. JOINT LEARNING ON ATTRIBUTED GRAPHS dblp 0.1 0.2 0.3 0.4 0.5 78.50 78.75 79.00 79.25 79.50 TR Micro−F1 #Walks 5 10 15 20 25 (a) dblp 0.1 0.2 0.3 0.4 0.5 78.4 78.8 79.2 79.6 TR Micro−F1 Walk Length 40 80 120 160 200 240 (b)

Figure 5.7: Ga Joint Parameter Sensitivity on: (a) Number of Walks(γa), (b) Walk

Length(λa)

andGa, while keeping walk length to 10. In second case 5.7(b), walk lengths

are jointly increased, while the number of walks is kept constant to 10. As discussed in previous case, increasing the number of walks or walk length gathers more contextual information, thus, learning more precise represen- tations. But, an excessive number of walks and large walk lengths are detri- mental as it generates noisy contexts which lead to poor representation, as shown in case γs = γa = 25 (Figure 5.7a), and λs = λa = 240 (Figure

5.7b).

5.3. EXPERIMENTS 60

Impact of th

We have shown the usefulness of attributes (Table 5.2–5.4) for learning high quality embeddings. We further stress upon the importance of attributes by varying the parameter th. It is pertinent to mention that increasing the value of th implies a decrease in the number of attributes. From Figure 5.9, it is evident that as we decrease the number of attributes, the accuracy of embeddings decreases. At th = 1, the performance is lower due to the noise introduced by including all possible words as attributes.

● ● ● ● ● ● ● 78.4 78.8 79.2 79.6 0 5 10 15 20 25 th Micro−F1 DBLP

Chapter 6

A Simple Approach to Learn

Representation on Attributed

Graphs

Nodes in an attributed graph tend to have homophilic relationships, i.e. tend to connect to nodes that are similar to themselves in terms of at- tributes. By exploiting attribute information and learning from both the network structure and the attributes at the same time, better representa- tions can be obtained. But this comes at a cost, due to the increase in the computational complexity needed to model the various relationships, structural and attribute, between nodes. Moreover, for some graphs the number of attributes may be very large (in order of millions), that increases the overall model complexity.

Similar to network structure, attributes are also highly non-linear and sparse [27, 33]. Various approaches have been proposed which handle the network structure and attribute nonlinearity, and sparsity [49, 50].

dane[49] handles them by employing a deep neural network model on both

the network structure and attributes to handle non-linearity and sparsity. This method preserves semantic proximities between nodes by sampling nodes that have high similarity in attributes and optimizing their represen-

62 tation together with the first- and higher-order proximities in the network structure. Unfortunately, the cost of this sampling procedure is quadratic in the number of nodes, thus not scalable to large graphs. Another ap- proach called Attributed Network Representation Learning (anrl) [50] uses a neighbor enhancement autoencoder which optimizes the learning on attributes and the neighborhood context. The neighborhood context is generated through random walks whose complexity is too high for large graphs, both in terms of time and space.

Therefore, handling attributes in conjunction with structure is more challenging as it involves capturing non-linearity and sparsity, while at the same time preserving the structural properties of the graph.

To address these problems, this chapter proposes Sage2Vec (Simple Attributed Graph Embedding), a model that preserves the second-order proximities in the structure, and also handles the network structure and attribute non-linearity, and sparsity. The motivation behind this work is to build a simple model in terms of number of parameters and to show that a very simple approach can achieve better results than complex state-of-the- art baselines in a large percentage of scenarios. We employ an autoencoder model with an enhanced decoder that takes only structural information as input but optimizes on both structure and attribute information. By using attribute information for optimization reduces the overall model complex- ity and thus, the model can be scaled to larger graphs.

The rest of the chapter is organized as follows: Section 6.1 provides the formal definition of the problem. In Section 6.2, we describe our pro- posed model and the experimental setup, which is later evaluated on several datasets in Section 6.3.

63 CHAPTER 6. A SIMPLE APPROACH TO LEARN REPRESENTATION ON ATTRIBUTED GRAPHS

6.1

Problem Definition

Problem Given an attributed graphG, we aim at learning a low-dimensional network representation Φ : V → Rd, where d |V| is the dimension of

the learned representation, such that the mapping function Φ preserves the structural proximity and attribute information.

The quality of this mapping will be evaluated against several tasks such as node classification on various real-world datasets, as described in Sec- tion 6.3.

Documento similar