• No se han encontrado resultados

UNIÓN EUROPEA

INICIATIVAS COMUNITARIAS

3.2 El proceso de negociación del marco financiero plurianual 2007-

Personalized tag recommendation, as a special case of collaborative filtering, is a recent topic in recommender systems. The two main directions for these systems are content-based approaches and graph-based approaches.

Content-based methods, which usually encode users’ preferences from textual information (e.g., web pages, academic papers, tags), can predict tags for new users and new items. One state-of-the-art content-based tag recommendation system [84] utilized several tag sources including item content and user history to build both profiles for users and tags.

Graph-based approaches, which usually have stronger assumptions than content- based ones (e.g., requiring every user, every item and every tag to occur in at least p posts), can provide better performance. Earlier work like FolkRank, introduced by Hotho et al. [62], is an adaptation of PageRank that can generate high quality recommendations which are shown empirically to be better than other previously proposed collaborative filtering models [64]. Guan et al. [51] proposed a framework based on a graph Laplacian to model interrelated multi-type objects involved in the tagging system. Recently, factorization models (also considered as graph-based

approaches) show very successful evaluation results on personalized tag recommen- dation problems [112, 114, 130].

Non-personalized tag recommenders—i.e., for a given item they recommend to all users the same tags—have also attracted a lot of attention (e.g., [58, 127]). Garg et al. [42] propose a personalized interactive tag suggestion system which suggests tags based on the ones that a user entered most recently. They employ a naive Bayes classifier which is only based on tag co-occurrences.

An important factor not considered by any of the above methods is the temporal dynamics of users’ short-term interests. Recent research also shows that users are much more likely to use their recently used tags. Zhang et al. [165] investigate the recurrence dynamics of social tagging. Time information is also important to recommend high-quality tags to users. In recommender systems and collaborative filtering, temporal information has already shown its success. Ding et al. [35] simply add a time weight on collaborative filtering through a decay function. Koren [73] demonstrates that users’ interests evolve and presents a model which can track the temporal behavior to better recommend items for users. Xiang et al. [146] model long-term and short-term preferences by creating a session node on user-item graph and then use temporal personalized random walk to recommend items for users. Modeling long-term and short-term interests is also related to the problem of concept drift which needs to find the balance between temporal effects and long- term trends [120, 144]. Lebanon et al. [77] introduce a local likelihood model for concept drift which weights the local likelihood by using a kernel function. Another similar method is positional language model which is proposed by Lv and Zhai [89]. Both models are proximity-based methods. In this kind of method, the smoothing actually models the lifetime of users’ short-term interests.

Chapter 3

Structural Link Analysis and

Prediction in Microblogs

In the above chapter, we study the personalized tagging prediction and temporal dynamics in social tagging systems. In this chapter, we will investigate another popular system—microblogging system. We first analyze the link structure in Twit- ter, and then by analyzing data collected over time, we compare most popular and recent methods and principles for link prediction. Finally, we propose novel struc- tural methods to calculate the probability of a link being created by examining the current user’s local network structure. Our experiments on both static and dynamic data sets show that our methods noticeably outperform the state-of-the-art.

3.1

Introduction

The use of online social networks and social media in general has surged in recent years. In this chapter, we focus on the understanding of the use of one particular type of social service—that of the microblogging network. In microblog services such

as Twitter, Yammer and Google Buzz, participants form an explicit social network by “following” (subscribing to) another user and thus automatically receive the (short) messages generated by the target user. Unlike some online social networks such as Facebook, LinkedIn or Myspace, a followed user has the option but not the requirement to similarly follow back. Thus, relationships in these social networks may be asymmetric, leading to three kinds of link relationships between users A and B. If A follows B, we say that A is a follower of B, and that B is a friend of A. If A and B both follow each other, we consider them mutual friends or reciprocal friends.

Thus, user B in a microblog service can generate messages, which are generally public and searchable, and any followers of B, such as A, will automatically receive those messages along with messages generated by all other users that A follows. The combination of multiple message intentions and asymmetry of connections has led some to call microblogging services such as Twitter “hybrid networks” [76, 117]. They are hybrid not just because they can carry multiple types of messages, but also because participants create links for multiple reasons—to be social (e.g., to connect online to existing offline social contacts) or to link to information sources. With multiple types of users, it may be difficult to understand how microblogging networks grow and evolve. In a hybrid social-information network, there are two viewpoints to consider. In an information network, the link prediction problem is like the recommendation problem, which is to recommend an information source to an information consumer. In a social network, the problem is to recommend friends to the users, as introduced by Liben-Nowell and Kleinberg [80]. If we can predict the next link that a user will likely create, we will 1) have a model of the user’s interests that may be of value in recommending new links (e.g., as in Twitter’s recently introduced “Who to follow” friend suggestions, and many third- party suggestion services) and in detecting communities; 2) be closer to modeling

the network’s overall growth processes; and, 3) be able to simplify the task of adding that link when the user wishes to do so.

In this chapter, we analyze link structures in Twitter to predict future links. Our contributions are as follows.

• We analyze real Twitter data collected over time to answer the question of from where the new links come. We additionally compare three sampling methods for the link prediction task.

• We are the first to experimentally compare many popular link prediction meth- ods in a microblogging network. Furthermore, we also compare with matrix factorization—the most popular method of recommender systems.

• We propose a novel structure-based link prediction method. Empirical results on ego-centric networks of Twitter users show that our methods can outper- form state-of-the-art methods on this task.

Documento similar