• No se han encontrado resultados

8.1. FUNDAMENTOS TEÓRICOS, LEGALES DE LAS POLITICAS PÚBLICAS

8.1.2 EVALUACION DEL ACUERDO 0424 DE 2017 DE LA CIUDAD DE CAL

Previous work on entity linking has predominantly examined the problem with respect to English documents and an English KB like Wikipedia or Freebase (Bunescu and Paşca, 2006; Mihalcea and Csomai, 2007; Ratinov et al., 2011; Hoffart et al., 2011; Cheng and Roth, 2013; Shen et al., 2015, inter alia). All such entity linking systems involve two steps:

candidate generationandcontext-sensitive inference, that together deal with ambiguity and variability of entities in text. I describe them below.

Candidate Generation

Considering all entities in the knowledge base as possible disambiguation for a mention m

is impractical, as the KB K can contain millions of entities. Therefore, it is prudent to filter out irrelevant entities that the mention is unlikely to refer to. Candidate Generation identifies a small set C(m) of plausible entities that a given mention m can link to. Given

m, candidate generation outputs a list of candidate entitiesC(m) ={e1, e2,· · ·, eK}of size

at mostK, each associated with a prior probability Prprior(ei|m)indicating the probability

of mreferring to ei, given only m’s surface.

To do this, first a dictionary mapping mention surfaces (strings) to entities that they can refer to, is compiled. For instance, such a dictionary will mapchicagoto the set of entities {Chicago, University_of_Chicago,Chicago_Cubs, · · · }. One approach to compile such a dictionary is to crawl a large collection of hyperlinked documents and computing the fre-

String Entity (e) Prprior(e|m) counts/total chicago Chicago 0.275 62939/228690 chicago University_of_Chicago 0.075 17186/228690 chicago Chicago_Cubs 0.045 10332/228690 chicago Chicago_Tribune 0.041 9457/228690 · · · Chicago_White_Sox 0.040 9184/228690 · · · Chicago_Bears 0.039 9018/228690 · · · Art_Institute_of_Chicago 0.015 3538/228690 · · · Chicago_(band) 0.005 1225/228690 chicago Loyola_University_Chicago 0.005 1057/228690

Table 23: Part of a dictionary compiled from Wikipedia hyperlinks showing the entities that the string chicagocan refer to, along with the respective prior probabilities and counts. quency with which an anchor text (or mention surface) links to a page in Wikipedia, such as the one shown in Table 23. From the frequencies, one can estimate the conditional prob- ability Prprior. For instance, if the string chicago appears hyperlinked 228690 times in the

document collection, and links toChicago and University_of_Chicago 62939 and 17186 times respectively, then the probability of a mention [chicago] referring to Chicago and University_of_Chicago is 0.275 and 0.075 respectively. In the literature, this collection of documents can either be hyperlinked text from the Web (Spitkovsky and Chang, 2012), or simply the articles in Wikipedia itself (Ratinov et al., 2011).

Context-Sensitive Inference

Having identified a small set of plausible entities, the context-sensitive inference step predicts one of eˆ C(m) as the disambiguation for mention m. For this, a compatibility score

ϕ(m→e)of the mgrounding to a candidate entity eis computed using features extracted from the context of m and information abouteexpressed in the KB K. The compatibility score is trained using grounded mentions, i.e., mentions of entities that are grounded, as supervision, such that the one shown in Figure 15. An inference problem is then solved to find the best assignment for m. The different approaches for performing context-sensitive inference vary in these two steps: choice of the features in compatibility score (hand-crafted

Choice of Features in ϕ(m e). The score ϕ(m e) can be computed using hand- crafted or learnt features. Ratinov et al. (2011) use hand-crafted features comparing the content of entity e’s page and mention m’s context. For instance, the cosine similarity of the TF-IDF vectors of e’s page and m’s context All these manually defined features are compiled into a feature vector f(e, m), such that ϕ(m e) = wf(e, m) where w is a learnt weight vector.2 On the other hand, approaches like (Gupta et al., 2017; Sil et al., 2018) learn feature representationsefor the entity andgfor the mention’s context, so that

ϕ(m e) =eg. The representations e and g are learnt end-to-end by optimizing a loss defined for the disambiguation task using back-propagation (Rumelhart et al., 1986).

Local Inference. Local inference resolves each mention m ∈ D in isolation, with the assignment to mentionmi having no influence to assignment to mentionmj where=j.

ˆ

ei =arg max ei∈C(mi)

ϕ(mi→ei) ∀mi ∈ D (7.1)

Local inference is a popular approach (Mihalcea and Csomai, 2007; Durrett and Klein, 2014; Lazic et al., 2015; Francis-Landau et al., 2016), owing to its simplicity.

Global Inference. The local inference approach does not take into account relationship between candidate entities of different mentions in the same document. For instance, if Steven_Gerrardis a candidate for mentionmi∈ D andLiverpool_F.C.is a candidate for

mention mj ∈ D, then assigning mi Steven_Gerrard, mj Liverpool_F.C. is coher- ent because these two entities are related.3 This correlation between entities is expressed

through a pairwise coherence scoreψ(mi →ei, mj →ej) that is used along with the learnt

compatibility scoreϕ(m→e)to formulate a global inference problem,

e1,eˆ2,· · · ,eˆn) =arg max ei∈C(mi) ∑ i ϕ(mi, ei) + ∑ =j ψ(mi→ei, mj →ej) ∀mi ∈ D (7.2)

2A non-linear kernel (e.g., polynomial kernel) can also be used.

! ரே s[!வr$l]ம&'mஉ*+ வே - ளை யா23றாr.

Everton won against [Liverpool] in an FA Cup match.

Figure 15: Tamil and English mention contexts containing [mentions] of the entity Liverpool_F.C. from the respective Wikipedias. Tamil Wikipedia only has 9 mentions referring to Liverpool_F.C., whereas English Wikipedia has 5303 such mentions. Clearly, there is a need to augment the limited contextual evidence in low-resource languages (like Tamil) with evidence from high-resource languages (like English). The Tamil sentence translates to “Suarez plays for [Liverpool] and Uruguay.”

This global inference problem is NP-hard (Cucerzan, 2007), so approximate inference ap- proaches have been proposed that decompose the inference problem to smaller but tractable inference problems using a technique like message passing (Globerson et al., 2016; Ganea and Hofmann, 2017). The pairwise coherence score ψ itself can either be hand crafted re- lational scores between entities imposed only at inference time (Cheng and Roth, 2013) or learnt end-to-end with the rest of the model (Ganea and Hofmann, 2017).

Documento similar