• No se han encontrado resultados

– Modificaciones en la cobertura de medicamentos recetados de la Parte D

Relation extraction is one branch of information extraction that identifies se- mantic relations between extracted entities. It has been studied extensively and applied on various types of texts, such as plain text [Agichtein and Gravano, 2000], news articles [Doddington et al., 2004], Wikipedia pages [Suchanek et al., 2007] and research articles [Krallinger et al., 2011].

The relations of interest may be binary (i.e., relations between two entities) or multi-way (i.e., relations among more than two entities, a.k.a, “events”). Two recent examples are the slot-filling task in TAC ’11 [Entity Linking, 2011] which targets 26 binary relations for persons (e.g., country of birth and member of) and 16 for organizations (e.g., members and countries of headquarters), and the GENIA event extraction task in BioNLP ’11 [Kim et al., 2011] which aims to recognize 9 types of bio-molecular events (e.g., binding and localization) possibly involving multiple proteins/entities at multiple sites.

The approaches for extracting binary relations can be broadly classified into the following two categories:

Rule-based approaches: The rule-based approaches for binary relation ex-

traction are similar to the ones for entity extraction as mentioned in Sec- tion 3.2, except that the patterns are defined around two entities and the actions are to report the corresponding relations for the patterns matched. A few examples of these approaches can be found in [Jayram et al., 2006;

Shen et al., 2007;Krishnamurthy et al., 2008]. Please refer to Section3.2.1

for a review on the strengths, weaknesses, and issues of these approaches.

pair of entities using statistical models. There are two groups of methods which differ in terms of how pairs of entities are modeled. The first group of methods models each pair of entities individually as a vector of fea- tures. The strength of this approach is that various types of features, such as lexical features, syntactic features and semantic features [Kambhatla, 2004; GuoDong et al., 2005], can be easily cast into a unified framework and employed to comprehensively describe the entities and the context between/surrounding them. As shown in [Jiang and Zhai, 2007], which systematically explore several types of features including entity attributes (e.g. entity types), n-grams, constituency-based parse tree features (e.g., grammar productions) and dependency parse tree features (e.g., depen- dency relations and paths), good performance can be readily achieved us- ing only the basic features from each type. Nevertheless, the fact that many statistical models assume the independence of features and these features can only take on single values, leads to the difficulty in capturing structured information, such as parse trees.

The second group of methods defines similarity between pairs of enti- ties using a kernel function. With kernel-based classifiers such as SVM, the classification of an unseen instance is done by finding out whether the instance is more similar to the ones which are related by the given relation than the ones which are not. In early works, the kernel func- tions employed commonly make use of structured syntactic information, such as constituency-based [Zelenko et al., 2002; Zhou et al., 2007] and dependency-based [Culotta and Sorensen, 2004; Bunescu and Mooney, 2005] parse trees. Correspondingly, the similarity scores are usually com- puted with graph algorithms, such as counting the number of common subtrees [Zhou et al., 2007] and measuring the number of common proper- ties on the shortest path between pairs of entities [Bunescu and Mooney, 2005]. Therefore, these methods naturally handle structured information well. To allow more types and forms of information to be incorporated, recent research also works on developing more complex kernel functions.

entity kernel and a tree kernel through polynomial expansion, while the context-sensitive convolution tree kernel in [Zhou et al., 2010] is specifi- cally designed for a rich semantic relation tree structure which integrates both syntactic and semantic information. While these complex kernels are able to outperform the feature-based modeling methods, they require sub- stantial efforts to engineer and it is unclear how applicable they are for relation extraction problems of different settings or in other domains. Similar to entity extraction, many approaches in these two categories are supervised and their effectiveness is dependent on the availability of an annotated corpus of suitable size. To alleviate this need and tap into the large amount of unlabeled data from large text collections or the Web, non-supervised approaches have also been an active area of research in relation extraction. For example, two early rule-based systems, DIPRE [Brin, 1999] and Snowball [Agichtein and Gravano, 2000], start with a seed collection of entity pairs for the relation to be extracted. They then search in unlabeled text sources (e.g., the Web) for sentences containing the entity pairs. Afterwards, they learn new rules from the retrieved sentences and use the learned rules to extract new entity pairs from the text sources. These entity pairs are then added to the seed collection and the process repeats until some termination condition is met. Later systems, such as KnowItAll [Etzioni et al., 2005] and TextRunner [Banko et al., 2007], make use of generic patterns to extract candidate entity pairs. These candidate pairs are then selected using domain-independent heuristics (e.g., pointwise mutual information derived from search engine hit counts) or unsupervised classifiers (e.g., a classifier that heuristically labels its own training data). In the end, the selected pairs can be used to derive extraction patterns or provide statistics for estimating whether an entity pair is a correct instance. In the case where a relation database exists, distant supervision can be performed by harvesting training data using the entity pairs from the database [Mintz et al., 2009].

Moving beyond binary relations, rule-based approaches are more popular be- cause they handle multi-way relations naturally by defining patterns over multi- ple entities and reporting that the relations of interest exist among the entities

events using 50 generic event extraction patterns supported by lexico-syntactic information. These patterns can be learnt automatically (e.g., [Piskorski et al., 2007]). As a way to consolidate texts that contain similar events for better rule learning and relation extraction, clustering can be applied as a preprocessing step [Piskorski et al., 2008;Liu et al., 2008].

In contrast, in statistical approaches, multi-way relations need to be de- composed to multiple binary relation classifications whose results need to be combined. [McDonald et al., 2005] propose to factorize the complex relations into a set of binary relations and train one classifier to extract all pairs of re- lated entities. Based on the output of this classifier, a graph can be constructed with nodes representing entities and edges representing whether the entities are related. The original multi-way relation can then be recovered by finding the maximum cliques in this graph. The main advantage of this method is that it allows statistical approaches for binary relation classifications, which have been studied extensively, to be applied onto multi-way relations.

Research of domain-specific relation extraction has been done predominantly in the biomedical domain, for tasks such as gene-drug relation, protein-protein interaction and bio-molecular event extraction. In general, both rule-based [Hak- enberg et al., 2008] and statistical approaches [Riedel and McCallum, 2011;Tikk et al., 2010] have been adopted equally, although the results in [Kim et al., 2011] give some evidence that the latter approach leads to better performing systems. Various domain-specific sources can be utilized in the extraction process. For example, medical information databases (e.g., PharmGKB) can be used to per- form distant supervision [Buyko et al., 2012] while lexica of problem-specific trigger words (i.e., words that usually express interactions) can be used to avoid extracting relations from irrelevant sentences [Bobic et al., 2012].

All the existing works in relation extraction focus on extracting relations between two textual entities. As far as we know, no prior work has examined the extraction of relations between text entities and domain-specific constructs. Therefore, we carry out our own corpus study in math to get a better under- standing of how concepts and constructs can be related and then formulate the

Title URL

Absolute value http://en.wikipedia.org/wiki/Absolute value

Bayes’ theorem http://en.wikipedia.org/wiki/Bayes’ theorem

Complex number http://en.wikipedia.org/wiki/Complex number

Fraction http://en.wikipedia.org/wiki/Fraction

(mathematics)

Fourier transform http://en.wikipedia.org/wiki/Fourier transform

Function http://en.wikipedia.org/wiki/Function

(mathematics)

Modular arithmetic http://en.wikipedia.org/wiki/Modular arithmetic

Polynomial http://en.wikipedia.org/wiki/Polynomial

Pythagorean theorem http://en.wikipedia.org/wiki/Pythagorean theorem

Trigonometry http://en.wikipedia.org/wiki/Trigonometry

problem of Text-to-Construct Linking accordingly.

Documento similar