DESAFÍOS AMBIENTALES Y PUEBLOS ORIGINARIOS

COLOMBIA Y LOS DERECHOS DE LA NATURALEZA:

3. DESAFÍOS AMBIENTALES Y PUEBLOS ORIGINARIOS

The idea of automated reasoning dates back before AI itself and can be traced to ancient

Greece. Aristotle’s syllogisms paved the way fordeductive reasoning formalism. It continued

its way with philosophers like Al-Kindi, Al-Farabi, and Avicenna (Davidson, 1992), before culminating as the modern mathematics and logic.

Within AI research, McCarthy (1963) pioneered the use of logic for automating reasoning

for language problems, which over time branched into other classes of reasoning (Holland

et al., 1989; Evans et al., 1993).

A closely related reasoning to what we study here is abduction (Peirce, 1883; Hobbs et al.,

1993), which is the process of findingthe best minimal explanationfrom a set of observations

(see Figure 8). Unlike in deductive reasoning, in abductive reasoning the premises do not

guarantee the conclusion. Informally speaking, abduction is inferring cause from effect (reverse direction from deductive reasoning). The two reasoning systems in Chapter 3 and

4 can be interpreted as abductive systems.

We define the notation to make the exposition slightly more formal. Let`denote entailment and ⊥denote contradiction. Formally, (logical) abductive reasoning is defined as follows:

Given background knowledge B and observations O, find a hypothesis H, such that

B ∪H 0 ⊥ (consistency with the given background) and B∪H ` O (explaining the

observations).

In practical settings, this purely logical definition has many limitations: (a) There could be

multiple hypotheses H that explain a particular set of observations given the background

knowledge. The best hypothesis has to be selected based on some measure of goodness

Figure 8: Brief definitions for popular reasoning classes and their examples.

elements, i.e. there are degrees of certainties (rather than binary assignments) associated

with observations and background knowledge. Hence the decision of consistency and ex-

plainability has to be done with respect to this fuzzy measure. (c) The inference problem in its general form is computationally intractable; often assumptions have to be made to

have tractable inference (e.g., restricting the representation to Horn clauses).

2.5.2. Incorporating “uncertainty” in reasoning

Over the years, a wide variety of soft alternatives have emerged for reasoning algorithms,

by incorporating uncertainty into symbolic models. This resulted in theories like fuzzy-

logic (Zadeh, 1975), or probabilistic Bayesian networks (Pearl, 1988; Dechter, 2013), soft

abduction (Hobbs et al., 1988; Selman and Levesque, 1990; Poole, 1990). In Bayesian networks, the (uncertain) background knowledge is encoded in a graphical structure and upon

receiving observations, the probabilistic explanation is derived by maximizing a posterior

probability distribution. These models are essentially based on propositional logic and can-

not handle quantifiers (Kate and Mooney, 2009). Weighted abduction combines the weights

of relevance/plausibility with first-order logic rules (Hobbs et al., 1988). However, unlike

ical basis and does not lend itself to a complete probabilistic analysis. Our framework in

Chapter 3,4 is also a way to perform abductive reasoning under uncertainty. Our proposal

is different from the previous models in a few ways: (i) Unlike Bayesian network our frame-

work is not limited to propositional rules; in fact, there are first-order relations used in the

design of TableILP (more details in Chapter 3). (ii) unlike many other previous works,

we do not make representational assumptions to make the inference simpler (like limiting to Horn clauses, or certain independence assumptions). In fact, the inference might be

NP-hard, but with the existence of industrial ILP solvers this is not an issue in practice.

Our work is inspired by a prior line of work on inference on structured representations to

reason on (and with) language; see Chang et al. (2008, 2010); ?, 2012), among others.

2.5.3. Macro-reading vs micro-reading

With increased availability of information (especially through the internet)macro-reading

systems have emerged with the aim of leveraging a large variety of resources and exploiting the redundancy of information (Mitchell et al., 2009). Even if a system does not understand

one text, there might be many other texts that convey a similar meaning. Such systems de-

rive significant leverage from relatively shallow statistical methods with surprisingly strong

performance (Clark et al., 2016). Today’s Internet search engines, for instance, can success-

fully retrievefactoid style answers to many natural language queries by efficiently searching

the Web. Information Retrieval (IR) systems work under the assumption that answers to

many questions of interest are often explicitly stated somewhere (Kwok et al., 2001), and

all one needs, in principle, is access to a sufficiently large corpus. Similarly, statistical cor-

relation based methods, such as those using Pointwise Mutual Information or PMI (Church and Hanks, 1989), work under the assumption that many questions can be answered by

looking for words that tend to co-occur with the question words in a large corpus. While

both of these approaches help identify correct answers, they are not suitable for questions

requiring language understanding and reasoning, such as chaining together multiple facts in

a piece of evidence given to the system, without reliance of redundancy. The focus of this

thesis ismicro-reading as it directly addresses NLU; that being said, whenever possible, we

use macro-reading systems as our baselines.

2.5.4. Reasoning on “structured” representations

With increasing knowledge resources and diversity of the available knowledge representations, numerous QA systems are developed to operate over large-scale explicit knowledge

representations. These approaches perform reasoning over structured (discrete) abstrac-

tions. For instance, Chang et al. (2010) address RTE (and other tasks) via inference on

structured representations), Banarescu et al. (2013) use AMR annotators (Wang et al.,

2015), Unger et al. (2012) use RDF knowledge (Yang et al., 2017), Zettlemoyer and Collins

(2005); Clarke et al. (2010); Goldwasser and Roth (2014); Krishnamurthy et al. (2016) use

semantic parsers to answer a given question, and Do et al. (2011, 2012) employ constrained

inference for temporal/causal reasoning. The framework we study in Chapter 3 is a reasoning algorithm functioning over tabular knowledge (frames) of basic science concepts.

An important limitation of IR-based systems is their inability to connect distant pieces of

information together. However, many other realistic domains (such as science questions or

biology articles) have answers that are not explicitly stated in text, and instead require com-

bining facts together. Khot et al. (2017) creates an inference system capable of combining

Open IE tuples (Banko et al., 2007). Jansen et al. (2017) propose reasoning by aggregating

sentential information from multiple knowledge bases. Socher et al. (2013); McCallum et al.

(2017) propose frameworks for chaining relations to infer new (unseen) relations. Our work in Chapter 3 creates chaining of information over multiple tables. The reasoning framework

in Chapter 4 investigates reasoning over multiple peaces of raw text. The QA dataset in

Chapter 5 we propose also encourages the use of information from different segments of the

2.5.5. Models utilizing massive annotated data

A highlight over the past two decades is the advent of statistical techniques into NLP (Hirschman

et al., 1999). Since then, a wide variety of supervised-learning algorithms have shown strong

performances on different datasets.

The increasingly large amount of data available for recent benchmarks make it possible to

train neural models (see “Connectionism”; Section 2.4.2) (Seo et al., 2016; Parikh et al.,

2016; Wang et al., 2018; Liu et al., 2018; Hu et al., 2018). Moreover, an additional tech-

nical shift was using distributional representation of words (word vectors or embeddings)

extracted from large-scale text corpora (Mikolov et al., 2013; Pennington et al., 2014) (see

Section 2.4.3).

Despite all the decade-long excited about supervised-learning algorithms, the main progress,

especially in the past few years, has mostly been due to the re-emergence of unsupervised

representations (Peters et al., 2018; Devlin et al., 2018).2

In document Los acueductos comunitarios en Colombia (página 90-93)