• No se han encontrado resultados

Formal Concept Analysis (FCA) was introduced by Rudolph Wille [57] in the 80’s. It is a conceptual environment to structure, analyze, minimize, visualize and reveal hidden knowledge from the data using techniques of data mining about a binary relationship between the elements of two sets: objects and attributes. In the last decades, the FCA techniques have suc- ceeded in diverse research areas such as data mining, social networks anal- ysis, marketing, medical diagnosis, etc.

The starting point is the relationship between a set of objects and its properties.

Definition 1.2.1. A formal context is a triplet K = hG, M, Ii which con- sists of two non-empty sets G and M and a binary relation I between G and M . The elements of G are called the objects and the elements of M

1.2. FORMAL CONCEPT ANALYSIS 19

are called the attributes of the context. For g ∈ G and m ∈ M , we write hg, mi ∈ I or gIm if the object g has the attribute m.

The following example will be often cited in the rest of the work. It is taken from [26] where the authors consider data about 130 countries. For the sake of readability, we only consider a part of this dataset taking into consideration 8 countries.

Example 1.2.2. The data consists of knowing whether these countries be- long or not to Gr77 (Group of 77), NA(Non-alligned), LLDC (Least Devel-

oped Countries), MASC (Most Seriously Affected Countries), OPEC (Orga-

nization of Petrol Exporting Countries) and ACP (African, Caribbean and

Pacific Countries). The triplet K0 = hG, M, Ii is a formal context where

the set of objects and the set of attributes are, respectively, the following:

G = {Afghanistan, Algeria, Benin, Botswana, Cameroon, Gabon, Haiti, Kiribati}

M = {Gr77, NA, LLDC, MASC, OPEC, ACP}.

Table 1.1 depicts the binary relation I of the formal context K0.

I Gr77 NA LLDC MASC OPEC ACP

Afghanistan × × × × Algeria × × × Benin × × × × × Botswana × × × × Cameroon × × × × Gabon × × × × Haiti × × × × Kiribati × ×

Table 1.1: Membership of countries in supranational groups.

Thus, hCameroon, Gr77i ∈ I means “the object Cameroon has the at-

tribute Group of 77” or “Cameroon belongs to Group of 77”. However,

hCameroon, OPECi 6∈ I, i.e., “Cameroon does not belong to OPEC”.

Knowledge extracted from the formal context needs to be represented. There are two main ways of representation: either by the so-called concept

lattice or by means of a set of attribute implications. Indeed, both ways are equivalent: they represent the same knowledge and any of them can be built from the other without the need of the formal context.

1.2.1 Formal Concepts

Formal concepts are pairs of subsets of objects and attributes that are somehow related in context. Before introducing them, we start defining the derivation or concept-forming operators.

Definition 1.2.3. Given a formal context K = hG, M, Ii, two mappings (−)↑ : 2G → 2M and (−): 2M → 2G, named concept-forming operators,

are defined as follows:

A↑ ={m ∈ M | hg, mi ∈ I for all g ∈ A} B↓ ={g ∈ G | hg, mi ∈ I for all m ∈ B} for any A ⊆ G and B ⊆ M .

The set A↑ is the set of all the common attributes shared by all the objects of A, and the set B↓ is the set of objects sharing all the attributes of B.

Example 1.2.4. From the context K0 depicted in Table 1.1, one can obtain:

{Algeria}↑ = {Gr77, NA, OPEC} G↑ = ∅ {Afghanistan, Algeria}↑ = {Gr77, NA} ∅↑ = M {OPEC}↓ = {Algeria, Gabon} M↓ = ∅ {OPEC, ACP}↓ = {Gabon} ∅↓ = G

One of the fundamental results of FCA is the following theorem that relates it to the notions described in the previous section.

Theorem 1.2.5. Let K = hG, M, Ii be a formal context. The pair of the concept-forming operators h(−)↑, (−)↓i is a Galois connection between G and M .

Therefore, from the above theorem and Theorems 1.1.6 and 1.1.7, the following corollary is directly obtained.

1.2. FORMAL CONCEPT ANALYSIS 21

Corollary 1.2.6. In any context hG, M, Ii, the following properties hold: (i) A1 ⊆ A2 implies A↑2 ⊆ A ↑ 1 for all A1, A2⊆ G. (ii) B1⊆ B2 implies B2↓ ⊆ B ↓ 1 for all B1, B2 ⊆ M .

(iii) A ⊆ A↑↓ and B ⊆ B↓↑ for all A ⊆ G and B ⊆ M . (iv) A↑ = A↑↓↑ and B↓ = B↓↑↓ for all A ⊆ G and B ⊆ M .

In addition, both compositions of the two concept-forming operators, (−)↑↓: 2G→ 2G and (−)↓↑: 2M → 2M,

are closure operators.

Moreover, the closed sets of these two mappings, that is, the fixpoints of the closure operators, define the so-called formal concepts. As we shall see, formal concept is a key point in FCA which formally describes an idea of the model and it allows us to characterize a set of objects by means of the attributes they share and vice versa.

Definition 1.2.7. Let K = hG, M, Ii be a formal context and A ⊆ G, B ⊆

M . The pair hA, Bi is called a formal concept if A↑ = B and B↓ = A.

The set of objects, A, is said to be the extent and the set of attributes, B, the intent of the concept hA, Bi.

In other words, hA, Bi is a formal concept if A contains all the objects sharing the attributes in B and, analogously, B contains all the attributes sharing the objects in A. The set of all concepts of the context K, denoted by B(K), constitutes a lattice that is called concept lattice, where the order relation is defined as follows:

hA1, B1i ≤ hA2, B2i if and only if A1⊆ A2 or, equivalently, B2 ⊆ B1.

Example 1.2.8. There are 26 formal concepts associated with the formal context K0 introduced in Table 1.1. One concept is, for instance, the pair

of the least developed countries in a certain region providing its properties and characterizing its countries. Note that the sets {LLDC, ACP} and {Benin,

Botswana, Haiti, Kiribati} are closed sets:

{Afghanistan, Benin, Botswana, Haiti}↑↓ = {Afghanistan, Benin, Botswana, Haiti}

{LLDC, ACP}↓↑ = {LLCD, ACP}

There is an alternative way to define the formal concepts. They can be defined as maximal rectangles in the cross-table of the formal context. Definition 1.2.9. A rectangle in K = hG, M, Ii is a pair hA, Bi such that A × B ⊆ I. For rectangles hA1, B1i and hA2, B2i, put hA1, B1i v hA2, B2i

if and only if A1 ⊆ A2 and B1 ⊆ B2.

Theorem 1.2.10. A pair hA, Bi is a formal concept of K = hG, M, Ii if and only if hA, Bi is a maximal rectangle in K w.r.t. v.

Example 1.2.11. In the formal context K0 from Table 1.1, the pair

h{Benin, Botswana}, {Gr77, NA, ACP}i

is a rectangle in K0, but it is not a formal concept because it is not maximal

with respect to v. The pair

h{Benin, Botswana, Cameroon, Gabon}, {Gr77, NA, ACP}i

is a maximal rectangle (i.e. formal concept) containing it (see Table 1.2). As it has been said, FCA is a different view of the notions introduced in the previous section. Thus, on the one hand, Galois connections and Moore families (which can be seen as concept lattices) allow a graphical representation. On the other hand, closure operators lead to the notion of implicational systems, which facilitate the reasoning by means of logic. The following section is devoted to this notion.

Documento similar