• No se han encontrado resultados

Representación de la Recaudación tributaria desde el 2007 al 2016

Graphical models are used across a very broad spectrum of problems from social science type problems, such as identifying communities [51, 65, 130, 166, 175], to image segmentation [20, 84], to cell biology [33], to modeling the world wide web [24, 31, 33, 62] and many more. We use an anisotropic model which, for example, is suitable for cosmological models [83, 101], modeling outbreaks of disease [99] and image recognition [178]. With this type of problem there is often no data generating model available. For example in [84] the authors use a graphical model to identify features in a picture. In problems of this type there is no physical model. For this reason graphical models are a very popular choice of modeling technique. The types of problems that motivate the graphical modeling methodology can be seen more generally as clustering or classification problems.

The problem is given dataΨn ={ξi}ni=1 ⊂ XwhereX ⊂Rdfindµ : Ψn → Rthat

Figure 6.1: An example graph. For the classifier estimates see Figure 6.2.

with the cluster labeled 0 andµ(ξi) = 1means thatξiis associated with the cluster labeled 1.

For a finite number of observations we allow a soft classification however the scaling is chosen such that in the data rich limit classifiers are binary valued. The motivation for our approach is to validate approximating the hard classification problem by a soft classification problem. The soft classification problem is in general numerically easier [68] and therefore more appealing to the practitioner. However one also wants to be precise in regards to which class a data point belongs. Minimizers of the Ginzburg-Landau functional are used as a classification tool [163] in order to allow for phase transitions which allow a soft classification approach whilst also penalizing states that are not close to a hard classification. A consequence of our proofs is an insight into the ratio of data points that receive a hard classification, i.e. the asymptotic behavior of{ξi :µ(ξi)∈ {0,1}}.

Another important application for this work is in designing classifiers. By not assuming that the model is isotropic we allow greater flexibility which allows one to choose some features as more important than others. The next subsection contains a simple example which shows how the design choice can affect the classification. In particular one can use the methodology to map infinite dimensional data onto a finite dimensional space in order to classify the infinite dimensional data set.

Assessing the validity of such an approach is of high importance. This is especially true as one cannot intuitively link the model to the data generating process. When one can make such a connection then the model can be heuristically motivated. Without such connection one needs to do more in order to justify the approach.

The primary results of this chapter concern showing thatµ(n)converges to a minimizer of a limiting model. We also give some preliminary results into characterizing the rate of con- vergence in a simplified example. We believe these results will hold under more generality than stated here and it is the objective of ongoing work to extend them.

Our approach is motivated by [5, 69, 163]. Classifiers are constructed as the solution of a variational problem which is common in statistical problems, e.g. maximum likelihood and maximum-a-posterior problems. In particular minimizers of the Ginzburg-Landau funtional, a phase transition model popular in material science and image segmentation, are used as classi-

fiers. Classifiersµ(n): Ψn →Rare constructed as follows. LetV :R→[0,∞)be a potential

such that states taking the value 0 or 1 is favored. For exampleV(t) =t2(t−1)2. A graph is constructed by taking the vertices as the setΨnand weighting edges

Wij =η(ξi−ξj)

forη:Rd→[0,∞)and we say that there is an edge betweenξiandξjifWij >0, for example

see Figure 6.1. Assume thatη:Rd→[0,∞)is of the form

η(x) =

1

dη(x/) (6.1)

with scaling parameter∈ (0,∞)andη : Rd → [0,∞). For example one (isotropic) choice

ofηisη(x) = 1if|x|<1andη(x) = 0if|x| ≥1. For a functionµonΨnthe graph energy

En(µ(n))∈[0,∞]is defined by En(µ) = 1 1 n n X i=1 V(µ(ξi)) + 1 1 n2 X i,j Wij|µ(ξi)−µ(ξj)|. (6.2)

Our classifier to the clustering problem is then given as the minimizer of (6.2).

This is similar to the approach taken in [69] where they consider pairwise interactions only. In particular one can define the graph total variation by

GT Vn(µ) := 1 1 n2 X i,j Wij|µ(ξi)−µ(ξj)|. (6.3)

In the special case thatµ(ξi) ∈ {0,1}this reduces to the graph cut ofΨn, i.e. ifµ−1(0) =A0

andµ−1(1) =A1then GT Vn(µ) = 1 1 n2 X ξi∈A0 X ξj∈A1 Wij.

We wish to allow for soft clustering however the total variation term is not enough to be able to do this informatively. The clustering approach is made more robust by including a first order term which penalizes associating a data point to more than one cluster. See, for example, Figure 6.2 for a comparison. It is not trivial that the convergence results in [69] will survive adding a penalty term.

Finding minimizers ofEnis also an important problem but is not addressed in this thesis.

We instead refer to [29, 30] for numerical methods.