CAPITULO III: MARCO METODOLÓGICO
3.4 Procedimiento para la obtención de la información.
Annealed Importance Sampling Annealed importance sampling is a sequen- tial Monte Carlo method which allows a non-analytically normalizable distribution to be estimated in an unbiased fashion through simulated annealing heuristic. This is accomplished by starting at a distribution with a known normalization, and grad- ually transforming it into the distribution of interest through a chain of Markov transitions. The transition operator Tk(x0; x) represents the probability density of transiting from state x to x0. One can use any suitable MCMC transition operator that guarantees a suitable sequence of intermediate probability distributions. One general way to define this sequence is to set:
Pk(x) ∝ PA∗(x)1−βkPB∗(x)βk, (4.53) where 0 = β0 < β1 < · · · < βK = 1 is the annealing temperature chosen by the user. Annealed Importance Sampling produces a sample of points x(1), x(2), . . . x(N ) and their weights w(1), w(2), . . . w(N ) by a sequence of points x1, . . . , xk as follows:
Algorithm 7 Annealed Importance Sampling- One Run
1: Generate x1, x2, . . . , xk as follows:
2: – Sample x1from PA= P0
– Sample x2given x1 using T1
– Sample xk given xk−1 using Tk−1
3: Set x(i)= x k and 4: Set w(i)=P1∗(x1) P∗ 0(x1) P2∗(x2) P∗ 1(x2). . . PK∗(xk) P∗ k−1(xk)
The above procedure produces a single independent point x(i) for use in estimating expectations. Note that since the transitions from each step to the next take place through Metropolis Hastings, there is no need to calculate the normalizing constants of any intermediate distributions. The final result ˆIj of each annealing run is heavily dependent on the starting state which was randomly sampled from the prior, thus the above procedure needs to be repeated several times(M ) in order for the result to converge to the true value. (Neal, 2005) shows that for sufficiently large number of intermediate distributions, k, the variance of rAIS will be propotional to 1/M k, where M refers to the number of annealing runs. In order to avoid possible overflow problems, the calculations are done in logarith-
mic scale, i.e. log P (x|M ) ' log 1 M M X j=1 ˆ Ij , where ˆ Ij = exp " n X i=1 (βi− βi−1) log P (x|M ; θjβi) #
The major advantage of the AIS algorithm from a computational point of view, is that it is not required for the Markov chains to converge to their stationary distributions. The samples need only approximately be drawn from a series of in- termediate distributions, which form a path in the probability density space from the prior to the posterior. Thus, the algorithm provides a method to approximate marginal likelihoods even in cases when one cannot find distributions that guaran- tee convergence. Since convergence assessment may be problematic especially in non linear problems, requiring several thousands of samples to be rejected before convergence is achieved, proving AIS very useful.
Although the annealing run allows a much freer movement in the state space by making small transitions from each step to the next using the Metropolis Hastings algorithm, AIS is slow and still an approximation which means that there is no guarantee that the precise result is calculated by visiting all modes. Thus for cases, where multi-modality is an issue and the modes are far from each other with respect to these small transition steps, the chain is less likely to move from one mode to another. In such scenarios, one can increase the temperature to permit uphill moves more frequently; this will allow the approximate sampling from a sequence of intermediate distributions to provide greater coverage of different regions of the state space.
4.8
Summary
This chapter discusses various probabilistic models of visual scene analysis that take some functional inspiration from the mammalian visual system and provide a useful basis to draw Fisher score space for the classification task. In order to maximize the likelihood of the visual data, these models face the problem of sampling from the joint probability distribution of the data and the features. We discuss several sampling algorithms in this context and show how these models could be trained to maximize the likelihood of the seen data.
Experiments and Results
This chapter explains the design of the experiments carried out to investigate the problem of visual scene classification typically solved via the generative models. We discuss the experimental framework used to assess the classification potential of these generative models and the proposed Fisher kernel based approach used to take over the same recognition challenge.
5.1
Data Sets
In order to develop visual models of objects and scenes, benchmark data sets play an important role to test the performance of detection and classification. Current benchmark data sets for evaluating object classification systems claim to provide image variability in terms of the object/scene’s appearance, shape, size, orientation, viewpoint and noise that is naturally present in the real world. Despite their wide use and applicability, these computer vision data sets have been criticised for their inadequacy to provide a trustworthy test bed for algorithms that aim to achieve human comparable speed and accuracy in recognition (Pinto et al.,2008;Ponce et al.,2006;Torralba & Efros,2011). Assuming these standard data sets are decent enough to calibrate the recognition performance of artificial algorithms, we continue to use them in order to gauge the performance of our technique against the state of the art methods.
The following different texture, character and object recognition data sets have been used in this work. Note that in all these data sets, an object may be part of the scene or the scene itself used for the classification task.
Texture Data sets:
– UIUC (Lazebnik et al.,2005) – CUReT (Dana et al.,1999a) – Brodatz (Valkealahti & Oja,1998) – Berkeley (Martin et al.,2001) – Emphysema (Sørensen et al.,2010)
The first three data sets contain texture patterns that form the primitives of natu- ral scene images. Natural images portray different visual textures with contrasting properties such as regularity versus randomness and uniformity versus distortion in the same image. This collage of pixel variation is a result of the uncontrolled illumination conditions in real life with a variety of objects appearing at different scales and viewpoints. The Berkeley data set is a natural scene image database that is usually used for image segmentation tasks. The Emphysema data set contains computed tomography (CT) slices of the lung tissues showing different textures for medical image analysis.
Character and Object Recognition Data sets: – MNIST (Lecun et al.,1998)
– USPS (Hull,1994) – Alphanumeric1 – ETH-802 – Caltech-1013
The first three data sets are character and digits recognition data sets, whereas the last two are object recognition data sets widely used for determining the success of computational models and algorithms for scene recognition. A more detailed discussion on the specifications of the data sets is given in the forthcoming sections.