2.3.1 ¿Estabilidad o equilibrio? La adaptabilidad del sistema
2.3.2 La ciudad, un sistema complejo autorregulador
The above procedure of estimating Dt will fail if the Decision matrix ∆s is empty.
6.2 Determining scene equivalence from a database 146
(section 6.2.2), an empty ∆s will occur if there are no matches that concurrently
satisfy the two criteria - some matches may have a high Gc but a small N%test and
vice versa. In this case, Dt cannot be estimated from the procedure described in
section 6.2.2 above and the test scene in this case is deemed to be ambiguous. It is highly likely that Gcand is an unreliable match since the pairwise matching
with the entire database does not produce a single good match in terms of Gc and
N%test and is thus inconclusive. The solution proposed is based on the assumption
that a good positive match is not likely to result in such an ambiguous case and hence this match should be rejected as unreliable if possible.
Once an ambiguous case is detected, the scene decision module will set Dt =
Gcand directly. From (6.10), all matches will be accepted if no more modifications
are made. Instead, Dmin is further modified to a higher value, denoted as D∗min
which makes it more likely to reject unreliable matches (6.18). An illustration of how changing the value of Dmin to D∗min helps in rejecting an unreliable Gcand is
shown in Fig. 6.4
This procedure is justified on the basis that since Dt is effectively unable to
decide if Gcand should be accepted, one can only heuristically decrease the tolerance
for false matches in this unreliable case. This is done by directly manipulating the value of Dmin, making it larger to Dmin∗ so that it is unlikely the Gcand is accepted.
6.3 Final remarks 147
Figure 6.4: For the case of ambiguous scenes, Dt cannot be computed for scene
decision. Instead, modifying the value of Dmin to a higher D∗min value allows such
ambiguous scenes to be rejected.
should be accepted. Gcand must then be larger than D∗min for acceptance which is
still possible if the true positive has N%cand or Gcand that are very near (but lower)
than the threshold criteria for constructing ∆s. In this work, Dmin∗ is set to 0.05.
An example of a positive ambiguous test scene is shown in Appendix C.3.2 that demonstrates how this procedure works.
6.3
Final remarks
This chapter completes the description of the proposed SRS with the scene decision module. A novel similarity measure is introduced, known as the Global Configura- tion Coefficient, Gc which combines both 2D pixel correlation information (N%test)
as well as rank correlation measures of the spatial configuration in (Sρ, Kτ). Gc is
6.3 Final remarks 148
input test scene and a reference image database. This framework describes how the initial candidate match, Gcandis extracted and validated by estimating an adaptive
decision threshold, Dt(6.10). Modifications to the procedure for ambiguous scenes
are also considered in (6.18). Finally, examples which show how Dt can produce a
reasonable threshold are illustrated in Appendix C for a variety of common cases of Gcand.
The next chapter describes the experimental setup and tests that are used to validate the proposed SRS’s performance and effectiveness for a variety of environ- ments under various image distortions. A detailed discussion of the experimental results follows thereafter, and attempts to highlight the contribution of the various components to the recognition accuracy of the proposed SRS.
149
Chapter
7
Experimental Results and Discussion
This chapter presents the experiments conducted to verify the proposed SRS. The experimental setup is first introduced in section 7.1 where the four different im- age databases used are described. The experimental procedure is subsequently presented in section 7.2 where various measures of recognition accuracy are intro- duced so as to evaluate the performance of the proposed SRS. In order to highlight the performance of the proposed SRS, several comparative studies with various sim- ilarly designed SRSs are described in section 7.3. The results of the experiments are summarised in section 7.4, and various interesting examples from the image databases are highlighted. Finally, an analysis and discussion of the experimental results are presented in section 7.5.
7.1 Experimental setup 150
7.1
Experimental setup
In this section, the four image databases used in all the experiments are de- scribed. These databases contain images taken from four different environments (their referenced name in this thesis is denoted in bold) - indoors(IND), a sandy shore(UBIN), a tropical rainforest(NS) and a mangrove forest(SBWR). For this thesis, the distinction between reference and test scenes is based on how the image scenes are used in the experiments - scenes that make up the reference database, Dref, are reference scenes while scenes used for testing the recognition accuracy of
the proposed SRS are test scenes. In order to validate the robustness and discrim- inatory power of the proposed SRS (section 1.3), the scenes in the database often contain significant image distortions. A summary of the four databases is shown in Table 7.1 where the number of scenes used in each environment is shown as a triplet (Nref Npos Nneg) which are respectively the number of reference, positive
and negative scenes used in the particular database. Some typical example scenes from the four databases are shown in Fig. 7.1. The following sections describe the databases in greater detail. More examples of the reference and test scenes used in the experiments are shown in Appendix D.3.
7.1 Experimental setup 151
Table 7.1: The four databases used in the experiments. Database (Nref, Npos, Nneg) Type
IND (18, 25, 21) Indoor
UBIN (20, 63, 69) Outdoor coastal
NS (20, 41, 52) Outdoor varied
SBWR (15, 15, 16) Outdoor enclosed
7.1.1
Database IND
This database consists of indoor scenes taken under typical lighting conditions. Included is a set of artificial scenes with simple features that are configured differ- ently in space, so as to test the usefulness of rank correlations in detecting changes in the ordinal configuration of ambiguous scenes sharing the same features (Fig. 7.1(IND: top)). Another set of images contains scenes from a typical office/factory with significant clutter and people moving around (Fig. 7.1(IND: bottom)). This database verifies the robustness of the proposed SRS against various image distor- tions due to viewpoint changes and human movements. The database also tests the proposed SRS’s ability to discriminate ambiguous scenes containing numerous similar features that confuse other methods (e.g. [3, 76, 87])
7.1.2
Database UBIN
This database consists of outdoor images taken predominantly along a sandy shore and among the surrounding vegetation of an island. It is the nesting habitat of many species of tropical sand-digging wasps (section 3.5.2 and Fig. 3.20) where
7.1 Experimental setup 152
Figure 7.1: Various challenging test (left) and reference scenes (right) of the four databases, two rows ((t)op,(b)ottom) shown per scene. IND: ambiguous scenes(t) and viewpoint changes with significant clutter(b), UBIN: clear vs.hazy overcast sky(t) with differences in tides and shadows vs.leaves swept up(b), NS: non-uniform illumination(t) and changes in scene content due to rain and tree fall(b) and SBWR: numerous occlusions due to dense vegetation. See text for a detailed description of each database.
one can see them making foraging trips to and fro their nests in an unerring man- ner. The scenes are taken on two different days a month apart from each other at around the same time but under very different weather conditions. The reference scenes are taken on a clear sunny day while a portion of the test scenes are taken under very hazy (dim) conditions. Furthermore, the test scenes have also suffered from significant changes due to natural erosion and the dynamic nature of a coastal environment. For example, the reference scenes are taken at low tides while the
7.1 Experimental setup 153
test scenes are taken at high tides which make this database very challenging (Fig. 7.1(UBIN: top)). Human intervention can also cause scenes taken from similar places to appear very different - leaves being swept up as well as the addition/re- moval of man-made structures in the scene (Fig. 7.1(UBIN: bottom)) further makes the recognition of this database difficult. Using this database will verify the robustness of the proposed SRS against such changes in a simple and open coastal environment with relatively sparse vegetation. The skyline is also particularly evident in such an environment which is exploited to aid in scene recognition.
7.1.3
Database NS
The NS database consists of scenes with lush green vegetation taken at a primary swamp forest in a nature reserve. The test scenes are varied in structure, from enclosed forests to semi-open clearings such as streams and ponds (Fig. 7.2).
Figure 7.2: The NS database consists of three environments: Enclosed forest (left), streams and ponds (middle) and semi-open clearings (right).
7.1 Experimental setup 154
noon time on a clear day, the second set is taken three weeks later from the period between the late afternoon and the evening, also on a clear day while the third set is taken at around noontime on a hazy, cloudy day one week after the second set. As the first two sets are taken on clear days at very different times, changes in illumination caused by the movement of the sun are particularly evident. The effects of shadows and the non-uniform lighting in the environment due mainly to the foilage can be quite drastic and are particularly challenging (Fig. 7.1(NS: top)). Finally, because of the separation in time between the three sets of test scenes, changes due to the dynamic nature of the environment add to the difficulty in recognising the scenes (Fig. 7.1(NS: bottom)).
7.1.4
Database SBWR
In contrast to the ‘openness’ of the UBIN database, SBWR contains relatively complex scenes taken from an enclosed tropical mangrove forest. As the mangrove environment is dominated by a few plant species, this database contains many similar-looking vegetation, and is characterised by dense foliage and numerous occlusions (Fig. 7.1(SBWR)). The difficulty in recognition is compounded as the reference scenes are taken purposely at random points in the forest, with no distinct landmarks that could be used by human observers, unlike the other two databases of natural scenes.
7.2 Experimental procedure 155
The design of the reference database is also slightly different than the other three databases. Many of the reference scenes are represented by two or three snapshots of the same scene at the beginning, middle and end of a TBL arc. This is motivated by the increased complexity of the environment which requires for its representation several slightly displaced snapshots of the same scene as they indeed look remarkably different (Fig. 5.8)! Furthermore, several authors have hypothe- sised that the view at the endpoints of the TBL arcs are remembered by insects as they contain useful information for scene recognition (see [25]’s conclusion on the purpose of TBL flights and section 5.2.1). The reference database constructed in this case thus models this statement.
This database tests the proposed SRS’s tolerance to such natural scenes with many occlusions and clutter, common in an enclosed forest.
7.2
Experimental procedure
The experimental procedure evaluates the performance of the proposed SRS by computing the recognition accuracy in terms of positive acceptance, Pacc and posi-
tive rejection, Prej rates (in %) when positive and negative test scenes from the four
databases are presented respectively to the proposed SRS. The entire procedure mimicks a typical scene recognition situation (section 6.2) - a reference database Dref of Nref reference scenes is constructued and scene recognition is performed
7.2 Experimental procedure 156
with Ntest test scenes with the database. Obviously Ntest = Npos+Nneg. The entire
evaluation procedure is summarised in the following steps:
1. The Scene matrix cells, Ms(section 5.3), of all of the images (both reference
and test scenes) in the database are first extracted from the raw input images and saved.
2. The reference image database, Dref, is constructed. Ideally this step should
be performed automatically by a separate algorithm that decides which scenes are distinct enough to be used as reference images. This can be achieved in a practical navigation system during the learning phase when the agent explores its environment for the first time. For this work, one assumes that this has been done and the Nref reference images are chosen manually to
produce Dref.
3. The rest of the Msare then grouped into two test sets containing Npospositive
or Nneg negative scenes depending if Dref contains a known positive match
for the test scenes or not.
4. The two test sets containing Npos and Nnegscenes are then used to obtain the
positive acceptance and positive rejection accuracies (Pacc, Prej) respectively.
This is done by presenting a test scene matrix cell, Mtests to the scene decision module as described in section 6.2 so as to obtain the final decision, Df on
7.2 Experimental procedure 157
may yield different results due to the fact that RANSAC is implemented in matching the salient-SURF keypoints between the test and reference scenes similar to the method used to extract ordinal depth by TBL (section 6.1.2). The accuracy of the proposed SRS for the same test scene may thus vary over several trials. In order to arrive at a reasonable estimation of the recog- nition accuracy, the scene decision with the same test scene is repeated for Niter times. At each iteration, a correct decision is given one point while an
incorrect decision is given zero point. For positive scenes, a correct decision occurs when the SRS correctly matches the reference scene in the database. For negative scenes, a correct decision effectively rejects the scene since no reliable matches can be found. This is done by comparing Df with a known
database of correct response the SRS should give if there are no errors. In this work, Niter is fixed at 20 for all of the experiments.
The recognition accuracies (Pacc, Pneg) is given as the percentage of correct
decision over all the (Npos, Nneg) test scenes with each scene iterated over Niter
times:
Pi =
P B Nj× Niter
× 100, j ∈ [(acc, pos), (rej, neg)]
B = 1 if Df is correct 0 if Df is incorrect (7.1)