2. Formalize mathematically the task of determining where in input space to sample the next data point. We express the above mentioned optimality criterion
5.2 Building Hierarchical Architectures from Simple De- tectors
5.2.3 Finding Sets of Sub-Patterns from an Articulate Target
The discussion in Section 5.2.2 assumes a reasonable scheme for nding and grouping to- gether sub-patterns in an image that belong to the same articulate target object. Unfor- tunately, we believe existing techniques for performing such a task in general are still very much ad hoc and unreliable at best. We conclude this thesis by referring to two current areas of research that may lead to feasible and robust search paradigms for image sub-
patterns. Our discussion here will, however, only be speculative and brief, since current research trends in both these areas still appear open and highly exploratory in nature.
Grouping
Grouping [57] [49] is a process that identies sets of features in a cluttered image likely to have arisen from a single object. It serves mainly as a pre-processing stage that speeds up object recognition by reducing the number of dierent image feature combinations the recognition stage has to consider while testing for the target. Lowe [57] rst demonstrated the idea on an early computational recognition system that makes explicit use of group- ing. Jacobs [49] later extended the idea to a geometric model-based object recognition domain. Typical grouping schemes operate on simple low-level image features that their object recognition systems use, such as intensity edges, junctions and corners. Often, these schemes rely heavily on prior assumptions about feature congurations that make them likely target candidates. Such assumptions may be based on domain specic knowledge, like common features that tend to co-exist in the target, or \general" observations like certain image feature congurations being more \salient" in real scenes.
In the hierarchical pattern detection scheme we are considering here, one can treat the sub-pattern components of an articulate target class as highly specic and sophisticated model features, similar in spirit to the simpler features used by current object recognition and grouping systems. We have argued earlier that one can reliably detect these sophisti- cated features using our proposed object and pattern detection approach from Chapter 2.
We believe one can also develop similar grouping based techniques to identify salient sets of these sophisticated image features that are likely parts from the same target.
Knowledge-based Methods for Directing Search
Over the past twenty to thirty years, knowledge-based systems or expert systems have been used on a wide range of decision problems, including problems in medical diagnosis [20], molecular structure prediction [56] and software maintenance [81] [82]. In computer vision, researchers have also used rule-based expert systems with rigid object models to direct search for additional image features, based on structures that have already been detected in the scene [18] [17]. The idea works as follows: when the recognition module detects and identies an image feature as some part of the target, the knowledge-based search module
uses the object model to predict new image locations where the recognizer may nd other target features. When a sucient number of target features have been found and \grouped"
together in this fashion, the recognizer declares the target present in the scene. We believe one can use the same general techniques developed in this area to also constrain search for the sub-pattern components of an articulate target class.
Like grouping, knowledge-based systems also rely heavily on domain specic knowledge, such as common geometric congurations between target components, to predict new image locations for nding additional target features. Recently, researchers have developed a graphical representation for learning structured domain knowledge with statistical data, known as Bayesian Networks [46] [68]. Bayesian Nets are especially suitable for learning and encoding uncertain domain specic knowledge in expert systems. One can construct a Bayesian Net that learns the variable geometric structure of an articulate target class as follows: First, we encode existing information about the articulate target class as a set of graphical nodes and directed arcs. Each graphical node could represent the distribution of some relative location or orientation variable between a pair of sub-pattern components. The directed arcs specify dependencies between the relative position and orientation variables.
Next, we use a database of real target examples to update the Bayesian Net, which includes the probability distribution of each node variable and values along each directed arc. Hence, even if our initial domain knowledge about the target class structure may be unreliable or incomplete, one can still improve it through the statistical learning process.
In summary, learning with Bayesian Nets combines the advantages of exploiting existing domain knowledge as in a classical knowledge-based system, and the ability to rene existing knowledge with statistical data as in a traditional connectionist learning-based approach.
Research in this area is new and still evolving, and a detailed discussion on applying Bayesian Nets to hierarchical pattern detection is beyond the scope of this thesis. The interested reader should refer to the following papers and others for a more comprehensive guide to the literature [46] [68] [21] [44].
Appendix A
The Active Learning Procedure
This appendix derives the active example selection procedures for polynomial approximators and Gaussian radial basis functions presented in Chapter 4. Specically, we show the steps leading to Equations 4.11 and 4.17, i.e., the total output uncertainty cost functions
U(^gn+1jDn;xn+1~ ) for polynomial approximators and Gaussian RBFs respectively.