Since commercial systems can be expensive to acquire (and often specific infor- mation relating to their underlying algorithms and techniques is withheld due to commercial purposes), research has benefited more from freely available systems provided by academia. Some of the most influential well-known systems are that presented hereinafter are PicHunter,PhotoBook, Blobworld, andMARS.
Photobook
Photobook[336] was developed byMIT Media Laboratory11and represents one of the first academic prototypes for VIR. Its implementation includes retrieval mechanisms for two-dimensional shapes, texture images and face recognition; the technology
10http://www.virage.com/products/vir-irw.html 11http://www.media.mit.edu/
for the latter one was also used by Viisage Technology12 in their FaceID package
employed in several US police departments.
Although Photobook also supports colour features, the main concentration is on texture and shape. The texture features are computed as the sum of the three or- thogonal Wold components: periodicity, directionality and randomness. The shape description is based on the extraction of the boundary, which is then described by corners and curvature points. Queries are created by selecting still images from grid (QBE) or by entering an annotation filter (TBIR). Shape similarity is pri- marily calculated using the deformation effort; other similarity measures include Euclidean, Mahalanobis, vector space angle, histogram, Fourier peak and wavelet tree distances.
PicHunter
PicHunter [70, 71, 72, 73] is another example of a freely available image retrieval system and was developed by the NEC Research Institute (which is now a part of
NEC Laboratories America13 after a merger in 2002). Its functionality is thereby
based on the assumption that a user is looking for an exact image in the database and therefore presents one of the first applications for target testing searches.
The content descriptors are mainly based on hidden alphanumeric representa- tions as well as colour features in the HSV and RGB spaces that are represented as colour histograms and correlograms (both HSV) as well as CCV (RGB). The queries are specified using QBE, and the similarity between the individual features (i.e. colour vectors) is calculated using the L1 distance.
MARS
MARS [326, 370] stands for Multimedia Archival and Retrieval System(s) and de- scribes a series of systems first developed by the Department of Computer Science
at the University of Illinois at Urbana-Champaign14 and further improved at the
12 http://www.viisage.com/ 13 http://www.nec-labs.com/ 14 http://www-db.ics.uci.edu/pages/research/mars.shtml
Department of Information and Computer Science at the University of California, Irvine15.
In MARS, colour is represented in a two-dimensional histogram over the HS coordinates of the HSV colour space (the V component is neglected because it can be influenced by lighting conditions); the texture features coarseness and directionality are also stored in histograms, while the contrast of the texture is stored in a scalar; the boundary of the shape is described using Fourier descriptors (FD).
Queries allow boolean operators and can comprise any combination of the low level features colour, texture, and shape (that can be chosen from a palette) as well as textual descriptions (as keywords can be integrated as well). Histogram intersection is used to compare colour histograms, the weighted sum of the Euclidean (L2) distance for texture similarity, and the weighted sum of the standard deviations
of the magnitude and phase angles of the FD coefficients for shape similarity.
Blobworld
Blobworld [41] was developed by the Computer Science Division of theUniversity of California at Berkeley16 and was one of the first retrieval systems to use image
regions for the query process. Several updated versions with significant changes were published over time [42, 43] until the research project finally ended in 2004.
Figure 2.5: Blobworld: a real and segmented image of a wolf.
Before the feature extraction process is started, the image is first segmented into regions. The first versions [40, 41, 43] used 6 features for segmentation, a colour
15http://www.ics.uci.edu/
histogram based on the HSV space to store the colours and ellipses to symbolise the regions, whereas the latter versions such as described in [42] made use of 8 features for segmentation, the CIE L*a*b* space and the real boundaries of the regions as illustrated in Figure 2.5 respectively. For each region, mean contrast and anisotropy are used as texture features, while approximate area, eccentricity and orientation quantify the corresponding shapes.
The query interface allows the user to select a category (to limit the search space) and the regions (blobs) of an initial image. The importance of the selected blob as well as the importance of the colour, texture, location and shape within that blob can be indicated and form the basis for retrieval. The colour histograms are hereby matched using QFD, and the Euclidean (L2) distance quantifies the
similarity of the texture descriptors and of the centroids. R* trees are used for indexing purposes.
Other Academic Systems
Other academic systems that are alluded in the context of this thesis include the following:
PicToSeek [129, 130] defines colour and shape invariants as features in content- based queries to guarantee invariancy of camera viewpoint, illumination conditions as well as the geometry of the objects.
NeTra [257] is another example of a system that uses image segmentation. First, an image is divided into regions of homogenous colour, and then colour, texture, shape and spatial location are extracted from those regions.
VisualSEEk [410, 411] employs a similar approach and also decomposes each im- age into regions of dominant colours. Again, feature properties and spatial relations are retained for each region.
ASSERT is described in [398] and is specifically targeted towards retrieval of high-resolution computed tomography images of the lung.