• No se han encontrado resultados

Capítulo I: Revisión bibliográfica

1.5. Procesos de malteado

For Hand Posture Recognition, the feature for posture classification is the data representation of the hand shapes. There are two main types of features for HPR, contour and texture features. Contour features are descriptors that represent the exterior contour of the target hand region. The Fourier descriptor is one of the most widely used contour features. The basic idea is making the contour an one dimensional vector, similar to the chain code. Then Fourier Transform is applied to the vector. The Fourier coefficients can then be treated as the feature of the contour [120, 121]. Moments are descriptors that represent various contour properties, such as the sum of horizontal and vertical directed variance. Other contour features include Wavelet descriptors [122], shape signatures such as average distance between pixels on the contour, gradient shape features [123] and the centroid of the contour, etc.

Texture features are essentially various representations on the gradient pat- terns. Histogram of Gradient (HoG) [124] as one of the popular texture features, is a histogram presentation of the local gradients. Scale-Invariant Feature Transform (SIFT) [125] is another popular texture feature. SIFT contains selected key points with coordinates and gradient descriptors. The key points are local extreme values in the Laplace of Gaussian (LoG) pyramid. The descriptors summarise the orien- tation and intensity of local gradients. The two important advantages of SIFT are its invariance against rotation and scale. Hence, the SIFT key points represent the edges and ridge-like textures regardless of the orientation and scale of the target

object. Moreover, it can also tolerant certain level of view point changing. Speeded- Up Robust Features (SURF) was originally presented by Herbert Bay et al [126] based on the idea of SIFT. It also inherited in-plane rotation and scale invariance, which makes it desirable for Hand Posture Recognition. However, with high preci- sion of texture matching, the 64 dimensional descriptor of SURF requires intense computations for both key point extraction and matching. Calonder et al. [127] proved that this problem can be fixed by directly building a short binary descriptor with independent bits. That is called the Binary Robust Independent Elementary Feature (BRIEF) [127]. This binary descriptor uses Hamming distance as matching criteria, instead of Euclidean distance between the descriptors. That can largely fasten the texture matching process. However, the descriptors are not rotation and scale invariant. Rublee et al. [128] proposed an improved binary descriptor which is rotation invariant, called Oriented Fast and Rotated BRIEF (ORB). There is an additional advantage of ORB which is its robustness to noises. Leutenegger et al. [129] also introduced a binary descriptor that is both rotation and scale invariant. It is called Binary Robust Invariant Scalable Keypoints (BRISK). The main charac- teristic of BRISK is that, in each scale pyramid octave, a corner detection method called Features from Accelerated Segment Test (FAST) [130], is used to detect the key points, instead of simply locating local extreme values as in SURF and SIFT. That makes the key point selection more efficient in BRISK, and ensures that the amount of key points in BRISK descriptor is lower than SIFT and SURF. How- ever, the robustness of BRISK descriptor can be affected by out-of-plane rotation or rapid texture changes. That is a vital drawback for applications such as HPR. Fast Retina Keypoint (FREAK) proposed by Alahi et al. [131] is the latest devel- opment on gradient based key point texture features. It simulates the principle of human retina visualisation. A cascade of binary descriptors is used instead of a sin- gle descriptor. The cascade structure is constructed by comparing image intensities over a sampling pattern based on the human retinal ganglion cells distribution, in

which the method picks more key points on the central area. Although Alahi et al. [131] reported better performance over SURF, its robustness against illumination and view point changes in HPR applications still remains untested. For the tracking method in Chap 4, the only texture feature used is SURF. The reason is two-fold. Firstly, using short binary descriptors including BRISK and ORB sacrifices certain invariance properties and accuracies [127, 128]. Secondly, SURF has certain level of tolerance for view-point variance, which BRISK and ORB do not have.

For HGR applications, the aforementioned gradient based texture key point descriptors can also be used for texture matching based tracking. In Chapter 4, a novel texture matching tracker will be introduced. Instead of shape descriptors, descriptors of dynamic hand trajectories are the features for gesture classification in HGR problems. Since for HGR in uncontrolled environments, the method should not require certain hand shape to be presented by the user, the spatial features of hand shapes are not considered in the HGR methods introduced in Chap 4-6. After the hand tracking method has located the position of the hand candidate in the frames, temporal features are extracted to represent the trajectories. Unlike contour and texture spatial features, there are only a few commonly used trajectory features. They can be categorised into two types, local and global features. Local trajectory features include hand speed, location and movement direction [8, 3]. The hand velocity, orientation of movement and coordinates displacements of the hand between adjacent frames can be used to describe the elementary trajectory segments. Global features are shape descriptors that are extracted from the complete hand trajectories. All contour and texture features mentioned before can potentially be used as global trajectory features. Because the same as hand postures, trajectories can be seen as stationary images.

Documento similar