• No se han encontrado resultados

3. TRATAMIENTO DEL CECC

3.4. MECANISMO DE ACCIÓN DE LA QUIMIOTERAPIA

Whilst Cullinan et al. (2015), and Matzner et al. (2015) mentioned above have attempted to use motion features to differentiate between small numbers of species, all other exist- ing works concerned with the automated classification of birds use appearance features derived from a single image of an individual bird. These type of approaches can be further subdivided into those that make use of information about the physical structure of indi- vidual birds (referred to as part-based), and those which do not. Non-part-based methods use colour and shape features of the entire bird, without considering the relative posi- tion, to determine its species (Marini et al., 2013; Wah et al., 2011a,b). For example, the work by Marini et al. uses colour features extracted from the bird with a Support Vector Machine (SVM) classifier. Again most of these works have been used to differentiate between relatively small numbers of species and struggle to maintain performance as the number of species increase. Marini et al. showed that when using colour features alone on the Caltech-ucsd birds-200-2011 Dataset, accuracy reduces from approximately 85% when selecting between 2 species to 20% when differentiating between 17 species and just 7% accuracy with 200 species. SIFT and colour features (Wah et al., 2011b) work well for the classification of bird species but again, is only tested for a small number of classes. The work presented in Welinder et al. (2010), classified bird species using size and colour histogram with a bin size of 10 but the classification rate was also low, which was attributed to the fine-grained nature of the dataset.

Part-based methods use features which are associated with specific parts of the bird’s body and use colour and/or shape features (Wah et al., 2011a; Duan et al., 2012; Berg and Belhumeur, 2013; Huang et al., 2013; Branson et al., 2014; Wah et al., 2011b; Berg et al., 2014). This general approach can help differentiate between species with high

2.2. APPLICATION OFCOMPUTERVISIONTECHNIQUES TOSPECIES

CLASSIFICATION 19

visual correlation. For Example Collared and Turtle Doves (Streptopelia decaocto and Streptopelia turturrespectively) are species with distinguishing features around the neck and the eyes but Turtle Doves exhibit a black and white striped neck patch and a bold red eye ring, which is not visible on Collared Doves. In such species, these methods have achieved better classification accuracy but almost all require some manual input (so are not fully automated) and also require good-quality images in which body parts are present and identifiable. Typically, the manual inputs are annotations which identify the bird’s parts prior to feature extraction and this is time-consuming and labour intensive, placing some practical limits on the amount of data which can be processed. For example, Branson et al. proposed a human in the loop approach, which is predicated on the idea that a human and computerised system working together classify more efficiently than either alone. The human operator annotates the parts and answers multiple choice questions and the algorithm uses this information to assist with the classification: they were able to achieve a true positive rate of 93%. Berg et al. developed an online application called Birdsnap, which can be used to classify various US bird species; this also requires some manual input to annotate the bird’s parts prior to segmentation and classification.

Krause et al. (2015), and (Gavves et al., 2013, 2015) both developed annotation-free parts-based methods, which automatically identified body parts using co-segmentation and alignment. Their results compared favourably to other states of the art methods: Krause et al. used figure/ground segmentation to determine pose and localise the bird’s parts, while Gavves et al. fitted ellipse to the segmented object to align and then determine sub-parts. Results based on the CUB-2011 dataset show that the true positive rates were 62%, 82% and 67% for the methods in Gavves et al. (2013); Krause et al. (2015); Gavves et al. (2015) respectively. The drawbacks of these methods are that they make use of the Grab-Cut segmentation method Rother et al. (2004b) so still require some manual inputs: decomposing soft-bodied objects with arbitrary poses, remain a challenging problem in computer vision. Another annotation free method was proposed by Zhang et al. (2015), this detects parts using CNN feature representations. The main difference between this and Krause et al. is the method by which the parts are selected. Whereas Krause et al.

align the co-segmented objects before labelling parts, Zhang et al. (2015) uses CNN fea- ture representations to detect parts automatically. The correct classification rate reported by Zhang et al. was (75%) based on the CUB-2011 Dataset and was not a significant improvement on the other methods reported.

The objective of this research is to develop a system which is capable of identifying birds in flight, which can be deployed in the field. Almost all of the part-based methods mentioned above are dependent on manual annotations and whilst many have been suc- cessful, this constraint makes them inherently unsuitable for wide-scale deployment in the field. Further more, they all also require high-quality images. The datasets used have been highly-detailed, high-resolution images which exceed the quality that would be captured automatically from flying birds in real-world settings. In addition, it is asserted that birds furthest from the camera would be less easily classified using appearance features alone (for example, colour features are attenuated). This motivates our approach of combin- ing colour and motion features. Of the other approaches mentioned, it was consider that Marini et al. is most appropriate for this problem domain, being fully automated, non- parts based (so more likely to be robust to reduced image quality) and reporting relatively good results. This method have therefore used as a benchmark for the work presented in this thesis.