The inputs of the PWC are the time frequency contour files (one for each recording) extracted by the automatic whistle and moan detector. To train the classifier each recording needed to be associated with one visually identified species. This was done by linking recordings to sightings. This selection process was done in several stages described in a schematic diagram (Figure 4-2 i to v) and in the following paragraphs. The main stages were to (i) select the visual detections of interest, (ii) extract the acoustic data of interest (iii) link visual and acoustic detections, (iv) train the classifier, and (v) test it. This process was done individually for both the French and Spanish dataset.
4.3.1.a Selection of visual detections
During the survey seven whistling species were visually detected: bottlenose dolphin (Tursiops truncatus), common dolphin (Delphinus delphis), striped dolphin (Stenella coeruleoalba), killer whales (Orcinus orca), long finned pilot whale (Globicephala. melas), short finned pilot whale (G. macrorhynchus), and Risso’s dolphin (Grampus griseus).
Part I Classification Chapter 4: Classification of data from a less reliable training dataset
64
Common and striped dolphin were often observed together in large mixed groups, and in this situation the visual observer identified the groups as common and striped (C&S) .
The CODA visual survey protocol required observers to give the degree of confidence (High, Medium, Low) of species identification for each sighting (CODA, 2009). For quality assurance purpose, all primary and tracker sightings with high or assumed high (blank in the database) identification confidence were selected. (Figure 4-2, i.a).
4.3.1.b Link between visual and acoustic detection
4.3.1.b.i Time at hydrophones (Figure 4-2 iii.a)
As mentioned in the description of the data, the visual observers looked for animals ahead of the vessel, whereas the hydrophones, from which acoustic data were extracted, were towed up to 400m behind the vessel. Due to the distance between the visual platform and acoustic platform, the probability of simultaneously detecting the same animal both visually and acoustically was not optimal. Thus the following method was adopted for linking visual and acoustic detections; For each visual detection, the time when the hydrophones were at the perpendicular distance of the sighting (this variable will be called “abeam time”: TAb ) was estimated using the formula below
1”w =•–—*%-/m + 4005.14 + 1™ (4-1)
where Aˆ was the angle between the bearing of the vessel and the animal, radial distance (R) estimated by the visual observer. Then the distance between the visual team and the hydrophone was added (400m). This total distance was dived by the vessel speed 5.14 meters per seconds and added to the time of visual observation (TV).
It was assumed that the animal did not move significantly between the visual detection and the time the hydrophones were abeam of the animals.
4.3.1.b.ii Acoustic selection (Figure 4-2 iii.b)
Each visual detection (Primary and Tracker) of species of interest with a high confidence level of identification was associated with the acoustic recordings corresponding to the “abeam time” of detection. To be sure not to miss any vocalisations, while at the same time
65
ensuring not to select recordings with two different species several rules were applied to be conservative on the choice of recordings:
• immediate recordings before and after the “abeam time” corresponding to the visual detection were selected;
• if within a selected recording more than one species was observed the recording was not selected for the analysis;
• if an adjacent recording contained a visual detection of a different species these adjacent recordings were not selected;
• the last two rules were not applied to the common (COD), striped (STD) and common/striped (C&S) detections. Indeed, during the visual survey an initial sighting would be made and then consecutive re-sightings were made during which the confidence of species identification went up. Common and striped dolphin were regularly observed in large mixed groups (C&S, common AND striped), within these mixed groups smaller, single species subgroups were observed (common OR striped; so that consecutive re-sightings separated by 5 minutes or less would alternate between groups consisting entirely of common dolphins and groups consisting entirely of striped dolphins. For this reason if any of these three groups (C&S, COD or STD) were sighted within the same or adjacent recordings, these recordings were selected and identified as CSD detections.
4.3.2. Creation of the classifiers
Four classifiers were trained and tested using the CODA data; two with the French dataset and two with the Spanish dataset. For each dataset a first classifier, called 2Sp French classifier and 3Sp Spanish classifier were trained with all the detections from COD, STD and C&S pooled in one unique classification group (CSD). This setup was a conservative approach which matched with the misidentification of these species by the visual teams. Then each dataset was used to train a classifier with the COD, STD and C&S detections representing a classification group each. They were called 4Sp French classifier and 5Sp Spanish classifier.
Finally a last classifier, called the North Atlantic classifier, has been trained using the data of Gillespie et al., (2013). This classifier was trained with the same species group as the 3Sp classifier and with the optimal fragment and sections length measured by Gillespie et al.
Part I Classification Chapter 4: Classification of data from a less reliable training dataset
66
(2013). This classifier was made using data recorded in different areas of the North Atlantic ocean generally from a small sailing research vessel in the vicinity of groups of dolphins or made while underway with dolphins close to the vessel (Gillespie et al., 2013)
The training was done following the method developed in the previous chapter (chapter 3 2.1 p 43). To identify the optimal fragment and section length, the quality coefficient (Q) was calculated on the pooled French and Spanish datasets. Fragments and sections ranging respectively from 5 to 15 bins (27ms to 80ms) and 10 to 30 fragments were tested.
Each classifier was represented by its confusion matrix when 80% of the training data were used to train the classifier. To estimate the precision of the classification probabilities if 100% of the training data were used to train it, each classifier was trained with different proportions of training data as described in (chapter 2). However the final algorithm of the classifier which was used to classify new data was created using 100% of the training data.