• No se han encontrado resultados

The proposed algorithm was used to locate and learn beats for 1-minute excerpts of the 100 songs in the General Dataset. Additionally, the general-purpose beat trackers described in Section 5.1 (Ellis, Dixon, and Davies) were also run on the same data. In this way, not only could the system be evaluated on still longer excerpts of music in order to test the robustness of the algorithm, but the multiple genres allow for the examination of which types of music the proposed algorithm works well on, and which it does not.

5 15 25 35 45 55 65 75 85 95 10

30 50 70

Accuracy on General Dataset (1m Clips), Room Noise

Accuracy (Information Gain)

Number of excerpts

Proposed Tracker 1 Tracker 2 Tracker 3

Accuracy (Information Gain)

5 15 25 35 45 55 65 75 85 95 Accuracy on General Dataset (1m Clips), Room Noise

30 50 70 Num ber of Excerpts Ellis Proposed Dixon Davies 10

Figure 7.10: Results of the proposed and the comparison algorithms trained and run on 60-second excerpts from the General Dataset recorded in the presence of room noise, in Information Gain.

The overall results of this experiment are shown in Figures 7.9 and 7.10. While performance is not as great as it is for the Dance Dataset, it can still be observed that the system continues to outperform the off-the-shelf algorithms. Some of this music is very challenging to track, particularly music from the classical and jazz genres, which lack many of the elements that made the Dance Dataset more tractable such as consistent, steady, and prominent beats. However, despite these difficulties, the proposed algorithm is still superior to the others. This demonstrates the overall utility of the system, as it can still function more accurately than others while tracking challenging music in noisy environments.

Additionally, the results broken down by genre are shown in Tables 7.5 and 7.6. These tables make it clear that the proposed algorithm is better at tracking some genres than others. Hip-hop and soul do particularly well, with F-Measure scores of 84.5 and 72.6 and Information Gain scores of 81.6 and 67.4 respectively. The music found for these genres tends to include prominent steady beats without too many

Table 7.5: Average results of the proposed and the comparison algorithms trained and run on 60-second excerpts from the General Dataset recorded in the presence of room noise. Results are in F-Measure.

Genre Proposed Ellis Dixon Davies

Classical 44.1 37.8 34.0 39.3 Country/Western 49.7 26.1 27.9 41.7 Dance 64.0 62.6 58.6 72.8 Folk 40.4 34.2 47.0 57.3 Hip-Hop 84.5 40.8 32.0 56.7 Jazz 37.1 34.5 30.6 45.2 Metal 65.5 43.8 46.4 48.9 Pop 61.8 41.3 40.5 60.7 Rock 66.5 46.8 53.2 59.4 Soul 72.6 46.4 40.0 52.7 Total 58.6 41.4 41.0 53.5

other strong sound sources to detract from them. For much of the selected music of these genres, the beats are the loudest and most prominent parts, which allows for the proposed system to more easily separate the beats from everything else and successfully perform beat tracking. The system still does relatively well for other genres that tend to have steady beats, such as Dance, Metal, Pop, and Rock. It does the most poorly on genres such as Classical and Jazz, which often have irregular beats, as well as Country/Western and Folk, which tends to have softer beats than the other genres.

It is worth noting that the proposed system does not win for every single genre. In terms of F-Measure, for example, the Davies tracker outperforms it for the Dance genre. A likely explanation for this is that the feature used in the Davies system, the complex spectral difference, was especially impacted by the beats in those songs, which tend to have a particularly bombastic quality. This feature was still an accurate indicate of the presence of beats even after noise was added due to the beats still being

Table 7.6: Average results of the proposed and the comparison algorithms trained and run on 60-second excerpts from the General Dataset recorded in the presence of room noise. Results are in Information Gain.

Genre Proposed Ellis Dixon Davies

Classical 23.2 13.7 13.9 3.1 Country/Western 34.1 9.3 8.3 14.6 Dance 54.5 47.1 44.6 42.3 Folk 14.2 8.5 23.9 21.4 Hip-Hop 81.6 44.0 28.8 29.7 Jazz 14.0 18.3 6.9 18.0 Metal 54.1 35.3 24.9 22.0 Pop 53.1 31.7 33.6 32.2 Rock 54.1 33.8 30.7 32.0 Soul 67.4 26.2 22.0 21.6 Total 45.0 26.8 23.7 23.7

loud and that feature being so heavily influenced on those particular beats. However, the proposed system does win for a majority of the genres, and furthermore has the highest average accuracy overall. Also, the proposed system wins more consistently in terms of Information Gain, which tends to be more meaningful for lower scores (while F-Measure is more meaningful at higher scores). Since the achieved scores of the system are generally substantially less than 100, Information Gain is likely the more meaningful metric in this situation, and by that standard, the proposed system achieves superior results for almost every genre.

The results of this experiment show that the proposed system produces state-of- the-art results on an audio corpus taken from a wide variety of genres. It is thus more useful in cases where the exact genre of the music that will be played is not known, as might be the case in a bar or club where the music being played through the speakers can vary dramatically from song to song. This also indicates that the system is robust enough that it can learn and locate different types of beats, instead

5 15 25 35 45 55 65 75 85 95 10

30 50 70

Accuracy on General Dataset (1m Clips), Bar Noise

Accuracy (F−Measure) Number of excerpts Proposed Tracker 1 Tracker 2 Tracker 3 Accuracy (F-Measure) 5 15 25 35 45 55 65 75 85 95 Accuracy on General Dataset (1m Clips), Bar Noise

30 50 70 Num ber of Excerpts Ellis Proposed Dixon Davies 10

Figure 7.11: Results of the proposed and the comparison algorithms trained and run on 60-second excerpts from the General Dataset recorded in the presence of bar noise, in F-Measure.

of being restricted to only being able to find beats that heavily influence a particular hand-crafted feature. In short, these results demonstrate that the system is robust both to noise and to the variations that occur between songs from several different genres. Furthermore, for genres such as Hip-Hop and Soul, these results demonstrate that the system is highly accurate and can already track such music well even in these noisy conditions.

Documento similar