• No se han encontrado resultados

The big change in TRECVID 2003 from previous years was the domain of the video used for evaluation, where 120 hours of television news content from CNN and ABC from 1998 and 13 hours of C-SPAN video was used for the development and evaluation corpus [Browne et al, 2003]. One of the reasons that television news content was chosen is its highly structured nature making story/scene detection feasible.

Apart from shot boundary detection, the 2003 TRECVID featured a news story (scene) detection task and introduced a number of new segmentation features (see Table 1-5). There are a number of domain-specific features like news subject

monologue and weather news included in 2003 whereas TREC 2002’s features were

not domain-specific.

Table 1-5: TREC 2003 Features

Num Feature Num Feature

1 Outdoors 10 Aircraft

2 News Subject Face 11 News Subject Monologue

3 People 12 Non-studio Setting

4 Building 13 Sporting Event

5 Road 14 Weather News

6 Vegetation 15 Zoom in

7 Animal 16 Physical violence

8 Female Speech 17 Person X

9 Car / Truck/ Bus

The retrieval results showed that these additional features did boost the overall system performance for a few of the participating TRECVID groups [TRECVID 2003] despite the fact that average feature performance was similar to the previous year (Figure 1-8 & Figure 1-7). The domain-specific nature of some the features and quality of the evaluation content might be factors in the performance benefit

at the time of writing. None of the features from the previous year were specific to the domain whereas in 2003 there was 5 namely "sporting event”, “weather news",

“non-studio setting”, “news subject monologue"and "news subject face".

Table 1-6: TRECVID 2003 Q uery Topics

1 Aerial views * 13 Basketball Matches 2 Baseball Matches 14 Yasser Arafat

3 Aircraft Taking off 15 Helicopters in the Air 4 Tomb of the Unknown Soldier 16 Missile Lunch

5 Mercedes Benz logo 17 Tanks

6 Person diving 18 Train coming towards camera 7 Fire 19 Snowy Mountain top

8 Osma Bin Ladin 20 Traffic

9 Egyptian Sphinx 21 People on the Street 10 Congressman Mark Souder * 22 Actor Morgan Freeman * 11 Cup of coffee * 23 Video of Cats

12 The Pope 24 The White House

* Indicates no text found in ASR

1 Results Taken from TRECVID An Introduction [TRECVID 2003] 27

Another factor in the way that the use o f features helped improve retrieval performance (especially low level features) was the nature of particular topics and/or their associated examples. Figure 1-9 illustrates two interesting topic examples that offer improved TREC 2003 retrieval performance when using features [Browne et al, 2003]. The first topic example “Basketball Matches” illustrates TREC topic examples that are taken from a similar video source to the content we expect to retrieve from. The second example “'Aircraft Taking o ff ’ is a good topic for low- level features like the colour histogram as the examples contain a large amount of a dominant colour, in the example’s cases it is the background blue sky which takes up a significant portion of the image.

Topic Num Topic Example Images

Figure 1-9: Friendly Im age Topic examples from TREC 2003

Some TRECVID query topics and their examples did pose a difficult challenge for retrieval. The following are the main reasons why some topics prove more challenging for retrieval than others:

1: The ASR audio did not contain the topic keywords (see Table 1-6). Naturally without matching keywords, text-based search offers limited performance. In the case of the query “‘Congressman M ark Souder” one solution that was tried was text- based searching on Congressman and Senator to narrow down the search and then browse through the results.

2: The TREC features were not useful for the particular query. If there was person detection feature for “Congressman M ark Souder” then the topic search would have been far easier, unfortunately none o f the remaining features were suitable for that specific topic.

3: The Query topic examples are visually dissimilar from the search content. Retrieval systems that use low-level features and incorporate the topic examples into search would not have much success. The main reason is that the topic examples and valid results from the search content are too different in terms of colour and shape to be ranked highly and retrieved (see Figure 1-10).

Topic Num Topic Examples

Figure 1-10: Unfriendly Im age Topic examples from TREC 2003

As we can see from Figure 1-10, the examples from the topic 10 are all publicity material. Content from C SPAN or TV news is unlikely to have matching visual content being more likely to contain outside footage o f the Congressman being interviewer or footage of he as one o f a group of people at a meeting or gathering. The second example of topic 22 has 2 stills taken from a movie and another promotional photo and once again it is unlikely that matching video will be found. However, there is a strong possibility o f finding trees and greenery as the 2 example results for topic 22 shown in Figure 1-11 illustrate.

Addess http://www fischiar dcu e/treCl2b/browse?J 0827126108SÛ2S31E12

» TASK 22 iyJWJ'escripjion i

0 sh o ts s a v e d ivinw t»<iv eilt a llots) >> f i n a l i llijs T ash

QUERY

Enter term s Added lext t im ag es will be u sed together for searching 't t a lL I t t Ail | d e a r a l)[

SEARCH R ESULT

Search foiwd 100 rt&t : hinc shot'*. Fijiic^ir'-j i= Mir ran k ed list of the sta :c fi re?ijli

;!ijc L «s. m |jj e !.‘j (''ay I tie errant cftcji on Pie 1 h o ik i Icon Delict I r« im age lu fir It, I:-* Sub = equtnl iiJjiC ifc1: ; *0f !itri th i Cf'ifCtLO-i tG .c. |K& *,^01. pi? yQUI itT3^7* r.

RESuL* P **C 1 ? ^ J 4 s iir jrr .i*: ifis'h AHC N ew s G J r . i e 19ÎIÛ m-i: ru •• .-■ »i.-vi

Mil-:11 2 * ‘ ^ijrii^n! trori, CNN N e w s 10 Muy 1'1'lil j ; , - . :A-- ■•„ . ■ ,1

Apptet stated. .© Trusted s*es

Figure 1-11: Screen shot of the DCU TRECVID 2003 System

The screenshot in Figure 1-11 shows an image only search for “Actor M organ

Freeman” using one of the topic examples where low-level histogram-based features

were used to generate the ranking. As we can see from the ranked results on the right (centre column) baseball features quite highly and the greenish background from the topic example is responsible for this. Note also that the ranked results also feature people.

1.7.5 The Future of TRECVID Video Evaluation

TRECVID 2004 continues with evaluation of television news content using a larger corpus of television news and updated query topics. Further into the future might see other domains like television sports (soccer and tennis for example) and other genres being evaluated. The feature extraction task will continue to have additional concepts added and removed depending on the corpus domain under evaluation, and possible additional features that could be included are:

1. River 6. Beach

4. Ski Detection 3. Boats

5. Music Genre

8. Football sports Field detection 9. Swimming detection

10. Laugh detection