• No se han encontrado resultados

Justificación de la estrategia en función de las necesidades

In document VERSIÓN 4 2 DE DICIEMBRE DE 2010 (página 155-172)

TOTAL ENFOQUE

3.2. La estrategia elegida en relación con los puntos fuertes y las deficiencias

3.2.2. Justificación de la estrategia en función de las necesidades

The ubiquitous expansion of multimedia data has driven the need for the development of automatic systems and tools for content-based multimedia analysis [28]. This research field has been very active in recent times due to the commercial interest and entertainment value which can be offered to large audiences. In particular, automatic indexing and video summarisation

1

of broadcast sports video have been very active fields. In [28], the authors discuss that with an abundance of media available, viewers prefer to retrieve key events in a sporting match, rather than watch a complete sporting event from start to finish. There are numerous approaches for shot classification and highlight extraction for specific sports video, which have been developed based on a combination of extracting low-level visual/auditory features and sports genre-specific rules [29]. Other approaches use ball and/or player tracking techniques for detecting semantic events caused by player-ball in- teractions.

Much of the published research on event detection in sporting video is general audience oriented, where automatically indexed events are chan- nelled to the audience automatically. In the context of communication, au- tomatic video annotation is of particular interest since video transcoding can overcome communications bottlenecks. Another application for sports video analysis is in home video applications, where video summaries or user annotations are required to provide functionality for searching tools in large personal archives. However, sporting professionals, such as soccer coaches are more interested in the tactical events and useful statistics, which can be inferred from detected events [178]. Zhe et al. [178] explain that manual annotation systems, which provide manual editing tools along with event retrieval interfaces are no longer useful, given the abundance of information which coaches amass over time. Coaches prefer comprehensive statistics, which can be used to infer tactical patterns and help to improve performance during or after a match. To manage the ongoing video annotation opera- tions, coaching teams frequently employ video editors to capture, annotate and organise information, which is then used to build tactical analysis and useful statistics. These video editing duties are extremely time-consuming and the possibility of using automatic multimedia event retrieval technolo-

gies is accelerating research in this area, especially by using visual sensing technology.

Managing visual data is becoming a bigger problem due to the increasing amount of content which is produced. To simplify the problem, identifying semantic indexes which can describe events within the video is very help- ful. Manual annotating is simply too tedious and time consuming, making automated video indexing techniques a necessity [11].

High-Level Event Analysis

With the large volumes of sport video being captured by both broadcast- ers and amateurs, recent years has seen an increase in technologies which can provide high level analysis. In soccer, camera image streams are used to answer high-level analysis questions such as: What attacking plays char- acterise each team? What are the main characteristics of a specific team player? What team roles do these players have? Are they capable of their assigned team role? Can they accomplish their given duties? How does a particular team attack and create an opportunity to score? What are the main skills of each player? What kind of team formation is being used? In a number of team based field sports, there is an appetite for real time analysis of events from referee associations, the sports press and supporters. Auto- matic video analysis tools have the potential to detect erroneous refereeing decisions, by monitoring video sequences to prevent misinterpretations due to occlusion or viewpoint error or simply due to an overwhelming number of events taking place concurrently.

To detect and track the ball in soccer videos, Yu et al. [173] use a trajectory based algorithm. Ball candidates are first selected from feature objects (the goalmouth and ellipse). A Kalman filter is then used to generate candidate trajectories from the ball candidates. A confidence measure then

Figure 6.1: The first goal scored (a) a long range of the score, (b) camera zoom to player, (c) crowd response, (d) a replay, (e) another replay, and (f) long range view of the resumption of play. [51]

decides which of the candidate trajectories is the correct ball trajectory. High-level events such as ball touching or ball passing are then inferred, which is then used to detect team ball possession analysis.

In soccer, multi-modal information analysis is commonly used to auto- matically detect high level events. In [51], the authors use a rule-based and model-based system is used, which exploits heuristic rules and detects the goal event in soccer. In their approach, the authors in [51] capture a close-up of the goal event with an emotional scorer and also a goalie occurring close to shots of the crowd reaction to the goal. This is immediately followed by several slow-motion replays of the goal from different camera angles. Be- tween the long range shot which first shows the goal to the resulting kick

off long shot, the authors in [51] define a cinematic template that needs to satisfy the following sequence of rules:

• Time allocation for a goal will last between 30 and 120 seconds; • There must be one or more close-up/out of field shots: These may

range from a player close up or a view of the crowd;

• There must be a minimum of one slow motion replay, as a reply of the goal is always played several times.

Figure 6.1 provides an illustration of the template for the first goal in the Spain1 sequence of the well known MPEG-7 data set. In this example the break duration is 54 seconds. The method for detecting goals firstly detects slow motion replay shots and thereafter it detects long shot views that mark the beginning and end of the goal sequence. Another commonly sought high level event in soccer is referee detection. The approach presented in [51] exploits the fact that referee’s clothing is always distinguishable from the clothing worn by both teams. They use a dominant color region detection algorithm which is applied when there is a medium or out of field/close-up shot.

In document VERSIÓN 4 2 DE DICIEMBRE DE 2010 (página 155-172)