• No se han encontrado resultados

Tareas a desarrollar para la expansión de la red GSM de Cuba

In document El estándar GSM y su empleo en Cuba (página 49-92)

Capítulo 3. Propuesta de ampliación de la red GSM de Cuba

3.2. Tareas a desarrollar para la expansión de la red GSM de Cuba

A significantly large number of computer graphics application typically produce images or videos as output often comparing the quality of the proposed algorithms and reproduction techniques using state-of-the-art objective quality metrics. While objective quality metrics especially full-reference perceptual metrics as described previously are often accurate in their quality prediction and have been shown to correlate well with subjective experiments, their primary restriction is the training set of distortions [SSB06b]. The accuracy of such metrics decreases with the growing variety of distortions [PLZ∗09]. Therefore, the final

judgment of quality required to convincingly prove the superiority of performance needs to corroborated by user studies with the help of potential users or reviewers. Given the range of distortions that are present in computer graphics applications it is unlikely that user studies will completely be replaced by objective metrics [MTM12]. However, such user studies although more convincing than objective evaluation are typically more tedious and tend to produce noisy results when conducted inappropriately and the interpretation of the results are non-trivial [MTM12]. This section introduces the reader to some of the basic subjective evaluation techniques required to conduct an effective evaluation of image/video content in an increasingly large number of applications. Additionally, this section provides an overview of some of the techniques to design such experiments and conduct appropri- ate statistical analysis such that the resultant data can be confidently accepted. Some of the techniques described in this section have been used in Chapters 5 and 6 to conduct subjective evaluation of HDR and LDR video content.

Subjective evaluation of image/video content are typically conducted with the help of rating, ranking and pairwise-comparison based experiments. This section describes each technique in brief detail.

4.3.1 Rating based experiments

Bad Excellent

Fair Excellent

Good

Bad

Poor

FIVE POINT LIKERT SCALE (DISCRETE) CONTINUOUS SCALE [1, 5]

Figure 4.8: Schematic diagram of a likert (discrete) and a continuous scale.

Rating based experiments can be broadly classified into two groups i.e. single and double stimuli categorical rating. In a single stimulus rating an image or video content is displayed for a short duration and the observers are requested to rate the quality of the displayed content on a scale of [1 - N] where higher is better. Typically, in many applica-

tions the rating scale is designed such that the scaleS[1,5]where the categories arebad,

poor, fair, good and excellent [MTM12] (see Figure 4.8). In some cases, the continuous

rating scale is favoured over the five-point Likert-type scale in order to avoid quantisation artefacts [Ass03] (see Figure 4.8). Additionally, such rating techniques also contain a hid- den reference (the reference stimuli (image/video sequence) against which other stimuli are tested) randomly presented to the observer to avoid bias.

In a double stimulus rating based experiment, the reference and target image/video content are presented to the observer at the same time typically by means of a dual-display. In case of a single display, the contents are randomly presented one after the other. The primary advantage of rating based experiments is the time required to conduct the experi- ments. The observers can either execute the task using a controlling GUI or can even mark their rating preferences using a score sheet which can later be digitised for analysis. For a single stimulus experiment, the number of trials required isn+1 fornconditions where one extra trial is required for thehiddenreference.

4.3.2 Ranking based experiments

Ranking based experiments are a more deterministic technique to subjectively evaluate im- age/video quality. Here the participants are tasked to rank a series to candidate stimuli (image/video content) against a known reference where the basis of ranking is the close- ness or resemblance of the candidate stimuli with that of the reference. Similar to the rating based experiments, ranking based experiments, might or might not contain a hidden refer- ence. Also, single/double stimulus ranking based experiments can be conducted where the candidate stimulus is always shown along with the reference stimulus to the participants. Typically, in such an experiment, the candidate stimuli are ordered from [1N], where lower is better and N is the number of candidate stimuli.

The primary disadvantage of the ranking based experiments is the time required to conduct such an experiment as the participants have to compare all the candidate stimuli before ordering according to their preference. Ranking can also be indirectly conducted us- ing forced choice pairwise comparisons as explained later in this section. Later, in Chapters 5 and 6, both rating- and ranking-based experiments have been conducted in order to obtain user experience of HDR videos and ranking of HDR video compression algorithms.

4.3.3 Pairwise comparison based experiments

Pairwise comparisons can also be broadly classified into two groups. The first grouporder-

ing by forced choicerequires the participants to choose between a pair of candidate stimuli

with similar content but processed with different conditions [MTM12] according to their preference. Observers are forced to choose one candidate in random when they perceive no difference between the candidates. There are several advantages of pairwise comparison

techniques such as fewer problems with the obtained subjective data [BADC11] as com- pared to rating and the existence of standard statistical techniques to determine the signif- icance of inferred ranks as compared to ranking [BADC11]. However, the main disadvan- tage of pairwise comparison techniques is the time required to conduct such experiments. Although, there is no time limit, it requires more trials to compare each pair of conditions which can be formulated as 0.5×(n·(n1)), where n are the number of possible con- ditions. Although, a full comparison is ideal, the number of trials can be limited using a balanced incomplete block design as described in [MTM12, GT61] or using a sorting algorithm to choose the comparison pairs [SF01].

Although, the forced choice comparison determines of the order to viewing prefer- ence, it does not quantify the difference between the stimuli presented. The second group of pairwise comparison techniques are classified aspairwise similarity judgementswhere the participants are not only asked to order the stimuli according to their preference but also to indicate the difference between each pair of stimuli presented, on a continuous scale similar to rating. In case the observer perceives no difference, the marker can be set to ‘0’. Such experiments are more deterministic and informative albeit at the cost of experiment time. Further details about the comparison methods is available in [MTM12].

In document El estándar GSM y su empleo en Cuba (página 49-92)

Documento similar