Objective Quality Evaluation aims to apply an automatic and reliable way to estimate a user’s perception of a service. Its goal is to have a good correlation with subjective quality evaluation methods.
The main purposes of objective quality evaluation for measuring QoE standards are: 1) Characterizing the meaning of user opinions related to specific applications; 2) Defining a method for reliable user opinions;
3) Defining a method for prediction of user opinions.
There are three available methods that are often used for an objective evaluation:
1) Full reference in which both reference and processed data are available for detailed objective-subjective comparison;
2) No reference in which only processed data is used for objective-subjective
comparison;
3) Reduced reference in which some features are extracted from reference and
processed data are available to derive and compare objective and subjective correlation.
Thefull referencemethod represents the highest accuracy, but it increases the non-data load. Theno referencemethod may give low accuracy because network conditions may affect its quality estimation; however, it has no effect on networking load. Thereduced reference promises a benefit over the first and second methods as it represents the combination of advantages from the first two methods such as higher accuracy but less non-data load.
2.2.2 Current status
Several methodologies using objective evaluation have been standardized in ITU Documents, for voice [21-23, 43], audio [44], and multimedia services [45, 46] as shown in Figure 2-4.
Figure 2-4: ITU's standards on QoE using objective quality evaluation methods
An important transmission rating model for measuring QoE for voice that is widely- known and adopted is the E-Model (ITU-G.107)[21]. This model estimates two way
Objective Quality Evaluation Speech Signal Voice over IP E Model ITU-T G. 107 Speech Codecs ITU-T P. 861
Speech / Speech Codecs Quality Assessment ITU-
T P. 862
Speech Quality Assessment ITU-T P.563
Audio Signal
Perceived Audio Quality ITU-T BS.1387-1
Multimedia services
Perceptual visual quality measurement ITU-T J. 246 (Reduced Reference) Multimedia video quality measurement ITU-T J. 247 (Full Reference)
conversational qualities as perceived by users in terms of listener and talker. It requires extensive knowledge of the system components for estimating user satisfaction and enables a determination as to whether a user will be satisfied with end to end transmission performance. In ITU-T G.107, a parameter known as the R-factor is used as a measure of quality and is defined by Equation(2.1).
A I I I R R 0 s d e (2.1) where:
R: Transmission rating factor
0
R : A basic ratio of signal to noise
s
I : Simultaneous impairment factor
d
I : Delay impairment factor
e
I : Equipment impairment factor
A : Advantage factor for expectation.
According to G.107 [21], the R factor is always a numerical value between 0 and 100 and an acceptable value of R is greater than 60 while an unattainable value of R is over 94.5 for current VoIP services. Table 2-1 is an extract of Table B.1 from the G.107 recommendation that shows how the R value and user satisfaction are related.
Table 2-1: Guidelines showing the relationship betweenR and user satisfaction (Table B.1 of G.107 [21]) R value
MOS Good or Better (%) Poor or Worse (%) User Satisfaction
90 4.34 97 Nearly 0 Very satisfied
80 4.03 89 Nearly 0 Satisfied
70 3.60 73 6 Some users dissatisfied
60 3.10 50 17 Many users dissatisfied
50 2.58 27 38 Nearly all users dissatisfied
The results of the MOS scores are presented in the second column of Table 2-1, and the values for a VoIP conversational situation are computed using the R-factor which is then scaled to a range of values from 1 to 5.
The relationship between R values and MOS is displayed in more detail in Figure 2-5. A five scale MOS is applied. For example, if the R value is less than or equal to 50, nearly all users are dissatisfied, the voice quality and the MOS lies in the range from 2 to 3.
Figure 2-5: MOS andR values (Figure B.2 of G.107 [21])
It could be said that this model gives an approach to modelling QoE by deriving application layer performance metrics based on network related performance parameters. However, because it is based on impairment values, therefore, it is too complex and needs more extensive knowledge of human perception; moreover, it is not supported for web traffic.
An alternative objective model is referred to as the Perceptual Speech Quality Measure (PSQM-P.861) [22]. This method is used specifically for speech coding and is linked to human auditory perception. Unfortunately, the PSQM model is suitable only for speech codecs and not for networked situations. Moreover, it gives poorer results when correlated with subjective opinions in some normal situations where there is background noise or packet loss. This recommendation has been recognized as having certain limitations in the specific area of application, and thus it was replaced by P.862 [23].
By injecting a signal into the system under test, degraded output is compared by Perceptual Evaluation of Speech Quality (PESQ-P.862) [23] using a reference input signal; PESQ can measure one-way voice quality, it demands no knowledge of the system under test, but it does require extensive knowledge of human perception.
The Video Quality Experts Group (VQEG) was formed in 1997 to address video quality issues. Their current goals are to advance quality assessment of the field of video,
and investigate new subjective assessment methods in which subjective rates are recorded, then used to predict objective quality metrics [47, 48]. The method for objective measurements of perceived audio quality is recommended in ITU-R BS. 1387.1 [44]. Objective perceptual video quality measurement with an available full reference is recommended in ITU-T J.247 [46] which defines four full reference models. ITU-T J.247 recommended a selection of appropriate objective perceptual video quality measurement methods such as testing a codec or testing a transmission chain. ITU-T J. 246 [45] defines three reduced reference models to measure perceptual visual quality for multimedia services over digital cable television network. For example, the edge peak signal to noise ratio (EPSN) reduced reference model calculated the mean squared error from a degradation of edge pixels.
The above models described in Sections 2.1 and 2.2 are examples of models that are both subjective and objective for QoE assessment. However, these measurement models are suitable for specific kinds of traffic involving audio signals. Thus, they cannot be applied in situations which involve web traffic in its many application forms. Moreover, they rely too much on scores and user opinions that are expensive in time and money to obtain.
2.3 Objective-Subjective correlation for measuring QoE