The quality of the algorithms is usually measured based on processing time complexity and memory requirements. In the evaluation of complex systems such as the SPA tracking system other quality criteria regarding the verifica-tion of the validity of the system are required. The tracking algorithm may fail to find the correct position of the target which is considered as an error in tracking. So measures such as the number of errors or the rate of error occurrence should be used to evaluate the tracking algorithms.
7.3.1 Performance Measurements
Errors from the tracking system occur when players move their extremities or if they move vertically (for example when jumping). The tracking algorithm tries to track part of the player’s body or the whole body. For example, the template matching tracking implemented in this work is a head based tracking. The tracker may fail to match the head exactly. This happens due to collisions between players which may lead to partial or complete occlusion of the tracked part. Another source of error is the inaccurate estimation of the background mask. This may cause that the tracked part of the player is considered as background and is masked out from the matching results.
In order to keep error free tracking manual intervention (corrections) from user is required. The number of manual interventions (corrections) is a measure of the quality the tracker. The rate of correction is the number of corrections done in a specific amount of time (for example in one minute).
In the evaluation of tracking algorithms done in this work the number of corrections done per minutes is used. This error measurement will be called E. The number of errors during the whole video sequence (basketball game quarter or handball game half) will be denoted as Ne.
User intervention in tracking process affects the rate of frame processing.
The time of error correction is important for that estimation of the total time of processing one quarter or one game. It depends on the graphical user interface of the SPA and how easy it is for the user to correct an error. To correct an error the user should pause the tracking, may go back in the video for several frames until the occurrence of the error, correct by dragging the tracker to the correct position of the tracker and then let the tracking run normally. So another important measure of tracking is the time needed to correct one tracking error Te.
To make it possible to evaluate the tracking algorithms a benchmark or ground truth dataset is needed. Using the manual tracking capability in the SPA software, the benchmark dataset described in Section 7.2.2 has been created.
7.3.2 Definition of Quality Criteria
One main use of the video tracking in the sports domain is to evaluate the performance of the players by coaches or by scientists in the coaching and sport science domain. According to Bös [13] the main quality criteria in the classic test theory are the Objectivity, Reliability and Validity. Objectivity is the degree in which the test results are independent of the investigator (examiner). A test is completely objective when it gives the same results by different investigators with the same subjects (test persons). Bös [13] based on Calrke [23] has used a correlation coefficient as a measure of objectivity and called it Objectivity Coefficient.
Reliability is defined by Lienert et. al. [64] as “The degree of reliability is a reliability coefficient which indicates to what extent under the same conditions and using the same subjects the obtained results of the test will be the same. In other words, in what extent the test results can be reproduced”
Lienert define the validity of the test as the degree of accuracy with which the test has the one feature that it claims to measure, even actually measures. A test is totally valid if the produced result is exactly what occurred in reality.
If it is applied to the tracking, a tracking system would be valid if the system produced exactly the paths that the subjects have really ran.
7.3.3 Definition of Evaluation Hypotheses
In order for the tests of validity, reliability and accuracy to be performed, the following hypotheses have been set:
• Hypothesis to prove the validity of the system:
– H1 The system produces the exact positions of the player on the playing field.
∗ H1.1 Player positions on the field are correct when the players are not moving (still-stand)
∗ H1.2 Player positions on the field are correct when the players move on position.
– H.2 The system determines the exact running paths and velocity gradients.
∗ H2.1 There is no distance produced when players still-stand on position.
∗ H2.2 There is no distance produced when players make sport-specific actions on position without moving.
∗ H2.3 The exact distances and speeds are computed when play-ers make shuttle-run longitudinal.
∗ H2.4 The exact distances and speeds are computed when play-ers make shuttle-run transvplay-erse.
∗ H2.5 The exact distances and speeds are computed when play-ers make quadratic-run.
∗ H2.6 The exact distances and speeds are computed when play-ers make circular-run.
∗ H2.7 The exact distances are computed when players make sprint with maximum speed.
∗ H2.8 The exact speeds are computed when players make sprint with maximum speed.
∗ H2.9 The exact distances and paths are computed when play-ers make zick-zack run longitudinal.
∗ H2.10 The exact distances and paths are computed when play-ers make zick-zack run transvplay-erse.
• H3 Hypothesis to prove the reliability:
– H3.1 The system produces the same distance when repeating the same tests of square-run.
– H3.2 The system produces the same distance when repeating the same tests of circular-run.
• H4 Hypothesis to prove the objectivity: Two independent investigators will get the same results when evaluating the same data sets.
Figure 7.18: Template size analysis for the template matching tracking.