Historically, the prediction of sports performance has been a concept usually reserved for those associated with the betting culture. However, in reality, each and every person involved within sport will subconsciously process information to predict sports performance. Performance prediction could be described as the ability to draw conclusions upon the outcome of future performance based upon the combined interaction of previously gathered information, knowledge or data. For players and coaches, predictions are often made about forthcoming opponents based upon previous encounters and known traits. Therefore, is it not reasonable to assume that with valid and reliable information, using the correct techniques, the accurate prediction of performance should be pos-sible? From a sports science perspective, the most common approaches to per-formance predication use large amounts of data and apply statistical techniques.
Human predictions however, are entirely derived from one’s underpinning knowledge and subjective bias, although the ‘experts’ are able to accommodate Figure 5.5 Example of a perturbation that was ‘smoothed out’ at time = 2.6:08
an overview of the development of notational analysis
71
a greater understanding and opportunity for the element of chance and uncertainty, unlike computerized models of performance predication that are entirely statistically driven, such as Multiple Linear Regression.
In an evaluation of human and computer-based prediction models for the 2003 Rugby World Cup (Table 5.1), O’Donoghue and Williams (2004) identified that the best human predictor performed better than any computer-based model, although the mean score of all the human predictions fell below each of the computer-based models that were used; unquestionably a result of the subject-ivity within human prediction. Interestingly, in a similar study during the 2002 Soccer World Cup (O’Donoghue et al. 2003), although the best computer-based models outperformed the human predictors once again, their overall effective-ness in predicting results was far inferior compared with the 2003 Rugby World Cup. Ironically, only the human-based focus group was able to predict four of the eight quarter-finalists and no method predicted more than one of the semi-finalists, whereas all of the computer-based models for the 2003 Rugby World Cup predicted seven of the eight quarter-finalists and three of the four semi-finalists.
However, this is understandable considering the inherent differences between soccer and rugby union. Very few upsets occur within rugby union and with only one drawn match during World Cup rugby from 1987 (O’Donoghue and Williams 2004), results generally go to form. Research by Garganta and Gonçalves (1997) led to the notion that among team sports, soccer presents one of the lowest success rates in the ratio of goals scored to the number of attacking actions performed, subsequently increasing the likelihood of drawn matches and upsets. This considered, unlike rugby union in which the number of points scored is far greater, the accurate prediction of soccer matches is far more dif-ficult, a notion shared by O’Donoghue and Williams (2004) and demonstrated by O’Donoghue et al. (2003).
Making suggestions upon the types of data that should be used is difficult and ultimately reliant upon what is actually available. Historically, research in soccer, and similar team sports, that has attempted to identify the characteristics of a successful team has used game related performance indicators rather than fac-tors such as distance travelled (Hughes et al. 1988; Yamanaka et al. 1993). By using process orientated data, such as pass completion, shots on target and entries into the attacking third for example, a more accurate picture of a team’s abilities would be created which directly relates to the dynamic processes involved in soccer. Although large databases of such information are available, the validity of the data in terms of defining successful performance is questionable
an overview of the development of notational analysis
72
and unsubstantiated. In order for performance prediction to move forward, not only within soccer, it is imperative that issues such as this are addressed, along with the continued development of valid and reliable methods of performance prediction.
By using MLR, a number of conditions must be accepted in considering its use as a prediction tool. The method is not based upon any ‘artificial learning’ process in order to generate predictions and predicts each game on its own merit without consideration for other factors. The simulation package used by O’Donoghue and Williams (2004) favoured Brazil to win the 2002 World Cup, rather than France who were the strongest team in the tournament. The model took into consideration the probabilities of qualifying from the group stages and then progressing through the knock-outs against different ranked opposition and predicted that Brazil had a greater chance of winning the fixture against France, should they have actually qualified from the group stages, because of the events Table 5.1 Marks awarded for each prediction
Method Marks awarded
Best individual human 39.00 4.00 2.00 1.00 0.00 46.00
Mean human prediction 35.63 3.45 1.17 0.26 0.31 40.66
Expert Focus Group 37.00 4.00 1.00 0.00 1.00 43.00
Computer-based methods Multiple linear regression (satisfies assumptions)
38.00 4.00 1.00 0.00 0.00 43.00
Multiple linear regression (violates assumptions)
38.00 4.00 1.00 0.00 1.00 44.00
Binary logistic regression 37.00 4.00 1.00 0.00 1.00 43.00
Neural networks with numeric input
37.00 4.00 1.00 0.00 1.00 43.00
Neural network with binary input – 4 middle layer nodes
34.00 3.00 1.00 0.00 1.00 39.00
Neural network with binary input – 8 middle layer nodes
34.00 4.00 1.00 0.00 1.00 40.00
Neural network with binary input – 16 middle layer nodes
36.00 4.00 1.00 0.00 1.00 42.00
Neural network with binary input – 32 middle layer nodes
37.00 4.00 1.00 0.00 1.00 43.00
Simulation program 38.50 4.00 1.00 0.00 1.00 44.50
an overview of the development of notational analysis
73
preceding the potential tie. However, MLR would simply identify that France had both a superior rank and far less travelling distance than Brazil and would predict France to win the tie. This simplistic approach of MLR and indeed other algorithmic methods such as binary logistic regression is their fundamental drawback, although the results of these research papers quoted proved that even the most simplistic approach can be relatively effective.