• No se han encontrado resultados

Consider a regression model relating a response variable Y to a predictor X. The least- squares value of the slope b can be viewed as an estimate of the slope that applies in the “true relationship” between Y and X. Hence, the margin of error of an estimate can be used to give a range of values for this true value, as discussed in Chapter 4.

For instance, in the model relating runs scored in a season and a team’s OPS, = 2026; the margin of error of this estimate is 100. Therefore, our best guess for the value of the true slope in the regression model with runs scored as the response variable and OPS as the predictor is 2026, and we are reasonably certain that the true value lies in the range 1926 to 2126. Stated another way, based on the data at hand, the value of the slope is 2026; if we were able to observe unlimited data on runs scored and OPS, we are reasonably certain that the value of the slope would be in the range

1926,  2126 .

(

)

The margin of error is useful for assessing the accuracy of the least-squares esti- mates and for understanding the range of values that are consistent with the observed data. Note that if Y and X follow a regression model, Y = a + bX + ε and the true value of the slope is 0, then Y and X are not related. Hence, it is often of interest to determine if the slope estimate is statistically significantly different from 0; in this case, we say simply that the slope is statistically significant. This can be done by comparing the estimate to the margin of error, as discussed in Chapter 4. If the estimate is larger (in magnitude) than the margin of error, then it is statistically significant; otherwise, we cannot exclude the possibility that the true slope is 0. This does not mean that the true slope is exactly 0; it just means that, based on our data, there is little or no statistical evidence that the slope is not 0.

In the runs scored/OPS example, the slope estimate is 2026 and the margin of error is 100, so that the slope is clearly statistically significant. Consider an analysis with runs scored as the response variable but with stolen bases, denoted by S, as the predictor (Dataset 6.1). The regression equation is

ˆ 712.9 0.235  .= +

Y S

The margin of error of the slope estimate is 0.420. Because 0.235 is less than 0.420, we conclude that the slope estimate is not statistically significant; that is, the data are con- sistent with a model in which runs scored and stolen bases are not related. Practically speaking, there is not a strong relationship between runs scored and stolen bases, and we would not expect to obtain accurate predictions of a team’s runs scored using only stolen bases, a fact that is obvious to most baseball fans.

In determining the statistical significance of a slope estimate, it is important to keep in mind that any conclusions are dependent on the range of predictor variables used in the analysis. For instance, consider NBA (National Basketball Association) centers in

the 2010–2011 season who played in at least 70 games (Dataset 6.2). Let Y denote the player’s rebounds per 48 minutes played and let X denote the player’s height in inches. The regression equation relating Y to X is given by

ˆ 19.1 0.075  .= −

Y X

The margin of error of the slope estimate is 0.569. It follows that the slope estimate is not statistically significant; hence, there is no evidence of a relationship between height and rebounding. Of course, this does not mean that a player of any height can be an effective rebounder in the NBA. All the players in the analysis are between 6’9” and 7’5” tall. What the analysis tells us is that, within that range, for NBA centers playing at least 70 games, height is not a useful predictor of rebounding success.

Another factor affecting the statistical significance of a slope estimate is the sample size. If a slope is not statistically significant, it means that there is insufficient statistical evidence to conclude that it is not 0. This could be because the true slope is 0, or close to 0, or it could be that, although the true slope is not zero, there is insufficient data to detect a relationship between the response and predictor variables.

Consider another analysis of the runs scored of MLB teams, using triples T as the predictor. Using data from just the 2011 season (Dataset 6.1), the estimated regression line is

ˆ 612 2.73  .= +

Y T

The margin of error of the slope estimate is 3.780, so the estimated slope is not statis- tically significant. Now, suppose that we repeat the analysis using the data from the 2007–2011 seasons. The estimated regression line is now

ˆ 686 1.65  .= +

Y T

The margin of error of this slope estimate is 1.508, so the slope estimate is statistically significant. Therefore, teams that hit more triples tend to score more runs, with about 1.65 more runs per triple.

With a larger sample size, the margin of error of an estimate is smaller; hence, more estimates are significant, generally speaking. One consequence of this fact is that, if the sample size is very large, an estimate might be statistically significant but not practically important. For instance, consider MLB pitchers in the 2011 season who pitched at least 36 innings; there are 376 such players. Let Y denote a pitcher’s WHIP (walks and hits per inning pitched) and let A denote the pitcher’s age in years on June 30, 2011 (Dataset 6.3). The estimated regression equation relating Y to A is given by

ˆ 1.466 0.00543  .= −

Y A

The margin of error of the slope estimate is 0.00526. It follows that the slope estimate is statistically significant: Older pitchers tend to give up fewer walks and hits per inning pitched.

Note, however, that the magnitude of the effect is quite small. Over 90% of the pitchers studied are between 23 and 36 years old. Using the regression equation, the dif- ference in the average WHIP for 23-year-old pitchers and 36-year-old pitchers is about 0.07, a very small difference. Stated another way, if we are interested in understand- ing the factors that contribute to a pitcher’s WHIP, age is not one that would normally be considered. Therefore, although from a statistical point of view age and WHIP are related, the relationship is not practically important.

6.4 THE RELATIONSHIP BETWEEN WINS

ABOVE REPLACEMENT AND TEAM WINS

An important contribution of sabermetrics to baseball is the development of meth- ods of measuring the contribution of a player to his team. One of the most useful types of these measures is “wins above replacement” (WAR). Consider the case of a position player. WAR combines the player’s contributions in batting, base run- ning, and fielding into a single statistic. The units of WAR are “wins,” so that a player with a WAR value of 5, for example, has contributed 5 more wins to his team than a “replacement player” would have. Roughly speaking, a replacement player is the type of player a team might expect to play if a starting player is injured, with- out expending additional resources (e.g., trading for another team’s starting player). Note that, because some positions are easier to play than others, the properties of a replacement player depend on the player’s position. For pitchers, WAR is based on the pitcher’s contribution to “team defense”; it often uses “fielding-independent” metrics, measures of a pitcher’s performance that adjust for the contribution of field- ing to pitching statistics.

Many different implementations of this idea have been proposed, leading to several different definitions of WAR (or a similarly named statistic). Here, we use the version of WAR calculated by FanGraphs.com; a detailed description of the calculation used is given on that site. For each player on a team, a value of WAR can be determined, roughly measuring how many wins that player contributed. In 2013, the MLB leader in WAR is Mike Trout, with a value of 10.4. Among pitchers, the leader is Clayton Kershaw, with a value of 6.4. Near the bottom of the WAR list is Paul Konerko, with a value of −1.8, suggesting that the White Sox would have been better off using a generic replacement player in place of Konerko.

One interpretation of the WAR statistic is as a way to distribute a team’s wins, above those that would be achieved by a team of replacement players, among the team’s players. Because the properties of such a “replacement team” are the same for each MLB team, this suggests that a team’s actual wins should be closely related to its WAR, the sum of the WAR values for each player on the team. More specifically, if W repre- sents a team’s actual wins and X denotes the team’s WAR, we expect that W = +R X , where R represents the number of wins expected from a replacement team; note that it is generally thought that R is about 50.

To investigate this relationship, we can conduct a linear regression analysis with team wins as the response variable and team WAR as the predictor variable. Using data from the 2009–2013 MLB seasons, the estimated regression equation is

= +

ˆ 49.7 0.940  .

W X

The margin of error of the intercept estimate is about 2.8, so that the results are consis- tent with the value of wins by a replacement team in the range 47 to 52.5. The margin of error of the slope estimate is 0.08, so that the results are consistent with a slope of 1, as expected based on the theory of WAR. However, because 1 is near the end of the range 0.940 ± 0.08, the results suggest that WAR might be slightly overestimating the number of wins attributed to each player.

The value of R2 for the regression is 78.1%, indicating that about 78% of the variation in team wins can be explained by the team’s WAR values; that is, team WAR is a good predictor of team wins. To evaluate the magnitude of this R2 value, we can compute the R2 value for the regression of team wins on other possible pre- dictors. For instance, if we use the difference of a team’s hits and walks and its hits and walks allowed as the predictor, we obtain R2=71.7%. Consider a team’s total bases plus its hits and walks, similar to what would be used in calculating its OPS. If we use the difference between that value and the total bases plus hits and walks that a team has allowed as the predictor, then R2=79.0% , slightly higher than what we obtained using team WAR as the predictor. Therefore, although team WAR is a useful predictor of team wins, it is not necessarily better than other measures of team performance.

This result is not surprising because the ability of a team to hit well and to prevent its opponent from hitting well is generally considered to be the most important factor in a team’s success. Note that this result does not suggest that WAR should be replaced by a simpler measure, such as total bases plus hits and walks. The primary motivation of WAR is not to predict team wins but to evaluate individual players. The results pre- sented here are designed to check if WAR is calibrated correctly (it appears to be) and if it is closely related to team wins, as we would expect (it is).

6.5 REGRESSION TO THE MEAN: WHY

THE BEST TEND TO GET WORSE AND

Documento similar