Conductas disruptivas - Marco Teórico

C.- Nivel local

1.1.6. Marco Teórico

1.1.6.2 Conductas disruptivas

How many times have you and your friends debated whether a player is worth the money his team pays him? Economists are not content to debate the monetary value of a player over lunch. They estimate a player’s value based on a sophis-ticated statistical technique known as regression.²² While we cannot make you an expert in a few short pages, by the end of this appendix, you should have an appreciation of the concept of a regression, a general idea of how economists use regressions, and a basic grasp of how to interpret regression output.

Consumption

Leisure

0 L^* 24

C^* C₁

C₀

U₂

U₁

FIgure 2a.12 An Increase in Income Allows for Some Leisure

As people become more productive, they can devote an increasing amount of time to leisure.

22See G. S. Thomas, “Surhoff Proves to Be ‘99’s Best Investment,” Street & Smith’s SportsBusiness Journal, October 25–31, 1999, p. 1, for an article that uses a technique such as this.

Suppose you want to figure out how much Alex Ovechkin, a star forward with the Washington Capitals, is worth to his team. Presumably, the Capitals pay him based on some measure of performance. (We discuss the precise measure in Chapter 8.) In a very simple world, teams may base the salaries of all players other than goalies on the number of goals they score:

Salary = f (Goals)

In the equation above a player’s salary is a dependent variable, because its value depends on (is determined by) the number of goals a player scores. Because the number of goals does not depend on another variable in the equation above, we call it an independent variable. If the relationship between goals scored and a player’s salary (the “functional form” of f(x) in the equation above) was a straight line like that in Figure 2B.1, you would be able to compute how much Alex Ovechkin was worth based on the number of goals he scored. Since you know that Ovechkin scored 32 goals in the 2010–2011 season, you could tell your friends how much he is worth to the team.

Unfortunately, life is not so simple. Salaries and goals scored do not line up perfectly along a straight line. Instead, the relationship is likely to be scattered around the line, as shown in Figure 2B.2. The points corresponding to players’

goals and salaries may be scattered about the line for two reasons. First, there may be some error in measuring the variables involved. For example, Ovechkin’s official salary may not include a bonus he received for making the NHL All-Star Team. If so, the official statistics understate his full compensation, and the point corresponding to his goals and salary lies below the line.

Second, a player’s salary and goals scored may not lie on the line because of some factor for which we have failed to account. For example, Ovechkin also had 53 assists—plays that led to goals scored by his teammates. If teams reward both players for both goals and assists, then Ovechkin’s goal–salary combination may lie above the line in Figure 2B.2 because his salary also reflects a factor that our goal–salary relationship ignores.

Salary

Goals FIgure 2b.1 The True Relationship between goals and Salary

If we knew the true relationship between goals and salary, we would know what each hockey player’s salary should be.

Making matters more difficult still, in real life, we do not observe the line in Figure 2B.2. All we see is the scatter of points. From this scatter of points, we must estimate the relationship between goals and salaries before making a statement about a given player.

Economists who want to know the relationship between goals scored and sal-ary in the NHL must first estimate the true relationship from the scatter of points that appear in Figure 2B.3. They do so through a process known as ordinary least squares (OLS). The name ordinary least squares indicates that we choose the line that minimizes the sum of the squared distances between the points and the line. If e_i is the distance (measured as a vertical line) between each point (i = 1,c, n) scat-tered around the proposed line and the line itself, OLS minimizes the sum S, where

S = a

n i =1e_i²

Salary

Goals FIgure 2b.2 observations Scattered Around the True Relationship A player’s salary might be higher or lower than the true relationship predicts.

Salary

Goals FIgure 2b.3 oLS Fits a Line to the Scattering of Points

In reality, we see only the scattered points and must estimate the true relationship.

While we do not bother with all the theory behind OLS estimation, it helps to see why economists prefer it to two alternative estimation methods. One alter-native is to minimize the total error (Σe_i), in effect adding the signed distances of the points from the proposed line. Figure 2B.4 shows this method fails to distin-guish between lines A and B, even though line A clearly gives the better fit. The problem is that the error for line B is also zero because the negative error offsets the positive error.

We can solve the problem of offsetting positive and negative misses by either squaring the errors or taking their absolute value (Σ0^ei0). The two methods, how-ever, are not identical. If we added the absolute value of the error terms, we would conclude that either line C or line D in Figure 2B.5 fits the data equally well. By squaring the errors, OLS places greater weight on the large miss made by line D.

OLS thus fits our intuitive notion that a line with several small misses fits the data better than a line with a few very large ones.

Salary

Goals A

FIgure 2b.4 Minimizing the Sum of Errors May Yield a Poor Fit By this standard, the two lines fit the points equally well.

Salary

Goals C

FIgure 2b.5 A Few Big Misses Are Worse than Many Small ones In general, economists prefer many small misses to a few large ones.

Economists call the OLS estimate of the line relating salary to goals a simple regression, because it assumes that there is a simple explanation for why some players make more than others: They score more goals. Here is the output from one such simple regression²³

Salary = 1,291,215 + 92,297 * Goals

In this equation, the coefficient 1,291,215 is the intercept term. It is the salary a player receives if he does not score any goals (Goals = 0). The coefficient 92,297 represents the slope term. It shows the impact that scoring an extra goal has on sal-ary. It says that each goal scored adds a little over $92,000 to a player’s salsal-ary. This model thus predicts that Alex Ovechkin will make about 1,291,215 + 92,297 × 32 =

$4,244,719.

We cannot, however, be certain that a player’s salary will actually rise by about $92,297 per goal scored. Figure 2B.6 shows two different sets of points that both lead to the same slope term. While the estimate is the same for each, we are far more confident of our results in Figure 2B.6a. Statisticians measure their con-fidence in their estimates with a variable called the standard error. We shall not derive the formula for the standard error; we simply say that the closer the stan-dard error of a coefficient is to zero, the more confident we are that our estimate accurately reflects the true value. A good rule of thumb is to look for a standard error that is no more than half the size of the coefficient. Computer programs generally compute the ratio of the coefficient to the standard error, a value called the t-statistic. Since we want the standard error to be no more than about half

Salary

Goals

Salary

Goals

(a) (b)

FIgure 2b.6 We Have More Confidence When the Points Are Close to the Line

23The sample in this case is 2010–2011 NHL players who played at least 20 games and had averaged at least 5 minutes per game on the ice.

the value of the coefficient, we look for a t-statistic that is greater than 2.0. Most economics papers report the t-values in parentheses below the coefficients like this:

Salary = 1,291,215 + 92,297 * Goals (13.30) (13.83)

In this case, both t-values are much greater than 2.0, so we can be confident that the true values of both the constant and the slope terms are not zero and that goals actually do have an impact on salary.

Multiple Regression and Dummy Variables

As noted earlier, we can probably make our measurement more accurate by includ-ing other variables that affect a player’s salary. In fact, failinclud-ing to include a key vari-able such as assists may cause our coefficient on goals to be off target, a problem statisticians call bias. We call a regression that has several explanatory variables a multiple regression, reflecting the fact that a dependent variable (in our case, salaries) may be affected by multiple factors. The results of a multiple regressions look very much like those of a simple regression. In this case, we find

Salary = 755,677 + 18,436 * Goals + 74,820 * Assists (7.74) (2.03) (11.35)

The interpretation of the coefficients becomes a bit more complex in a multi-ple regression. Now the coefficient on goals, 18,436, reflects the impact of an addi-tional goal on a player’s salary holding the number of assists constant. It allows us to say that if two players have the same number of assists (and any other factor one might include) but one of the players has 10 more goals than the other, we expect the player with more goals to earn about $184,000 per year more than the other.

While standard errors and t-statistics give a good idea as to how well specific variables explain the data, they do not tell how good a job the regression as a whole does. Fortunately, most regression packages provide several overall measures of the quality of the regression. The most intuitive measure of a regression’s “quality of fit” is its R². The value of the R² tells us how much of the variation in the depen-dent variable can be explained by the explanatory variables in the regression. In the above regressions, for example, the R² rises from 0.21 to 0.34 when we add assists as an explanatory variable. This tells us that goals alone explain about 21 percent of the variation in salary, while goals and assists combined explain about 34 percent. Using both goals and assists improves the regression because the R² of the second regression is closer to 1, meaning it comes closer to explaining 100 percent of the variation in salary.

One additional variable we might want to include in our multiple regression is the player’s position. Neither goals nor assists are as important for defenseman, whose primary responsibility is to prevent scoring by the other team, as they are to an offensive player. We cannot add a player’s position, however, in the same way that we would add the number of goals or assists he has. A player’s position, like a worker’s sex or race, is a qualitative variable; it does not have an obvious numerical

value. To include position in our regression, we must first create a dummy vari-able. dummy variables assign numerical values to qualitative variables.

In this case, we let the dummy variable equal zero if the player was not a defense-man and one if the player was a defensedefense-man. This changes the regression to

Salary = 418,312 + 45,715 * Goals + 64,470 * Assists + 694,374 * Defenseman (3.55) (4.35) (9.46) (4.92)

Since the variable Defenseman equals zero for all players who do not play defense, the coefficient has no impact for them. We can think of the coefficient as the impact of playing defense, ceteris paribus—the impact of playing defense for a player who scores a given number of goals and who has a given number of assists. These results suggest that defensemen are paid a premium of almost $700,000. This does not mean that defensemen are more valuable to hockey teams. Defensemen are less likely to score goals or have assists than wings or centers. An offen-sive player who scores about seven more goals and has six more assists than a defensive player more than makes up the difference of the dummy variable. As expected, adding a player’s position improves the quality of the regression—the R² rises to 0.36.

P a r t t w o

In document DE POSGRADO (página 48-53)