Sus orígenes - Villamil: El medico compositor

2. Villamil: El medico compositor

2.1 Sus orígenes

4.5.1 Data Preparation and Variables

We have collected the following data for the 2015 season of College Football (Septem- ber 1st to December 31st): Tweets from @cfpteamtix,11 _{detailed game information}

(i.e., day/time, location, rivalry, conference game, final score, betting market odds), and rankings (i.e., AP Poll and Selection Committee rankings).12

The regular season was 13 week-long (from September 3rd to November 29th). After it ended, the conference championship games were played within the next week. On December 6th, the Selection Committee announced the semi-finals (i.e., Clem- son vs. Oklahoma and Alabama vs. Michigan State), which were to be played on

11_{Twitter’s Advanced Search feature does not allow us to reach all tweets of an account although}

it goes until the date of account opening. Therefore, we use twimemachine.com, which displays all tweets up to a certain point in the past, to periodically scrape the @cfpteamtix feed.

12_{As mentioned earlier, Selection Committee announces rankings starting on November 3rd. We}

Table 4.3 Number of Tweets and Related Seat Capacity by Team, Zone, and Tweet Type

Transactions Offers

Zone100 Zone400 Zone100 Zone400

Team Tweets Seats Tweets Seats Tweets Seats Tweets Seats

Alabama 78 193 70 166 53 188 37 123 Baylor 37 86 40 117 21 51 19 60 Clemson 94 249 93 252 55 164 38 87 Iowa 30 90 38 104 5 17 19 52 Louisiana St 82 233 76 247 23 58 24 89 Michigan St 4 4 69 214 15 48 23 77 Mississippi 18 67 29 99 16 43 18 75 Notre Dame 76 208 88 235 49 154 59 188 Ohio State 57 150 66 184 53 174 47 135 Oklahoma 45 132 39 124 3 7 16 59 TOTAL 521 1412 608 1742 293 904 300 945

December 31st.13 Our data spans the period from September 1st to December 31st; however, we note that TeamTix were not sold during the conference championship week (from November 30th to December 5th) based on College Football Playoff’s decision.

After combining these different data sources, we choose 10 teams with high tweet volumes: Alabama, Baylor, Clemson, Iowa, Louisiana State, Michigan State, Missis- sippi, Notre Dame, Ohio State, and Oklahoma. Table 4.3 presents the number of tweets and related seat capacities (we call this volume) for each team, seating zone and tweet type (transactions vs. offers). For market transactions, we have a total of 1129 tweets for a seat volume of 3154 (Clemson leads with 501 seats). For trade offers, we have a total of 593 tweets for a seat volume of 1849 (Notre Dame leads with 342 seats). We share a weekly snapshot for Alabama’s regular season in Appendix C. We use daily volume (i.e., actual capacity, not the number of tweets) for market

13_{Winners of the semi-final match-ups, Clemson and Alabama, faced each other in the National}

Table 4.4 Variable Descriptions

Name Detail

T opW init 1 if teamiwon against a top-10 opponent in the last match, 0 o.w.

QualW init 1 if teamiwon against an opponent ranked 11-25 in the last match, 0 o.w.

Lossit 1 if teamilost the last match, 0 o.w.

RivalW init 1 if teamiwon against a rival in the last match, 0 o.w.

W inAOit 1 if teamiwon against the odds in the last match, 0 o.w.

Byeit 1 if teamihad a bye (i.e., did not play a game) in the last week

Rankit Teami’s CFP ranking for a given day (AP Poll ranking when not available)

RankM omit Last change in the ranking of teami

EnterT op4it 1 if teamientered Top 4 in the last ranking announcement, 0 o.w.

LeaveT op4it 1 if the teami left Top 4 in the last ranking announcement, 0 o.w.

SALESikt The number of seats involved in transactions for teamiin zone kon dayt

OF F ERSikt The number of seats involved in offers for teamiin zonek on dayt

transactions and trade offers (for each team and zone) as our dependent variables. All variables are defined in Table 4.4. Although our variables are at the daily level, the value of our independent variables change only after games or ranking announcements.

4.5.2 Count-data models: Poisson and Negative Binomial

In this section, we explain two count-data models used in our analysis. Our natural starting point is the Poisson model. The Poisson model has a probability mass function (where µ is the rate parameter):

P r(Y =y) = e

−µ_µy

y! , y= 0,1,2, ... (4.7)

with E(Y) = V ar(Y) = µ (equidispersion property, i.e., mean-variance equality). We use the following mean parametrization:

When the equidispersion property is violated, we use the Negative Binomial dis- tribution whose probability mass function is (where Γ(.) denotes the gamma integral and α denotes its variance parameter):

P r(Y =y|µ, α) = Γ(α −1₊_y₎ Γ(α−1_Γ(_y_{+ 1)} α−1 α−1₊_µ !α−1 µ α−1₊_µ !y , y= 0,1,2, ... (4.9)

with E(Y|µ, α) = µ and V ar(Y|µ, α) = µ(1 +αµ). The Negative Binomial model allows the use of the same mean parametrization,µ=ex0β_{, and leaves}_α_{as a constant.}

4.5.3 Empirical Metholodogy

As for our empirical approach, we start with the simple Poisson model. We next use the Poisson Maximum Likelihood Estimator to relax the equidispersion assumption to obtain a robust estimate of the variance-covariance matrix of the estimator. Then, we test for equidispersionV ar(y|x) = E(y|x). To do this, we follow Cameron and Trivedi (2009) and implement an auxiliary regression of the generated dependent variable, {(y−µˆ)−y}/µˆ on ˆµ, without an intercept term and perform a t test of whether the coefficient of ˆµ is zero. If the coefficient of ˆµ is significantly different than zero, the test suggests the presence of overdispersion. Then, we use the Negative Binomial model, which explicitly models the overdispersion. Finally, we use a Likelihood Ratio test for the hypothesis H0 :α= 0.

While using these models in STATA 14, we try models with different sets of independent variables in our analyses, from a model with only control variables to a full model:

• Model 1 (base model) includes team and week fixed effects, and the number days of since the last match (or the bye-day of team i): T eami, W eekt, Lagit • Model 2: Model 1 and game result variables (T opW init, QualW init, Lossit,

RivalW init, W inAOit, Byeit)

• Model 4: Model 1 and Top 4 related variables (EnterT op4it,LeaveT op4it) • Model 5: Model 1, game result and ranking variables

• Model 6: Model 1, game result, ranking and Top 4 related variables

In document Jorge Villamil Cordovez : Poiesis de los Andes (página 30-41)