Let Fo( ) be the unknown true probability distribution of Z. For ` = 1; :::; L with a …xed
…nite L, let F`;o(jx`) be the unknown true conditional probability distribution of Z ` given X` = x`, where Z ` denotes the components of Z not in the conditioning variable X`, and hence Z = (Z0 `; X`0)0. The model (1) - (2) can be rewritten
Z
g (z; o; h1;o( ); :::; hL;o( )) dFo(z) = 0; (48) Z
`(z `; x`; h`;o(x`)) dF`;o(z `jx`) = 0 for almost all x`, ` = 1; :::; L: (49) We note that although the unknown functions h`;o( ); ` = 1; :::; Lenter the conditional moment restrictions (2) (i.e., (49)) through h`;o(X`) only, they could enter the unconditional moment restrictions (1) (i.e., (48)) in a very ‡exible way. We assume that the in…nite dimensional nuisance functions ho( ) = (h1;o( ); :::; hL;o( )) 2 H = H1 HL are identi…ed by the conditional moment restrictions (49), and that if ho( ) were known, the …nite dimensional parameter o 2 is (possibly) over identi…ed by the unconditional moment restrictions (48).
Proof of Lemma 1. For the ease of notation and without loss of generality, we assume in this proof that L = 2. We assume that the regularity condition as in Newey (1990, De…nition A.1) is satis…ed.
Let fo(z)to be the true density of Z with respect to a sigma …nite dominating measure (z), and fo(z `jx`)be the true conditional density of Z ` given X` = x` (` = 1; 2). Here F denotes a class of candidate density function of Z with fo 2 F. De…ne a class of density functions F that satisfy the conditional and unconditional moment conditions:
F = f 2 F : Z
1(z 1; h1(x1)) f (z 1jx1)d (z 1) = 0;
Z
2(z 2; h2(x2)) f (z 2jx2)d (z 2) = 0;
Z
g (z; ; h1; h2) f (z)d (z) = 0 : (50)
We will consider a class of densities of Z indexed by ( ; h1; h2; ), where denotes the parameter that determines the features of the distribution of Z other than the restriction above. More
precisely, let G denote a class of real valued measurable function of Z such that
F = ff (zj ; h1; h2; ) : 2 Gg (51)
for any = ( ; h1; h2) 2 H1 H2. Let V V1 V2 V denote the completion (with respect to the L2-norm) of H1 H2 G f( o; h1;o; h2;o; o)g where o satis…es
f ( zj o; h1;o; h2;o; o) = fo(z):
We will consider the parametric family f ( zj o+ ; h1;o+ 1v1; h2;o+ 2v2; o+ v ) in F . The scores in the direction of , 1, 2, of this family are such that
s (Z) = c ;1( Z 1j X1) + d ;1(X1)
= c ;2( Z 2j X2) + d ;2(X2) ;
sh1(Z) [v1] = ch1;1( Z 1j X1) [v1] + dh1;1(X1) [v1]
= ch1;2( Z 2j X2) [v1] + dh1;2(X2) [v1] ; sh2(Z) [v2] = ch2;1( Z 1j X1) [v2] + dh2;1(X1) [v2]
= ch2;2( Z 2j X2) [v2] + dh2;2(X2) [v2] ;
s (Z) [v ] = c ;1( Z 1j X1) [v ] + d ;1(X1) [v ]
= c ;2( Z 2j X2) [v ] + d ;2(X2) [v ] ; with
E [ c ;1(Z 1; X1)j X1] = 0; (52)
E [d ;1(X1)] = 0; (53)
E [ c ;2(Z 2; X2)j X2] = 0; (54)
E [d ;2(X2)] = 0; (55)
E [ ch1;1( Z 1j X1) [v1]j X1] = 0; (56) E [dh1;1(X1) [v1]] = 0; (57) E [ ch2;1( Z 1j X1) [v1]j X1] = 0; (58) E [dh2;1(X1) [v1]] = 0; (59)
E [ ch2;2( Z 2j X2) [v2]j X2] = 0; (60) E [dh2;2(X2) [v2]] = 0; (61) E [ ch1;2( Z 2j X2) [v2]j X2] = 0; (62) E [dh1;2(X2) [v2]] = 0; (63) and
E [ c ;1(Z 1; X1) [v ]j X1] = 0; (64)
E [d ;1(X1) [v ]] = 0; (65)
E [ c ;2(Z 2; X2) [v ]j X2] = 0; (66)
E [d ;2(X2) [v ]] = 0: (67)
Here, ch1( Z 1j X1) [v1] and dh1(X1) [v1] denote the conditional score of Z 1 given X1 and the marginal score of X1, obtained by di¤erentiating the log-likelihood with respect to 1, for example. Note that s (Z) is a d 1 vector of functions. Below, we will write ch1(Z) [v1] ch1( Z 1j X1) [v1], e.g., for simplicity of notations.
Di¤erentiating the moment restrictions in (50), we obtain the nonparametric tangent space T as the completion of the set consisting of sh1(Z) [v1] + sh2(Z) [v2] + s (Z) [v ], where s’s satisfy (52) - (67) as well as
E 1(Z; h1;o) c ;1(Z)0 X1 = 0; (68)
@m1(X1; h1;o(X1))
@h01 v1(X1) + E [ 1(Z; h1;o) ch1;1(Z) [v1]j X1] = 0; (69) E [ 1(Z; h1;o) ch2;1(Z) [v2]j X1] = 0; (70) E [ 1(Z; h1;o)c ;1(Z) [v ]j X1] = 0; (71)
E 2(Z; h2;o)c ;2(Z)0 X2 = 0; (72) E [ 2(Z; h2;o)ch1;2(Z) [v1]j X2] = 0; (73)
@m2(X2; h2;o(X2))
@h02 v2(X2) + E [ 2(Z; h2;o)ch2;2(Z) [v2]j X2] = 0; (74) E [ 2(Z; h2;o)c ;2(Z) [v ]j X2] = 0; (75) and
@E [g (Z; o; h1;o; h2;o)]
@ 0 + E g (Z; o; h1;o; h2;o) s (Z)0 = 0; (76) E[g (Z; o; h1;o; h2;o) sh1(Z) [v1]] = 0; (77) E [g (Z; o; h1;o; h2;o) sh2(Z) [v2]] = 0; (78) E [g (Z; o; h1;o; h2;o) s (Z) [v ]] = 0; (79) for any (v1; v2; v )2 V1 V2 V , where @m`(X`@h;h`;o0` (X`)) and v`(X`)are d` d` matrix of functions and d` 1 vector of functions respectively. Note that (14) is used in (77) and (78).
The residual of the projection of s on T , s (Z) proj [s (Z)j T ] will give the semiparametric score S (Z) and the semiparametric information bound of owill be E[S (Z)S (Z)0]. See Bickel et al (1993) and Newey (1990). We show that the residual of the projection of s on T is equal to
S (Z) = @E[g (Z)]
@ 0
0
E g (Z) g (Z)0 1g (Z) (80)
where g (Z) = g (Z; o; h1;o; h2;o).
We now de…ne 1(X1) and 2(X2) as solutions to
0 = E 1(Z; h1;o) c ;1(Z)0 S (Z)0 ch1;1(Z) [ 1] ch2;1(Z) [ 2] X1 (81) and
0 = E 2(Z; h2;o) c ;2(Z)0 S (Z)0 ch1;2(Z) [ 1] ch2;2(Z) [ 2] X2 : (82) Note that for ` = 1; 2, `(X`) is a d` d matrix of functions and
ch`;`(Z) [ `] = ch`;`(Z) `;1 ; : : : ; ch`;`(Z) `;d
is a 1 d vector of functions, where `;j(X`)denotes the j-th row of `(X`)for j = 1; : : : ; d . We argue that such 1(X1) and 2(X2) exist as unique objects almost surely for the fol-lowing reason. Letting v1 = 1;j(X1) in (69) and v2 = 2;j(X2) in (70) for j = 1; : : : ; d , we
get
@m1(X1; h1;o(X1))
@h01 1(X1) + E [ 1(Z; h1;o) ch1;1(Z) [ 1]j X1] = 0 (83) and
E [ 1(Z; h1;o) ch2;1(Z) [ 2]j X1] = 0: (84) Using (68), (80), (83) and (84), we note that
E 1(Z; h1;o)c ;1(Z)0 X1 E 1(Z; h1;o)S (Z)0 X1
E [ 1(Z; h1;o)ch1;1(Z) [ 1]j X1] E [ 1(Z; h1;o)ch2;1(Z) [ 2]j X1]
=0 + @E[g (Z)]
@ 0
0
E g (Z) g (Z)0 1E [ g (Z) 1(Z; h1;o)0j X1] + @m1(X1; h1;o(X1))
@h01 1(X1)
0
+ 0;
so we rewrite (81) as
0 = E 1(Z; h1;o) g (Z)0 X1 E g (Z) g (Z)0 1 @E[g (Z)]
@ 0 + @m1(X1; h1;o(X1))
@h01 1(X1) ; which can be solved for 1(X1) as long as @m1(X1; h1;o(X1))/ @h01 is invertible almost surely.
Similarly, we can solve for 2(X2)as long as @m2(X2; h2;o(X2))/ @h02is invertible almost surely.
Now let
0 = s (Z)0 S (Z)0 sh1(Z) [ 1] sh2(Z) [ 2] :
We will show that satis…es the properties (64)-(67), (71), (75), and (79) of the s (Z) [v ] for some v .
By construction, we have E [ ] = 0. Taking de;1(X1) [v ] = E [ j X1]
= d ;1(X1) (dh1;1(X1) [ 1])0 (dh2;1(X1) [ 2])0
+ E c ;1(Z) S (Z) (ch1;1(Z) [ 1])0 (ch2;1(Z) [ 2])0 X1
= d ;1(X1) (dh1;1(X1) [ 1])0 (dh2;1(X1) [ 2])0 E [ S (Z)j X1] ; and
ec;1(Z) [v ] = de;1(X1) [v ]
= c ;1(X1) (ch1;1(Z) [ 1])0 (ch2;1(Z) [ 2])0 S (Z) + E [ S (Z)j X1] ;
we can see that properties (64) and (65) are satis…ed for
=ec;1(Z) [v ] + ed ;1(X1) [v ] :
Withec;2(Z) [v ]and ed ;2(X2) [v ]similarly de…ned, we can see that properties (66) and (67) are also satis…ed.
Equations (81) implies that
E 1(Z; h1;o)ec;1(z) [v ]0 X1
= E 1(Z; h1;o) c ;1(Z)0 S (Z)0 ch1;1(Z) [ 1] ch2;1(Z) [ 2] X1 + E [ 1(Z; h1;o)j X1] E [ S (Z)0j X1]
= 0:
which implies that the property (71) is satis…ed by . Likewise, (75) are satis…ed by . Using (76)-(78), we obtain
E g (Z)0 = E s (Z) g (Z)0 E S (Z) g (Z)0
= @E[g (Z)]
@ 0
0
+ @E[g (Z)]
@ 0
0
E[g (Z) g (Z)0] 1 E[g (Z) g (Z)0]
= 0; (85)
which shows that the property (79) is satis…ed.
These observations lead us to conclude that
sh1(Z) [ 1] + sh2(Z) [ 2] + 2 T . (86) Because S (Z) is proportional to g (Z), we can deduce from (77)-(79) that S (Z) ? T . Along with (86), this implies that S (Z) is the residual of the projection of s on T . Thus the semiparametric information bound of o is
E[S (Z)S (Z)0] = @E[g (Z)]
@ 0
0
E[g (Z) g (Z)0] 1 @E[g (Z)]
@ 0 : (87)
References
[1] Ackerberg, D, K. Caves, and G. Frazer (2006): “Structural Identi…cation of Production Functions,” mimeo, UCLA.
[2] Ackerberg, D., X. Chen, and J. Hahn (2012): “A Practical Asymptotic Variance Estimator for Two-Step Semiparametric Estimators,” Review of Economics and Statistics 94, 481-498.
[3] Aguirregabiria, V. and P. Mira (2010): “Dynamic Discrete Choice Structural Models: A Survey,” Journal of Econometrics, 156, 38-67.
[4] Ai, C. and X. Chen (2003): “E¢ cient Estimation of Conditional Moment Restrictions Models Containing Unknown Functions,” Econometrica, 71, 1795-1843.
[5] Ai, C. and X. Chen (2007): “Estimation of Possibly Misspeci…ed Semiparametric Con-ditional Moment Restriction Models with Di¤erent Conditioning Variables,” Journal of Econometrics 141, 5-43.
[6] Ai, C. and X. Chen (2012): “The Semiparametric E¢ ciency Bound for Models of Sequential Moment Restrictions Containing Unknown Functions,”Journal of Econometrics 170, 442-457.
[7] Aguirregabiria, V., and P. Mira (2002): “Swapping the Nested Fixed Point Algorithm: A Class of Estimators for Discrete Markov Decision Models,” Econometrica 70, 1519-1543.
[8] Andrews, D. (1994): “Asymptotics for Semi-parametric Econometric Models via Stochastic Equicontinuity,” Econometrica 62, 43-72.
[9] Arcidiacono, P. and R. Miller (2011): “Conditional Choice Probability Estimation of Dy-namic Discrete Choice Models with Unobserved Heterogeneity,” Econometrica 79, 1823-1868.
[10] Armstrong, T.B., M. Bertanha, and H. Hong (2012): “A Fast Resample Method for Para-metric and SemiparaPara-metric Models,” unpublished working paper.
[11] Bajari, P., L. Benkard and J. Levin (2007): “Estimating Dynamic Models of Imperfect Competition,” Econometrica 75, 1331-1370.
[12] Bajari, P., H. Hong and D. Nekipelov (2012): “Game Theory and Econometrics: A Sur-vey of Some Recent Research,” Advances in Economics and Econometrics: Theory and Applications, Tenth World Congress of Econometric Society.
[13] Bajari, P., H. Hong, J. Krainer and D. Nekipelov (2010): “Estimating Static Models of Strategic Interactions,” Journal of Business & Economic Statistics 28, 469-482.
[14] Bickel, P. J., C. A. J. Klaassen, Y. Ritov, and J. A. Wellner (1993): E¢ cient and Adaptive Inference in Semiparametric Models. Johns Hopkins University Press, Baltimore.
[15] Chamberlain, G. (1992a) “E¢ ciency Bounds for Semiparametric Regression,” Economet-rica, 60, 567-596.
[16] Chamberlain, G. (1992b) “Comment: sequential moment restrictions in panel data,”Jour-nal of Business and Economic Statistics 10, 20-26.
[17] Chen, X., O. Linton and I. van Keilegom (2003) “Estimation of Semiparametric Models when the Criterion Function is not Smooth,” Econometrica, 71, 1591-1608.
[18] Collard-Wexler, A. (2012): “Demand Fluctuations in the Ready-Mix Concrete Industry,”
forthcoming in Econometrica.
[19] Crepon, B., F. Kramarz, and A. Trognon (1997): “Parameters of Interest, Nuisance Para-meters, and Orthogonality Conditions: An Application to Autoregressive Error Component Models,” Journal of Econometrics 82, 135-156.
[20] Fang, H. and Y. Wang (2012): “Estimating Dynamic Discrete Choice Models with Hy-perbolic Discounting with an Application to Mammorgraphy Decisions,” working paper, University of Pennsylvania.
[21] Ghosh, M., W.C. Parr, and K. Singh (1984): “A Note on Bootstrapping the Sample Median,” Annals of Statistics 12, 1130-1135.
[22] Gonçalves, S., and H. White (2005): “Bootstrap Standard Error Estimates for Linear Regression,” Journal of American Statistical Association 100, 970-979.
[23] Hayashi, F., and C. Sims (1983): “Nearly E¢ cient Estimation of Time Series Models with Predetermined, but not Exogenous Instruments,” Econometrica 51, 783–798.
[24] Hong, H., A. Mahajan, and D. Nekipelov (2010): “Extremum Estimation and Numerical Derivatives,” mimeo, Stanford University.
[25] Hotz, J. and R. Miller (1993): “Conditional Choice Probabilities and the Estimation of Dynamic Models,” Review of Economic Studies 60, pp. 497–529.
[26] Hotz, J., Miller, R., Sanders, S., and J. Smith (1994): “A Simulation Estimator for Dy-namic Models of Discrete Choice,” Review of Economic Studies 61, pp. 265-289.
[27] Hansen, L.P. (1982): “Large Sample Properties of Generalized Method of Moments Esti-mators,” Econometrica, 50, 1029-1054.
[28] Kasahara, H., and K. Shimotsu (2009): “Nonparametric Identi…cation and Estimation of Finite Mixture Models of Dynamic Discrete Choices,” Econometrica 77, 135-175.
[29] Levinsohn, J. and A. Petrin (2003): “Estimating Production Functions using Inputs to Control for Unobservables,” Review of Economic Studies 70, 317-341
[30] Machado, J.A.F., and P. Parente (2005): “Bootstrap Estimation of Covariance Matrices via the Percentile Method,” Econometrics Journal 8, 70-78.
[31] Murphy, K. M. and R. H. Topel (1985): “Estimation and Inference in Two-Step Econo-metric Models,” Journal of Business and Economic Statistics 3, pp. 370 –379.
[32] Newey, W.K. (1984): “A Method of Moments Interpretation of Sequential Estimators,”
Economics Letters 14, pp. 201 –206.
[33] Newey, W.K. (1990): “Semiparametric E¢ ciency Bounds,”Journal of Applied Economet-rics 5, 99 –135.
[34] Newey, W.K. (1994): “The Asymptotic Variance of Semiparametric Estimators,” Econo-metrica 62, 1349 –1382.
[35] Newey, W.K., and J.L. Powell (1999): “Two-step Estimation, Optimal Moment Condi-tions, and Sample Selection Models,” working paper, MIT.
[36] Newey, W. (2004): “E¢ cient Semiparametric Estimation via Moment Restrictions,”
Econometrica, 72, 1877-1897.
[37] Olley, G. and A. Pakes (1996): “The Dynamics of Productivity in the Telecommunications Equipment Industry,” Econometrica 64, pp. 1263–1297.
[38] Pakes, A. and G. Olley (1995): “A Limit Theorem for a Smooth Class of Semiparametric Estimators,” Journal of Econometrics 65, 295–332.
[39] Pakes, A., Ostrovsky, M, and S. Berry (2007): “Simple Estimators for the Parameters of Discrete Dynamic Games, with Entry/Exit Examples,” RAND Journal of Economics 38, 373-399.
[40] Pesendorfer, M. and P. Schmidt-Dengler (2008): “Asymptotic Least Squares Estimators for Dynamic Games,” Review of Economic Studies 75, 901–928.
[41] Ritov, Y. and P.J. Bickel (1990): “Achieving Information Bounds in Non and Semipara-metric Models,” Annals of Statistics 18, 925–938.
[42] Ryan, S. (2012): “The Costs of Environmental Regulation in a Concentrated Industry,”
Econometrica 80, 1019–1061.
[43] Shao, J. (1992): “Bootstrap Variance Estimators with Truncation,”Statistics & Probability Letters 15, 95-101.
[44] Wooldridge, J.M. (2002): Econometric Analysis of Cross Section and Panel Data, Cam-bridge: MIT Press.
[45] Wooldridge, J.M. (2009): “On Estimating Firm-Level Production Functions Using Proxy Variables to Control for.Unobservables,” Economics Letters 104, 112-114
[46] Wu, C.F.J. (1986): “Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis,” Annals of Statistics 14, 1261-1295.