html Dear Dan, > Here is the problem that has come up repeatedly here: > > Users want an insane number of variables in panel estimation. It isn't > enough to have a separate intercept for each county or hospital in the > US, or each person in the PSID or NLSY. Stata can handle those directly > or with some variation of taking out means. Now users want to have that > plus to interact the dummies with other variables, which multiplies the > number of variables in the regression, and which I don't know how to do > without putting all the nuisance variables into the regression equation. > Any suggestions would be appreciated. I found my notes on this yesterday, and there is a fairly simple solution which you can use immediately. Say you want to estimate the model Y = X*B + Z*G_i where X are the variables with constant coefficients, and Z are the variables with different coefficients for each individual. You can get the estimate of B by using transformed Y and X, using residuals from regressing Y and X on Z with individual coefficients. This can be done easily in TSP with PANEL(BYID) , and perhaps it can be done in Stata as well, if people prefer to use Stata. I programmed a PROC in TSP to automate it - see the example below. (I'll send grunfeld.txt in a separate email). Sincerely, Clint -------------- options double crt; name gcoefi 'coefficients varying by i'; ? by Clint Cummins 4/06 ? Data Source: Grunfeld (1958) ? Description: Panel Data, 10 U.S. firms over 20 years, 1935-1954. ? Variables: doc FN 'Firm Number'; doc YR 'Year 1935-1954'; doc I 'Annual real gross investment'; doc F 'Real value of the firm (shares outstanding)'; doc K 'Real value of the capital stock'; list vars FN YR I F K; const n,10 t,20 ystart,1935; set nt=n*t; smpl 1,nt; read(file='grunfeld.txt') vars; freq(panel,n=n,t=t,id=FN,time=YR,start=ystart) a; ? example with model: ? I = F*b + a_i + K*g_i title 'direct results with OLSQ'; dummy fn; dot 1-10; k. = k*fn.; enddot; olsq i f fn1-fn10 k1-k10; title 'results with transformed variables using panel(byid)'; ? Method: ? Regress dependent variable and all RHS variables with nonvarying ? coefficients on the other variables, using panel(byid). ? The transformed variables are the residuals from these byid regressions. ? In the example below, I is the dependent variables, and F is the ? only RHS variable with a constant coefficient. ? Then regress using the transformed variables, and make a d.f. ? correction to the SEs to reflect the extra estimated coefficients. panel(noall,byid,silent) i c k; i_t = @resi; rename @coefi g_i; panel(noall,byid,silent) f c k; f_t = @resi; rename @coefi g_f; olsq i_t f_t; title 'd.f. correction'; mat v = @vcov*(@nob-1)/(@nob-1-2*10); tstats(names=@rnms) @coef v; title 'nuisance coefficients'; mat gamma = g_i - g_f*@coef; print gamma; ? Automated version, with PROC COEFI list X f; list Z c k; Coefi i X Z B G RESI SSRI LOGLI 1; print B G; print SSRI LOGLI; Proc Coefi Y X Z B G RESI SSRI LOGLI IFPRINT; ? Estimate the model Y = X*B + Z*G_i ? X and Z are lists of variables ? (Assumes FREQ(PANEL) is in effect) ? Also assumes no missing data. ? This version assumes that B are the primary coefficients of ? interest, so SEs for the G_i are not computed, although they ? could be. ? by Clint Cummins 4/06 ? Create transformed variables and save coefs local y_t g_y X_t; panel(noall,byid,silent) y Z; y_t = @resi; rename @coefi g_y; dot X; local ._t g_.; panel(noall,byid,silent) . Z; ._t = @resi; rename @coefi g_.; enddot; local nz_ni; ? @AICI = -@LOGLI + NZ*NI, so: set nz_ni = @AICI + @LOGLI; list(suffix=_t) X_t X; ? Regression with transformed variables olsq(silent) y_t X_t; mat B = @COEF; RESI = @RES; set SSRI = @SSR; set LOGLI = @LOGL; ? d.f. correction if (IFPRINT); then; do; local v; mat v = @vcov*(@nob-@ncoef)/(@nob-@ncoef-nz_ni); tstats(names=X) @coef v; enddo; ? nuisance coefficients mat G = g_y; dot(index=j) X; mat G = G - g_.*B(j); enddot; Endproc;