Stata uses a finite sample correction to reduce downwards bias in the errors due to the finite number of clusters. It is a multiplicative factor on the variance-covariance matrix, $c=frac{G}{G-1} cdot frac{N-1}{N-K}$, where G is the number of groups, N is the number of observations, and K is the number of parameters. I think coeftest
only uses $c'=frac{N-1}{N-K}$ since if I scale R's standard error by the square of the first term in c, I get something pretty close to Stata's standard error:
display 0.21136*(46/(46-1))^(.5)
.21369554
Here's how I would replicate what Stata is doing in R:
require(plm)
require(lmtest)
data(Cigar)
model <- plm(price ~ sales, model = 'within', data = Cigar)
G <- length(unique(Cigar$state))
c <- G/(G - 1)
coeftest(model,c * vcovHC(model, type = "HC1", cluster = "group"))
This yields:
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
sales -1.219563 0.213773 -5.70496 1.4319e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
which agrees with Stata's error of 0.2137726 and t-stat of -5.70.
This code is probably not ideal, since the number of states in the data may be different than the number of states in the regression, but I am too lazy to figure out how to get the right number of panels.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…