Title: | Instrumental-Variables Regression by '2SLS', '2SM', or '2SMM', with Diagnostics |
Version: | 0.6-5 |
Date: | 2025-01-19 |
Description: | Instrumental variable estimation for linear models by two-stage least-squares (2SLS) regression or by robust-regression via M-estimation (2SM) or MM-estimation (2SMM). The main ivreg() model-fitting function is designed to provide a workflow as similar as possible to standard lm() regression. A wide range of methods is provided for fitted ivreg model objects, including extensive functionality for computing and graphing regression diagnostics in addition to other standard model tools. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Depends: | R (≥ 3.6.0) |
Imports: | car (≥ 3.0-9), Formula, lmtest, MASS, stats |
Suggests: | AER, effects (≥ 4.2.0), knitr, insight, parallel, rmarkdown, sandwich, testthat, modelsummary, gt, ggplot2 |
Encoding: | UTF-8 |
LazyData: | true |
VignetteBuilder: | knitr |
BugReports: | https://github.com/zeileis/ivreg/issues/ |
URL: | https://zeileis.github.io/ivreg/ |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-01-19 09:14:58 UTC; zeileis |
Author: | John Fox |
Maintainer: | Achim Zeileis <Achim.Zeileis@R-project.org> |
Repository: | CRAN |
Date/Publication: | 2025-01-19 10:40:02 UTC |
U.S. Cigarette Demand Data
Description
Determinants of cigarette demand for the 48 continental US States in 1995 and compared between 1995 and 1985.
Usage
data("CigaretteDemand", package = "ivreg")
Format
A data frame with 48 rows and 10 columns.
- packs
Number of cigarette packs per capita sold in 1995.
- rprice
Real price in 1995 (including sales tax).
- rincome
Real per capita income in 1995.
- salestax
Sales tax in 1995.
- cigtax
Cigarette-specific taxes (federal and average local excise taxes) in 1995.
- packsdiff
Difference in
log(packs)
(between 1995 and 1985).- pricediff
Difference in
log(rprice)
(between 1995 and 1985).- incomediff
Difference in
log(rincome)
(between 1995 and 1985).- salestaxdiff
Difference in
salestax
(between 1995 and 1985).- cigtaxdiff
Difference in
cigtax
(between 1995 and 1985).
Details
The data are taken from the online complements to Stock and Watson (2007) and
had been prepared as panel data (in long form) in CigarettesSW
from the AER package (Kleiber and Zeileis 2008). Here, the data are provided by
state (in wide form), readily preprocessed to contain all variables needed for
illustrations of OLS and IV regressions. More related examples from Stock and
Watson (2007) are provided in the AER package in StockWatson2007
.
A detailed discussion of the various cigarette demand examples with R code
is provided by Hanck et al. (2020, Chapter 12).
Source
Online complements to Stock and Watson (2007).
References
Hanck, C., Arnold, M., Gerber, A., and Schmelzer, M. (2020). Introduction to Econometrics with R. https://www.econometrics-with-r.org/
Kleiber, C. and Zeileis, A. (2008). Applied Econometrics with R. Springer-Verlag
Stock, J.H. and Watson, M.W. (2007). Introduction to Econometrics, 2nd ed., Addison Wesley.
See Also
Examples
## load data
data("CigaretteDemand", package = "ivreg")
## basic price elasticity: OLS vs. IV
cig_ols <- lm(log(packs) ~ log(rprice), data = CigaretteDemand)
cig_iv <- ivreg(log(packs) ~ log(rprice) | salestax, data = CigaretteDemand)
cbind(OLS = coef(cig_ols), IV = coef(cig_iv))
## adjusting for income differences (exogenous)
cig_iv2 <- ivreg(log(packs) ~ log(rprice) + log(rincome) | salestax + log(rincome),
data = CigaretteDemand)
## adding a second instrument for log(rprice)
cig_iv3 <- update(cig_iv2, . ~ . | . + cigtax)
## comparison using heteroscedasticity-consistent standard errors
library("lmtest")
library("sandwich")
coeftest(cig_iv2, vcov = vcovHC, type = "HC1")
coeftest(cig_iv3, vcov = vcovHC, type = "HC1")
## long-run price elasticity using differences between 1995 and 1985
cig_ivdiff1 <- ivreg(packsdiff ~ pricediff + incomediff | incomediff + salestaxdiff,
data = CigaretteDemand)
cig_ivdiff2 <- update(cig_ivdiff1, . ~ . | . - salestaxdiff + cigtaxdiff)
cig_ivdiff3 <- update(cig_ivdiff1, . ~ . | . + cigtaxdiff)
coeftest(cig_ivdiff1, vcov = vcovHC, type = "HC1")
coeftest(cig_ivdiff2, vcov = vcovHC, type = "HC1")
coeftest(cig_ivdiff3, vcov = vcovHC, type = "HC1")
Partly Artificial Data on the U.S. Economy
Description
These are partly contrived data from Kmenta (1986), constructed to illustrate estimation of a simultaneous-equation econometric model. The data are an annual time-series for the U.S. economy from 1922 to 1941. The values of the exogenous variables D, and F, and A are real, while those of the endogenous variables Q and P are simulated according to the linear simultaneous equation model fit in the examples.
Usage
data("Kmenta", package = "ivreg")
Format
A data frame with 20 rows and 5 columns.
- Q
food consumption per capita.
- P
ratio of food prices to general consumer prices.
- D
disposable income in constant dollars.
- F
ratio of preceding year's prices received by farmers to general consumer prices.
- A
time in years.
Source
Kmenta, J. (1986) Elements of Econometrics, 2nd ed., Macmillan.
See Also
Examples
data("Kmenta", package = "ivreg")
deq <- ivreg(Q ~ P + D | D + F + A, data = Kmenta) # demand equation
seq <- ivreg(Q ~ P + F + A | D + F + A, data = Kmenta) # supply equation
summary(deq, tests = TRUE)
summary(seq, tests = TRUE)
U.S. Returns to Schooling Data
Description
Data from the U.S. National Longitudinal Survey of Young Men (NLSYM) in 1976 but using some variables dating back to earlier years.
Usage
data("SchoolingReturns", package = "ivreg")
Format
A data frame with 3010 rows and 22 columns.
- wage
Raw wages in 1976 (in cents per hour).
- education
Education in 1976 (in years).
- experience
Years of labor market experience, computed as
age - education - 6
.- ethnicity
Factor indicating ethnicity. Is the individual African-American (
"afam"
) or not ("other"
)?- smsa
Factor. Does the individual reside in a SMSA (standard metropolitan statistical area) in 1976?
- south
Factor. Does the individual reside in the South in 1976?
- age
Age in 1976 (in years).
- nearcollege
Factor. Did the individual grow up near a 4-year college?
- nearcollege2
Factor. Did the individual grow up near a 2-year college?
- nearcollege4
Factor. Did the individual grow up near a 4-year public or private college?
- enrolled
Factor. Is the individual enrolled in college in 1976?
- married
factor. Is the individual married in 1976?
- education66
Education in 1966 (in years).
- smsa66
Factor. Does the individual reside in a SMSA in 1966?
- south66
Factor. Does the individual reside in the South in 1966?
- feducation
Father's educational attainment (in years). Imputed with average if missing.
- meducation
Mother's educational attainment (in years). Imputed with average if missing.
- fameducation
Ordered factor coding family education class (from 1 to 9).
- kww
Knowledge world of work (KWW) score.
- iq
Normed intelligence quotient (IQ) score
- parents14
Factor coding living with parents at age 14: both parents, single mother, step parent, other
- library14
Factor. Was there a library card in home at age 14?
Details
Investigating the causal link of schooling on earnings in a classical model for wage determinants is problematic because it can be argued that schooling is endogenous. Hence, one possible strategy is to use an exogonous variable as an instrument for the years of education. In his well-known study, Card (1995) uses geographical proximity to a college when growing up as such an instrument, showing that this significantly increases both the years of education and the wage level obtained on the labor market. Using instrumental variables regression Card (1995) shows that the estimated returns to schooling are much higher than when simply using ordinary least squares.
The data are taken from the supplementary material for Verbeek (2004) and are based
on the work of Card (1995). The U.S. National Longitudinal Survey of Young Men
(NLSYM) began in 1966 and included 5525 men, then aged between 14 and 24.
Card (1995) employs labor market information from the 1976 NLSYM interview which
also included information about educational attainment. Out of the 3694 men
still included in that wave of NLSYM, 3010 provided information on both wages
and education yielding the subset of observations provided in SchoolingReturns
.
The examples replicate the results from Verbeek (2004) who used the simplest specifications from Card (1995). Including further region or family background characteristics improves the model significantly but does not affect much the main coefficients of interest, namely that of years of education.
Source
Supplementary material for Verbeek (2004).
References
Card, D. (1995). Using Geographical Variation in College Proximity to Estimate the Return to Schooling. In: Christofides, L.N., Grant, E.K., and Swidinsky, R. (eds.), Aspects of Labour Market Behaviour: Essays in Honour of John Vanderkamp, University of Toronto Press, Toronto, 201-222.
Verbeek, M. (2004). A Guide to Modern Econometrics, 2nd ed. John Wiley.
Examples
## load data
data("SchoolingReturns", package = "ivreg")
## Table 5.1 in Verbeek (2004) / Table 2(1) in Card (1995)
## Returns to education: 7.4%
m_ols <- lm(log(wage) ~ education + poly(experience, 2, raw = TRUE) + ethnicity + smsa + south,
data = SchoolingReturns)
summary(m_ols)
## Table 5.2 in Verbeek (2004) / similar to Table 3(1) in Card (1995)
m_red <- lm(education ~ poly(age, 2, raw = TRUE) + ethnicity + smsa + south + nearcollege,
data = SchoolingReturns)
summary(m_red)
## Table 5.3 in Verbeek (2004) / similar to Table 3(5) in Card (1995)
## Returns to education: 13.3%
m_iv <- ivreg(log(wage) ~ education + poly(experience, 2, raw = TRUE) + ethnicity + smsa + south |
nearcollege + poly(age, 2, raw = TRUE) + ethnicity + smsa + south,
data = SchoolingReturns)
summary(m_iv)
Methods for "ivreg"
Objects
Description
Various methods for processing "ivreg"
objects; for diagnostic methods,
see ivregDiagnostics
.
Usage
## S3 method for class 'ivreg'
coef(object, component = c("stage2", "stage1"), complete = TRUE, ...)
## S3 method for class 'ivreg'
vcov(object, component = c("stage2", "stage1"), complete = TRUE, ...)
## S3 method for class 'ivreg'
bread(x, ...)
## S3 method for class 'ivreg'
estfun(x, ...)
## S3 method for class 'ivreg'
vcovHC(x, ...)
## S3 method for class 'ivreg'
terms(x, component = c("regressors", "instruments", "full"), ...)
## S3 method for class 'ivreg'
model.matrix(
object,
component = c("regressors", "projected", "instruments"),
...
)
## S3 method for class 'ivreg_projected'
model.matrix(object, ...)
## S3 method for class 'ivreg'
predict(
object,
newdata,
type = c("response", "terms"),
na.action = na.pass,
se.fit = FALSE,
interval = c("none", "confidence", "prediction"),
df = Inf,
level = 0.95,
weights,
...
)
## S3 method for class 'ivreg'
print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'ivreg'
update(object, formula., ..., evaluate = TRUE)
## S3 method for class 'ivreg'
residuals(
object,
type = c("response", "projected", "regressors", "working", "deviance", "pearson",
"partial", "stage1"),
...
)
## S3 method for class 'ivreg'
Effect(focal.predictors, mod, ...)
## S3 method for class 'ivreg'
formula(x, component = c("complete", "regressors", "instruments"), ...)
## S3 method for class 'ivreg'
find_formula(x, ...)
## S3 method for class 'ivreg'
alias(object, ...)
## S3 method for class 'ivreg'
qr(x, ...)
## S3 method for class 'ivreg'
weights(object, type = c("variance", "robustness"), ...)
Arguments
object , model , mod |
An object of class |
component |
For |
complete |
If |
... |
arguments to pass down. |
x |
An object of class |
newdata |
Values of predictors for which to obtain predicted values; if missing predicted (i.e., fitted) values are computed for the data to which the model was fit. |
type |
For |
na.action |
|
se.fit |
Compute standard errors of predicted values (default |
interval |
Type of interval to compute for predicted values: |
df |
For |
level |
for confidence or prediction intervals, default |
weights |
Either a numeric vector or a one-sided formula to provide weights for prediction
intervals when the fit is weighted. If |
digits |
For printing. |
formula. |
To update model. |
evaluate |
If |
focal.predictors |
Focal predictors for effect plot, see |
See Also
ivreg
, ivreg.fit
, ivregDiagnostics
Summary and Inference Methods for "ivreg"
Objects
Description
Summary method, including Wald tests and (by default) certain diagnostic tests, for
"ivreg"
model objects, as well as other related inference functions.
Usage
## S3 method for class 'ivreg'
confint(
object,
parm,
level = 0.95,
component = c("stage2", "stage1"),
complete = TRUE,
vcov. = NULL,
df = NULL,
...
)
## S3 method for class 'ivreg'
summary(object, vcov. = NULL, df = NULL, diagnostics = NULL, ...)
## S3 method for class 'summary.ivreg'
print(
x,
digits = max(3, getOption("digits") - 3),
signif.stars = getOption("show.signif.stars"),
...
)
## S3 method for class 'ivreg'
anova(object, object2, test = "F", vcov. = NULL, ...)
## S3 method for class 'ivreg'
Anova(mod, test.statistic = c("F", "Chisq"), vcov. = NULL, ...)
## S3 method for class 'ivreg'
linearHypothesis(
model,
hypothesis.matrix,
rhs = NULL,
test = c("F", "Chisq"),
vcov. = NULL,
...
)
Arguments
object , object2 , model , mod |
An object of class |
parm |
parameters for which confidence intervals are to be computed; a vector or numbers or names; the default is all parameters. |
level |
confidence level; the default is |
component |
Character indicating |
complete |
If |
vcov. |
Optionally either a coefficient covariance matrix or a function to compute such a covariance
matrix from fitted |
df |
For |
... |
arguments to pass down. |
diagnostics |
Report 2SLS "diagnostic" tests in model summary (default is |
x |
An object of class |
digits |
Minimal number of significant digits for printing. |
signif.stars |
Show "significance stars" in summary output? |
test , test.statistic |
Test statistics for ANOVA table computed by |
hypothesis.matrix , rhs |
For formulating a linear hypothesis; see the documentation
for |
See Also
ivreg
, ivreg.fit
, ivregDiagnostics
Examples
## data and model
data("CigaretteDemand", package = "ivreg")
m <- ivreg(log(packs) ~ log(rincome) | log(rprice) | salestax, data = CigaretteDemand)
## summary including diagnostics
summary(m)
## replicate global F test from summary (against null model) "by hand"
m0 <- ivreg(log(packs) ~ 1, data = CigaretteDemand)
anova(m0, m)
## or via linear hypothesis test
car::linearHypothesis(m, c("log(rincome)", "log(rprice)"))
## confidence intervals
confint(m)
## just the Wald tests for the coefficients
library("lmtest")
coeftest(m)
## plug in a heteroscedasticity-consistent HC1 covariance matrix (from sandwich)
library("sandwich")
## - as a function passing additional type argument through ...
coeftest(m, vcov = vcovHC, type = "HC1")
## - as a function without additional arguments
hc1 <- function(object, ...) vcovHC(object, type = "HC1", ...)
coeftest(m, vcov = hc1)
## - as a matrix
vc1 <- vcovHC(m, type = "HC1")
coeftest(m, vcov = vc1)
## in summary() with diagnostics = TRUE use one of the function specifications,
## the matrix is only possible when diagnostics = FALSE
summary(m, vcov = vcovHC, type = "HC1") ## function + ...
summary(m, vcov = hc1) ## function
summary(m, vcov = vc1, diagnostics = FALSE) ## matrix
## in confint() and anova() any of the three specifications can be used
anova(m0, m, vcov = vcovHC, type = "HC1") ## function + ...
anova(m0, m, vcov = hc1) ## function
anova(m0, m, vcov = vc1) ## matrix
Deletion and Other Diagnostic Methods for "ivreg"
Objects
Description
Methods for computing deletion and other regression diagnostics for 2SLS regression.
It's generally more efficient to compute the deletion diagnostics via the influence
method and then to extract the various specific diagnostics with the methods for
"influence.ivreg"
objects. Other diagnostics for linear models, such as
added-variable plots (avPlots
) and component-plus-residual
plots (crPlots
), also work, as do effect plots
(e.g., predictorEffects
) with residuals (see the examples below).
The pointwise confidence envelope for the qqPlot
method assumes an independent random sample
from the t distribution with degrees of freedom equal to the residual degrees of
freedom for the model and so are approximate, because the studentized residuals aren't
independent.
For additional information, see the vignette Diagnostics for 2SLS Regression.
Usage
## S3 method for class 'ivreg'
influence(
model,
sigma. = n <= 1000,
type = c("stage2", "both", "maximum"),
applyfun = NULL,
ncores = NULL,
...
)
## S3 method for class 'ivreg'
rstudent(model, ...)
## S3 method for class 'ivreg'
cooks.distance(model, ...)
## S3 method for class 'influence.ivreg'
dfbeta(model, ...)
## S3 method for class 'ivreg'
dfbeta(model, ...)
## S3 method for class 'ivreg'
hatvalues(model, type = c("stage2", "both", "maximum", "stage1"), ...)
## S3 method for class 'influence.ivreg'
rstudent(model, ...)
## S3 method for class 'influence.ivreg'
hatvalues(model, ...)
## S3 method for class 'influence.ivreg'
cooks.distance(model, ...)
## S3 method for class 'influence.ivreg'
qqPlot(
x,
ylab = paste("Studentized Residuals(", deparse(substitute(x)), ")", sep = ""),
distribution = c("t", "norm"),
...
)
## S3 method for class 'ivreg'
influencePlot(model, ...)
## S3 method for class 'influence.ivreg'
influencePlot(model, ...)
## S3 method for class 'ivreg'
infIndexPlot(model, ...)
## S3 method for class 'influence.ivreg'
infIndexPlot(model, ...)
## S3 method for class 'influence.ivreg'
model.matrix(object, ...)
## S3 method for class 'ivreg'
avPlots(model, terms, ...)
## S3 method for class 'ivreg'
avPlot(model, ...)
## S3 method for class 'ivreg'
mcPlots(model, terms, ...)
## S3 method for class 'ivreg'
mcPlot(model, ...)
## S3 method for class 'ivreg'
Boot(
object,
f = coef,
labels = names(f(object)),
R = 999,
method = "case",
ncores = 1,
...
)
## S3 method for class 'ivreg'
crPlots(model, terms, ...)
## S3 method for class 'ivreg'
crPlot(model, ...)
## S3 method for class 'ivreg'
ceresPlots(model, terms, ...)
## S3 method for class 'ivreg'
ceresPlot(model, ...)
## S3 method for class 'ivreg'
plot(x, ...)
## S3 method for class 'ivreg'
qqPlot(x, distribution = c("t", "norm"), ...)
## S3 method for class 'ivreg'
outlierTest(model, ...)
## S3 method for class 'ivreg'
spreadLevelPlot(x, main = "Spread-Level Plot", ...)
## S3 method for class 'ivreg'
ncvTest(model, ...)
## S3 method for class 'ivreg'
deviance(object, ...)
## S3 method for class 'rivreg'
influence(model, ...)
Arguments
model , x , object |
A |
sigma. |
If |
type |
If |
applyfun |
Optional loop replacement function that should work like
|
ncores |
Numeric, number of cores to be used in parallel computations. If set
to an integer the
|
... |
arguments to be passed down. |
ylab |
The vertical axis label. |
distribution |
|
terms |
Terms for which added-variable plots are to be constructed; the default,
if the argument isn't specified, is the |
f , labels , R |
see |
method |
only |
main |
Main title for the graph. |
Value
In the case of influence.ivreg
, an object of class "influence.ivreg"
with the following components:
coefficients
the estimated regression coefficients
model
the model matrix
dfbeta
influence on coefficients
sigma
deleted values of the residual standard deviation
dffits
overall influence on the regression coefficients
cookd
Cook's distances
hatvalues
hatvalues
rstudent
Studentized residuals
df.residual
residual degrees of freedom
In the case of other methods, such as rstudent.ivreg
or
rstudent.influence.ivreg
, the corresponding diagnostic statistics.
Many other methods (e.g., crPlot.ivreg
, avPlot.ivreg
, Effect.ivreg
)
draw graphs.
See Also
ivreg
, avPlots
,
crPlots
, predictorEffects
,
qqPlot
, influencePlot
,
infIndexPlot
, Boot
,
outlierTest
, spreadLevelPlot
,
ncvTest
.
Examples
kmenta.eq1 <- ivreg(Q ~ P + D | D + F + A, data = Kmenta)
summary(kmenta.eq1)
car::avPlots(kmenta.eq1)
car::mcPlots(kmenta.eq1)
car::crPlots(kmenta.eq1)
car::ceresPlots(kmenta.eq1)
car::influencePlot(kmenta.eq1)
car::influenceIndexPlot(kmenta.eq1)
car::qqPlot(kmenta.eq1)
car::spreadLevelPlot(kmenta.eq1)
plot(effects::predictorEffects(kmenta.eq1, residuals = TRUE))
set.seed <- 12321 # for reproducibility
confint(car::Boot(kmenta.eq1, R = 250)) # 250 reps for brevity
car::outlierTest(kmenta.eq1)
car::ncvTest(kmenta.eq1)
Instrumental-Variable Regression by 2SLS, 2SM, or 2SMM Estimation
Description
Fit instrumental-variable regression by two-stage least squares (2SLS). This is equivalent to direct instrumental-variables estimation when the number of instruments is equal to the number of regressors. Alternative robust-regression estimators are also provided, based on M-estimation (2SM) and MM-estimation (2SMM).
Usage
ivreg(
formula,
instruments,
data,
subset,
na.action,
weights,
offset,
contrasts = NULL,
model = TRUE,
y = TRUE,
x = FALSE,
method = c("OLS", "M", "MM"),
...
)
Arguments
formula , instruments |
formula specification(s) of the regression
relationship and the instruments. Either |
data |
an optional data frame containing the variables in the model.
By default the variables are taken from the environment of the
|
subset |
an optional vector specifying a subset of observations to be used in fitting the model. |
na.action |
a function that indicates what should happen when the data
contain |
weights |
an optional vector of weights to be used in the fitting process. |
offset |
an optional offset that can be used to specify an a priori known component to be included during fitting. |
contrasts |
an optional list. See the |
model , x , y |
logicals. If |
method |
the method used to fit the stage 1 and 2 regression:
|
... |
further arguments passed to |
Details
ivreg
is the high-level interface to the work-horse function
ivreg.fit
. A set of standard methods (including print
,
summary
, vcov
, anova
, predict
, residuals
,
terms
, model.matrix
, bread
, estfun
) is available
and described in ivregMethods
. For methods related to regression
diagnostics, see ivregDiagnostics
.
Regressors and instruments for ivreg
are most easily specified in a
formula with two parts on the right-hand side, e.g., y ~ x1 + x2 | z1
+ z2 + z3
, where x1
and x2
are the explanatory variables and z1
,
z2
, and z3
are the instrumental variables. Note that exogenous regressors
have to be included as instruments for themselves.
For example, if there is
one exogenous regressor ex
and one endogenous regressor en
with instrument in
, the appropriate formula would be y ~ en +
ex | in + ex
. Alternatively, a formula with three parts on the right-hand
side can also be used: y ~ ex | en | in
. The latter is typically more convenient, if
there is a large number of exogenous regressors.
Moreover, two further equivalent specification strategies are possible that are
typically less convenient compared to the strategies above. One option is to use
an update formula with a .
in the second part of the formula is used:
y ~ en + ex | . - en + in
. Another option is to use a separate formula
for the instruments (only for backward compatibility with earlier versions):
formula = y ~ en + ex, instruments = ~ in + ex
.
Internally, all specifications are converted to the version with two parts on the right-hand side.
Value
ivreg
returns an object of class "ivreg"
that inherits from
class "lm"
, with the following components:
coefficients |
parameter estimates, from the stage-2 regression. |
residuals |
vector of model residuals. |
residuals1 |
matrix of residuals from the stage-1 regression. |
residuals2 |
vector of residuals from the stage-2 regression. |
fitted.values |
vector of predicted means for the response. |
weights |
either the vector of weights used (if any) or |
offset |
either the offset used (if any) or |
estfun |
a matrix containing the empirical estimating functions. |
n |
number of observations. |
nobs |
number of observations with non-zero weights. |
p |
number of columns in the model matrix x of regressors. |
q |
number of columns in the instrumental variables model matrix z |
rank |
numeric rank of the model matrix for the stage-2 regression. |
df.residual |
residual degrees of freedom for fitted model. |
cov.unscaled |
unscaled covariance matrix for the coefficients. |
sigma |
residual standard deviation. |
qr |
QR decomposition for the stage-2 regression. |
qr1 |
QR decomposition for the stage-1 regression. |
rank1 |
numeric rank of the model matrix for the stage-1 regression. |
coefficients1 |
matrix of coefficients from the stage-1 regression. |
df.residual1 |
residual degrees of freedom for the stage-1 regression. |
exogenous |
columns of the |
endogenous |
columns of the |
instruments |
columns of the |
method |
the method used for the stage 1 and 2 regressions, one of |
rweights |
a matrix of robustness weights with columns for each of the stage-1
regressions and for the stage-2 regression (in the last column) if the fitting method is
|
hatvalues |
a matrix of hatvalues. For |
df.residual |
residual degrees of freedom for fitted model. |
call |
the original function call. |
formula |
the model formula. |
na.action |
function applied to missing values in the model fit. |
terms |
a list with elements |
levels |
levels of the categorical regressors. |
contrasts |
the contrasts used for categorical regressors. |
model |
the full model frame (if |
y |
the response vector (if |
x |
a list with elements |
References
Greene, W.H. (2003) Econometric Analysis, 5th ed., Upper Saddle River: Prentice Hall.
See Also
ivreg.fit
, ivregDiagnostics
, ivregMethods
,
lm
, lm.fit
Examples
## data
data("CigaretteDemand", package = "ivreg")
## model
m <- ivreg(log(packs) ~ log(rprice) + log(rincome) | salestax + log(rincome),
data = CigaretteDemand)
summary(m)
summary(m, vcov = sandwich::sandwich, df = Inf)
## ANOVA
m2 <- update(m, . ~ . - log(rincome) | . - log(rincome))
anova(m, m2)
car::Anova(m)
## same model specified by formula with three-part right-hand side
ivreg(log(packs) ~ log(rincome) | log(rprice) | salestax, data = CigaretteDemand)
# Robust 2SLS regression
data("Kmenta", package = "ivreg")
Kmenta1 <- Kmenta
Kmenta1[20, "Q"] <- 95 # corrupted data
deq <- ivreg(Q ~ P + D | D + F + A, data=Kmenta) # demand equation, uncorrupted data
deq1 <- ivreg(Q ~ P + D | D + F + A, data=Kmenta1) # standard 2SLS, corrupted data
deq2 <- ivreg(Q ~ P + D | D + F + A, data=Kmenta1, subset=-20) # standard 2SLS, removing bad case
deq3 <- ivreg(Q ~ P + D | D + F + A, data=Kmenta1, method="MM") # 2SLS MM estimation
car::compareCoefs(deq, deq1, deq2, deq3)
round(deq3$rweights, 2) # robustness weights
Fitting Instrumental-Variable Regressions by 2SLS, 2SM, or 2SMM Estimation
Description
Fit instrumental-variable regression by two-stage least squares (2SLS). This is equivalent to direct instrumental-variables estimation when the number of instruments is equal to the number of predictors. Alternative robust-regression estimation is also supported, based on M-estimation (22M) or MM-estimation (2SMM).
Usage
ivreg.fit(
x,
y,
z,
weights,
offset,
method = c("OLS", "M", "MM"),
rlm.args = list(),
...
)
Arguments
x |
regressor matrix. |
y |
vector for the response variable. |
z |
instruments matrix. |
weights |
an optional vector of weights to be used in the fitting process. |
offset |
an optional offset that can be used to specify an a priori known component to be included during fitting. |
method |
the method used to fit the stage 1 and 2 regression:
|
rlm.args |
a list of optional arguments to be passed to the |
... |
further arguments passed to |
Details
ivreg
is the high-level interface to the work-horse function
ivreg.fit
. ivreg.fit
is essentially a convenience interface to
lm.fit
(or lm.wfit
)
for first projecting x
onto the image of
z
, then running a regression of y
on the projected
x
, and computing the residual standard deviation.
Value
ivreg.fit
returns an unclassed list with the following
components:
coefficients |
parameter estimates, from the stage-2 regression. |
residuals |
vector of model residuals. |
residuals1 |
matrix of residuals from the stage-1 regression. |
residuals2 |
vector of residuals from the stage-2 regression. |
fitted.values |
vector of predicted means for the response. |
weights |
either the vector of weights used (if any) or |
offset |
either the offset used (if any) or |
estfun |
a matrix containing the empirical estimating functions. |
n |
number of observations. |
nobs |
number of observations with non-zero weights. |
p |
number of columns in the model matrix x of regressors. |
q |
number of columns in the instrumental variables model matrix z |
rank |
numeric rank of the model matrix for the stage-2 regression. |
df.residual |
residual degrees of freedom for fitted model. |
cov.unscaled |
unscaled covariance matrix for the coefficients. |
sigma |
residual standard error; when method is |
x |
projection of x matrix onto span of z. |
qr |
QR decomposition for the stage-2 regression. |
qr1 |
QR decomposition for the stage-1 regression. |
rank1 |
numeric rank of the model matrix for the stage-1 regression. |
coefficients1 |
matrix of coefficients from the stage-1 regression. |
df.residual1 |
residual degrees of freedom for the stage-1 regression. |
exogenous |
columns of the |
endogenous |
columns of the |
instruments |
columns of the |
method |
the method used for the stage 1 and 2 regressions, one of |
rweights |
a matrix of robustness weights with columns for each of the stage-1
regressions and for the stage-2 regression (in the last column) if the fitting method is
|
hatvalues |
a matrix of hatvalues. For |
See Also
ivreg
, lm.fit
,
lm.wfit
, rlm
, mad
Examples
## data
data("CigaretteDemand", package = "ivreg")
## high-level interface
m <- ivreg(log(packs) ~ log(rprice) + log(rincome) | salestax + log(rincome),
data = CigaretteDemand)
## low-level interface
y <- m$y
x <- model.matrix(m, component = "regressors")
z <- model.matrix(m, component = "instruments")
ivreg.fit(x, y, z)$coefficients