1126 lines
39 KiB
Python
1126 lines
39 KiB
Python
"""Score, lagrange multiplier and conditional moment tests
|
||
robust to misspecification or without specification of higher moments
|
||
|
||
Created on Thu Oct 30 00:42:38 2014
|
||
|
||
Author: Josef Perktold
|
||
License: BSD-3
|
||
|
||
Notes
|
||
-----
|
||
|
||
This module is a mixture of very general and very specific functions for
|
||
hypothesis testing in general models, targeted mainly to non-normal models.
|
||
|
||
Some of the options or versions of these tests are mainly intented for
|
||
cross-checking and to replicate different examples in references.
|
||
|
||
We need clean versions with good defaults for those functions that are
|
||
intended for the user.
|
||
|
||
|
||
|
||
References
|
||
----------
|
||
|
||
The following references are collected after my intitial implementation and is
|
||
most likely not exactly what I used.
|
||
|
||
The main articles on which the functions are directly based upon, are Boos 1992,
|
||
Tauchen 1985 and Whitney 1985a. Wooldrige artificial regression is
|
||
based on several articles and his text book.
|
||
Background reading are the textbooks by Cameron and Trivedi, Wooldridge and
|
||
Davidson and MacKinnon.
|
||
Newey and MacFadden 1994 provide some of the theoretical background.
|
||
|
||
Poisson dispersion tests are based on Dean 1992 and articles and text books by
|
||
Cameron and Trivedi.
|
||
|
||
The references currently do not include the literature on LM-test for
|
||
specification and diagnostic testing, like Pagan, Bera, Bera and Yoon and
|
||
many others, except those for the Poisson excess dispersion case and Pagan
|
||
and Vella.
|
||
|
||
|
||
Boos, Dennis D. 1992. “On Generalized Score Tests.” The American Statistician 46
|
||
(4): 327–33. https://doi.org/10.2307/2685328.
|
||
|
||
Breslow, Norman. 1989. “Score Tests in Overdispersed GLM’s.” In Statistical
|
||
Modelling, edited by Adriano Decarli, Brian J. Francis, Robert Gilchrist, and
|
||
Gilg U. H. Seeber, 64–74. Lecture Notes in Statistics 57. Springer New York.
|
||
http://link.springer.com/chapter/10.1007/978-1-4612-3680-1_8.
|
||
|
||
Breslow, Norman. 1990. “Tests of Hypotheses in Overdispersed Poisson Regression
|
||
and Other Quasi- Likelihood Models.” Journal of the American Statistical
|
||
Association 85 (410): 565–71. https://doi.org/10.2307/2289799.
|
||
|
||
Cameron, A. Colin, and Pravin K. Trivedi. 1986. “Econometric Models Based on
|
||
Count Data. Comparisons and Applications of Some Estimators and Tests.” Journal
|
||
of Applied Econometrics 1 (1): 29–53. https://doi.org/10.1002/jae.3950010104.
|
||
|
||
Cameron, A. Colin, and Pravin K. Trivedi. 1990a. “Conditional Moment Tests and
|
||
Orthogonal Polynomials.” Indiana University, Department of Economics, Working
|
||
Paper, 90–051.
|
||
|
||
Cameron, A. Colin, and Pravin K. Trivedi. 1990b. “Regression-Based Tests for
|
||
Overdispersion in the Poisson Model.” Journal of Econometrics 46 (3): 347–64.
|
||
https://doi.org/10.1016/0304-4076(90)90014-K.
|
||
|
||
Cameron, A. Colin, and Pravin K. Trivedi. Microeconometrics: methods and
|
||
applications. Cambridge university press, 2005.
|
||
|
||
Cameron, A. Colin, and Pravin K. Trivedi. Regression analysis of count data.
|
||
Vol. 53. Cambridge university press, 2013.
|
||
|
||
Davidson, Russell, and James G. MacKinnon. 1981. “Several Tests for Model
|
||
Specification in the Presence of Alternative Hypotheses.” Econometrica 49 (3):
|
||
781–93. https://doi.org/10.2307/1911522.
|
||
|
||
Davidson, Russell, and James G. MacKinnon. 1990. “Specification Tests Based on
|
||
Artificial Regressions.” Journal of the American Statistical Association 85
|
||
(409): 220–27. https://doi.org/10.2307/2289548.
|
||
|
||
Davidson, Russell, and James G. MacKinnon. 1991. “Artificial Regressions and C
|
||
(α) Tests.” Economics Letters 35 (2): 149–53.
|
||
https://doi.org/10.1016/0165-1765(91)90162-E.
|
||
|
||
Davidson, Russell, and James G. MacKinnon. Econometric theory and methods. Vol.
|
||
5. New York: Oxford University Press, 2004.
|
||
|
||
Dean, C. B. 1992. “Testing for Overdispersion in Poisson and Binomial Regression
|
||
Models.” Journal of the American Statistical Association 87 (418): 451–57.
|
||
https://doi.org/10.2307/2290276.
|
||
|
||
Dean, C., and J. F. Lawless. 1989. “Tests for Detecting Overdispersion in
|
||
Poisson Regression Models.” Journal of the American Statistical Association 84
|
||
(406): 467–72. https://doi.org/10.2307/2289931.
|
||
|
||
Newey, Whitney K. 1985a. “Generalized Method of Moments Specification Testing.”
|
||
Journal of Econometrics 29 (3): 229–56.
|
||
https://doi.org/10.1016/0304-4076(85)90154-X.
|
||
|
||
Newey, Whitney K. 1985b. “Maximum Likelihood Specification Testing and
|
||
Conditional Moment Tests.” Econometrica 53 (5): 1047–70.
|
||
https://doi.org/10.2307/1911011.
|
||
|
||
Newey, Whitney K. and Kenneth D. West. 1987. “Hypothesis Testing with Efficient
|
||
Method of Moments Estimation.” International Economic Review 28 (3): 777–87.
|
||
https://doi.org/10.2307/2526578.
|
||
|
||
Newey, Whitney K. and Daniel McFadden. 1994 "Large sample estimation and
|
||
hypothesis testing." Handbook of econometrics 4: 2111-2245.
|
||
|
||
Pagan, Adrian, and Frank Vella. 1989. “Diagnostic Tests for Models Based on
|
||
Individual Data: A Survey.” Journal of Applied Econometrics 4 (S1): S29–59.
|
||
https://doi.org/10.1002/jae.3950040504.
|
||
|
||
Tauchen, George. 1985. “Diagnostic Testing and Evaluation of Maximum Likelihood
|
||
Models.” Journal of Econometrics 30 (1–2): 415–43.
|
||
https://doi.org/10.1016/0304-4076(85)90149-6.
|
||
|
||
White, Halbert. 1981. “Consequences and Detection of Misspecified Nonlinear
|
||
Regression Models.” Journal of the American Statistical Association 76 (374):
|
||
419–33. https://doi.org/10.2307/2287845.
|
||
|
||
White, Halbert. 1983. “Maximum Likelihood Estimation of Misspecified Models.”
|
||
Econometrica 51 (2): 513. https://doi.org/10.2307/1912004.
|
||
|
||
White, Halbert. 1994. Estimation, Inference and Specification Analysis.
|
||
Cambridge: Cambridge University Press. https://doi.org/10.1017/CCOL0521252806.
|
||
|
||
Wooldridge, Jeffrey M. 1991. “Specification Testing and Quasi-Maximum-
|
||
Likelihood Estimation.” Journal of Econometrics 48 (1–2): 29–55.
|
||
https://doi.org/10.1016/0304-4076(91)90031-8.
|
||
|
||
Wooldridge, Jeffrey M. 1990. “A Unified Approach to Robust, Regression-Based
|
||
Specification Tests.” Econometric Theory 6 (1): 17–43.
|
||
|
||
Wooldridge, Jeffrey M. 1991a. “On the Application of Robust, Regression- Based
|
||
Diagnostics to Models of Conditional Means and Conditional Variances.” Journal
|
||
of Econometrics 47 (1): 5–46. https://doi.org/10.1016/0304-4076(91)90076-P.
|
||
|
||
Wooldridge, Jeffrey M. 1991b. “On the Application of Robust, Regression- Based
|
||
Diagnostics to Models of Conditional Means and Conditional Variances.” Journal
|
||
of Econometrics 47 (1): 5–46. https://doi.org/10.1016/0304-4076(91)90076-P.
|
||
|
||
Wooldridge, Jeffrey M. 1991c. “Specification Testing and Quasi-Maximum-
|
||
Likelihood Estimation.” Journal of Econometrics 48 (1–2): 29–55.
|
||
https://doi.org/10.1016/0304-4076(91)90031-8.
|
||
|
||
Wooldridge, Jeffrey M. 1994. “On the Limits of GLM for Specification Testing: A
|
||
Comment on Gurmu and Trivedi.” Econometric Theory 10 (2): 409–18.
|
||
https://doi.org/10.2307/3532875.
|
||
|
||
Wooldridge, Jeffrey M. 1997. “Quasi-Likelihood Methods for Count Data.” Handbook
|
||
of Applied Econometrics 2: 352–406.
|
||
|
||
Wooldridge, Jeffrey M. Econometric analysis of cross section and panel data. MIT
|
||
press, 2010.
|
||
|
||
"""
|
||
import warnings
|
||
|
||
import numpy as np
|
||
from scipy import stats
|
||
|
||
from statsmodels.tools.decorators import cache_readonly
|
||
from statsmodels.regression.linear_model import OLS
|
||
|
||
|
||
# deprecated dispersion functions, moved to discrete._diagnostic_count
|
||
|
||
|
||
def dispersion_poisson(results):
|
||
"""Score/LM type tests for Poisson variance assumptions
|
||
|
||
.. deprecated:: 0.14
|
||
|
||
dispersion_poisson moved to discrete._diagnostic_count
|
||
|
||
Null Hypothesis is
|
||
|
||
H0: var(y) = E(y) and assuming E(y) is correctly specified
|
||
H1: var(y) ~= E(y)
|
||
|
||
The tests are based on the constrained model, i.e. the Poisson model.
|
||
The tests differ in their assumed alternatives, and in their maintained
|
||
assumptions.
|
||
|
||
Parameters
|
||
----------
|
||
results : Poisson results instance
|
||
This can be a results instance for either a discrete Poisson or a GLM
|
||
with family Poisson.
|
||
|
||
Returns
|
||
-------
|
||
res : ndarray, shape (7, 2)
|
||
each row contains the test statistic and p-value for one of the 7 tests
|
||
computed here.
|
||
description : 2-D list of strings
|
||
Each test has two strings a descriptive name and a string for the
|
||
alternative hypothesis.
|
||
"""
|
||
msg = (
|
||
'dispersion_poisson here is deprecated, use the version in '
|
||
'discrete._diagnostic_count'
|
||
)
|
||
warnings.warn(msg, FutureWarning)
|
||
|
||
from statsmodels.discrete._diagnostics_count import test_poisson_dispersion
|
||
return test_poisson_dispersion(results, _old=True)
|
||
|
||
|
||
def dispersion_poisson_generic(results, exog_new_test, exog_new_control=None,
|
||
include_score=False, use_endog=True,
|
||
cov_type='HC3', cov_kwds=None, use_t=False):
|
||
"""A variable addition test for the variance function
|
||
|
||
.. deprecated:: 0.14
|
||
|
||
dispersion_poisson_generic moved to discrete._diagnostic_count
|
||
|
||
This uses an artificial regression to calculate a variant of an LM or
|
||
generalized score test for the specification of the variance assumption
|
||
in a Poisson model. The performed test is a Wald test on the coefficients
|
||
of the `exog_new_test`.
|
||
|
||
Warning: insufficiently tested, especially for options
|
||
"""
|
||
msg = (
|
||
'dispersion_poisson_generic here is deprecated, use the version in '
|
||
'discrete._diagnostic_count'
|
||
)
|
||
warnings.warn(msg, FutureWarning)
|
||
|
||
from statsmodels.discrete._diagnostics_count import (
|
||
_test_poisson_dispersion_generic
|
||
)
|
||
|
||
res_test = _test_poisson_dispersion_generic(
|
||
results, exog_new_test, exog_new_control= exog_new_control,
|
||
include_score=include_score, use_endog=use_endog,
|
||
cov_type=cov_type, cov_kwds=cov_kwds, use_t=use_t,
|
||
)
|
||
return res_test
|
||
|
||
|
||
class ResultsGeneric:
|
||
|
||
|
||
def __init__(self, **kwds):
|
||
self.__dict__.update(kwds)
|
||
|
||
|
||
class TestResults(ResultsGeneric):
|
||
|
||
def summary(self):
|
||
txt = 'Specification Test (LM, score)\n'
|
||
stat = [self.c1, self.c2, self.c3]
|
||
pval = [self.pval1, self.pval2, self.pval3]
|
||
description = ['nonrobust', 'dispersed', 'HC']
|
||
|
||
for row in zip(description, stat, pval):
|
||
txt += '%-12s statistic = %6.4f pvalue = %6.5f\n' % row
|
||
|
||
txt += '\nAssumptions:\n'
|
||
txt += 'nonrobust: variance is correctly specified\n'
|
||
txt += 'dispersed: variance correctly specified up to scale factor\n'
|
||
txt += 'HC : robust to any heteroscedasticity\n'
|
||
txt += 'test is not robust to correlation across observations'
|
||
|
||
return txt
|
||
|
||
|
||
def lm_test_glm(result, exog_extra, mean_deriv=None):
|
||
'''score/lagrange multiplier test for GLM
|
||
|
||
Wooldridge procedure for test of mean function in GLM
|
||
|
||
Parameters
|
||
----------
|
||
results : GLMResults instance
|
||
results instance with the constrained model
|
||
exog_extra : ndarray or None
|
||
additional exogenous variables for variable addition test
|
||
This can be set to None if mean_deriv is provided.
|
||
mean_deriv : None or ndarray
|
||
Extra moment condition that correspond to the partial derivative of
|
||
a mean function with respect to some parameters.
|
||
|
||
Returns
|
||
-------
|
||
test_results : Results instance
|
||
The results instance has the following attributes which are score
|
||
statistic and p-value for 3 versions of the score test.
|
||
|
||
c1, pval1 : nonrobust score_test results
|
||
c2, pval2 : score test results robust to over or under dispersion
|
||
c3, pval3 : score test results fully robust to any heteroscedasticity
|
||
|
||
The test results instance also has a simple summary method.
|
||
|
||
Notes
|
||
-----
|
||
TODO: add `df` to results and make df detection more robust
|
||
|
||
This implements the auxiliary regression procedure of Wooldridge,
|
||
implemented based on the presentation in chapter 8 in Handbook of
|
||
Applied Econometrics 2.
|
||
|
||
References
|
||
----------
|
||
Wooldridge, Jeffrey M. 1997. “Quasi-Likelihood Methods for Count Data.”
|
||
Handbook of Applied Econometrics 2: 352–406.
|
||
|
||
and other articles and text book by Wooldridge
|
||
|
||
'''
|
||
|
||
if hasattr(result, '_result'):
|
||
res = result._result
|
||
else:
|
||
res = result
|
||
|
||
mod = result.model
|
||
nobs = mod.endog.shape[0]
|
||
|
||
#mean_func = mod.family.link.inverse
|
||
dlinkinv = mod.family.link.inverse_deriv
|
||
|
||
# derivative of mean function w.r.t. beta (linear params)
|
||
dm = lambda x, linpred: dlinkinv(linpred)[:,None] * x
|
||
|
||
var_func = mod.family.variance
|
||
|
||
x = result.model.exog
|
||
x2 = exog_extra
|
||
|
||
# test omitted
|
||
try:
|
||
lin_pred = res.predict(which="linear")
|
||
except TypeError:
|
||
# TODO: Standardized to which="linear" and remove linear kwarg
|
||
lin_pred = res.predict(linear=True)
|
||
dm_incl = dm(x, lin_pred)
|
||
if x2 is not None:
|
||
dm_excl = dm(x2, lin_pred)
|
||
if mean_deriv is not None:
|
||
# allow both and stack
|
||
dm_excl = np.column_stack((dm_excl, mean_deriv))
|
||
elif mean_deriv is not None:
|
||
dm_excl = mean_deriv
|
||
else:
|
||
raise ValueError('either exog_extra or mean_deriv have to be provided')
|
||
|
||
# TODO check for rank or redundant, note OLS calculates the rank
|
||
k_constraint = dm_excl.shape[1]
|
||
fittedvalues = res.predict() # discrete has linpred instead of mean
|
||
v = var_func(fittedvalues)
|
||
std = np.sqrt(v)
|
||
res_ols1 = OLS(res.resid_response / std, np.column_stack((dm_incl, dm_excl)) / std[:, None]).fit()
|
||
|
||
# case: nonrobust assumes variance implied by distribution is correct
|
||
c1 = res_ols1.ess
|
||
pval1 = stats.chi2.sf(c1, k_constraint)
|
||
#print c1, stats.chi2.sf(c1, 2)
|
||
|
||
# case: robust to dispersion
|
||
c2 = nobs * res_ols1.rsquared
|
||
pval2 = stats.chi2.sf(c2, k_constraint)
|
||
#print c2, stats.chi2.sf(c2, 2)
|
||
|
||
# case: robust to heteroscedasticity
|
||
from statsmodels.stats.multivariate_tools import partial_project
|
||
pp = partial_project(dm_excl / std[:,None], dm_incl / std[:,None])
|
||
resid_p = res.resid_response / std
|
||
res_ols3 = OLS(np.ones(nobs), pp.resid * resid_p[:,None]).fit()
|
||
#c3 = nobs * res_ols3.rsquared # this is Wooldridge
|
||
c3b = res_ols3.ess # simpler if endog is ones
|
||
pval3 = stats.chi2.sf(c3b, k_constraint)
|
||
|
||
tres = TestResults(c1=c1, pval1=pval1,
|
||
c2=c2, pval2=pval2,
|
||
c3=c3b, pval3=pval3)
|
||
|
||
return tres
|
||
|
||
|
||
def cm_test_robust(resid, resid_deriv, instruments, weights=1):
|
||
'''score/lagrange multiplier of Wooldridge
|
||
|
||
generic version of Wooldridge procedure for test of conditional moments
|
||
|
||
Limitation: This version allows only for one unconditional moment
|
||
restriction, i.e. resid is scalar for each observation.
|
||
Another limitation is that it assumes independent observations, no
|
||
correlation in residuals and weights cannot be replaced by cross-observation
|
||
whitening.
|
||
|
||
Parameters
|
||
----------
|
||
resid : ndarray, (nobs, )
|
||
conditional moment restriction, E(r | x, params) = 0
|
||
resid_deriv : ndarray, (nobs, k_params)
|
||
derivative of conditional moment restriction with respect to parameters
|
||
instruments : ndarray, (nobs, k_instruments)
|
||
indicator variables of Wooldridge, multiplies the conditional momen
|
||
restriction
|
||
weights : ndarray
|
||
This is a weights function as used in WLS. The moment
|
||
restrictions are multiplied by weights. This corresponds to the
|
||
inverse of the variance in a heteroskedastic model.
|
||
|
||
Returns
|
||
-------
|
||
test_results : Results instance
|
||
??? TODO
|
||
|
||
Notes
|
||
-----
|
||
This implements the auxiliary regression procedure of Wooldridge,
|
||
implemented based on procedure 2.1 in Wooldridge 1990.
|
||
|
||
Wooldridge allows for multivariate conditional moments (`resid`)
|
||
TODO: check dimensions for multivariate case for extension
|
||
|
||
References
|
||
----------
|
||
Wooldridge
|
||
Wooldridge
|
||
and more Wooldridge
|
||
|
||
'''
|
||
# notation: Wooldridge uses too mamny Greek letters
|
||
# instruments is capital lambda
|
||
# resid is small phi
|
||
# resid_deriv is capital phi
|
||
# weights is C
|
||
|
||
|
||
nobs = resid.shape[0]
|
||
|
||
from statsmodels.stats.multivariate_tools import partial_project
|
||
|
||
w_sqrt = np.sqrt(weights)
|
||
if np.size(weights) > 1:
|
||
w_sqrt = w_sqrt[:,None]
|
||
pp = partial_project(instruments * w_sqrt, resid_deriv * w_sqrt)
|
||
mom_resid = pp.resid
|
||
|
||
moms_test = mom_resid * resid[:, None] * w_sqrt
|
||
|
||
# we get this here in case we extend resid to be more than 1-D
|
||
k_constraint = moms_test.shape[1]
|
||
|
||
# use OPG variance as in Wooldridge 1990. This might generalize
|
||
cov = moms_test.T.dot(moms_test)
|
||
diff = moms_test.sum(0)
|
||
|
||
# see Wooldridge last page in appendix
|
||
stat = diff.dot(np.linalg.solve(cov, diff))
|
||
|
||
# for checking, this corresponds to nobs * rsquared of auxiliary regression
|
||
stat2 = OLS(np.ones(nobs), moms_test).fit().ess
|
||
pval = stats.chi2.sf(stat, k_constraint)
|
||
|
||
return stat, pval, stat2
|
||
|
||
|
||
def lm_robust(score, constraint_matrix, score_deriv_inv, cov_score,
|
||
cov_params=None):
|
||
'''general formula for score/LM test
|
||
|
||
generalized score or lagrange multiplier test for implicit constraints
|
||
|
||
`r(params) = 0`, with gradient `R = d r / d params`
|
||
|
||
linear constraints are given by `R params - q = 0`
|
||
|
||
It is assumed that all arrays are evaluated at the constrained estimates.
|
||
|
||
Parameters
|
||
----------
|
||
score : ndarray, 1-D
|
||
derivative of objective function at estimated parameters
|
||
of constrained model
|
||
constraint_matrix R : ndarray
|
||
Linear restriction matrix or Jacobian of nonlinear constraints
|
||
hessian_inv, Ainv : ndarray, symmetric, square
|
||
inverse of second derivative of objective function
|
||
TODO: could be OPG or any other estimator if information matrix
|
||
equality holds
|
||
cov_score B : ndarray, symmetric, square
|
||
covariance matrix of the score. This is the inner part of a sandwich
|
||
estimator.
|
||
cov_params V : ndarray, symmetric, square
|
||
covariance of full parameter vector evaluated at constrained parameter
|
||
estimate. This can be specified instead of cov_score B.
|
||
|
||
Returns
|
||
-------
|
||
lm_stat : float
|
||
score/lagrange multiplier statistic
|
||
|
||
Notes
|
||
-----
|
||
|
||
'''
|
||
# shorthand alias
|
||
R, Ainv, B, V = constraint_matrix, score_deriv_inv, cov_score, cov_params
|
||
|
||
tmp = R.dot(Ainv)
|
||
wscore = tmp.dot(score) # C Ainv score
|
||
|
||
if B is None and V is None:
|
||
# only Ainv is given, so we assume information matrix identity holds
|
||
# computational short cut, should be same if Ainv == inv(B)
|
||
lm_stat = score.dot(Ainv.dot(score))
|
||
else:
|
||
# information matrix identity does not hold
|
||
if V is None:
|
||
inner = tmp.dot(B).dot(tmp.T)
|
||
else:
|
||
inner = R.dot(V).dot(R.T)
|
||
|
||
#lm_stat2 = wscore.dot(np.linalg.pinv(inner).dot(wscore))
|
||
# Let's assume inner is invertible, TODO: check if usecase for pinv exists
|
||
lm_stat = wscore.dot(np.linalg.solve(inner, wscore))
|
||
|
||
return lm_stat#, lm_stat2
|
||
|
||
|
||
def lm_robust_subset(score, k_constraints, score_deriv_inv, cov_score):
|
||
'''general formula for score/LM test
|
||
|
||
generalized score or lagrange multiplier test for constraints on a subset
|
||
of parameters
|
||
|
||
`params_1 = value`, where params_1 is a subset of the unconstrained
|
||
parameter vector.
|
||
|
||
It is assumed that all arrays are evaluated at the constrained estimates.
|
||
|
||
Parameters
|
||
----------
|
||
score : ndarray, 1-D
|
||
derivative of objective function at estimated parameters
|
||
of constrained model
|
||
k_constraint : int
|
||
number of constraints
|
||
score_deriv_inv : ndarray, symmetric, square
|
||
inverse of second derivative of objective function
|
||
TODO: could be OPG or any other estimator if information matrix
|
||
equality holds
|
||
cov_score B : ndarray, symmetric, square
|
||
covariance matrix of the score. This is the inner part of a sandwich
|
||
estimator.
|
||
not cov_params V : ndarray, symmetric, square
|
||
covariance of full parameter vector evaluated at constrained parameter
|
||
estimate. This can be specified instead of cov_score B.
|
||
|
||
Returns
|
||
-------
|
||
lm_stat : float
|
||
score/lagrange multiplier statistic
|
||
p-value : float
|
||
p-value of the LM test based on chisquare distribution
|
||
|
||
Notes
|
||
-----
|
||
The implementation is based on Boos 1992 section 4.1. The same derivation
|
||
is also in other articles and in text books.
|
||
|
||
'''
|
||
|
||
# Notation in Boos
|
||
# score `S = sum (s_i)
|
||
# score_obs `s_i`
|
||
# score_deriv `I` is derivative of score (hessian)
|
||
# `D` is covariance matrix of score, OPG product given independent observations
|
||
|
||
#k_params = len(score)
|
||
|
||
# Note: I reverse order between constraint and unconstrained compared to Boos
|
||
|
||
# submatrices of score_deriv/hessian
|
||
# these are I22 and I12 in Boos
|
||
#h_uu = score_deriv[-k_constraints:, -k_constraints:]
|
||
h_uu = score_deriv_inv[:-k_constraints, :-k_constraints]
|
||
h_cu = score_deriv_inv[-k_constraints:, :-k_constraints]
|
||
|
||
# TODO: pinv or solve ?
|
||
tmp_proj = h_cu.dot(np.linalg.inv(h_uu))
|
||
tmp = np.column_stack((-tmp_proj, np.eye(k_constraints))) #, tmp_proj))
|
||
|
||
cov_score_constraints = tmp.dot(cov_score.dot(tmp.T))
|
||
|
||
#lm_stat2 = wscore.dot(np.linalg.pinv(inner).dot(wscore))
|
||
# Let's assume inner is invertible, TODO: check if usecase for pinv exists
|
||
lm_stat = score.dot(np.linalg.solve(cov_score_constraints, score))
|
||
pval = stats.chi2.sf(lm_stat, k_constraints)
|
||
|
||
# # check second calculation Boos referencing Kent 1982 and Engle 1984
|
||
# # we can use this when robust_cov_params of full model is available
|
||
# #h_inv = np.linalg.inv(score_deriv)
|
||
# hinv = score_deriv_inv
|
||
# v = h_inv.dot(cov_score.dot(h_inv)) # this is robust cov_params
|
||
# v_cc = v[:k_constraints, :k_constraints]
|
||
# h_cc = score_deriv[:k_constraints, :k_constraints]
|
||
# # brute force calculation:
|
||
# h_resid_cu = h_cc - h_cu.dot(np.linalg.solve(h_uu, h_cu))
|
||
# cov_s_c = h_resid_cu.dot(v_cc.dot(h_resid_cu))
|
||
# diff = np.max(np.abs(cov_s_c - cov_score_constraints))
|
||
return lm_stat, pval #, lm_stat2
|
||
|
||
|
||
def lm_robust_subset_parts(score, k_constraints,
|
||
score_deriv_uu, score_deriv_cu,
|
||
cov_score_cc, cov_score_cu, cov_score_uu):
|
||
"""robust generalized score tests on subset of parameters
|
||
|
||
This is the same as lm_robust_subset with arguments in parts of
|
||
partitioned matrices.
|
||
This can be useful, when we have the parts based on different estimation
|
||
procedures, i.e. when we do not have the full unconstrained model.
|
||
|
||
Calculates mainly the covariance of the constraint part of the score.
|
||
|
||
Parameters
|
||
----------
|
||
score : ndarray, 1-D
|
||
derivative of objective function at estimated parameters
|
||
of constrained model. These is the score component for the restricted
|
||
part under hypothesis. The unconstrained part of the score is assumed
|
||
to be zero.
|
||
k_constraint : int
|
||
number of constraints
|
||
score_deriv_uu : ndarray, symmetric, square
|
||
first derivative of moment equation or second derivative of objective
|
||
function for the unconstrained part
|
||
TODO: could be OPG or any other estimator if information matrix
|
||
equality holds
|
||
score_deriv_cu : ndarray
|
||
first cross derivative of moment equation or second cross
|
||
derivative of objective function between.
|
||
cov_score_cc : ndarray
|
||
covariance matrix of the score for the unconstrained part.
|
||
This is the inner part of a sandwich estimator.
|
||
cov_score_cu : ndarray
|
||
covariance matrix of the score for the off-diagonal block, i.e.
|
||
covariance between constrained and unconstrained part.
|
||
cov_score_uu : ndarray
|
||
covariance matrix of the score for the unconstrained part.
|
||
|
||
Returns
|
||
-------
|
||
lm_stat : float
|
||
score/lagrange multiplier statistic
|
||
p-value : float
|
||
p-value of the LM test based on chisquare distribution
|
||
|
||
Notes
|
||
-----
|
||
TODO: these function should just return the covariance of the score
|
||
instead of calculating the score/lm test.
|
||
|
||
Implementation similar to lm_robust_subset and is based on Boos 1992,
|
||
section 4.1 in the form attributed to Breslow (1990). It does not use the
|
||
computation attributed to Kent (1982) and Engle (1984).
|
||
"""
|
||
|
||
tmp_proj = np.linalg.solve(score_deriv_uu, score_deriv_cu.T).T
|
||
tmp = tmp_proj.dot(cov_score_cu.T)
|
||
|
||
# this needs to make a copy of cov_score_cc for further inplace modification
|
||
cov = cov_score_cc - tmp
|
||
cov -= tmp.T
|
||
cov += tmp_proj.dot(cov_score_uu).dot(tmp_proj.T)
|
||
|
||
lm_stat = score.dot(np.linalg.solve(cov, score))
|
||
pval = stats.chi2.sf(lm_stat, k_constraints)
|
||
return lm_stat, pval
|
||
|
||
|
||
def lm_robust_reparameterized(score, params_deriv, score_deriv, cov_score):
|
||
"""robust generalized score test for transformed parameters
|
||
|
||
The parameters are given by a nonlinear transformation of the estimated
|
||
reduced parameters
|
||
|
||
`params = g(params_reduced)` with jacobian `G = d g / d params_reduced`
|
||
|
||
score and other arrays are for full parameter space `params`
|
||
|
||
Parameters
|
||
----------
|
||
score : ndarray, 1-D
|
||
derivative of objective function at estimated parameters
|
||
of constrained model
|
||
params_deriv : ndarray
|
||
Jacobian G of the parameter trasnformation
|
||
score_deriv : ndarray, symmetric, square
|
||
second derivative of objective function
|
||
TODO: could be OPG or any other estimator if information matrix
|
||
equality holds
|
||
cov_score B : ndarray, symmetric, square
|
||
covariance matrix of the score. This is the inner part of a sandwich
|
||
estimator.
|
||
|
||
Returns
|
||
-------
|
||
lm_stat : float
|
||
score/lagrange multiplier statistic
|
||
p-value : float
|
||
p-value of the LM test based on chisquare distribution
|
||
|
||
Notes
|
||
-----
|
||
Boos 1992, section 4.3, expression for T_{GS} just before example 6
|
||
"""
|
||
# Boos notation
|
||
# params_deriv G
|
||
|
||
k_params, k_reduced = params_deriv.shape
|
||
k_constraints = k_params - k_reduced
|
||
|
||
G = params_deriv # shortcut alias
|
||
|
||
tmp_c0 = np.linalg.pinv(G.T.dot(score_deriv.dot(G)))
|
||
tmp_c1 = score_deriv.dot(G.dot(tmp_c0.dot(G.T)))
|
||
tmp_c = np.eye(k_params) - tmp_c1
|
||
|
||
cov = tmp_c.dot(cov_score.dot(tmp_c.T)) # warning: reduced rank
|
||
|
||
lm_stat = score.dot(np.linalg.pinv(cov).dot(score))
|
||
pval = stats.chi2.sf(lm_stat, k_constraints)
|
||
return lm_stat, pval
|
||
|
||
|
||
def conditional_moment_test_generic(mom_test, mom_test_deriv,
|
||
mom_incl, mom_incl_deriv,
|
||
var_mom_all=None,
|
||
cov_type='OPG', cov_kwds=None):
|
||
"""generic conditional moment test
|
||
|
||
This is mainly intended as internal function in support of diagnostic
|
||
and specification tests. It has no conversion and checking of correct
|
||
arguments.
|
||
|
||
Parameters
|
||
----------
|
||
mom_test : ndarray, 2-D (nobs, k_constraints)
|
||
moment conditions that will be tested to be zero
|
||
mom_test_deriv : ndarray, 2-D, square (k_constraints, k_constraints)
|
||
derivative of moment conditions under test with respect to the
|
||
parameters of the model summed over observations.
|
||
mom_incl : ndarray, 2-D (nobs, k_params)
|
||
moment conditions that where use in estimation, assumed to be zero
|
||
This is score_obs in the case of (Q)MLE
|
||
mom_incl_deriv : ndarray, 2-D, square (k_params, k_params)
|
||
derivative of moment conditions of estimator summed over observations
|
||
This is the information matrix or Hessian in the case of (Q)MLE.
|
||
var_mom_all : None, or ndarray, 2-D, (k, k) with k = k_constraints + k_params
|
||
Expected product or variance of the joint (column_stacked) moment
|
||
conditions. The stacking should have the variance of the moment
|
||
conditions under test in the first k_constraint rows and columns.
|
||
If it is not None, then it will be estimated based on cov_type.
|
||
I think: This is the Hessian of the extended or alternative model
|
||
under full MLE and score test assuming information matrix identity
|
||
holds.
|
||
|
||
Returns
|
||
-------
|
||
results
|
||
|
||
Notes
|
||
-----
|
||
TODO: cov_type other than OPG is missing
|
||
initial implementation based on Cameron Trived countbook 1998 p.48, p.56
|
||
|
||
also included: mom_incl can be None if expected mom_test_deriv is zero.
|
||
|
||
References
|
||
----------
|
||
Cameron and Trivedi 1998 count book
|
||
Wooldridge ???
|
||
Pagan and Vella 1989
|
||
"""
|
||
if cov_type != 'OPG':
|
||
raise NotImplementedError
|
||
|
||
k_constraints = mom_test.shape[1]
|
||
|
||
if mom_incl is None:
|
||
# assume mom_test_deriv is zero, do not include effect of mom_incl
|
||
if var_mom_all is None:
|
||
var_cm = mom_test.T.dot(mom_test)
|
||
else:
|
||
var_cm = var_mom_all
|
||
|
||
else:
|
||
# take into account he effect of parameter estimates on mom_test
|
||
if var_mom_all is None:
|
||
mom_all = np.column_stack((mom_test, mom_incl))
|
||
# TODO: replace with inner sandwich covariance estimator
|
||
var_mom_all = mom_all.T.dot(mom_all)
|
||
|
||
tmp = mom_test_deriv.dot(np.linalg.pinv(mom_incl_deriv))
|
||
h = np.column_stack((np.eye(k_constraints), -tmp))
|
||
|
||
var_cm = h.dot(var_mom_all.dot(h.T))
|
||
|
||
# calculate test results with chisquare
|
||
var_cm_inv = np.linalg.pinv(var_cm)
|
||
mom_test_sum = mom_test.sum(0)
|
||
statistic = mom_test_sum.dot(var_cm_inv.dot(mom_test_sum))
|
||
pval = stats.chi2.sf(statistic, k_constraints)
|
||
|
||
# normal test of individual components
|
||
se = np.sqrt(np.diag(var_cm))
|
||
tvalues = mom_test_sum / se
|
||
pvalues = stats.norm.sf(np.abs(tvalues))
|
||
|
||
res = ResultsGeneric(var_cm=var_cm,
|
||
stat_cmt=statistic,
|
||
pval_cmt=pval,
|
||
tvalues=tvalues,
|
||
pvalues=pvalues)
|
||
|
||
return res
|
||
|
||
|
||
def conditional_moment_test_regression(mom_test, mom_test_deriv=None,
|
||
mom_incl=None, mom_incl_deriv=None,
|
||
var_mom_all=None, demean=False,
|
||
cov_type='OPG', cov_kwds=None):
|
||
"""generic conditional moment test based artificial regression
|
||
|
||
this is very experimental, no options implemented yet
|
||
|
||
so far
|
||
OPG regression, or
|
||
artificial regression with Robust Wald test
|
||
|
||
The latter is (as far as I can see) the same as an overidentifying test
|
||
in GMM where the test statistic is the value of the GMM objective function
|
||
and it is assumed that parameters were estimated with optimial GMM, i.e.
|
||
the weight matrix equal to the expectation of the score variance.
|
||
"""
|
||
# so far coded from memory
|
||
nobs, k_constraints = mom_test.shape
|
||
|
||
endog = np.ones(nobs)
|
||
if mom_incl is not None:
|
||
ex = np.column_stack((mom_test, mom_incl))
|
||
else:
|
||
ex = mom_test
|
||
if demean:
|
||
ex -= ex.mean(0)
|
||
if cov_type == 'OPG':
|
||
res = OLS(endog, ex).fit()
|
||
|
||
statistic = nobs * res.rsquared
|
||
pval = stats.chi2.sf(statistic, k_constraints)
|
||
else:
|
||
res = OLS(endog, ex).fit(cov_type=cov_type, cov_kwds=cov_kwds)
|
||
tres = res.wald_test(np.eye(ex.shape[1]))
|
||
statistic = tres.statistic
|
||
pval = tres.pvalue
|
||
|
||
return statistic, pval
|
||
|
||
|
||
class CMTNewey:
|
||
"""generic moment test for GMM
|
||
|
||
This is a class to calculate and hold the various results
|
||
|
||
This is based on Newey 1985 on GMM.
|
||
Lemma 1:
|
||
Theorem 1
|
||
|
||
The main method is `chisquare` which returns the result of the
|
||
conditional moment test.
|
||
|
||
Warning: name of class and methods will likely be changed
|
||
|
||
Parameters
|
||
----------
|
||
moments : ndarray, 1-D
|
||
moments that are tested to be zero. They do not need to be derived
|
||
from a likelihood function.
|
||
moments_deriv : ndarray
|
||
derivative of the moment function with respect to the parameters that
|
||
are estimated
|
||
cov_moments : ndarray
|
||
An estimate for the joint (expected) covariance of all moments. This
|
||
can be a heteroscedasticity or correlation robust covariance estimate,
|
||
i.e. the inner part of a sandwich covariance.
|
||
weights : ndarray
|
||
Weights used in the GMM estimation.
|
||
transf_mt : ndarray
|
||
This defines the test moments where `transf_mt` is the matrix that
|
||
defines a Linear combination of moments that have expected value equal
|
||
to zero under the Null hypothesis.
|
||
|
||
Notes
|
||
-----
|
||
The one letter names in Newey 1985 are
|
||
|
||
moments, g :
|
||
cov_moments, V :
|
||
moments_deriv, H :
|
||
weights, W :
|
||
transf_mt, L :
|
||
linear transformation to get the test condition from the moments
|
||
|
||
not used, add as argument to methods or __init__?
|
||
K cov for misspecification
|
||
or mispecification_deriv
|
||
|
||
This follows the GMM version in Newey 1985a, not the MLE version in
|
||
Newey 1985b. Newey uses the generalized information matrix equality in the
|
||
MLE version Newey (1985b).
|
||
|
||
Newey 1985b Lemma 1 does not impose correctly specified likelihood, but
|
||
assumes it in the following. Lemma 1 in both articles are essentially the
|
||
same assuming D = H' W.
|
||
|
||
References
|
||
----------
|
||
- Newey 1985a, Generalized Method of Moment specification testing,
|
||
Journal of Econometrics
|
||
- Newey 1985b, Maximum Likelihood Specification Testing and Conditional
|
||
Moment Tests, Econometrica
|
||
"""
|
||
|
||
def __init__(self, moments, cov_moments, moments_deriv,
|
||
weights, transf_mt):
|
||
|
||
self.moments = moments
|
||
self.cov_moments = cov_moments
|
||
self.moments_deriv = moments_deriv
|
||
self.weights = weights
|
||
self.transf_mt = transf_mt
|
||
|
||
# derived quantities
|
||
self.moments_constraint = transf_mt.dot(moments)
|
||
self.htw = moments_deriv.T.dot(weights) # H'W
|
||
|
||
# TODO check these
|
||
self.k_moments = self.moments.shape[-1] # in this case only 1-D
|
||
# assuming full rank of L'
|
||
self.k_constraints = self.transf_mt.shape[0]
|
||
|
||
@cache_readonly
|
||
def asy_transf_params(self):
|
||
|
||
moments_deriv = self.moments_deriv # H
|
||
#weights = self.weights # W
|
||
|
||
htw = self.htw # moments_deriv.T.dot(weights) # H'W
|
||
res = np.linalg.solve(htw.dot(moments_deriv), htw)
|
||
#res = np.linalg.pinv(htw.dot(moments_deriv)).dot(htw)
|
||
return -res
|
||
|
||
@cache_readonly
|
||
def project_w(self):
|
||
# P_w = I - H (H' W H)^{-1} H' W
|
||
moments_deriv = self.moments_deriv # H
|
||
|
||
res = moments_deriv.dot(self.asy_transf_params)
|
||
res += np.eye(res.shape[0])
|
||
return res
|
||
|
||
@cache_readonly
|
||
def asy_transform_mom_constraints(self):
|
||
# L P_w
|
||
res = self.transf_mt.dot(self.project_w)
|
||
return res
|
||
|
||
@cache_readonly
|
||
def asy_cov_moments(self):
|
||
"""
|
||
|
||
`sqrt(T) * g_T(b_0) asy N(K delta, V)`
|
||
|
||
mean is not implemented,
|
||
V is the same as cov_moments in __init__ argument
|
||
"""
|
||
|
||
return self.cov_moments
|
||
|
||
@cache_readonly
|
||
def cov_mom_constraints(self):
|
||
|
||
# linear transformation
|
||
transf = self.asy_transform_mom_constraints
|
||
|
||
return transf.dot(self.asy_cov_moments).dot(transf.T)
|
||
|
||
@cache_readonly
|
||
def rank_cov_mom_constraints(self):
|
||
return np.linalg.matrix_rank(self.cov_mom_constraints)
|
||
|
||
def ztest(self):
|
||
"""statistic, p-value and degrees of freedom of separate moment test
|
||
|
||
currently two sided test only
|
||
|
||
TODO: This can use generic ztest/ttest features and return
|
||
ContrastResults
|
||
"""
|
||
diff = self.moments_constraint
|
||
bse = np.sqrt(np.diag(self.cov_mom_constraints))
|
||
|
||
# Newey uses a generalized inverse
|
||
stat = diff / bse
|
||
pval = stats.norm.sf(np.abs(stat))*2
|
||
return stat, pval
|
||
|
||
@cache_readonly
|
||
def chisquare(self):
|
||
"""statistic, p-value and degrees of freedom of joint moment test
|
||
"""
|
||
diff = self.moments_constraint
|
||
cov = self.cov_mom_constraints
|
||
|
||
# Newey uses a generalized inverse
|
||
stat = diff.T.dot(np.linalg.pinv(cov).dot(diff))
|
||
df = self.rank_cov_mom_constraints
|
||
pval = stats.chi2.sf(stat, df) # Theorem 1
|
||
return stat, pval, df
|
||
|
||
|
||
class CMTTauchen:
|
||
"""generic moment tests or conditional moment tests for Quasi-MLE
|
||
|
||
This is a generic class based on Tauchen 1985
|
||
|
||
The main method is `chisquare` which returns the result of the
|
||
conditional moment test.
|
||
|
||
Warning: name of class and of methods will likely be changed
|
||
|
||
Parameters
|
||
----------
|
||
score : ndarray, 1-D
|
||
moment condition used in estimation, score of log-likelihood function
|
||
score_deriv : ndarray
|
||
derivative of score function with respect to the parameters that are
|
||
estimated. This is the Hessian in quasi-maximum likelihood
|
||
moments : ndarray, 1-D
|
||
moments that are tested to be zero. They do not need to be derived
|
||
from a likelihood function.
|
||
moments_deriv : ndarray
|
||
derivative of the moment function with respect to the parameters that
|
||
are estimated
|
||
cov_moments : ndarray
|
||
An estimate for the joint (expected) covariance of score and test
|
||
moments. This can be a heteroscedasticity or correlation robust
|
||
covariance estimate, i.e. the inner part of a sandwich covariance.
|
||
"""
|
||
|
||
def __init__(self, score, score_deriv, moments, moments_deriv, cov_moments):
|
||
self.score = score
|
||
self.score_deriv = score_deriv
|
||
self.moments = moments
|
||
self.moments_deriv = moments_deriv
|
||
self.cov_moments_all = cov_moments
|
||
|
||
self.k_moments_test = moments.shape[-1]
|
||
self.k_params = score.shape[-1]
|
||
self.k_moments_all = self.k_params + self.k_moments_test
|
||
|
||
@cache_readonly
|
||
def cov_params_all(self):
|
||
m_deriv = np.zeros((self.k_moments_all, self.k_moments_all))
|
||
m_deriv[:self.k_params, :self.k_params] = self.score_deriv
|
||
m_deriv[self.k_params:, :self.k_params] = self.moments_deriv
|
||
m_deriv[self.k_params:, self.k_params:] = np.eye(self.k_moments_test)
|
||
|
||
m_deriv_inv = np.linalg.inv(m_deriv)
|
||
cov = m_deriv_inv.dot(self.cov_moments_all.dot(m_deriv_inv.T)) # K_inv J K_inv
|
||
return cov
|
||
|
||
@cache_readonly
|
||
def cov_mom_constraints(self):
|
||
return self.cov_params_all[self.k_params:, self.k_params:]
|
||
|
||
@cache_readonly
|
||
def rank_cov_mom_constraints(self):
|
||
return np.linalg.matrix_rank(self.cov_mom_constraints)
|
||
|
||
# TODO: not DRY, just copied from CMTNewey
|
||
def ztest(self):
|
||
"""statistic, p-value and degrees of freedom of separate moment test
|
||
|
||
currently two sided test only
|
||
|
||
TODO: This can use generic ztest/ttest features and return
|
||
ContrastResults
|
||
"""
|
||
diff = self.moments_constraint
|
||
bse = np.sqrt(np.diag(self.cov_mom_constraints))
|
||
|
||
# Newey uses a generalized inverse
|
||
stat = diff / bse
|
||
pval = stats.norm.sf(np.abs(stat))*2
|
||
return stat, pval
|
||
|
||
@cache_readonly
|
||
def chisquare(self):
|
||
"""statistic, p-value and degrees of freedom of joint moment test
|
||
"""
|
||
diff = self.moments #_constraints
|
||
cov = self.cov_mom_constraints
|
||
|
||
# Newey uses a generalized inverse, we use it also here
|
||
stat = diff.T.dot(np.linalg.pinv(cov).dot(diff))
|
||
#df = self.k_moments_test
|
||
# We allow for redundant mom_constraints:
|
||
df = self.rank_cov_mom_constraints
|
||
pval = stats.chi2.sf(stat, df)
|
||
return stat, pval, df
|