* Declare the data as time-series using the variable "year" tsset year * Estimate an OLS regression of y on x1 and x2 reg y x1 x2 * Compute the Durbin–Watson statistic for autocorrelation estat dwatson * Breusch–Godfrey test for autocorrelation up to lag 2 (large-sample version) estat bgodfrey, lags(2) * Obtain residuals from the regression and save them as "uhat" predict uhat, resid * Perform the Ljung–Box Q test for autocorrelation up to lag 2 wntestq uhat, lags(2) * Display the autocorrelation and partial autocorrelation of the residuals up to lag 2 * (this output also includes Ljung–Box Q statistics for each lag) corrgram uhat, lags(2) * Estimating a linear regression model when the errors exhibit AR(1) serial correlation using the Cochrane–Orcutt (corc) transformation. The disturbances are assumed to follow a first-order autoregressive process. prais y x1 x2, corc twostep * twostep: stop after the first iteration * Estimating a linear regression model when the errors exhibit AR(1) serial correlation using the Prais-Winsten transformation. The disturbances are assumed to follow a first-order autoregressive process. prais y x1 x2, twostep * twostep: stop after the first iteration ************************************************************************************************************** *----------------------------------------------------------------------- * A) DURBIN–WATSON TEST FOR FIRST-ORDER AUTOCORRELATION *----------------------------------------------------------------------- * Estimate the main OLS regression model: Y_t = β0 + β1 X1_t + β2 X2_t + u_t reg y x1 x2 * Compute the Durbin–Watson statistic for testing first-order autocorrelation: * H0: ρ = 0 (no autocorrelation) * H1: ρ ≠ 0 (autocorrelation exists) *----------------------------------------------------------------------- * Comparison: Correlation between uhat and uhat_lag1 vs. Estimated ρ_hat from the residual regression uhat_t = ρ_hat * uhat_(t-1) + v_t (no constant term) *----------------------------------------------------------------------- * Obtain residuals from the regression and save them as "uhat" predict uhat, resid * Generate lagged residuals (uhat_t-1) gen uhat_lag1 = L.uhat * Compute the sample correlation between uhat and uhat_lag1 corr uhat uhat_lag1 * This correlation provides an approximate estimate of ρ_hat obtained from the regression: uhat_t = ρ_hat * uhat_(t-1) + v_t (no constant term) reg uhat uhat_lag1, noconstant * The estimated coefficient _b[uhat_lag1] should closely match the sample correlation reported above. *----------------------------------------------------------------------- * Compare the two methods of calculating Durbin–Watson statistic *----------------------------------------------------------------------- * METHOD 1: Direct Durbin–Watson test (built-in) *----------------------------------------------------------------------- * d = Σ(uhat_t - uhat_(t-1))^2 / Σ(uhat_t^2) *----------------------------------------------------------------------- * Compute squared differences (uhat_t − uhat_t-1)^2 gen diff_sq = (uhat - uhat_lag1)^2 replace diff_sq = . if missing(uhat_lag1) * Compute squared residuals uhat_t^2 gen uhat_sq = uhat^2 * Sum numerator: Σ(diff_sq) quietly summarize diff_sq scalar num = r(sum) * Sum denominator: Σ(uhat_sq) quietly summarize uhat_sq scalar den = r(sum) * Compute Durbin–Watson statistic scalar d_manual = num / den display "--------------------------------------------------------------" display " Manual Durbin–Watson statistic (computed from residuals):" display " d = " %9.6f d_manual display "--------------------------------------------------------------" * METHOD 2: Using the correlation between residuals and their lag * Regression of residuals on their lag (uhat_t = ρ_hat * uhat_(t-1) + v_t) reg uhat uhat_lag1, noconstant * Compute the approximate DW statistic using d ≈ 2*(1 - ρ_hat) display "Approximate DW from residual-lag correlation: " 2*(1 - _b[uhat_lag1]) * Compare the DW statistic from Method 1 with the approximate DW from Method 2 * They should be very close, demonstrating equivalence of the two approaches ************************************************************************************************************** *----------------------------------------------------------------------- * B.1) BREUSCH–GODFREY TEST FOR AR(p) AUTOCORRELATION (manual calculation) *----------------------------------------------------------------------- * STEP 1: Estimate the main OLS regression model: Y_t = β0 + β1 X1_t + β2 X2_t + u_t reg y x1 x2 * STEP 2: Obtain residuals from the regression and save them as "uhat" predict uhat, resid * STEP 3: Generate lagged residuals up to lag p = 2 (example) gen uhat_lag1 = L.uhat gen uhat_lag2 = L2.uhat * STEP 4: Auxiliary regression of residuals on original regressors + lagged residuals * uhat_t = α0 + α1 X1_t + α2 X2_t + ρ1 uhat_(t-1) + ρ2 uhat_(t-2) + v_t reg uhat x1 x2 uhat_lag1 uhat_lag2 * STEP 5: Record R-squared from the auxiliary regression display "R-squared from auxiliary regression: " e(r2) *----------------------------------------------------------------------- * STEP 6: Compute LM statistic manually * LM = (T - p) * R^2_aux * Under H0, LM ~ χ² with p degrees of freedom *----------------------------------------------------------------------- local p = 2 local T = e(N) local LM_stat = (`T' - `p') * e(r2) display "LM statistic: " `LM_stat' * STEP 7: Compute p-value for LM statistic * p-value = P(χ²_p > LM_stat) display "p-value for LM test: " chi2tail(`p', `LM_stat') *----------------------------------------------------------------------- * STEP 8: Compute F statistic manually * F = (R^2_aux / p) / ((1 - R^2_aux) / (T - k - 2p - 1)) * Under H0, F ~ F(p, T - k - 2p - 1) *----------------------------------------------------------------------- local k = 2 local F_stat = (e(r2)/`p') / ((1 - e(r2))/(`T' - `k' - 2*`p' - 1)) display "F statistic: " `F_stat' * STEP 9: Compute p-value for F statistic display "p-value for F test: " Ftail(`p', `T' - `k' - 2*`p' - 1, `F_stat') *----------------------------------------------------------------------- * Notes: * - This manual procedure replicates the Breusch–Godfrey test. * - LM statistic follows chi-square distribution with p degrees of freedom. * - F statistic follows F-distribution with p and T-k-2p-1 degrees of freedom. * - Reject H0 if p-value < significance level α, indicating evidence of autocorrelation. *----------------------------------------------------------------------- ************************************************************************************************************** *----------------------------------------------------------------------- * B.2) BREUSCH–GODFREY TEST FOR AR(p) AUTOCORRELATION * (manual calculation WITH ZERO-FILLED LAGGED RESIDUALS) *----------------------------------------------------------------------- * STEP 1: Estimate the main OLS regression model * Y_t = β0 + β1 X1_t + β2 X2_t + u_t reg y x1 x2 *----------------------------------------------------------------------- * STEP 2: Obtain residuals from the regression and save them as "uhat" *----------------------------------------------------------------------- predict uhat, resid *----------------------------------------------------------------------- * STEP 3: Generate lagged residuals up to lag p = 2 * Replace missing lag values with ZERO (no observations dropped) *----------------------------------------------------------------------- local p = 2 gen uhat_lag1 = L.uhat replace uhat_lag1 = 0 if missing(uhat_lag1) gen uhat_lag2 = L2.uhat replace uhat_lag2 = 0 if missing(uhat_lag2) *----------------------------------------------------------------------- * STEP 4: Auxiliary regression: * uhat_t = α0 + α1 X1_t + α2 X2_t + ρ1 uhat_(t-1) + ρ2 uhat_(t-2) + v_t *----------------------------------------------------------------------- reg uhat x1 x2 uhat_lag1 uhat_lag2 *----------------------------------------------------------------------- * STEP 5: Extract R-squared from the auxiliary regression *----------------------------------------------------------------------- display "R-squared from auxiliary regression: " e(r2) *----------------------------------------------------------------------- * STEP 6: Compute LM statistic (zero-filled version) * LM = T * R^2_aux (not T - p) * LM ~ χ²(p) *----------------------------------------------------------------------- local T = e(N) local LM_stat = `T' * e(r2) display "LM statistic (zero-filled version): " `LM_stat' display "LM p-value: " chi2tail(`p', `LM_stat') *----------------------------------------------------------------------- * STEP 7: Compute F statistic (zero-filled version) * F = (R^2_aux / p) / ((1 − R^2_aux) / (T − k − p − 1)) * F ~ F(p, T − k − p − 1) *----------------------------------------------------------------------- local k = 2 local df_denom = `T' - `k' - `p' - 1 local F_stat = (e(r2)/`p') / ((1 - e(r2)) / `df_denom') display "F statistic (zero-filled version): " `F_stat' display "F-test p-value: " Ftail(`p', `df_denom', `F_stat') *----------------------------------------------------------------------- * Notes: * - This version uses zero-filling instead of dropping the first p observations. * - Therefore: * LM = T * R^2_aux * F = (R^2_aux / p) / ((1 - R^2_aux) / (T - k - p - 1)) * - LM ~ chi-square(p), F ~ F(p, T-k-p-1) * - Reject H0 if p-value < α → evidence of AR(p) autocorrelation. *----------------------------------------------------------------------- ************************************************************************************************************** *-----------------------------------------------------------------------* * C) LJUNG–BOX TEST FOR AUTOCORRELATION (manual calculation, p = 2) *-----------------------------------------------------------------------* * Step 1: Run the OLS regression * Regress y on x1 and x2 regress y x1 x2 * Step 2: Obtain residuals * Store residuals in variable 'uhat' predict uhat, resid * Step 3: Compute denominator for autocorrelations * Denominator = sum of squared residuals gen uhat2 = uhat^2 quietly summarize uhat2 if !missing(uhat2) local denom = r(sum) * Step 4: Generate lagged residuals (lag 1 and 2) gen uhat_lag1 = L1.uhat gen uhat_lag2 = L2.uhat * Step 5: Compute numerators for autocorrelations gen num1 = uhat * uhat_lag1 quietly summarize num1 if !missing(num1) local rho1 = r(sum)/`denom' gen num2 = uhat * uhat_lag2 quietly summarize num2 if !missing(num2) local rho2 = r(sum)/`denom' * Step 6: Calculate Ljung–Box statistic (order 2) local T = _N scalar LB = `T'*(`T'+2)*((`rho1'*`rho1')/(`T'-1) + (`rho2'*`rho2')/(`T'-2)) display "Ljung–Box statistic (order 2): " %9.6f LB * Step 7: Calculate p-value using correct upper-tail function scalar pvalue = chi2tail(2, LB) display "p-value: " %9.6f pvalue * Step 8: Decision * If p-value <= 0.05 -> reject H0 (autocorrelation exists) * If p-value > 0.05 -> fail to reject H0 (no autocorrelation) ************************************************************************************************************** *----------------------------------------------------------------------- * D.1. COCHRANE–ORCUTT TRANSFORMATION (manual implementation) *----------------------------------------------------------------------- * STEP 1: Estimate the original OLS model * Y_t = β0 + β1 X1_t + β2 X2_t + ... + βk Xk_t + u_t reg y x1 x2 *----------------------------------------------------------------------- * STEP 2: Obtain OLS residuals u_hat *----------------------------------------------------------------------- predict uhat, resid *----------------------------------------------------------------------- * STEP 3: Regress residuals on their first lag to estimate ρ * u_hat_t = ρ * u_hat_(t-1) + v_t *----------------------------------------------------------------------- gen uhat_lag = L.uhat reg uhat uhat_lag, noconstant * Store the estimated autocorrelation coefficient scalar rho = _b[uhat_lag] display "Estimated rho (Cochrane–Orcutt): " rho *----------------------------------------------------------------------- * STEP 4: Generate transformed dependent variable * Y_t* = Y_t – ρ_hat * Y_(t-1) *----------------------------------------------------------------------- gen y_lag = L.y gen y_star = y - rho * y_lag *----------------------------------------------------------------------- * STEP 5: Generate transformed regressors * X_it* = X_it – ρ_hat * X_i,(t-1) *----------------------------------------------------------------------- gen x1_lag = L.x1 gen x1_star = x1 - rho * x1_lag gen x2_lag = L.x2 gen x2_star = x2 - rho * x2_lag * (Add more regressors similarly if k > 2) gen x0_star = 1 - rho *----------------------------------------------------------------------- * STEP 6: Estimate the transformed model: * y*_t = β0 x0*_t + β1 x1*_t + β2 x2*_t + ... + v_t *----------------------------------------------------------------------- reg y_star x0_star x1_star x2_star, noconstant * Store estimated coefficients scalar b0_hat = _b[x0_star] scalar b1_hat = _b[x1_star] scalar b2_hat = _b[x2_star] *----------------------------------------------------------------------- display "Estimated beta_0 (original intercept): " b0_hat display "Estimated beta_1: " b1_hat display "Estimated beta_2: " b2_hat *----------------------------------------------------------------------- * Interpretation Notes: * - Cochrane–Orcutt corrects for AR(1) autocorrelation. * - If desired, iterations can be applied, but the above code implements the pure one-step Cochrane–Orcutt procedure. ************************************************************************************************************** *----------------------------------------------------------------------- * D.2. PRAIS–WINSTEN TRANSFORMATION (manual implementation) *----------------------------------------------------------------------- *----------------------------------------------------------------------- * STEP 1: Estimate the original OLS model *----------------------------------------------------------------------- reg y x1 x2 *----------------------------------------------------------------------- * STEP 2: Obtain OLS residuals *----------------------------------------------------------------------- predict uhat, resid *----------------------------------------------------------------------- * STEP 3: Estimate rho by regressing residuals on lagged residuals *----------------------------------------------------------------------- gen uhat_lag = L.uhat reg uhat uhat_lag, noconstant scalar rho = _b[uhat_lag] display "Estimated rho (Prais–Winsten): " rho *----------------------------------------------------------------------- * STEP 4: Construct the transformation multiplier for t=1 *----------------------------------------------------------------------- scalar pw = sqrt(1 - rho^2) *----------------------------------------------------------------------- * STEP 5: Generate lags *----------------------------------------------------------------------- gen y_lag = L.y gen x1_lag = L.x1 gen x2_lag = L.x2 * (Add further regressors similarly if needed) *----------------------------------------------------------------------- * STEP 6: Generate transformed dependent variable Y* *----------------------------------------------------------------------- gen y_star = . replace y_star = pw * y if _n == 1 replace y_star = y - rho * y_lag if _n > 1 *----------------------------------------------------------------------- * STEP 7: Generate transformed regressors X* *----------------------------------------------------------------------- gen x1_star = . replace x1_star = pw * x1 if _n == 1 replace x1_star = x1 - rho * x1_lag if _n > 1 gen x2_star = . replace x2_star = pw * x2 if _n == 1 replace x2_star = x2 - rho * x2_lag if _n > 1 * (Add more regressors as needed) *----------------------------------------------------------------------- * STEP 8: Transformed intercept regressor X0* *----------------------------------------------------------------------- gen x0_star = . replace x0_star = pw if _n == 1 replace x0_star = 1-rho if _n > 1 *----------------------------------------------------------------------- * STEP 9: Estimate the Prais–Winsten transformed model (with no constant) *----------------------------------------------------------------------- reg y_star x0_star x1_star x2_star, noconstant *----------------------------------------------------------------------- display "Estimated beta_0: " _b[x0_star] display "Estimated beta_1: " _b[x1_star] display "Estimated beta_2: " _b[x2_star] *----------------------------------------------------------------------- * This implements the single-iteration Prais–Winsten estimator. * Iterative Prais–Winsten can be programmed but is typically handled automatically by Stata's built-in command. *-----------------------------------------------------------------------