How to — and not do — time series analysis
Ironman of Political Calculations tries (and fails) to do a serious analysis of whether the drought had an impact on Kansas GDP. He fails on so many counts, it’s hard to cover all bases, but I’ll provide a few pointers. Overarching this entire debate, recall the initial question was whether the measured shortfall in Kansas overall GDP growth could be attributed to the drought.
Problem 1. Just because drought measurably affects the agricultural sector doesn’t mean the entire economy is measurably affected.
Agriculture, forestry, fishing, hunting (hereafter “agriculture”) value added has only accounted for on average 3.7% of Kansas GDP over the available sample period, as shown in Figure 1.
Figure 1: Share of Kansas GDP attributable to Agriculture, forestry, fishing, hunting (blue), and average value (red). Light brown shading is drought as defined by Political Calculations. Source: BEA (July 27, 2016), and author’s calculations.
I have no difficulty with the proposition that drought might have an impact on agricultural value added; but if agriculture only accounts for a small portion of the economy, one could have a drought and still not have a substantial impact on GDP.
Interestingly, the share of value added rises during the period Political Calculation alleges a drought induced hit to Kansas GDP. Expressing in real terms the (log) shares, one finds a similar story: the agriculture share is higher during the drought. (For a discussion of why log ratios are used, see here) This is shown in Figure 2.
Figure 2: Log ratio of Kansas real GDP attributable to Agriculture, forestry, fishing, hunting minus log ratio in 2005Q1 (blue), and average value (red). Light brown shading is drought as defined by Political Calculations. Source: BEA (July 27, 2016), and author’s calculations.
If the drought so impacted agriculture so as to diminish Kansas GDP, why doesn’t it show up in these averages?
Problem 2. Running regressions when one doesn’t know what one is doing can be problematic. Ironman runs a simple bivariate regression of Kansas real agricultural output on the Kansas drought index. Nothing stops one from doing that; the question is whether the estimated coefficient converges in distribution to anything in particular. It won’t if both variables are stochastically trending, but fail to be cointegrated. Figure 3 shows the relevant time series.
Figure 3: Kansas real GDP attributable to Agriculture, forestry, fishing, hunting, in millions Ch.2009$ (blue), and Palmer Drought Severity Index, PDSI (red). Light brown shading is drought as defined by Political Calculations. Source: BEA (July 27, 2016), NOAA, and author’s calculations. (graph updated to correct PDSI series for transcription errors, 9/11)
What do formal tests for stationarity indicate? For the level of real agricultural output Political Calculations uses, the Augmented Dickey Fuller test (constant, trend) fails to reject the null hypothesis of a unit root at the 5% level. The Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) test with a trend stationarity null rejects at the 5% level (Bartlett kernel, Newey-West bandwidth). For more on the use of combination of unit root/trend stationary tests to make inferences, see e.g. Cheung and Chinn (Oxford Economic Papers, 1996).
What about the drought index? The level of this series fails to reject the unit root null at any conventional level of significance (5%, 10%, 20%, 50%). Depending on the specification, it fails to reject the trend stationary null at the 5% level (using Newey-West bandwidth, eviews default) or borderline rejects (using Andrews bandwidth).edited 9/12 So this too seems like an integrated series. So…if one had to guess, one could say both series are integrated of order 1.
In other words, Ironman is regressing an integrated series on an integrated series. The high R-squared (0.76) and big t-statistic (7.4) that Ironman touts is exactly what Granger and Newbold (1974) predicted when one regressed a random walk series on another random walk series, i.e., “spurious correlation”. Every serious empirical economist I know is aware of this; Ironman is apparently not. Unless the two series are cointegrated — a conventional Johansen maximum likelihood test, with constant in cointegrating vector, in test VAR, 1 lag of first differences fails to reject the no cointegration null hypothesis – estimation in first differences is then appropriate. (See this paper for a primer.) A simple regression of first (log) difference of agricultural output on the first difference of the drought index yields a regression result with a
-0.01 -0.00 adjusted R-squared, and a t-statistic on the drought index of 1.05 1.44 (using HAC robust standard errors).
Problem 3. Forgetting the question. This debate arose because I asserted that lagging growth in Kansas economic activity overall was due to tax cuts and associated government spending cuts. Ironman asserted the drought was the culprit. But if one wants to answer the original question, one needs to look at what happened to Kansas GDP, not agricultural value added.
In order to find out what is attributable to what, one needs to include the factors deemed of importance. Hence, I estimate an error correction model incorporating (log) Kansas GDP, US GDP, the real value of the dollar for Kansas (from the Dallas Fed) in levels, and the Palmer Drought Severity Index (PDSI) and the government component of Kansas GDP in first differences. y is log GDP, r is the real value of the dollar for Kansas.
ΔytKS = 1.696 – 0.258 yt-1KS + 0.165 yt-1US -0.055rt-1 +0.993 Δ ytUS + (0 to 2 lags of first difference of PDSI) + (0 to 2 lags of first difference of log KS govt) + (0 to 2 lags of first difference of r)
Adj-R2 = 0.57, SER = 0.0077, n=41, sample = 2006q1-2016q1, Q(4) stat = 3.618[p-val = 0.451], Q(8) stat = 5.323[p-val = 0.723]. bold denotes significance at 5% msl, using HAC robust standard errors. bold italics denotes significance at 15% level. [Regression output] [Data] ]NOAA data, accessed 9/11/16] (PDSI data transcription errors corrected 9/11)
Real Kansas GDP and the fitted values from the estimated equation are displayed in Figure 4.
Figure 4: Kansas GDP in millions Ch.2009$ SAAR (blue), and fitted values (red). Left scale in log terms. Light brown shading is drought as defined by Political Calculations. Source: BEA (July 27, 2016), and author’s calculations (corrected 9/11).
A Wald test (F-test) for a restriction that the coefficients on the first differences of the dollar value are all zero fails to reject. The corresponding restriction for government value added is borderline rejected at the 5% level, and the restriction that the sum of the coefficients is zero is not rejected.
The joint restriction that all the coefficients on the PDSI are zero is also rejected. The interesting thing is that the only statistically significant coefficient goes the wrong direction: a worsening of the drought is associated with an acceleration in growth at a 2 quarter lag. In addition, the restriction the sum is zero is rejected, in favor of the alternative of a negative coefficient.
What variable is most important for determining the growth rate of the Kansas economy? The reported coefficients do not directly address that issue – that depends on the interaction of the coefficients and the independent variables. One way at getting at the question is to refer to a “standardized beta”. Of the statistically significant first difference terms, the largest is for contemporaneous US GDP; the second is government value added at lag 1 (0.32). The third is the PDSI at lag 2 (-0.22) – although the coefficient is of the opposite direction of what is asserted in Political Calculations.
Bottom line: Excel in the hands of the econometrically-ignorant is like a kid running with scissors.
Corollary: Government’s share of GDP in Kansas seems more important than the drought in driving Kansas growth rates.