# Identities, Parameters and Regressions

Your final jab in this post regresses the state output gap on the fiscal gap. You then conclude that there is a positive relation between the two and that this somehow implies that a reduction in gov’t spending is a drag on the economy. I’ll just point out that …. that gov’t spending is a component of GSP. Of course they’re positively related.

I think this comment reflects a commonplace confusion between identities, functional relationships, and reduced form coefficients.

For instance, consider that at the industry level for a state, such as in data reported by BEA here, total private (agriculture, mining, through manufacturing through other services) and government has to sum to total gross state product (in current lexicon state GDP). So, one would think that regressing total output on government output has to yield a positive coefficient. Run the regression over the 2005Q1-15Q3 period (the entire available sample) for Wisconsin:

GDPWI,t = 262893.4 – 0.215GOVWI,t + ut

Adj-R2 = -0.02. Bold denotes significance at 10% msl using HAC robust standard errors.

Lest you think this proves higher government spending causes less output, well, consider the same regression for Kansas.

GDPKS,t = 18829.71 + 5.895GOVKS,t + ut

Adj-R2 = 0.34. Bold denotes significance at 10% msl using HAC robust standard errors.

This is a lesson I learned as an undergraduate — do not appeal to an accounting identity for information about a coefficient.

Let’s turn to an example that is more familiar. We all know the expenditure side definition of GDP from macro:

Y ≡ C + I + G + EX – IM

I stress in my undergraduate courses that one cannot use identities to explain how the world works, or more concretely, when G changes, how does Y. For that, one needs a model.

So, the hapless regression-runner might regress Y on G, asserting one must get a positive coefficient since G is by construction a component of Y. But in fact different theories yield different implied coefficients.

Suppose aggregate supply is given by Y = Yn = ΦF(K,N), and aggregate demand is given by Y = fn(G, T, M/P), where K, N are given. Then the correlation between Y and G is … zero!

Suppose P is predetermined within a period, and aggregate demand takes the same form as above. Then the correlation between Y and G is positive, but not necessarily one (it’ll depend on the marginal propensity to consume, interest sensitivity of investment, interest sensitivity of money demand, etc.).

Suppose P depends positively on the output gap (i.e., a Phillips Curve holds), such that P = Pe + θ(Y-Yn). Then the correlation is positive, but (holding all else constant) smaller than that in the previous example.

Suppose P depends positively on the output gap, and the fiscal authorities rely on a “fiscal rule” that achieves counter-cyclical stabilization, e.g., G = φ(Y-Yn), φ < 0. Then the implied correlation between Y and G is now negative.

One could in principle estimate the reduced form coefficient in the first three cases (after accounting for omitted variables, etc.); the parameters (e.g., θ) would require determining an appropriate instrumental variable. The correlation can always be calculated; whether it’s meaningful is another question.

Bottom Line: Identities do not tell you about behavior. Inferring causality is hard, but if one wants to tell a story, one has to try to deal with the data in an intelligent way.

1. 2slugbaits

Menzie I can see where some folks might misunderstand and not catch the fact that your regressions are using the same “t” on both sides of the regressions, so these are contemporaneous “relationships”…or rather “non-relationships.”

Ok, but come on, is that really what the commenter meant? If in time t a government goes in debt and does stimulus spending at 20 percent of GDP, couldn’t one realistically assume that in time t GDP would be greater than it would if the stimulus had not occurred, given that G is in the formula?

If you can’t even assume that than should anyone really be arguing for stimulus policies?

1. Menzie Chinn Post author

Anonymous: Sure, but when does government spending to GDP rise by 20 percentage points? In 2008Q4, government consumption and investment was 21.0%; by 2009Q3 (as the ARRA kicked in), it peaked at 21.6%, and then declined. And it’s been declining ever since. Hence, your example is completely irrelevant.

5. Julian Silk

Dear Menzie,

I agree with all this, but think you would have a more conclusive result, if one with poorer R-squares, if you regressed changes in GSP vs. changes in state government spending. After all, it was the cuts in Kansas government spending that were supposed to usher in this glorious era. It hasn’t quite worked out, has it?

Julian

6. Rick Stryker

Menzie,

Your conclusion that you can’t infer parameters from identities is certainly correct . But your regression analysis does not establish that point.

As you put it: “So, one would think that regressing total output on government output has to yield a positive coefficient.” If you did think that, you’d be right. The negative coefficient you got for Wisconsin is an artifact of the small sample size you have. You are regressing the levels of two non-stationary variables on each other, variables that would have a positive trend with sufficiently long data. If you had a long enough data set, you would always get a highly statistically significant positive coefficient in that Wisconsin regression.

By the way, there is a typo in the results: the intercept of the Wisconsin and Kansas regressions are switched around.

1. Menzie Chinn Post author

Rick Stryker: Yes, in a big enough sample … but for those of us who’ve worked in cointegration and the “great ratios” (e.g., C/Y), it’s amazing how big the sample has to be.

Thanks for pointing out the typo.

1. Rick Stryker

Menzie,

I’m curious why you think the Wisconsin series are cointegrated though (if that’s what you meant). Neither the Phillips-Ouliaris Pz nor Pu tests indicate cointegration, although that may be a small sample problem. If they aren’t cointegrated, then the regression you ran is spurious. Even so, you’d still expect to get a positive, highly significant coefficient in large enough samples.

1. Menzie Chinn Post author

Rick Stryker: I think you are kind of missing the point I was aiming at Professor Russell. His point was if Y ≡ Z + W, then in a regression it must be the case that there is a positive correlation (irrespective of correlation, etc.) between Y and Z. I used GDP and GOV in a linear form because that was what was consistent with an identity. If I had my ‘druthers, I would have estimated in log first differences given what I know (by the way, Johansen does reject the no cointegration null at 10% for levels if allowing for quadratic deterministic trend). But that would not have driven home the incorrectness of Professor Russell’s assertion.

1. Rick Stryker

Menzie,

No, I didn’t miss the point. I began my initial comment in this thread with “Your conclusion that you can’t infer parameters from identities is certainly correct . But your regression analysis does not establish that point. ” My point is that the regression you selected to drive home your point doesn’t actually make your point, since you selected a regression that would get a positive coefficient if you just had enough data.

By the way, if you really believed that the levels of the data had a quadratic deterministic trend, then that would imply in general a linear trend in the cointegrating relation in the Johansen framework. However, you did not include a linear trend in your regression. I’d also point out that if you really do think this is a cointegrating regression, you really can’t just calculate HAC standard errors as you did, since the t stats are generally invalid in this case. Also, I’m not sure why to report R2, since it goes to 100% as the sample size gets large. The t stat and R2 points are true whether the underlying regression is spurious or cointegrating.

2. Menzie Chinn Post author

Rick Stryker: I report the same stats all the time. I understand the point. In the futre I’ll make sure to put in a caveat about inference with integrated series (I’d thought people were bored by this topic, but your comment tells me that’s not the case).

3. Menzie Chinn Post author

Rick Stryker: By the way, what’s your thoughts on the revelation of Donald Trump as the face of the Republican Party? I must confess that I am unsurprised.

2. Julian Silk

Dear Rick,

From what I can tell, the sample would have to be enough so that the other variables, such as investment in particular (or investment over GDP or some version of the logs of the two), would have to revert to roughly mean values or higher than mean values on occasion in this case. You’d need this so the variables you were less interested in would be roughly stationary, so that nonstationary government would show up positively vs. nonstationary GDP. In my energy work that was published with Fred Joutz in Energy Economics in 1997, 45 years was nowhere near enough to allow variables you might hope to be stationary to become so, if you allow for structural shifts in the equation. We modeled residential electricity consumption vs. income, from 1949 to 1993, and the variable that you might hope is stationary is natural gas prices. We used dummy variables, because you can’t model everything, and it makes a big difference.

J.

1. Rick Stryker

Thanks Julian. I’ll look at your paper.

I was more focused on the technical aspects of the Wisconsin regression. If the Wisconsin variables were cointegrated, then yes, the Monte Carlo evidence suggests it can take a lot of data to converge to the true parameters, which would represent the long-run relationship between the variables.

Here I think the situation may be a bit different. Although I can see that the variables might be cointegrated, I would not be surprised if they weren’t, especially as this is a bit of an odd regression, with levels rather than log of levels of trending non-stationary variables. The variables don’t seem to be cointegrating according to standard tests. In this case, the parameter that will be estimated by the regression does not represent any economic relationship, in the short run or the long run, but is rather spurious. But even if it is spurious, the estimated parameter will be positive and highly statistically significant given enough data.

1. Julian Silk

Dear Rick,

My hope is to get started on something that uses the cointegration techniques on another issue shortly. But here it’s an interesting question what test one uses to look at this. With Johansen-Juselius tests, there is always a power problem, which people have agreed to ignore, for variables that are near-integrated. A nice look at this may be found in http://www.federalreserve.gov/pubs/ifdp/2007/915/ifdp915.htm. David Hendry and others make the argument that you want to start with this very large data set of possible variables and narrow it down, and so they like the Johansen-Juselius test. But my impression is that the conclusion that you want to have supporting evidence besides just the test is quite acceptable to the OxMetrics people, and maybe to Johansen and Juselius themselves. (I found her very easy to get along with, while he struck me as a forbidding Norseman.) Since the log is a monotonic transformation, if the levels are really cointegrated, with any degree of strength, it would seem the logs are too, but again, it hasn’t really been looked at.

Julian

1. Rick Stryker

Hi Julian,

I believe the paper you are referring to is concerned with the size of the Johansen test rather than the power. The paper finds that the Johansen test tends to find cointegration too often, which is consistent with the monte carlo evidence reported in Cheung and Lai (1993) in “Finite Sample Sizes of Johansen’s Likelihood Ratio Tests for Cointegration” in the Oxford Bulletin. That’s another reason why I tend to believe the series that Menzie examined are not cointegrated–the Johansen test is biased to find it if it is not there in finite samples. That fact plus the weak statistical evidence (10% confidence level) and the requirement to have a quadratic trend in the data suggest that there is no cointegration. I don’t think Menzie is necessarily arguing for cointegration but, if it’s not there, he has estimated a spurious regression in which all parameter estimates and test statistics are meaningless. I think your earlier suggestion to have run a regression in differences was a good one.

Power and size considerations are of course a big problem with all these unit root and cointegration tests. The problem with Johansen beyond that standard difficulty is that it’s just very easy to make a specification error. I’d agree that Johansen and Juselius would want to bring in other evidence besides the test results. That’s what they did in their 1992 paper that investigated PPP and UCIP. The max eigenvalue and trace tests disagreed on the number of cointegration vectors and J&J argued from considerations outside the tests to determine which test to believe.

Not sure if you are famiiiar already, but when considering what cointegration tests to use in terms of finite sample bias, I’d recommend Haug’s 1996 Journal of Econometrics article “Tests for Cointegration: A Monte Carlo Comparison,” which investigates a range of cointegration tests in terms of both size and power.

Hope your research goes well.

Note: this comment was rejected by the blog software as already having been made, so I apologize if it’s repeated

7. Levi Russell

Of course, I clarified this comment in the previous post. You are behaving quite disingenuously, Menzie.

1. baffling

levi, a word of advice. an untenured faculty member should really limit creating controversy in the blogosphere, and instead focus on producing quality research and peer reviewed publications. your time will come when you can take over the blogosphere, but now is not the time. you should have mentors at your institution to help guide you in this endeavor. for an untenured faculty member, the blogosphere is like the rest of social media for your students- a disaster waiting to happen. again a friendly piece of advice. content on the internet never disappears…and in due time you will be evaluated for tenure. let your content be peer reviewed papers, not internet banter.

8. XO

Please listen to Baffling, that advice is so so true. Life before and after tenure is very different. Everything before tenure should be focused on peer reviewed publications. Publish early and often. After tenure, you can build a broader spectrum of outlets for your ideas. Anything not peer reviewed now is just taking away from what you need to be focusing on in this 5/6 year period. And, there is no right to be forgotten in the USA….

