The information content of the establishment vs. household-based employment series vs. hours worked.
A recent post on post-recession employment trends sparked a vigorous debate. In response, I decided to look further at the time series characteristics of some oft-cited variables, and how they relate to real GDP. The series I am going to examine are non-farm payroll employment, based upon the survey of establishments, and the civilian employment series, smoothed and adjusted by the BLS to conform in concept to the establishment series (this series, drawn from BLS, Employment from the BLS household and payroll surveys: summary of recent trends, dated December 2, 2005, is different than the official series). In addition, I’ll look at the BLS measure of total hours worked in the business sector.
The first figure shows the two employment series (the smoothed and adjusted civilian employment series only begins in 1994), with payroll employment in blue and civilian employment in red. The second figure shows the first difference of the two logged series. As is evident, the variability of the population based series is substantially larger than that of the establishment based series. Over the 1994m01-2005m11 period, the standard errors of the annualized growth rates of the monthly employment levels are 0.00138 versus 0.00284 for establishment and population, respectively. That means that the latter is about twice as variable as the former. (Interestingly, the “drift” terms are about the same, with the implied annualized monthly growth rates equal to 1.5% in both instances).
One reason to be interested in employment is to see how it links up with measures of real activity. In the next figure, the two measures are plotted against real GDP, in blue (payroll in red, civilian employment in green, all series now logged). The last figure is hours worked and real GDP, over a comparable period. One would be tempted to do some “ocular regressions” to determine which series is more informative regarding real GDP, but that would lead to differing interpretations. Instead, I decided to characterize the data statistically.
First, all series fail to reject a unit root null (allowing for constant and trend) according to a standard augmented Dickey-Fuller test (where the lag length is selected using a Bayesian information criterion). That means all the series seem to be integrated of order one. One implication of these findings it that it is inappropriate to conduct inference on any of these series using deterministic time trends to “detrend” the data (this is also true for payroll employment over a longer sample, 1950m01-2005m11).
Second, over the 1995q1-2004q4 period (in all subsequent statistical analyses, I have truncated the sample at end-2004 to minimize the effect of data revisions), payroll employment appears to be cointegrated with real GDP, at the 10% marginal significance level, using a maximum likelihood approach (allowing for a constant in the cointegrating equation and a constant in the VAR(4), which is like allowing a deterministic time trend in all of the variables). similarly hours worked is also cointegrated with real GDP, at even higher levels of significance.
Third, the smoothed and adjusted civilian employment series, based on the household survey, does not appear to be cointegrated with real GDP, even at the 10% marginal significance level. Using finite sample critical values would make the failure to find cointegration even more marked.
Fourth, extending the sample back far enough to 1968, both payroll employment and the unadjusted civilian employment series (the adjusted series begin in 1994), as well as the hours worked, are borderline cointegrated with real GDP, but with the wrong sign and statistically insignificant coefficients.
Fifth, since the series are all integrated, it makes sense to examine in addition the information content of the log first-differenced series. Without taking a stand on causality, one approach to doing this is to use principal components analysis, applied to real GDP, payroll employment, adjusted civilian employment, and hours worked. This methodology identifies the vector that minimizes the variance of a linear combination of the variables. The first principal component accounts for 68% of the variation; the series with the highest coefficient is payroll employment, the series with second highest is hours worked, followed by adjusted civilian employment and finally real GDP. A similar result obtains if GDP is dropped from the group; then the payroll series has the highest coefficient.
Sixth, little of this analysis speaks directly to the issue raised by Tim Kane regarding real time analysis. In order to address this issue, somebody (with some spare research assistant resources) would have to use the different vintages of data, such as that available at the
My conclusion is not that one should only pay attention to payroll employment. Clearly, all the series have information content. Rather, that each series has different characteristics. However, if I had to choose one series, I would still choose the payroll employment series.