There has been some discussion recently about discrepancies between different government estimates of the state of the labor market. Although a legitimate issue has been raised, there has also been a bit of misunderstanding.
The Bureau of Labor Statistics provided a great boon to business cycle researchers when it began publishing the Business Employment Dynamics data. The BED data divides establishments into two categories: (a) firms that are either new establishments or are hiring more workers compared to the previous quarter, and (b) firms that either go out of business or are hiring fewer workers compared to the previous quarter. The number of net job additions from firms in the first class is referred to as “gross job gains,” whereas the number of net jobs lost from firms in the second class is referred to as “gross job losses.” The BED numbers for gross job gains and gross job losses come from the Quarterly Census of Employment and Wages.
These BED data are collected separately from (and reported with a considerably longer delay than) the Current Employment Statistics data. The CES data are instead simple counts of the total number of people working at surveyed establishments. The CES numbers are collected and reported monthly and provide the basis for the “nonfarm payroll employment” numbers that get most of the coverage in the press.
You’d think in principle that if you took the difference between the gross job gain and loss numbers from the BED, you’d get the same number as the change in the number of people working according to CES. Historically you typically did get roughly comparable numbers, but the estimates differ somewhat for the most recently available quarter, 2006:Q3. Barron’s Alan Abelson (via Barry Ritholtz) reports the following claims attributed to Philippa Dunne and Doug Henwood of the Liscio Report:
compared with a gain for the [2006:Q3] quarter of 442,000 jobs reported in the so-called establishment survey, the Business Employment Dynamics, or BED, reckoning was a scant 19,000 additions. In manufacturing, the 9,000 jobs lost according to the payroll figures balloon into a loss of 95,000 jobs in the BED data; the improbable 20,000 additions in construction (think: housing) turns into a loss of 77,000 by BED’s measure; the 507,000 gain in private services shrinks to 108,000. And so it goes. Or, more accurately, so goes the job mirage.
One likely culprit, Philippa and Doug suggest, is that curious concoction known as the “birth/death” model used by the Bureau of Labor Statistics to estimate the gains/losses in jobs from the launching and demise of businesses. Thanks to this voodoo calculation, 156,000 were added in last year’s third quarter and a hefty 388,000 in the opening four months of this year.
Bloomberg reported some slightly different numbers attributed to Ray Stone of Stone & McCarthy Research Associates:
The new [BED] data revealed a seasonally adjusted third-quarter private payroll gain of only 19,000, in sharp contrast to the BLS’ published monthly payroll [CES] increase of 498,000 for the quarter.
Calculated Risk has also mentioned the discrepancy between BED and CES, though he did not present the particular numerical calculations repeated above.
The 442,000 jobs growth number for 2006:Q3 was apparently arrived at by Dunne and Henwood by taking the average of the seasonally adjusted CES-reported levels of employment for July, August and September and subtracting the average seasonally adjusted values for April, May and June. However, temporally averaging CES data is not conceptually the correct way to get a number comparable to the BED value, since the latter is intended to reflect the situation on the 12th day of the third month of the quarter. The 498,000 figure calculated by Stone is better, being based on the difference between the September and the June CES values.
But the glaring error by either analyst was in trying to compare the 442,000 or 498,000 CES figure with a number of 19,000 from the BED. The 19,000 figure was arrived at by first taking the BED raw count of gross job gains and seasonally adjusting it, and then taking the BED raw count of gross job losses, and seasonally adjusting it separately. The seasonal patterns of these two series are quite different, and when you seasonally adjust them separately, and then take the difference between those seasonally adjusted sums, you get an artifact that should in no way be construed as the seasonally adjusted net employment gains.
The correct procedure, if you want to know whether the two surveys have come up with the same count, is to use the seasonally unadjusted values for each.
The change in the actual, seasonally unadjusted CES count of the number of people working at private establishments between June and September was a loss of 298,000 jobs. The difference between the seasonally unadjusted BED job gains and losses in 2006:Q3 was a net loss of 453,000 jobs. The discrepancy between the two is therefore 453 – 298 = 155 thousand jobs, not the 479 thousand job discrepancy claimed by Stone nor the 423 thousand job discrepancy claimed by Dunne and Henwood. In other words, about 2/3 of the discrepancy claimed by these analysts resulted from a misuse of the data by the analysts rather than a problem with the data that BLS reported.
For perspective, the graph below plots the cumulative seasonally unadjusted net employment change over 4 quarters as calculated by BED and over 12 months as calculated by CES. The two series track each other pretty well.
Even so, a discrepancy of 155,000 workers within the single quarter 2006:Q3 is more than it should be, and suggests something is clearly wrong with one of the measures.
BLS has reported an analysis of some of the prior discrepancies between the BED and CES figures. The report investigated a number of possibilities. Some firms answer one survey but not the other, some firms give different answers to the same question on different surveys, and there are possible errors that can arise from either means of data collection. The analysis did not find evidence that such factors could account for big differences between the two measures. Instead, the most likely factor identified in the BLS report is indeed the birth/death model fingered by Abelson above.
I do not know what Abelson’s definition of “voodoo” might be, but I doubt that many statisticians would want to describe the BLS birth/death model with such a term. Details of how it works can be found here. Perhaps “voodoo” to Abelson means “something with math in it.” The basic idea behind and motivation for the birth-death model is quite simple. The CES number of people working is based on counts received from firms who filed a report. Unfortunately, not all firms file reports on time. One case in which firms often do not file a timely report is if they have gone out of business. If you didn’t receive data from a firm this month, that could mean that the firm has gone out of business, or it could mean that they just didn’t get their report filed this month.
If a firm filed a report last month but did not file this month, is your estimate of the number of people working there now equal to zero? Or is your estimate the number the firm reported for the most recent available month? If you gave either answer, please go back and retake Stat 101. The best guess of the number of people working would come from forming some estimate of how many of the nonresponders represent business “deaths” and how many represent data errors. The way you would form such an estimate would be to look at historical data for what are the odds that a missing observation represents a business death rather than just a late report.
The other kind of firm that you’re going to miss with the CES survey is one whose existence you did not know about at the time you set up the survey design. Again, is your statistical estimate of the number of people working at new firms this month zero? Mine would be based on looking at what the past numbers for jobs coming from new firms have been and how they correlate with things I currently know.
And that is what the BLS birth/death model is all about. It is not voodoo and it is not mysterious. It is just an effort by the BLS to use the data it has to estimate the data it does not have.
Now, that is not to say that the statistical basis for the birth/death adjustments could not be improved, nor is it to claim that even the very best conceivable model could always tell you accurately the values for numbers you did not observe. As Calculated Risk has emphasized, at the moment, with a weak economic environment in general and a very troubled housing sector in particular, it seems very likely that a higher fraction than usual of the nonreporting establishments have in fact gone out of business, and that there are fewer new businesses starting than would be typical.
For this reason, it seems very likely to me that recent CES data have been overstating the extent of employment in residential construction. The next question is, what should we do about it?
The birth-death issue strikes me not as a problem with the CES data construction, but instead is a fundamental limitation of any data gathered directly from firms. This in my view is another good reason to be using the BLS household data, for which a surveyor goes to a particular residential address to collect data on how many people there are working, as a supplement to the CES establishment data. Yes, I know, the household count has problems of its own, and it would be an even bigger error to rely on it alone. But the good news is that the household survey gives us an estimate that at least is not contaminated by the birth/death issue. And the household data would lead you to conclude that employment growth in the first half of this year has been weaker than the CES estimates suggest.
Given these concerns, beginning next month I will be increasing the weight I place on the BLS household survey from 10% to 20% and decreasing the weight on the CES establishment data from 80% to 70%.
I conclude that the problems with the CES data are more significant than I had been estimating, though substantially less severe than some analysts have suggested. I continue to believe that the best way to deal with such problems is not to throw out data, but instead to widen the set of variables that are regarded as informative. That approach supports the inference that U.S. employment growth in the first half of this year was likely less robust than is currently reported, particularly in the construction sector.