For PubAffr819: What Not to Say as a Policy Analyst

(1) do not make absolutist statements without knowing the nature of the data; (2) Do not abuse statistical terminology; (3) do not assert a conspiracy is in place just because the data do not conform to your preferred narrative.

First, consider a comment on the Hurricane Maria death toll:

This [assertion that thousands of American citizens have died] is categorically false, Menzie. Excess deaths in PR through year end, those recorded by the Statistics Office, numbered only 654. Most of these occurred in the last ten days of September and the whole of October. While the power outages there were exacerbated by the state ownership of PR’s utility, a large portion of the excess deaths would likely have occurred regardless, given the terrain and the strength of the hurricane. Thus, perhaps 300-400 of the excess deaths would have occurred regardless of steps anyone could have made to fix the power supply. The remainder can be attributed essentially to the state ownership of the power utility.

I would note that excess deaths fell by half in December. Thus, the data suggests that the hurricane accelerated the deaths of ill and dying people, rather than killing them outright. I would expect the excess deaths at a year horizon (through, say, Oct. 1, 2018) to total perhaps 200-400. Still a notable number, but certainly not 4,600.

See the analysis: https://www.princetonpolicy.com/ppa-blog/2018/5/30/reports-of-death-in-puerto-rico-are-wildly-exaggerated

I would note that the official death toll is 2975, in GWU report commissioned by the Commonwealth of Puerto Rico, see discussion of estimates here.

Second, a 2018 post regarding uncertainty in statistical inference.

Mr. Steven Kopits takes issue with the Harvard School of Public Health led study’s point estimate of (4645) and confidence interval (798, 8498) for Puerto Rico excess fatalities post-Maria thusly:

Does Harvard stand behind the study, or not?

That is, does Harvard SPH believe that the central estimate of excess deaths to 12/31 is 4645, or not? Does it stand behind the confidence interval, or not? Is there still a 50+ probably that the death toll comes in over 4600? If there is, then the people of PR need to start looking for the 3,250 missing or the press needs to assume PR authorities are lying. Those are the implied action items.

Or should we just take whatever number HSPH publishes in the future and divide by 3 to get a realistic estimate of actual?

Let’s show a detail of the graph previously displayed (in this post):


Figure 1: Estimates from Santos-Lozada and Jeffrey Howard (Nov. 2017) for September and October (calculated as difference of midpoint estimates), and Nashant Kishore et al. (May 2018) for December 2017 (blue triangles), and Roberto Rivera and Wolfgang Rolke (Feb. 2018) (red square), and Santos-Lozada estimate based on administrative data released 6/1 (large dark blue triangle), end-of-month figures, all on log scale. + indicate upper and lower bounds for 95% confidence intervals. Orange triangle is Steven Kopits estimate for year-end as of June 4. Cumulative figure for Santos-Lozada and Howard October figure author’s calculations based on reported monthly figures.

The middle paragraph (highlighted red) shows a misunderstanding of what a confidence interval is. The true parameter is either in or not in the confidence interval. Rather, this would be a better characterization of a 95% CI:

“Were this procedure to be repeated on numerous samples, the fraction of calculated confidence intervals (which would differ for each sample) that encompass the true population parameter would tend toward 95%.”

In other words, it is a mistake to say there should be a 50% probability that the actual number will be above the point estimate. But that is exactly what Mr. Kopits believes a confidence interval means. He is in this regard incorrect. From PolitiFact:

University of Puerto Rico statistician Roberto Rivera, who along with colleague Wolfgang Rolke used death certificates to estimate a much lower death count, said that indirect estimates should be interpreted with care.

“Note that according to the study the true number of deaths due to Maria can be any number between 793 and 8,498: 4,645 is not more likely than any other value in the range,” Rivera said.

Once again, I think it best that those who wish to comment on estimates should be familiar with statistical concepts.

Third, here is an example of data paranoia, from a recent post.

Reader Steve Kopits writes about the debate over employment numbers:

At the same time, I thought it possible that both surveys were in fact correct, but garbled with the effect of the recovery from the suppression, thereby creating misleading impressions because we were misinterpreting the data. That still seems possible, though I’ve read that others think the CES was manipulated to provide a more rosy picture heading into the election.

 

This statement joins a long pile of such allegations, e.g.,  Senator BarrasoJack Welchformer Rep. Allan WestZerohedgeMick Mulvaney, among others. All I can say is that if there was a conspiracy, they didn’t do a very good job. With the benefit of the January benchmark revision, we can update our assessment of how badly the purported conspirators performed their job.

Figure 1: Nonfarm payroll employment in January 2023 release (red), in October 2022 release (blue), in 000’s, s.a. Source: BLS via FRED.

Now, it may turn out eventually (after another benchmark revision the results of which will be released in February 2024) that in Q2 NFP will turn out to be lower than indicated in the CES. But for purposes of deceiving the electorate in November 2022, this seems like a lousy way of doing it.

In any case, before people start crying that the data are manipulated, I wish they would read the BLS technical notes on (1) revisions and mean absolute revisions, (2) benchmark revisions, (3) the calculation of seasonal adjustment factors, (4) the application of population controls in the CPS. Before they start citing the various series, I wish they understood the informational content (relative to business cycle fluctuations) of the CPS employment series vs. that of the CES employment series. That understanding can be obtained by reading works by people who understand the characteristics of the macro data (Furman (2016)CEA (2017)Goto et al. (2021)).

From a sociological perspective, I do wonder why conspiracy theories are so attractive to some individuals. Here’s a Scientific American article laying out some of the character traits that are associated with adherence to conspiracy theories.

f

 

17 thoughts on “For PubAffr819: What Not to Say as a Policy Analyst

  1. pgl

    But, but, but Princeton has never been a policy analyst. More like the most incompetent consultant ever with his day job being a ‘contributor’ to Fox and Friends.

    Reply
    1. Menzie Chinn Post author

      Steven Kopits: Italics are there because it’s a quote. Done by WordPress. If you understood from context the hundreds of other posts of mine you’ve read, you would not find it remarkable. I am, however, happy you are impressed.

      Reply
  2. Steven Kopits

    Let’ s put this comment up here as well:

    With regards to SVB: Both you and I are using Dec. 31 numbers, Menzie. The difference is that I used the 10-K and you used Yahoo Finance.

    I don’t see any unusual reliance at SVB on interbank lending or anything similar. Are you stating that 10% debt is somehow ‘reckless’? On what basis? What interest rate are they paying and how did that contribute to the financial meltdown? You haven’t made that case, because it didn’t.

    What happened to SVB is that it was unable to digest a 5% (pp) risk free (r*) interest rate increase in 14 months. Well, duh. Banks borrow short — that’s what demand deposits are — and lend long, that’s what mortgages are. When interest rates were at zero, everyone refi-ed or took out a rock bottom loan and SVB is locked into that interest rate. Similarly for securities purchased then. You can flip them and repurchase, but you’ll ride the interest rate rise all the way up. On the other hand, with mortgage rates at 7%, no one is borrowing even as deposit rates are heading to 5%.

    If you’re interest income is 3% of assets, and your interest paid is 5% + another 1.5% of assets to cover operations, well, the bank will be losing money at the pace of 3-4% of assets per year. If equity is 5% of total assets, then the bank will find itself on life support pretty fast.

    This is not rocket science. Except at the Fed and Treasury, apparently.

    If you’re right, then there’s no systemic risk. Then why CS? Why Signature? Why a giant banking downgrade? Why did the Treasury (FDIC) guarantee the entirety of US demand deposits? Doesn’t seem like just an SVB problem to me.

    You simply confused assets and liabilities. Simple as that.

    Reply
    1. Menzie Chinn Post author

      Steven Kopits: Look at where the numbers I pulled are from on the balance sheet. Under liabilities. But in any case, this set of comments will be point 4 for my next cautionary note to my students (macro course Econ 702).

      Reply
        1. Menzie Chinn Post author

          Steven Kopits: Yes. And look at the 10-K Consolidated Balance Sheet, under liabilities section, short term debt (only last two year-ends reported), at https://d18rn0p25nwr6d.cloudfront.net/CIK-0000719739/f36fc4d7-9459-41d7-9e3d-2c468971b386.pdf. Notice how short term borrowing jumped up. That was what I was referring to. If SVB was so flush with deposits, why was it accessing short term borrowing via capital markets? The answer is that *it had been* flush with deposits, but SVB clients were accessing their cash, so deposits were falling. That necessitated SVB borrowing short term on short term capital markets.

          Reply
    2. baffling

      “Then why CS? Why Signature?”
      my understanding is these are two completely independent issues, which simply happened to occur during the same timeframe. they are NOT related.

      “Why did the Treasury (FDIC) guarantee the entirety of US demand deposits? Doesn’t seem like just an SVB problem to me.”
      well, that is NOT accurate. you may want it to be accurate, but that is NOT accurate. Yellen explicitly makes the statement that not all deposits are covered. so you must be calling Yellen a liar, Steven, but based on what evidence? do you know something she does not? or is this your great consultant instinct we hear?
      https://www.reuters.com/business/finance/yellen-tells-senators-us-banking-system-remains-sound-2023-03-16/

      Reply
  3. Steven Kopits

    In other words, it is a mistake to say there should be a 50% probability that the actual number will be above the point estimate. But that is exactly what Mr. Kopits believes a confidence interval means. He is in this regard incorrect. From PolitiFact:

    University of Puerto Rico statistician Roberto Rivera, who along with colleague Wolfgang Rolke used death certificates to estimate a much lower death count, said that indirect estimates should be interpreted with care.

    “Note that according to the study the true number of deaths due to Maria can be any number between 793 and 8,498: 4,645 is not more likely than any other value in the range,” Rivera said.

    Once again, I think it best that those who wish to comment on estimates should be familiar with statistical concepts.

    Let me reiterate that I think that’s wrong. If you re-run the numbers, then the results of the trials should cluster around the true sample mean. If you’re saying that 100 people said that Mr. Smith appeared to be 6 ft tall on average, with a confidence interval at 95% of +/- 1 foot. You’re saying that his height could be anything between 5 ft and 7 ft with equal probability. He is as likely to prove 5 ft tall as 7 ft tall or 6 ft tall. I don’t think that’s what a normal distribution gives you. If the distribution of observations is not normal, yes, you could get that. But not in a normal distribution.

    If one contends that “true number of deaths due to Maria can be any number between 793 and 8,498: 4,645 is not more likely than any other value in the range”, yeah, I think that’s dead wrong, unless you assume the data is evenly distributed, ie, a non-normal distribution. Possible, but for mortality problems, I would think highly unlikely. We’d default pretty easily to a normal curve.

    And as for saying 793 is as probable as 8,498, I’d say you have a garbage researcher there. You can’t get your best guess to within an entire order of magnitude? Well, stay at home, because you’re useless in real world decision-making. That would be like saying that oil prices would likely fall between $16 and $160 barrel with equal probability. Why bother showing up at all? What’s your value-added in any real-world discussion?

    Reply
    1. baffling

      Steven, what evidence do you have for that normal distribution you so insist on? other than that is what you assume, and require in order for your argument to not look so foolish? you commentary demonstrates a severe lack of understanding of elementary statistical analysis.

      “You can’t get your best guess to within an entire order of magnitude? ”
      I want an analyst who gives me analysis, not a “best guess”. you think that because you give me a “guess” with authority, it should hold sway over what the data analysis says? Steven, you are not advocating for analysis. you are advocating for acceptance via authority. I would be wary of any consultant who insisted on “best guess” decision making.

      Reply
      1. Steven Kopits

        I advised Andy Hall, the most prominent oil trader of his time. If I said to Andy, “Well, oil prices have an equal chance of being $16 or $160,” that would literally be my last phone call with him. As it would with any other money manager.

        The history of the normal curve deserves a better exposition. I am sure Menzie has one.

        Reply
        1. Menzie Chinn Post author

          Steven Kopits: You do know that the concept of a confidence interval is distinct from that of a Normal distribution (even if in practice we often assume it)?

          Reply
        2. baffling

          so you do not understand the proper use of the normal curve, Steven? you are the one advocating for its use. perhaps you could clarify on why it should be in play?

          ” If I said to Andy, “Well, oil prices have an equal chance of being $16 or $160,” that would literally be my last phone call with him. ”
          so you think it is preferable to indicate with confidence a number with which you have zero certainty in its accuracy. just a strong hunch and gut feeling. just as long as people understand those recommendations of yours are dependent upon your gut, and not a strong analytical process, I guess they get what they pay for.

          Reply
        3. Pgl

          Andy Hall. Is that Bruce Hall’s. Dumber brother? Stevie the name dropper. Dude no one cares what dorks you have talked to

          Reply
  4. Macroduck

    The point to this post is that Stevie doesn’t seem to know a valid argument from a mess, so Stevie,makes messes. And Stevie steps right up and makes a mess in answer. Stevie argues based on italics. Stevie argues based on Yahoo. Stevie says “I’m right” because Stevie says. Stevie offers no evidence.

    QED

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *