One of the aspects of previous formal studies of excess deaths in Puerto Rico in the aftermath of Hurricane Maria is that the population decline in the years preceding the hurricane strike, while acknowledged, is not accounted for (Santos-Lozada and Howard, 2017, Rivera and Rolke, 2018, Kishore et al., 2018, Santos-Lozada and Howard, 2018). This factor is potentially important because the “normal” number of deaths per month is a function of population size. Ignoring that fact, when taking an average over several years to infer the normal rate will bias up the estimated normal rate and bias down the implied number of excess fatalities.
In this short note, I attempt to control for this factor, using standard time series methods. Taking this route is not costless; it means introducing greater sampling error (another parameter must be estimated), as well as potential for specification error. Hence, the results below should not be construed as being superior, but rather “alternative”.
I find that using a conservative approach while adjusting for population, taking into account serial correlation, excess deaths are 804 (vs 740 when not adjusting for population). My estimate of 740 is lower than the Santos-Lozada and Howard (2018) estimate of 1139 through December likely arises from the different treatment of standard errors, as I formally account for serial correlation (which expands the estimated confidence interval).
If one looks at merely the deviation from predicted values, through March 2018 I obtain a baseline estimate of 1926 excess deaths, versus 2251 after adjusting for population decline. Using quantile regression, I obtain an estimate of 2705.
Data and Specification
The data are graphically presented below.
Figure 1: Mortality per month (blue, left log scale), and population (red, right log scale), cubic interpolation from IMF World Economic Outlook database data. Out of sample period shaded light green. Source: Santos-Lozada and Howard, 2017, June release of Vital Statistics data, IMF WEO April 2018 database, and author’s calculations.
I estimate two specifications. The first emulates the Santos-Lozada and Howard (2018) specification, in that it treats the time variation in “normal periods” as merely a function of the month (i.e., seasonality). It differs from their approach in treatment of outlier (October 2014), and use of HAC robust standard errors (Newey-West default in EViews).
(1) mt = β0 + δ(OCT14) + monthly dummies + ut
Where m is log mortality.
The second adds in log population, pop:
(1) mt = β0 + β1popt + δ(OCT14) + monthly dummies + ut
These regressions are estimated over the 2010M01-2016M12 period. I add a dummy for October 2014 because it’s a clear outlier. The regression output are reported in this memo. The regression residuals exhibit serial correlation, but not extreme non-Normality. For completeness’ sake, I estimate (2) using quantile regression as well.
I calculate for each specification the excess fatalities, measured either as deviations from 95% upper bound, or as deviation from conditional mean. The time series for these deviations from conditional means are shown in the figure below (Figure 3 in the memo).
Figure 3: Deviations from predicted values, for simple time dummies OLS model (blue), OLS model adjusting for population (green), and Quantile Regression model adjusting for population (red). Gray shading denotes pre-Maria sample. Source: author’s calculations.
Using the conservative approach of taking only entries above the 95% upper bound yields a baseline estimates of 740 excess deaths. This estimate is below the Santos-Lozada and Howard (2018) estimate of 1139 (through December) because (1) I have taken a slightly different approach to estimating the conditional mean, and likely more importantly (2) accounted for serial correlation in calculating the standard errors. Accounting for population change, the excess rises to 804, still using HAC robust standard errors.
Using the deviation from conditional mean, and summing using the baseline model (no population change accounted for) yields excess deaths of 1926; using the population adjusted OLS approach, I obtain an estimate of 2251 excess deaths (2705 using quantile regressions).
Note that these estimates extend only through March 2018. Presumably, there are still excess deaths given the destruction of infrastructure, and associated illnesses (e.g., the leptospirosis epidemic).
To reiterate: these estimates should not be construed to be “better”. Rather, it’s an illustration of how adjusting for the population decline might affect inferences regarding excess mortality.