Guest Contribution: “Are Google data really useful for macroeconomic nowcasting?”

Today, we’re pleased to present a guest contribution by Laurent Ferrara (Professor of Economics at Skema Business School, Paris and Director of the International Institute of Forecasters).

The recent sequence of economic, financial and pandemic crises around the globe has considerably shortened the horizon of predictions for macroeconomic forecasters. At the heart of the Covid-19 crisis, the horizon of interest was rather the end of the week than two years-ahead. This led practitioners to focus on new types of high-frequency and alternative datasets, raising thus new challenges for econometricians (unstructured data, very large datasets, mixed frequencies, high volatility, short samples …).

Various sources of alternative data have been used in the recent literature, such as for example web scraped data, scanner data or satellite data. Generally, those datasets are extremely large and can be considered as big data. One of the main sources of alternative data are Google search data, and seminal papers on the use of such data for forecasting are the ones by Hal Varian and co-authors (see for example here). In the area of nowcasting/forecasting, the literature tends to show evidence of some forecasting power for Google data, at least for some specific macroeconomic variables such as unemployment rate (D’amuri and Marcucci, 2017) en employment (Borup and Montes Schütte, 2020), building permits (Coble and Pincheira, 2017) or car sales (Nymand and Pantelidis, 2018). However, when correctly compared with other sources of information, the jury is still out on the gain that economists can get from using Google data for forecasting and nowcasting. A side question, highly debated on Econbrowser is about the replicability of those data by practitioners (see here for a discussion between Hal Varian and Simon van Norden).

In a recent paper, published with Anna Simoni in the Journal of Business and Economic Statistics (see here for a mimeo), we ask ourselves whether Google data are still useful in nowcasting quarterly GDP growth when controlling for official variables, such as opinion surveys or manufacturing production, generally used by forecasters. And if so, when exactly are those alternative data adding a gain in nowcasting accuracy. Nowcasting GDP growth is extremely useful for policy-makers to assess macroeconomic conditions in real-time. The concept of macroeconomic nowcasting has been popularized by Giannone et al. [2008] and differs from standard forecasting approaches in the sense it aims at evaluating current macroeconomic conditions on a high-frequency basis. The idea is to provide policy-makers with a real-time evaluation of the state of the economy ahead of the release of official Quarterly National Accounts, that always come out with a delay. See for example here for the U.S. economy and here for a recent post on Econbrowser.

Because Google search data are of high dimension, in the sense that the number of variable is large compared to the time series dimension, there is a price to pay for using them: first, we need to reduce their dimensionality from ultra-high to high by using a screening procedure and, second, we need to use a regularized estimator to deal with the pre-selected variables. Regularization techniques are a way to account for many variables, potentially correlated, into a linear regression (see for example the Ridge estimation). In this respect, we put forward a new approach combining variable pre-selection and Ridge regularization enabling to account for a large database. In the paper, we provide some theoretical results as regards the good asymptotic properties of this estimation strategy, that we refer to as Ridge after Model Selection.

In addition to those theoretical results, we get a bunch of empirical results that could be interesting to share with people interested in using high dimensional alternative data for macroeconomic nowcasting. Our objective is to nowcast GDP growth every week of the quarter, for the U.S., euro area and Germany over 3 types of economic periods: (i) a calm period (2014-16), (ii) a period with a sudden downward shift in GDP growth (2017-18, related to trade war between U.S and China/Europe) and (iii) a recession period with large negative growth rates (2008-09, driven by the Global Financial Crisis). In this respect we use classical macro data (surveys and production), as well as alternative data stemming from Google (Google Search Data, already grouped into categories and sub-categories). We compare various approaches based on their nowcasting ability, as measured by the Root Mean Squared Forecasting Error (RMSFE).  Four salient facts emerge from our empirical analysis.

First, we compare a standard regression (with Ridge regularization) with a regression after preselection (our Ridge after Model Selection approach).  Figure 1 shows the results for the euro area during a calm period (2014-16). We clearly see the gain in terms of nowcasting accuracy of pre-selecting data before entering into the model. The idea is that having too many variables adds too much noise. This is specifically the case with Google Search Data, as some of them are not directly related to economic activity. This result confirms previous results against the background of dynamic factor models (see Bai and Ng, 2008 or Barhoumi et al., 2009).

Figure 1: RMSFEs for the euro area during a calm period (2014-16) stemming from a standard regression with Ridge regularization (blue bars) and from the Ridge after Model Selection approach (orange bars). Evolution of RMSFEs within the 13 weeks of the current quarter. Source: Ferrara and Simoni (2023)

Second, we point out the usefulness of Google search data in nowcasting GDP growth rate for the first four weeks of the quarter, that is when there is no official information about the state of the current quarter. In Figure 1, we see that at the beginning of the quarter (from week 1 to week 4), Google data indeed provide an accurate picture of the GDP growth rate in the sense that RMSFEs are reasonably low (between 0.2% and 0.3%), slightly higher than those at the end of the quarter when all the information is available (about 0.2%).

Figure 2: RMSFEs for the euro area during a calm period (2014-16) stemming from a standard regression with Ridge regularization (blue bars), from the Ridge after Model Selection approach (orange bars), from the Ridge after Model Selection approach using only Google data (green bars) and from a basic regression model without any Google data (yellow bars) . Evolution of RMSFEs within the 13 weeks of the current quarter Source: Ferrara and Simoni (2023)

Third, as soon as official data become available, that is starting from week 5 with the release of the first opinion survey of the quarter (in the euro area case), then the relative nowcasting power of Google data rapidly vanishes. We see in Figure 2, that for the week 5, the RMSFE with all data (orange bar) is equivalent to the one without any Google data (the yellow bar), that is. with only macro information contained in the first survey of the quarter.  We also note that RMSFEs stemming from the Ridge after Model Selection approach using only Google data (green bars) do not show any decline overtime, suggesting that the gain visible in orange bars starting from week 5 is coming from the integration of macro variables.

Fourth, recession periods present a specific pattern, as the model without any pre-selection and with only Google data as information set provides the lowest RMSFEs (green bars in Figure 3). This pattern is also generally visible for German and U.S. data. This result must be further understood by additional research, but it might be related to the well-known higher uncertainty that we observe during recessions, meaning that more data must be used to account for it. In any case, this can be seen as a justification of the use of alternative data during crises.

Figure 3: RMSFEs for the euro area during a recession period (2008-09) stemming from a standard regression with Ridge regularization (blue bars), from the Ridge after Model Selection approach (orange bars), from the Ridge after Model Selection approach using only Google data (green bars) and from a basic regression model without any Google data (yellow bars) . Evolution of RMSFEs within the 13 weeks of the current quarter Source: Ferrara and Simoni (2023)

Various robustness checks confirm that those empirical results still hold for all the countries/areas in our analysis and are still valid when we increase the macroeconomic information set by considering 22 usual variables (sales, exports, employment, …). Last a true-real analysis for the euro area with vintages of data confirm the ranking of the various approaches. Overall, all those results point out that Google data can be very useful for GDP growth nowcasting during expansion phases when information is lacking, after a pre-selection step. However, as soon as official macroeconomic information arrives, the marginal gain from Google data tends to rapidly vanish. During recession phases, it seems that forecasters need the largest available information set to assess what’s going on in the economic activity.

This post written by Laurent Ferrara.

27 thoughts on “Guest Contribution: “Are Google data really useful for macroeconomic nowcasting?”

  1. pgl

    This Dominion Voting Systems defamation case is getting a lot of interesting attention:

    MSNBC guest host Jason Johnson criticized Fox News commentator Tucker Carlson and others at the network for “essentially calling their viewers idiots” following the release of court filings from the Dominion Voting Systems defamation case on Thursday. Johnson’s takedown comes one day after court documents revealed that Carlson and other network stars, including Sean Hannity, ridiculed the election fraud lies being pushed by Donald Trump and his lawyers, but they told a different story on air.

    Well Tucker Carlson and Sean Hannity have been lying to their viewers for many years because the people who watch their trash ARE IDIOTS.

    1. Moses Herzog

      Think about the high quality of the posts Prof Chinn, Prof Hamilton, Prof Frankel, Mr Ferrara above etc in stark contrast to some of the losers who are attracted to the comments section…….. Now …….. let’s think of the average type FOX might attract. Uhm, is it really possible to insult their intelligence?? I had some jerkwad on YT today who wanted to tell me the existence of viruses (ANY virus) had never been proven and that the man who invented PCR tests had been “murdered by Fauci”. This is the America 2023 we now live in.

  2. pgl

    And you thought Trump hanged with some evil people:

    Former Trump administration official and South Carolina Gov. Nikki Haley (R) kicked off the first rally for her new presidential campaign on Wednesday in South Carolina with an invocation by John Hagee, an evangelical pastor who opposes same-sex marriage and LGBTQ+ anti-discrimination ordinances. Hagee has also said that the Antichrist is gay, that God sent Hitler to push Jews into Israel, and that God sent Hurricane Katrina to punish New Orleans for holding a Pride parade. The following day, while speaking at an Exeter, New Hampshire town hall, Haley said of Florida’s “Don’t Say Gay” law “I don’t think [it] goes far enough.” While the law bans Kindergarten through third-grade teachers from discussing subjects related to sexual orientation and gender identity, which some say includes even mentioning LGBTQ+ people, a teacher’s same-sex partner, or a student’s LGBTQ+ parents, she compared such acknowledgment to sexual education.

  3. pgl

    Russia has committed crimes against humanity in Ukraine, Kamala Harris says
    Speaking at the Munich Security Conference in Germany, the U.S. vice president said, “We have examined the evidence. We know the legal standards. And there is no doubt. These are crimes against humanity.”

    Now I get that Moses is not that fond of VP Harris in general. And I will admit that this obvious statement was long overdue.

    But this had to be said. And I would put this above when St. Reagan told Gorby to “tear down this wall”. Now time to put words into real action.

    1. Moses Herzog

      Much better politics than saying on nationwide TV you went to the southern border after getting elected when it’s literally part of the public record you never went to the southern border. Then she had to go into the Hillary Clinton’s old campaign bunker and turn the ringer off on her phone for 1 year. Did she learn from having to hide out for one year from the U.S. media that she shouldn’t tell the lies of a 6 year old?? I doubt it.

      Since Copmala is replicating Hillary’s “hide out from the media during a Presidential bid” strategy, we’ll have to see if she breaks out into some kind of an HPV rash when she enters Wisconsin, Michigan, or Pennsylvania like Hillary was terrified with. Apparently, Hillary watched that film “It Follows” too many times and was terrified her handzies were gonna “get something” if she shook hands with someone from Milwaukee.

  4. JohnH

    18,955 civilian casualties in the Ukraine in the last year: 7,199 killed and 11,756 injured.

    Hundreds of thousands were killed in Iraq (2003-2006). Average annual deaths were at least an order of magnitude higher than in Ukraine, perhaps a couple.

    This observation is in no way intended to condone Russia’s behavior, only to put it in perspective.

    If people are going to get apoplectic and sanctimonious about Russia’s behavior, shouldn’t they at least demand that the US acknowledge and atone for its behavior? If Putin deserves to be tried at The Hague, shouldn’t Bush, Condi,, Cheney, Gates and Rumsfeld have been tried by now? Where’s the ‘never again’ call to keep the US doing exactly what Putin is doing?

    1. Menzie Chinn Post author

      JohnH: Yes, Bush, Condi, Cheney, Gates and Rumsfeld should have been tried by now. Now will you please cease and desist in your apology tour for Russian atrocities in the Ukraine? I seldom say this, but your excusing of an unprovoked expanded invasion of the Ukraine (the 2014 initial invasion was also unprovoked) makes me ill in a way that I find compelled to remark upon.

      Please note that if you go back in the posts, you will see that I for one was not a supporter of the US invasion of Iraq. So please stop trying to paint those of us who believe the Russian action in Ukraine illegitimate as apologists for US actions in Iraq.

        1. pgl

          Raise your hand if anyone supported that pathetic 2003 invasion. I never did. In fact I actively protested the invasion of Iraq. What did Jonny do? Took his wife to dinner.

          1. Moses Herzog

            At the time I remember having (seriously) what I would call “mixed feelings” about it. But I remember thinking, and verbalizing to my IRL friends that I couldn’t understand why we weren’t attacking Afghanistan FIRST, because Afghanistan was the logical choice if the true basis was revenge for the Twin Towers. Something that still makes me scratch my head (in extreme aggravation) to this very day.

          2. Moses Herzog

            While we’re on the topic, I never understood (and this is pre Osama Bin Ladin location and assassination) why we called Pakistan an “ally”. Another joke of absurdist comedy to this very day.

      1. Anonymous

        “unprovoked” may apply to bush invasion of iraq, and afghanistan, as well as us bombing of libya. it may also apply to us and saudi support to radical islamists in trying to depose assad, and kurds in iraq and syria.

        but inprovoked may be less obvious given the two minsk agreements, that were never implemented and the years of shelling the ‘breakaway’ oblasts that comprise donbas.

        osce reports of ceasefire violations show upticks in jan and feb 2022…..

        you can find osce special monitoring mission to ukraine maps of cease fire violations by dates up to the russian operations in 24 feb 2022.

        18 feb 2022:


        1. Macroduck

          In 2014, Russia invaded Ukraine. In 2022, there was violence in the Russian-occupied parts of Ukraine. To anyone not spoiling for a foght, violence in an occupied part of Ukraine is a funny kind of “provocation” to invade the unoccupied part of Ukraine. Occupation involves violence.

          Russia invaded Ukraine in 2014. If that had not happened, the “provocation” wouldn’t have been available as an excuse for Russia’s invasion of the rest of Ukraine.

          Nice try. Thanks for playing.

          1. Anonymous

            do look up who fought the battles in 2014 which assured independence of parts of donbas and who did most of the fighting prior to russain federation calling the 300,000 troop mobilization soon to be mustered.

            and the donbas votes held in 2014 and 1022 were less valid than the counts that elected biden?

            send links.

            your pols have no business near any nuclear weapon, they are a clear and danger to humanity!

          2. Moses Herzog

            This must be UllenRusky aka “UlenSpiegel”. He knows we’re onto his fraud game of pretending he’s German so he can perform image rehabilitation for Russian war crimes. So going with “Anon” now. What a POS trying to pass itself off as a human being.

      2. Moses Herzog

        I have no prediction on how this turns out on air warfare. I could see it being a super bad play by Russia, or Russia making their biggest gains on control of the region. But I would say this, so far Ukraine’s military and citizens’ defense of themselves have surprised us to the upside.

        My prayers will be with the people of Ukraine, and if you believe in the power of prayer, I encourage all others to do likewise.

    2. pgl

      OHCHR believes that the actual figures are considerably higher, as the receipt of information from some locations where intense hostilities have been going on has been delayed and many reports are still pending corroboration. This concerns, for example, Mariupol (Donetsk region), Lysychansk, Popasna, and Sievierodonetsk (Luhansk region), where there are allegations of numerous civilian casualties.

      We will not know how many Ukrianians have died from the Putin war crimes you celebrated for a year. And it took you the entire year to finally say something? Atta boy Putin’s poodle.

    3. pgl

      Estimates of the number of people who have died as a result of Putin’s unprovoked invasion differ widely. Here is another account:

      Western intelligence sources estimate that each side has suffered approximately 150,000 casualties since Russia launched its invasion on Feb. 24, 2022, Agence France-Presse reported. Norway’s defense chief Gen. Eirik Kristoffersen said last month on Norwegian TV that the most recent intelligence suggests that Ukraine has lost 100,000 troops and 30,000 civilians in the course of the war. Meanwhile, Kristoffersen said Russia had 180,000 soldiers who were either wounded, killed or missing. More recently, US officials said Russian casualties are fast approaching 200,000, according to reporting from the New York Times, citing American and other Western officials.

      Jonny boy wants us to believe there have been less than 19 thousand deaths – a number even he knows is a subset of the number of actual deaths. Why does Jonny boy lie to us this way? Simple – Putin tells him to do so.

  5. Bruce Hall

    Got a good laugh today. pgl reminds me of Dilbert’s boss.

    Nothing like an ad hominem response to prove your point, eh?

    Bruce Hall
    February 16, 2023 at 1:41 pm
    I know this about the PPI (just to forestall a snarky comment from the usual suspect), but I thought this was a good way of looking at the CPI and inflation.

    Reply ↓
    February 16, 2023 at 2:29 pm
    There was some magical point to a link that told us we do not have negative inflation? Come on Brucie – no interpretation? Oh yea – Kelly Anne is still emailing to you what you are supposed to say. Got it!

    1. pgl

      Bruce – did you READ the first part of this cartoon? I guess not because I was playing Dilbert. OK – you are the boss man!!!

    2. pgl

      There was some magical point to a link that told us we do not have negative inflation? Come on Brucie – no interpretation?

      I see you can repeat my challenge to you. Of course, my little boss man is totally incapable of addressing it.

      Come on Brucey – if you are going to go after me, come armed with something more than you little pee shooter.

    3. Moses Herzog

      @ Bruce Hall
      You don’t think that a post stating that we didn’t have negative inflation was really pretty stupid in the context of 2023?? Did you also need confirmation that gravity pulls things towards the ground and not towards the sky??

      1. pgl

        Oh my – your point may be a bit dangerous for little Brucie. After all – Brucie has proven over and over he does not understand the difference between a price level and the rate of change of that price level. Now if his understanding of gravity is just as weak, he might just fall off the planet.

        1. Moses Herzog

          I bet Bruce is obsessed with weather/spy balloons in the last week or so. Maybe we could show Bruce that one weather balloon can make it to 60,000 feet faster than another weather balloon, even though they were both rising higher over time??

      2. pgl

        I was curious who this Felix Richter that Brucie boy is citing is. Did Brucie even know his job description is “Data Journalist” not economist. Did Brucie boy not even realize his phone number means he works in Hamburg, Germany?

        A German data journalist is Brucie boy’s latest expert on the US macroeconomy. Yea Brucie boy is really, really dumb!

    4. pgl

      Seven reliable sources that support my position – Brucie actually thinks he is Dilbert here. What reliable source has Brucie ever provided us?

      That supposed John Hopkins analysis that was actually written by a Cato Clown, not peered reviewed, and thoroughly debunked by the actual scientists at John Hopkins?

      Or that high schooler website that showed CPI data that they forgot to tell Brucie boy was not seasonally adjusted?

      Bruce Hall never provides reliable sources? Why – because he does not even TRY to find a reliable source. After all he works for Kelly Anne Conway.

Comments are closed.