Today we are pleased to present a guest contribution written by Jeannine Bailliu, Xinfen Han, Mark Kruger, Yu-Hsien Liu and Sri Thanabalasingam (all Bank of Canada). This research may support or challenge prevailing policy orthodoxy. Therefore, the views expressed in this paper are solely those of the authors and may differ from official Bank of Canada views. No responsibility for them should be attributed to the Bank.
Although issues have been raised with respect to many of China’s official statistics, those pertaining to the labour market have been seen as particularly problematic. The main problem with the official statistics is that while they capture formal employment, they do not appear to include migrant workers, who are typically engaged on an informal basis. The omission of migrant workers in Chinese labour statistics is problematic because they represent a large share of the labour force. It is generally agreed that the official unemployment rate underestimates the level of unemployment in China and has failed to capture what is known about key historical development in China’s labour market. It has remained fairly stable over time; notably, it did not increase by much during the global financial crisis (GFC) in spite of the significant employment loss over that period.
Bailliu et al (2018) utilize machine learning techniques, specifically text analytics, to construct a labour market conditions index (LMCI) for China by extracting labour market information from mainland Chinese-language newspapers over the period 2003 to 2017. We employ a supervised machine learning approach by training a support vector machine (SVM) in a two-stage process. In the first stage, we train our SVM to find articles that are relevant to the state of the Chinese labour market. In the second stage, we train the classifier to distinguish between articles that represent positive and negative labour market sentiment.
We find that the behaviour of our LMCI appears to be consistent with the economic shocks that have impacted the Chinese labour market (Figure 1):
Figure 1: Labour Market Conditions Index for China (2003-2017). Source: Bailliu et al. (2018)
The usefulness of our LMCI will depend on the extent to which it captures direct measures of labour market outcomes. Moreover, the LMCI’s value added needs to be assessed vis-a-vis the ability of the official measures of labour market activity. We test the usefulness of the LMCI to explain and predict the behavior of wages ands credit against official measures: the registered urban unemployment rate, the urban labour demand-supply ratio and the employment sub-indices of the purchasing managers’ indices. Our results suggest that, although each of the official labour market indicators does contain some information either for wage or for credit growth, the information in our LMCI is more consistent. Moreover, the LMCI provides wage and credit forecasts that are better than those from any single official labour market indicator.
Since our dataset covers newspapers from a range of Chinese cities, we can also analyze how regional labour market conditions may vary. To test this, we construct two LMCI sub-indices: one for the export-oriented coastal provinces and a second for the remaining inland provinces. We find that exports are a predictor of labour market conditions in the coastal region (and for the country as a whole) but not for the inland region.
These results suggest that the text analytics can be used to extract useful labour market information from Chinese media.
This post written by Jeannine Bailliu, Xinfen Han, Mark Kruger, Yu-Hsien Liu and Sri Thanabalasingam.