Machine Learning and Central Banks: Ready for Prime Time?*

Abstract

We review what machine learning (ML) might have to offer central banks as an analytical approach to support monetary policy decisions. After describing the central bank’s “problem” and providing a brief introduction to ML, we propose the use of vector autoregression (VAR) methods in central banks to speculate how ML models must (will?) evolve to become influential analytical tools supporting central banks’ monetary policy decisions. We argue that VAR methods achieved that status only after they incorporated elements that allowed users to interpret them in terms of structural economic theories, and we believe that the same must be the case for ML.

Introduction

Machine learning (ML) is the new “car” on the street, and as with most things that are new, it is flashy, popular, and full of excitement. Central banks, or at least some central bank economists, appear to share this excitement, and as a result, a number of central banks have invested resources in ML. Is the excitement warranted?

Advances in artificial intelligence (AI) research has transformed many tasks previously carried out “manually.” Areas include interpreting X-ray images, translation, and facial recognition to mention processes that already use ML methods successfully. AI has also outperformed humans in playing highly complex games such as chess and Go, and it holds the promise of being able to give us driverless cars. Similar successes are also seen in banking applications, such as in credit scoring (). In addition, ML is increasingly used to inform public policy decisions (). At the same time, the applications to macro/monetary policy questions and issues are also rising: review the ML for central bank policy almost exclusively from the predictive perspective. develop early warning models for predicting financial crises using ML techniques.

Little wonder that central banks ask themselves whether these underlying methods can be used to cater to the needs of central banks. After all, if computers can be trusted to drive cars, trucks, and buses safely on busy streets filled with cyclists and pedestrians, can they not also be trusted with some aspects of the conduct of monetary policy? In this article, we will attempt to answer this important question.

Our analysis starts by providing answers to the following two questions: First, what does a monetary policymaking authority want from analytical models, and second, how have existing models, particularly vector autoregressions (VARs), evolved to serve these needs? In other words, we will use the evolution of VARs as a lens through which we will assess whether ML methods can be used to answer the questions that the central banks have.

What does a monetary policymaking authority or central bank require from an empirical tool? We argue that a monetary policymaking institution would like to (1) describe and summarize macroeconomic data; (2) make forecasts; (3) conduct structural analysis, including risk and scenario analysis; and (4) communicate its decisions/analyses.

We believe that the VAR literature made a critical contribution to the way that the central banks think about data, forecasting, and structural analysis. We will therefore use the developments and achievements of VAR methodology, as well as the challenges that it has had to overcome, as a lens through which we can assess the potential usefulness of ML for the central banks.

To avoid misunderstanding, we would like to make it clear that we are not ML experts. However, we do have experience in the use of VARs in central banking, and we will use this experience and our understanding of the current state of ML to speculate about how these tools need to evolve to challenge and perhaps overtake VARs as a dominant analytical tool for central banks.

We also want to emphasize that we are focusing in this article only on the potential use of ML in the monetary policy functions of the central bank. ML and AI have been used more actively in other aspects of central banking, such as financial stability, crisis forecasting, and banking supervision. We will not comment on these important applications for ML in this paper.

Almost two decades ago, Adrian Pagan argued that macroeconomic models used in central banks face a trade-off between their theoretical consistency or data consistency (Figure 1). Given that central banks have a fixed amount of resources (e.g., staff) available, this trade-off could be interpreted as a production possibility frontier. One could also think about this trade-off chart as the preferences of policymakers or researchers between theory-driven and data-driven approaches. The models that we consider to be on this frontier are listed in the figure.

Models like unrestricted VARs are located close to the lower-right part of the frontier, reflecting their data-heavy feature. Close to the upper-left part, we find real business cycle (RBC) and dynamic stochastic general equilibrium (DSGE) models that are typically not estimated in the conventional sense. Rather, they are calibrated using estimates of structural parameters obtained from other studies.

Over the past four decades, VARs have evolved substantially such that they now cover a wider spectrum on the Pagan frontier. The development of the range of VARs that incorporate restrictions to give them structural interpretations (shareholder values at risk, or SVARs) have perhaps pushed them toward the middle of the Pagan frontier. For example, it has been argued that DSGE-VARs provide a framework with which one could evaluate the DSGE models (). In other words, the development of VAR has been rather dynamic, not static.

The big question, we believe, is whether ML models that currently are located on the data-heavy corner of the frontier can move toward the middle, and what modifications need to be introduced to enable them to do so. In other words, can they combine good forecasting performance with an ability to provide insights into the more structural aspects of the macroeconomy that central banks need?

The remainder of this article is structured as follows: Section 2 sets out what we think central banks need from a model. Section 3 summarizes our understanding of current-generation ML models and how they might be used by central banks. Section 4 describes briefly the development of the VAR literature, and Section 5 discusses how well VARs have served in solving or answering central banks’ needs. Section 6 covers whether ML can develop into something similar or better than VARs. Section 7 concludes the discussion.

The Central Banks’ Problem

Here, we propose a list of what central banks need from models by extending Stock and Watson’s description of what macroeconometricians at policy institutions do. argued that they do four things: (1) describe and summarize macroeconomic data, (2) make macroeconomic forecasts, (3) quantify what we do or do not know about the true structure of the macroeconomy, and (4) advise (and sometimes become) macroeconomic policymakers.

We adapt this classification to central bank policymaking. Further, we argue that, to a first approximation, policymaking involves the following steps, and models play a role in every one of them:

Central banks summarize and analyze data.

They forecast the key macroeconomic variables.

They conduct risk analysis and balance of uncertainties.

They do structural/causal analysis, as well as scenario analysis.

They make decisions and communicate and justify these decisions to the public.

Prior to the VAR models, these four tasks were performed using a variety of statistical or econometric techniques. These ranged from large models with hundreds of equations to single-equation models that focused on interactions of a few variables to simple univariate time-series models involving only a single variable. As we note in Section 4, dissatisfaction with the performance of these models let Christopher Sims propose a new class of models, VARs, in his seminal Econometrica article ().

In the remainder of this paper, we will be arguing that the VARs have served well in each of these five steps listed here. We will also argue that to be helpful to policymakers in central banks, ML methods must be able to perform well in these areas.

What Is Machine Learning?

ML is so far principally a prediction tool, but change may be coming

It has its origins in computational statistics. Its primary concern has been to use algorithms to identify patterns or interrelationships that exist in data and to apply these patterns to make predictions. While the algorithms used in ML can be common techniques such as ordinary least squares regression, they can also be more complex methodologies such as decision trees, clustering algorithms, and deep learning multilayer neural networks.

Currently, the principal use of ML in economics has been for prediction. This is particularly the case in macroeconomic applications (). The growing popularity of ML comes from its ability to uncover complex patterns in the data that have not been prespecified a priori. This flexibility is in sharp contrast with approaches to forecasting traditionally used in central banks. In forecasting inflation, for example, researchers usually start with a structure that comes from theoretical considerations linking inflation to a set of causal determinants. For analytical and empirical tractability, the structure is often represented by a model that is linear in the underlying variables. But if forecast accuracy is the prime goal, this prespecified linear structure can be a weakness, and the advantage of ML is that it does not have to impose it on the estimation of the model, and hence on the forecast. For example, deep learning neural network algorithms do not impose any particular functional form of relationship between the explanatory variables and the forecast target. Instead, this functional form is the outcome of the network algorithm. The variables that are included in the forecasting model, be they a simple linear regression or a complex deep learning neural network, are still determined by the researcher, however.

While the main use of ML in economics is in prediction, recent developments have shown that, in certain contexts, it is also possible to design algorithms that will allow causal inference. As states, “Prediction and causal inference are distinct (though closely related) problems. Outside of randomized experiments, causal inference is only possible when the analyst makes assumptions beyond those required for prediction methods, assumptions that typically are not directly testable and thus require domain expertise to verify.” As we will argue next, for central bankers to make full use of ML techniques, it is important that causal analysis also be possible in nonexperimental contexts.

“Training” Versus “Validation” and the Need for Large Amounts of Data

Since ML algorithms impose relatively little a priori structure, there is a potential for overfitting whereby the algorithm will continue to add complex relationships until it perfectly explains the data that it is given. Some restraints must therefore be imposed. This can be done both at the stage of designing the algorithm that is used to explain the data and at the stage of validating the output of the algorithm.

For example, the number of nodes of a decision tree determines its complexity and ability to explain the data that confronts it. The risk of overfitting—explaining not only the systematic relationships imbedded in the data but also random noise resulting from errors in measurements, variations in the data due to idiosyncratic shocks that are not forecastable but nevertheless influence the outcome—can be reduced by limiting ex ante the complexity of the algorithm used in the analysis.

In addition to limiting the complexity ex ante, the analyst will typically divide the data sample into two parts: a training sample and a validation sample. The former is used to generate a model that best explains the data in the sample, while the latter is used to check how well the model explains data that are assumed to be driven by the same data-generating process but not used in the initial training phase. If the model’s ability to explain the validation data set is significantly worse than its ability to explain the training data set, it is likely that overfitting the data is a problem. The fundamental parameters of the algorithm are then adjusted and the training and validation process is repeated. The iteration process will end when the model’s explanation of the training data set is not significantly better than its ability to explain the validation data set.

The practice of using part of the available data as a training ground for the ML algorithm and part of it as validation implies that ML modeling is best suited to situations where we have a large number of observations on the events that we are trying to forecast. This can potentially be a problem for the forecasting of inflation, output fluctuations, and other macroeconomic variables at business cycle frequencies that central banks are particularly concerned with. In fact, argue that the methods developed in the ML literature have been particularly successful in big data settings.

There is no universally agreed definition of “big data.” However, the term usually refers to large structured and/or unstructured data sets where the number of observations could be hundreds of thousands or even millions. Textual data can also be digitized and made available for computer-aided analysis of content. Examples include the digitization of documents containing the latest financial regulations so these can be incorporated into compliance routines, newspaper articles to aid in the search for indicators of economic uncertainty, or useful information for regulators. With the availability of large data storage combined with sudden increases in computing power, these sources of information have become more useful.

Cars with ML Drivers Versus Economies with ML Policy Decisions: Similarities and Differences

If we can design cars that navigate safely using ML algorithms, why should we not be able to design monetary policy rules based on the same principles? After all, a self-driving car must be able to observe the environment in real time (including possibly idiosyncratic behavior of pedestrians, motorcyclists, and other vehicles), interpret what it observes in terms of objectives that it wants to achieve (arriving at a predetermined destination, avoiding accidents, adhering to driving rules and conventions, etc.), and decide on adjustments to be made to driving direction, speed, avoidance maneuvers, and other elements. Similarly, to conduct monetary policy, the central bank management needs to make observations about the current state of the economy (the current and probable future rate of inflation and growth, for example), interpret this information in relation to the objectives that it is charged with achieving (macroeconomic and financial stability in particular), and take the appropriate policy action (changing policy interest rates, intervene in the foreign exchange market, etc.).

At this high level of generality, the task of creating a driverless car seems to be similar to designing a ML model to conduct monetary policy. However, as we dig deeper, significant differences are revealed that need to be recognized and addressed before monetary policy can be driven, at least partially, by digital means.

Consider first observing the economy. There have been considerable advances in nowcasting methods to provide a detailed picture of the current state of inflation, growth, and other variables relevant for monetary policy decisions. These methods typically use a wide variety of indicators and sophisticated analytical approaches. ML tools are also being used to provide forecasts of these same variables. However, as we will discuss in some detail later in this paper, determining policy actions typically requires more than knowledge of the current state of the economy and forecasting its future path. For example, it is often necessary to know how the economy has arrived at its current state and what the underlying determinants of the forecasted future path are: Are they driven by supply or demand shocks, by implicit forecasts of changes in economic activity and inflation abroad, or by projected changes in consumer behavior? The appropriate policy response will depend on the context. In contrast, the driverless car does not really need to know whether the pedestrian who is crossing the street is coming from a movie theater or the grocery store, or, at least to a first approximation, whether the bend in the road ahead will be followed by a straightaway or another bend. In other words, narratives that provide a context are important for monetary policymaking, and here the “black box” aspects of current-generation ML models constitute a hurdle.

Another difference between the driverless car example and the challenges facing the central bank is that the time pattern of the responses to the actions taken by the driving algorithm and the actions taken by the central bank is quite different. In the parlance of monetary policy, the transmission mechanism is different. If a pedestrian crosses the street, the algorithm in the driverless car will initialize actions to slow down or come to a complete stop. The car will react virtually immediately to the action or event. For the central bank, on the other hand, if the ML algorithm initiates actions to slow the economy by raising the policy interest rate, the impact on the economy will be felt only several quarters later. It would be more like trying to stop a supertanker than stopping a car. To be sure, the algorithm can build this reaction lag into the decision-making structure, but this adds another layer of complexity to the central banking example. The algorithm must also learn how the economy reacts to a policy rate increase, whereas the algorithm for the driverless car can incorporate the mechanical relationships directly. But understanding the reaction of the economy to a policy change requires either having a reliable model of the transmission mechanism or training the algorithm on past experiences with policy rate changes. In the latter case, the required data may be insufficient to allow for a relatively unstructured ML approach.

Finally, there are the communication aspects of monetary policy decisions. The effects of a particular monetary policy are likely to depend on how well it is explained and justified. The policymaker needs to provide a plausible causal story for both the assessment of the current state of the economy and the policy decisions made. It is not likely to be sufficient to just refer to a computer algorithm. In contrast, the driverless car does not have to provide a story about why it stopped for the pedestrian, so long as it did so in a timely manner.

In the following sections, we will first elaborate on how central banks have tried to deal with the challenges of formulating policy based on a mixture of theoretical and empirical modeling, before discussing how ML needs to (will?) evolve to provide more support for the monetary policy decisions of central banks.

The Development of VARs at a Glance

More than four decades ago, provided a new macroeconometric framework in a paper with the provocative title “Macroeconomics and Reality.” The motivation was a view that the econometric models used predominantly at the time were unreliable because they relied on “incredible identification” (, p. 1) based on theories which in his judgment were not justifiable. He introduced an alternative methodology, estimation, and analysis based on VARs, which were designed to “let the data speak” more freely about the relationship between major macroeconomic variables without being constrained by a priori restrictions. In general terms, the VAR system proposed is an n-equation, n-variable linear model in which each variable is in turn explained by its own lagged values, plus past values of the remaining n−1 variables.

This simple framework enables researchers, in a systematic way, to capture rich dynamics in multiple time series. argue that “VARs held out the promise of providing a coherent and credible approach to data description, forecasting, structural inference and policy analysis.”

Will models based on ML and AI be able to extend Sims’s “let the data speak” approach even further by doing away with the linear structure? Before we attempt to answer this question, we briefly review the evolution of the use of VAR models in empirical macroeconomics in general, and in central banks in particular.

From Linear Dynamic Stochastic General Equilibrium Models to VARs and Back

From Structural Models to VARs

Every linear DSGE model can be transformed into a VAR representation. Let Yt be a vector of endogenous variables represented by the model. The linear DSGE model can then be written as

AYt=BYt−l+Ut(1)

where, for simplicity, the dynamics are restricted to one lag. The elements of Ut are shocks to each of the structural equations, and they are assumed to be independent of each other.A and B are structural parameters derived from the underlying theory. To convert this structural model into a VAR, simply premultiply both sides of Equation (1) by the inverse of matrix A:

Yt=CYt−1+Vt(2)

where C=A−1B and Vt=A−1Ut

Four observations follow from Equation (2). First, the reduced form, Equation (2), can be estimated efficiently, equation by equation, by ordinary least squares. Second, the parameters in C are linear combinations of the structural coefficients in A and B. Third, even if the elements in U are independent of each other, the elements in V will not be. Fourth, and most crucially for our purposes, it is in general not possible to infer the values of the structural coefficients from the reduced-form parameters. Similarly, it is not possible to infer the nature of the structural shocks (Us) from the properties of the reduced-form errors (Vs).

When VARs are used for pure forecasting, these observations are not problematic. Indeed, in its simple form, VARs are frequently used for this purpose. As we will be discussing later in this paper, more elaborate variants with time-varying parameters, stochastic volatility, and a number of other extensions have also been developed and are used frequently by central banks.

But if we want to tell stories of the type “This is what our estimates tell us about the consequences of a unexpected increase in the policy interest rate” (assuming that this variable is an element of Y), then the VAR estimates are not sufficient. We need to reverse engineer the VAR to identify the underlying structural equations. This identification problem has spurned a large literature.

Giving VARs a Structural Interpretation

To give VARs a structural interpretation, we need to impose restrictions on the model so we can recuperate the elements of A and B from the estimates of C, and to ascertain the nature of the Vs from the estimated Us. A very large body of literature has evolved to do just that.

The method originally proposed by Sims was to assume a temporal ordering through which the shocks had an impact on the endogenous variables. For example, an unanticipated change in the short-term monetary policy interest rate would affect a longer-term interest rate before aggregate demand or any other variable in the model. Similarly, an unexplained shock to aggregate demand would affect output before it would change inflation. If such a temporal ordering were imposed on variables in the economy, it would be possible to recuperate the original structural parameters in the model, as well as the time pattern of the structural shocks. That identification would then make it possible to trace the effects of any shock in the system on all the endogenous variables; that is, to tell stories of the type mentioned in the previous subsection.

Other identification schemes followed; some based on restrictions on the contemporaneous interactions between the endogenous variables in the structural model (the A matrix), and others on the sign pattern of the impulse response function. Yet others achieved identification based on the theoretical presumption that a particular shock would have no long-run effect on a specific variable based on theoretical signs, the information coming from high-frequency data, or heteroscedasticity patterns in the data.

The goals of these identification strategies were always the same: to be able to conduct causal analysis and to understand and, importantly, be able to communicate to stakeholders how monetary policy affects the economy. Here, the recent and fast-developing causal ML has some great promise. is an umbrella term for ML methods that formalize the data-generation process as a structural causal model (SCM). This perspective enables us to reason about the effects of changes to this process (interventions) and what would have happened in hindsight (counterfactuals).

Next, we categorize work in Causal ML into five groups according to the problems that they address: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, and (5) causal reinforcement learning. We systematically compare the methods in each category and point out open problems. Further, we review data-modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.

Incorporating Large Data Sets in VARs

The number of variables that can be directly included in a VAR model is frequently constrained by the available number of observations. An n-variable model that includes four lags of each variable will have 4n+1 free parameters to estimate per equation. If the frequency of observation is quarterly, then a four-variable system will require 40 years of observations if the rule-of-sum requirement of 10 observations per parameter is respected. For many economies, this is very demanding.

Two types of solutions to this problem have been proposed in the literature. One is to reduce the effective number of parameters by soft constraints on the lag patterns associated with each variable in the system. This can be achieved by applying the so-called Minnesota priors during the estimation process.

Another solution involves reducing the number of variables by combining them into factors and using these factors in the model instead of the underlying variables themselves. The resulting VARs in these cases have been given the name “factor-augmented VARs (FAVARs).” For example, to study the propagation mechanism of monetary policy, it might be desirable to include a number of financial variables to capture different channels of transmission. But doing so would render the size of the VAR too big to be estimated. Instead, analysts combine the financial variables into a financial factor and uses that in the VAR. Usually, the factor is a linear combination of the underlying variables, often based on principal component analysis or dynamic factor models.

Another example of reducing the number of variables comes up in the context of open-economy interactions with the rest of the world where, instead of letting all variables for all countries enter every equation, foreign variables are combined into foreign factors representing “world” prices, “world” interest rates, and “world” output, for example. use data from a panel of 17 industrialized countries to investigate the international transmission mechanism.

How Well Have the VARs Served in “Solving” the Central Banks’ Problem?

Forecasting and Risk Analysis

The standard VARs and their extended variants, such as time-varying parameter VARs, time-varying stochastic volatility VARs, and FAVARs, have been extensively used in forecasting. Their forecasting performances have been found to be good, and most central banks use them. One advantage of VARs in forecasting is their system nature, and hence every endogenous variable can be forecast with the same model. This is in contrast with ML models, which, to the best of our knowledge, typically predict one variable at a time. The VAR forecasts have been successful in conducting point forecasts as well as density forecasts (). Therefore, we can say without any qualification that VARs have been an important part of the forecasting toolkit of central banks.

Recently, monetary policymakers have also become interested in the tail-risks and taking into account the potential tail outcomes in forecasting. Building on finance literature in assessing tail risks in asset prices and returns, this line of research has emphasized the importance of tail risks in macroeconomic policymaking. Although this literature has started with the risks of significant declines in GDP growth and has relied on quantile regression methods to estimate tail risks (), the applications of the framework have widened to other important variables such as house prices and capital flows. Recently, extended this framework in a VAR context. and show that Bayesian VARs with stochastic volatility are able to capture tail risks in macroeconomic forecast distributions and outcomes. Bayesian VARs, which have been commonly used for point and density forecasting, are able to capture more time variation in downside risk than upside risk for output growth, consistent with the results of quantile regression of earlier research. Moreover, the Bayesian VARs come with additional gains in the form of standard point and density forecasts.

The body of VAR literature has also been responsive to the most recent challenges with the post-COVID-19 pandemic sample. With new data that have not been seen for the past century, are there implications for VAR forecasts? have recently shown that the VAR forecasts can indeed handle this massive anomaly in the data.

Structural and Scenario Analysis

The traditional econometrics or estimation used in central banks are often concerned with questions beyond simple out-of-sample forecasting. This is an important difference between traditional ML applications and econometrics. In many (arguably most) cases, central banks are interested in “average treatment effects” or other causal or structural relationships. As we have already noted, the challenge for ML models is to be able to make causal inferences in contexts important for monetary policy decisions.

Structural VARs have now been used to identify a large number of structural shocks and their causal effects, including monetary policy shocks, demand shocks, commodity price shocks, and oil market shocks. Moreover, structural VARs are used in distinguishing between different theoretical structures.

Structural analysis also includes scenario analysis, which central banks often do. For example, a central bank may be interested in the economic consequences of alternative paths for the policy interest rate—holding it constant for an extended period before increasing it to a higher level compared to increasing immediately, but doing so gradually until it reaches the new level. VARs can be and have been used to carry out such scenario analysis or conditional forecasting. The earlier applications of conditional forecasting include and . , for example, came up with a framework for computing and evaluating forecasts of endogenous variables within an SVAR, conditional on hypothetical paths of monetary policy and present a good example of the VARs moving along the Pagan frontier as discussed in the introduction: from a purely data-driven objective, they also become more consistent with theory. Their framework is based on the theoretical model that reports “when linear projections are reliable even though policy switches from one regime to another.”

Structural VARs are also used for understanding the history of macroeconomic in terms of the drivers of business cycles. With historical decompositions, one can get a view of the historical drivers of the business cycle dynamics as to what structural economic shock influenced the history of macroeconomic variables.

It is legitimate to ask why other causal inference techniques are not considered. The causal inference techniques often refer to quasi-experimental research designs that try to cover causality without using randomized controlled trials (RCTs). These techniques, such as instrumental variables (IV), regression discontinuities, event studies, and difference-in-differences, along with the gold standard RCTs, are powerful causal tools that are used more and more in macro literature as well. However, unlike VARs, they cannot be used in routine forecasting tasks that central banks perform.

In addition, at least the earlier, more canonical causal inference models tended to establish a causal relationship that was not dynamic. Recent studies address this issue. However, understanding the channels through which the causality occurs is often difficult to accomplish. SVARs are good at both capturing the dynamic causal effects and distinguishing between different channels that transmit the original shock.

Communication

Over the past few decades, transparency, accountability, and the need for clear and timely communication have come to be widely recognized as essential components of successful central banking. In part, this development is the result of demands by the general public for greater transparency and accountability in government institutions.

In part, more active communication by central banks also stems from their attempts to manage and guide market expectations and thereby increase the effectiveness of monetary policy. For example, stating in a monetary policy briefing that “the short-term policy interest rate will be maintained at a low level for a considerable period” can be part of a strategy to influence longer-term market interest rates that typically have a greater effect on consumption and investment decisions.

A component of a central bank’s communication consists of explaining why and on the basis of what information a policy decision has been made. This is typically followed by an explanation of how the policy change is expected to influence the economy and lead to desirable outcomes.

For the communication to be credible, therefore, the policy statement must not only make it clear which variables the central bank is primarily focused on, but it must also explain the economic mechanisms that the central bank relies on to achieve its objectives. All this requires having a clear causal structure in mind. This structure can be a calibrated theoretical model that reflects the central bank’s understanding of how the economy works, or it can be a more data-driven model, such as a VAR, which has been given a structural interpretation using methods described thus far in this paper. But crucially, the economic reasoning behind the central bank’s policy decision must be clear and transparent for it to be credible in the eyes of the public.

Can Machine Learning Do the Same?

In Section 2, we characterized the central banks’ problem as one of summarizing and analyzing data, forecasting key macroeconomic variables, conducting risk analysis and balance of uncertainties, carrying out structural/causal as well as scenario analysis, and finally communicating and justifying policy decisions vis-a-vis the public. Then, in Sections 4 and 5, we argued how VAR methodology has evolved and become a workhorse to assist central banks in responding to these tasks. In this section, we ask how ML may need to evolve to become a complementary or alternative methodology guiding monetary policy decisions and communication.

Forecasting and Risk Analysis

As summarized by , and as we have argued here, ML in its current form is essentially about prediction. We argue that ML’s advantage in short-term forecasting is significant, although it is subject to similar issues as classical econometric techniques; for example, structural breaks in the data with an event like the COVID-19 pandemic is likely to have implications for the training sample and forecasting.

Perhaps forecasts from more traditional models can be compared with those from ML algorithms, and, to the extent that the latter are superior, it may be possible to tweak the traditional models so they “have a story to tell.”

Risk analysis has become commonplace in traditional econometrics with the use of quantile regressions and density forecasts. We are not aware of similar applications in the ML sphere, but it would seem that carrying out the equivalent of quantile regressions should be possible with extant ML tools. Estimating and forecasting entire densities appear to us to be more of a challenge, as they require some assumptions about the underlying data-generating process.

Structural/Scenario Analysis and Communication

We have argued that clear and timely communication about its policy objectives, analytical framework, and potential risks is an essential component of a successful monetary policy framework. Having access to a well-specified, economywide model would clearly be desirable in this context. Of course, it would have to be estimated (or calibrated) on data for the economy in question to be useful. The large-scale econometric models that were commonly used in the 1960s and 1970s were of this type, although critics would question whether they were well-specified. As says, they were built on “incredible identifying restrictions.”

The popularity of the structural VARs that were developed following Sims’s seminal paper was due to their reliance on more palatable (in the views of those who developed these models) identifying restrictions that allowed the user to give causal interpretations of the estimated relationships. , Section 21.4) discusses the importance of causal inference in economics and how some of the identification strategies for identifying causal effects have received attention in the new ML/causal inference literature. However, to our knowledge, the current ML algorithms are in their infancy in terms of their penetration into the macro-literature to provide users with a similar structural interpretation between the input and output variables in the data, or indeed between the output variables themselves. These algorithms are often characterized as “black boxes.” However, there is a growing body of literature on interpretable ML algorithms (). One recent macro-example of this is given by . By using the Causal Forest algorithm, that paper shows how one can estimate the average impact of a crisis. This avenue of identification has been gaining momentum in other fields of economics, and we see this as an exciting opportunity that will eventually arrive in macro-issues. Moreover, we also see the potential of this line of identification marrying the structural VAR way of identification. So in this sense, we see the work of and , for example, becoming useful for central banking in the future, especially combined with the big data sets that the central banks are using more and more actively ().

In this section, we briefly discuss the history of the interpretable machine learning (IML) literature. We also review the more recent, state-of-the-art IML methods and discuss the challenges that they face. Research in IML has boomed in recent years. The IML field has its roots in regression modeling and rule-based ML, starting in the 1960s. Recently, many new IML methods have been proposed, many of them model-agnostic, but also there are some interpretation techniques specific to deep learning and tree-based ensembles. IML methods often involve the following two: (1) They analyze model components and sensitivity of results to input perturbations, and (2) they analyze local or global surrogate approximations of the ML model. However, a number of important challenges remain for IML. Dealing with dependent features, causal interpretation, and uncertainty estimation are a few of the issues that the literature cites. Perhaps the most important challenge is finding a rigorous and widely agreed-upon definition of interpretability.

and have shown that predictions from ML models can be decomposed into drivers. By using a Shapley value-based interpretability, argue that “the black box problem resulting from the from both the opaqueness of nonlinear models and the high dimensionality of the input space” can be addressed. Mapping the game-theoretical concept of Shapley values into forecasting enables researchers to decompose a prediction into linear contributions of individual variables that go into the prediction specification (). However, this approach is still within the context of predictability. It would perhaps be a major breakthrough if this Shapley value or a similar approach could be taken into a structural context that is similar to the historical decompositions in the SVAR context, where the drivers are the fundamental shocks.

Perhaps it may be possible for algorithms to be constrained to produce forecasts that are considered to be consistent with generally accepted economic theory. But the VARs identified through sign restrictions on the impulse response functions may provide inspiration, although the “generally accepted economic theory” would have to be specifically tailored to the ML environment. An example might be “if the algorithm produces a forecast that variable y1 will increase, then it must simultaneously forecast that variable y2 will increase.”

Other constraints that might be introduced would be related to the long-run restrictions in VAR models. The ML algorithm would be instructed to ensure that a certain class of variables would have no effect on the long-run forecast of another class of variables. A possible combination of ML and FAVAR models would ask the ML algorithm to produce the factors that go into the FAVAR model, doing away with the linearity assumption of conventional principal component methods.

Conclusions

As states, it is not easy to make predictions about the impact of ML on economics in general. And it is perhaps even harder to be more specific about its impact on central banking. This is largely because the changes are well underway.

The thesis in this paper is that the evolution of VAR methodology has led to its gradual adoption by central banks as a useful analytical tool. VAR methodology started at the lower-right (heavily data-focused) corner of the Pagan frontier (Figure 1), but it has gradually moved upward to incorporate theoretical elements, thus allowing central banks to use it to conduct structural/causal analysis, including scenario analysis. Such a focus, we have argued, is necessary for the central bank’s communication with the public about policy choices and decisions to be credible.

If our thesis is correct, and for ML models to be adopted widely by central banks in their monetary policy toolboxes, their causal structure and interpretation need to become more transparent. We hope that our analysis and modest suggestions in this paper will lead to a conversation between traditional econometricians and ML experts on how to combine the best elements of each approach to make more informed policy decisions and achieve a better understanding of these decisions by the general public.

In this study, we did not use a specific application, partly because a single ML model that can address all the problems of central bank has not been developed yet. However, future studies could use the growing causal ML models in a central banking context. We remain cautiously optimistic.

References

Abadie, A., & Cattaneo, M. D.2018. Econometric methods for program evaluation. Annual Review of Economics, 10(1), 465–503. http://doi:org/10.1146/annurev-economics http://ideas.repec.org/a/anr/reveco/v10y2018p465-503.html
Adrian, T., Boyarchenko, N., & Giannone, D.2019. Vulnerable growth. American Economic Review, 109(4), 1263–1289. http://ideas.repec.org/a/aea/aecrev/v109y2019i4p1263-89.html
Angrist, J. D., & Pischke, J.-S.2009. Mostly harmless econometrics: An empiricist’s companion. Economics Books 8769. Princeton University Press. http://ideas.repec.org/b/pup/pbooks/8769.html
Aruoba, S. B., Mlikota, M., Schorfheide, F., & Villalvazo, S.2021. SVARs with occasionally-binding constraints. NBER Working Papers 28571. National Bureau of Economic Research, Inc. (NBER). http://ideas.repec.org/p/nbr/nberwo/28571.html
Athey, S.2015. Beyond prediction: Using big data for policy problems. Science355(6324), 483–485. https://doi:org/DOI:10.1126/science.aal4321 http://science.sciencemag.org/content/355/6324/483/tab-article-info
Athey, S.2018. The impact of machine learning on economics. In The economics of artificial intelligence: An agenda (pp. 507–547). National Bureau of Economic Research, Inc. (NBER), June. http://ideas.repec.org/h/nbr/nberch/14009.html
Athey, S., & Imbens, G.2019. Machine learning methods economists should know about. Papers 1903.10075. arXiv.org, March. http://ideas.repec.org/p/arx/papers/1903.10075.html
Babii, A., Ghysels, E., & Striaukas, J.2020. Machine learning time series regressions with an application to nowcasting. arXiv: 2005.14057 [econ.EM].
Bernanke, B. S.1986. Alternative explanations of the money-income correlation. Carnegie-Rochester Conference Series on Public Policy, 25(1), 49–99. http://ideas.repec.org/a/eee/crcspp/v25y1986ip49-99.html
Bernanke, B. S, Boivin, J., & Eliasz, P.2005. Measuring the effects of monetary policy: A factor-augmented vector autoregressive (FAVAR) approach.” Quarterly Journal of Economics, 120(1), 387–422.
Blinder, A. S., Ehrmann, M., Fratzscher, M.De Haan, J., & Jansen, D.-J.2008. Central bank communication and monetary policy: A survey of theory and evidence. Technical Report 4, December. http://ideas.repec.org/a/aea/jeclit/v46y2008i4p910-45.html
Bluwstein, K., Buckmann, M., Joseph, A., Kang, M., Kapadia, S., & Simsek, Ö.2020. Credit growth, the yield curve and financial crisis prediction: Evidence from a machine learning approach. Bank of England Working Papers 848. Bank of England, January. https://ideas.repec.org/p/boe/boeewp/0848.html
Boudette, N. E.2017. Tesla’s self-driving system cleared in deadly crash. The New York Times. http://www.nytimes.com/2017/01/19/business/tesla-model-s-autopilot-fatal-crash.html
Carriero, A., Clark, T. E., & Marcellino, M.2020. Capturing macroeconomic tail risks with Bayesian vector autoregressions. Working Papers 202002R. Federal Reserve Bank of Cleveland, January. http://doi:org/10.26509/frbc-wp-202002 http://ideas.repec.org/p/fip/fedcwq/87375.html
Carriero, A., Clark, T. E., & Massimiliano, M.2020. Nowcasting tail risks to economic activity with many indicators. Working Papers 202013R2. Federal Reserve Bank of Cleveland, May. http://doi:org/10.26509/frbc-wp-202013 http://ideas.repec.org/p/fip/fedcwq/87955.html
Cengiz, D., Dube, A., Lindner, A. S., & Zentler-Munro, D.2021. Seeing beyond the trees: Using machine learning to estimate the impact of minimum wages on labor market outcomes. Working Paper Series 28399. National Bureau of Economic Research (NBER), January. http://doi:org/10.3386/w28399 http://www.nber.org/papers/w28399
Chakraborty, C., & Joseph, A.2017. Machine learning at central banks. Bank of England Working Papers 674. Bank of England, September. http://ideas.repec.org/p/boe/boeewp/0674.html
Chang, M., Chen, X., & Schorfheide, F.2021. Heterogeneity and aggregate fluctuations. University of Pennsylvania. Working Paper. http://web.sas.upenn.edu/schorf/files/2021/05/EvalHAmodels_v6_pub.pdf
Chavleishvili, S., & Manganelli, S.2019. Forecasting and stress testing with quantile vector autoregression. Working Paper Series 2330. European Central Bank, November. http://ideas.repec.org/p/ecb/ecbwps/20192330.html
Christiano, L. J., Eichenbaum, M., & Evans, C. L.1999. Monetary policy shocks: What have we learned and to what end? In J. B.Taylor & M.Woodford (Eds.), Handbook of macroeconomics (vol., 1, ch. 2, pp, 65–148). Elsevier. http://ideas.repec.org/h/eee/macchp/1-02.html
Dees, S., Di Mauro, F., Pesaran, M. H., & Smith, L. V.2007. Exploring the international linkages of the euro area: A global VAR analysis. Journal of Applied Econometrics, 22(1), 1–38. http://www.jstor.org/stable/25146503
Del Negro, M., & Schorfheide, F.2006. How good is what you’ve got? DSGE-VAR as a toolkit for evaluating DSGE models. Economic Review, 91(Q 2), 21–37. http://ideas.repec.org/a/fip/fedaer/y2006iq2p21-37nv.91no.2.html
Doan, T., Litterman, R. B., & Sims, C. A.1986. Forecasting and conditional projection using realistic prior distribution. Staff Report 93. Federal Reserve Bank of Minneapolis. http://ideas.repec.org/p/fip/fedmsr/93.html
Doerr, S., Gambacorta, L., & Garralda, J. M. S.2021. Big data and machine learning in central banking. BIS Working Papers 930. Bank for International Settlements (BIS), March. http://ideas.repec.org/p/bis/biswps/930.html
Fouliard, J., Howell, M., & Rey, H.2020. Answering the queen: Machine learning and financial crises. Working Paper Series 28302. National Bureau of Economic Research (NBER), December. http://doi:org/10.3386/w28302 http://www.nber.org/papers/w28302
Fuchs-Schündeln, N., & Hassan, T. A.2016. Natural experiments in macroeconomics. In J. B.Taylor & H.Uhlig (Eds.), Handbook of macroeconomics (vol. 2, ch. 0, pp. 923–1012). Elsevier. http://ideas.repec.org/h/eee/macchp/v2-923.html
Gambacorta, L., Huang, Y., Qiu, H., & Wang, J.2019. How do machine learning and non-traditional data affect credit scoring? New evidence from a Chinese fintech firm. BIS Working Papers 834. Bank for International Settlements (BIS), December. http://ideas.repec.org/p/bis/biswps/834.html
Gelos, R. G., Gornicka, L., Koepke, R., Sahay, R., & Sgherri, S.2019. Capital flows at risk: Taming the ebbs and flows. IMF Working Papers 19/279. International Monetary Fund (IMF), December. http://ideas.repec.org/p/imf/imfwpa/19-279.html
Giannone, D., Lenza, M., & Primiceri, G. E.2015. Prior selection for vector autoregressions. Review of Economics and Statistics, 97(2), 436–451. http://ideas.repec.org/a/tpr/restat/v97y2015i2p436-451.html
Jordà, Ò.2005. Estimation and inference of impulse responses by local projections. American Economic Review, 95(1), 161–182. http://doi:org/10.1257/0002828053828518 http://www.aeaweb.org/articles?id=10.1257/0002828053828518
Joseph, A.2019. Shapley regressions: A framework for statistical inference on machine learning models. Technical Report, King’s Business School, King’s College London. March.
Joseph, A., Kalamara, E., Kapetanios, G., & Potjagailo, G.2021. Forecasting UK inflation bottom up. Bank of England Working Paper 915. Bank of England, March. http://ideas.repec.org/p/boe/boeewp/0915.html
Karagedikli, O., Vahey, S. P., & Wakerly, E. C.2019. Improved methods for combining point forecasts for an asymmetrically distributed variable. CAMA Working Paper 2019-15. Centre for Applied Macroeconomic Analysis, Crawford School of Public Policy, Australian National University, February. http://ideas.repec.org/p/een/camaaa/2019-15.html
Kelleher, J. D.2019. Deep learning. MIT Press.
Kilian, L., & Lutkepohl, H.2018. Structural vector autoregressive analysis. Cambridge Books. Cambridge University Press.
Kuhn, M., & Johnson, K.2013. Applied predictive modeling. Springer. http://www.amazon.com/Applied-Predictive-Modeling-Max-Kuhn/dp/1461468485/
Leeper, E. M., & Zha, T.2003. Modest policy interventions. Journal of Monetary Economics, 50(8), 1673–1700. http://ideas.repec.org/a/eee/moneco/v50y2003i8p1673-1700.html
Loaiza-Maya, R., & Smith, M. S.2020. Real-time macroeconomic forecasting with a heteroscedastic inversion copula. Journal of Business & Economic Statistics, 38(2), 470–486. http://ideas.repec.org/a/taf/jnlbes/v38y2020i2p470-486.html
Masini, R. P., Medeiros, M. C., & Mendes, E. F.2021. Machine learning advances for time series forecasting. arXiv: 2012.12802 [econ.EM].
Molnar, C., Casalicchio, G., & Bischl, B.2020. Interpretable machine learning—a brief history, state-of-the-art and challenges. Technical report, Cornell University. http://arxiv.org/abs/2010.09337
Mullainathan, S., & Spiess, J.2017. Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87–106. http://ideas.repec.org/a/aea/jecper/v31y2017i2p87-106.html
Mumtaz, H., & Surico, P.2009. The transmission of international shocks: A factor-augmented VAR approach. Journal of Money, Credit and Banking, 41(s1), 71–100. http://ideas.repec.org/a/wly/jmoncb/v41y2009is1p71-100.html
Nakamura, E., & Steinsson, J.2018. High frequency identification of monetary non-neutrality: The information effect. NBER Working Papers 19260. National Bureau of Economic Research, Inc. (NBER), January. http://ideas.repec.org/p/nbr/nberwo/19260.html
Pearl, J.2018. The seven tools of causal inference with reflections on machine learning. Communications of Association for Computing Machinery, 1(1), 1–6. http://ftp.cs.ucla.edu/pub/stat_ser/r481.pdf
Pearl, J., & Mackenzie, D.2018. The book of why: The new science of cause and effect. Basic Books.
Plagborg-Møller, M., & Wolf, C. K. Forthcoming. Local projections and VARs estimate the same impulse responses. Econometrica, https://scholar.princeton.edu/sites/default/files/mikkelpm/files/lp_var.pdf
Primiceri, G., &.Tambalotti, A. 2020. Macroeconomic forecasting in the time of COVID-19. Working Paper. Federal Reserve Bank of New York.
Ramey, V. A.2016. Macroeconomic shocks and their propagation. In J. B.Taylor & H.Uhlig (Eds.), Handbook of Macroeconomics (vol. 2, ch. 0, pp. 71–162). Elsevier. http://doi:org/10.1016/bs.hesmac.2016.03 http://ideas.repec.org/h/eee/macchp/v2-71.html
Sims, C. A.1980. Macroeconomics and reality. Econometrica, 48(1), 1–48. http://ideas.repec.org/a/ecm/emetrp/v48y1980i1p1-48.html
Smith, M. S., & Vahey, S. P.2016. Asymmetric forecast densities for U.S. macroeconomic variables from a Gaussian copula model of cross-sectional and serial dependence. Journal of Business & Economic Statistics, 34(3), 416–434. http://ideas.repec.org/a/taf/jnlbes/v34y2016i3p416-434.html
Stock, J., & Watson, M.2001. Vector autoregressions. Journal of Economic Perspectives, 15(4), 101–115. http://www.aeaweb.org/articles?id=10.1257/jep.15.4.101
Stock, J., & Watson, M.2016. Dynamic factor models, factor-augmented vector autoregressions, and structural vector autoregressions in macroeconomics. In J. B.Taylor & H.Uhlig (Eds.), Handbook of macroeconomics (vol. 2, pp. 415–525). Elsevier. http://ideas.repec.org/h/eee/macchp/v2-415.html
Tiffin, A. J.2019. Machine learning and causality: The impact of financial crises on growth. IMF Working Papers 2019/228. International Monetary Fund (IMF), November. http://ideas.repec.org/p/imf/imfwpa/2019-228.html
Waggoner, D. F., & Zha.T.1999. “Conditional forecasts in dynamic multivariate models.” Review of Economics and Statistics, 81(4), 639–651. http://ideas.repec.org/a/tpr/restat/v81y1999i4p639-651.html

Abstract

1.Introduction

2.The Central Banks’ Problem

3.What Is Machine Learning?

3.1.ML is so far principally a prediction tool, but change may be coming

3.2.“Training” Versus “Validation” and the Need for Large Amounts of Data

3.3.Cars with ML Drivers Versus Economies with ML Policy Decisions: Similarities and Differences

4.The Development of VARs at a Glance

4.1.From Linear Dynamic Stochastic General Equilibrium Models to VARs and Back

4.1.1.From Structural Models to VARs

4.1.2.Giving VARs a Structural Interpretation

4.1.3.Incorporating Large Data Sets in VARs

5.How Well Have the VARs Served in “Solving” the Central Banks’ Problem?

5.1. Forecasting and Risk Analysis

5.2.Structural and Scenario Analysis

5.3.Communication

6.Can Machine Learning Do the Same?

6.1.Forecasting and Risk Analysis

6.2.Structural/Scenario Analysis and Communication

7.Conclusions

References

Figures

Introduction

The Central Banks’ Problem

What Is Machine Learning?

ML is so far principally a prediction tool, but change may be coming

“Training” Versus “Validation” and the Need for Large Amounts of Data

Cars with ML Drivers Versus Economies with ML Policy Decisions: Similarities and Differences

The Development of VARs at a Glance

From Linear Dynamic Stochastic General Equilibrium Models to VARs and Back

From Structural Models to VARs

Giving VARs a Structural Interpretation

Incorporating Large Data Sets in VARs

How Well Have the VARs Served in “Solving” the Central Banks’ Problem?

Forecasting and Risk Analysis

Structural and Scenario Analysis

Communication

Can Machine Learning Do the Same?

Forecasting and Risk Analysis

Structural/Scenario Analysis and Communication

Conclusions