The Energy and Gross Domestic Product Causality Nexus in Latin America 1900-2010

A better understanding of the relationship between energy consumption and economic growth is important for the less developed regions of the world such as Africa or Latin America, which future might be compromised by the imposition of the transition to a lower carbon economy. Studies on the energy-GDP nexus for Latin America have been few and bounded to short periods. We fill this gap by searching for causal paths between energy and GDP for 20 Latin American countries using a newly compiled dataset spanning the 20th century. Our main identification strategy is based on super exogeneity, which we complement with Granger tests, Toda and Yamamoto and enrich by controlling for structural breaks and the False Discovery Rate. The results highlight the inexistence of a homogeneous relation between energy and GDP in highly heterogeneous spatial and temporal dimensions, and thus the need to enhance our theoretical understanding of this relation. The policy implication is that designing and implementing energy policies coming from a single methodological approach and based on aggregated results should be avoided.


INTRODUCTION
Studies on the casual relation between energy consumption and GPD have yielded mixed results, despite the correlation between per capita energy consumption and GDP (Figure 1), its relation to economic development (Beaudreau, 2005;Cleveland et al., 1984;Weissenbacher, 2009;White, 1943) and the theoretical arguments that expansion of energy use underlies GDP growth. Understanding the relationship between energy use and economic growth has become a critical issue to resolve future world-scale challenges, particularly because the current fossil-reliant global energy system accounts for roughly 60% of total global greenhouse emissions. A better understanding of the relationship between energy consumption and economic growth is even more important for the less developed regions of the world, which future might be compromised by the imposition of the transition to a lower carbon economy. Despite the large reserves of oil and coal held in the region, Latin America is (and has always been) a low energy consuming region, with the second smallest primary energy consumption per capita in the world after Africa. Thus, a better understanding of the relationship between energy consumption and economic growth in Latin America is a key question.
The meta-analyses on this literature has identified an array of paths for future research. Based on 51 studies, Menegaki (2014) argues that future work should focus on developing countries, more advanced econometrics, and multivariate analysis. Moreover, based on 158 studies Kalimeris et al. (2014) call for using energy prices and elasticities, quality adjusted energy meassures, and grouping analysis by countries with similar patterns of energy consumption. Lastly, based on 72 studies, Bruns et al. (2014) emphasize the need for longer time series, quality adjusted energy meassures, and better theory.
This paper contributes to the energy-GDP nexus literature in four ways. First, we use a new dataset that spans the twentieth century and covers 20 Latin American countries. The existing studies on this region (and the mayority of the literature) suffer from the use of relatively short time periods, as they usually start after 1970 given data availability and sometimes contain no more than 30 observations. This raises important concerns about sampling variability (Stern and Enflo, 2013) and can pose significant problems to time series estimation due to the asymptotic nature of all relevant statistics (Smyth and Narayan, 2015). Although these problems have been addressed with panel data (Apergis and Payne, 2009;Herrerias et al., 2013), short time-spans inhibit the modeling of parameter heterogeneity, which has been proven important in practice (Baltagi and Griffin, 1997). To our knowledge, only Stern and Enflo (Stern and Enflo, 2013), Csereklyei et al. (Csereklyei et al., 2016), and Vaona (Vaona, 2012) use time series comparable to ours, but for other nations.
Second, we study a region that has been historically underrepresented in the energy-GDP nexus literature. Of the 101 papers surveyed by Payne (2010), only 11 include at least one Latin-American country. Similarly, of the 61 papers surveyed by Ozturk (2010), just 4 include at least one Latin-American country. Neither survey contains a paper that includes the entire region.
Third, we follow the identification strategy in Rodriguez-Caballero ans Ventosa-Santaularia, (2016) which controls for structural breaks and search for evidence of superexogeneity (Engle et al., 1983). The authors argue that evidence of superexogeneity is valuable to establish causal links and to assess if inferences can be used for policy-making. This strategy differs from the standard approach to infer causality in the energy-GDP nexus literature, which is based on bivariate (Kraft and Kraft, 1978) and multivariate (Stern, 1993) Granger tests, which have been modified to account for integration (Wolde-Rufael, 2004), cointegration (Masih and Masih, 1996), and exploit panel data availability (Lee, 2005) and regime shifts (Kocaaslan, 2013). These approaches, although increasingly refined and sophisticated, suffer from the fact that Granger-causality does not imply any meaningful sense of causality in non-stationary settings (Hendry, 2004).
Fourth, we take into account multiple hypotheses testing controlling for the False Discovery Rate (FDR) (Benjamini and Hochberg, 1995). To our understanding, this issue has been broadly neglected by the energy-GDP nexus literature. For example, Chontanawat et al. (2008) and Apergis and Tang (2013) study over 100 and 85 countries respectively, perform hundreds of tests, yet fail to mention any multiple inference procedure (Narayan, 2016).
The rest of the paper is organized as follows. Section 2 presents the data and details the methods used to analyze it. Section 3 presents our results. Section 4 provides a discussion on our results and section 5 concludes.

Data
Our dataset covers GDP (Maddison Project, 2018) and modern energy use for 20 Latin American countries from 1900 to 2010. The series on energy use contains the annual modern energy consumption (coal, oil, gas, and primary electricity). It has been built from preexisting compilations by energy historians which have been linked to internationally available statistics after 1970. The series were originally constructed by Rubio et al. (2010Rubio et al. ( ) for 1900Rubio et al. ( -1930, and later expanded by Rubio and Folchi (2012) to cover 1856-1960. Updates to coal consumption from Yañez et al. (2013) and on hydroelectricity by (Rubio and Tafunell (2014)) are included. Missing data from the previous sources after 1950 was obtained from the United Nations (UN/WES) (1976) until 1971. After 1971 the series are match with International Energy Agency (2013) data, excluding the biofuels and waste estimates from the IEA's totals in order to keep the data consistent across the entire period. Data are expressed in terajoules (TJ). Details can be found in the supplementary materials.
The box includes 1 st to 3 rd quartiles with the vertical line at the median value. The whiskers go from the upper to the lower adjacent values. Sorted by the median value of energy consumption.
The context of the data and the data itself must be taken into consideration when looking for evidence of the energy-GDP nexus in Latin America. Contrary to the world as a whole, in most Latin American countries energy consumption outpaced economic growth over the century (Figure 2). In fact, many countries have had impressive energy consumption growth rates over time, with averages for the period under study in Brazil, Colombia, Mexico, and Venezuela higher than their standard deviations. From 1950 to 2000 alone, the entire region's energy consumption increased roughly 12-fold while its GDP only grew about 9-fold. While energy consumption has tended to converge in the region over the past century GDP levels have not (Figure 3). Own elaboration the supplementary materials for energy and Maddison Project (2018) for GDP.
The regional averages hide vast heterogeneity over space between countries and over time for the same country ( Figure 2). Three countries (Argentina, Brazil and Mexico) concentrate the bulk of energy consumption and economic production in the region. These three countries account for up to 70% of the region's energy consumption and GDP over the twentieth century. Moreover, 90% of both indicators are accounted for by adding four more countries (Chile, Colombia, Uruguay and Venezuela). Thus, the remaining 13 countries share the remaining 10% of the region's energy consumption and economic output. Latin American heterogeneity across space can also be exemplified with per capita consumption: in 1890 a Chilean consumed on average about 370 times more energy than the average Guatemalan (Rubio et al., 2010), and in 2000, the average Mexican consumed over 250 times more energy than the average Haitian.
In addition to the heterogeneity between countries, the heterogeneity within countries over time is remarkable. While all experienced important increases in both GDP and energy use throughout the 20 th century, some had particularly dramatic paths. For example, Venezuela passed from being a poor agrarian country in the 1900s to the second largest oil producer in the world by the 1950s, turning into one of the richest countries in the world by the 1960s, but then collapsing towards the end of the century.
Whereas these differences in space and time seem to correlate well with differences in GDP, as argued by Rubio et al. (2010), there is no theoretical reason to expect that a constant and identifiable relation should hold across such contrasting settings. As argued by Toman and Jemelkova (2003) and Yu and Choi (1985), lack of consistency in the relation between energy and GDP can be attributed not only to varying econometrics and omitted variables bias, but also to different structure and stages of economic development, energy consumption patterns, and even climate conditions.

Integration
To study Granger-causality and super exogeneity we first determine the order of integration I (d) of the series under analysis. As all series are growing (Table 1), they represent processes that are either trend stationary or that contain a unit root with drift (Elder and Kennedy, 2001). The alternative of a unit root with a deterministic trend implies an explosive data generating process, and thus is discarded. To discriminate between the two possible processes we use the Elliott et al. (1996) and Kwiatkowski et al. (1992) tests (hereafter DF-GLS and KPSS respectively). The DF-GLS test is a Dickey-Fuller type test that uses a fitting regression of the form: Where the null hypothesis of a unit root with drift implies setting γ = 0 and testing H 0 : β = 0, and ϵ t represents a white noise error term. On another hand, the KPSS test uses a fitting regression of the form: Where the null is trend stationarity, ϵ t represents a white noise error term, and r t a random walk.
Accounting for structural breaks is important as they can be mistaken for unit roots (Campos et al., 1996;Perron, 1989). Thus, we use the Zivot and Andrews (1992) and Perron (1997) tests (hereafter ZA and P97 respectively). Model C of the ZA test, which is based on a Dickey-Fuller type test, follows from Perron (1989), and allows for an endogenous break in constant and trend under the alternative. This model uses a fitting regression of the f orm: Where the null hypothesis of a unit root with drift is represented by H 0 : β = 1, ϵ t is a white noise error term, DU t = 1 (t > T b ) and As usual, 1(•) is the indicator function, t a deterministic time trend, and T b the endogenously defined structural break date. On another hand, the P97 test, which allows for a break in the constant and trend under the null and the alternative, uses a fitting regression of the form: and where the null hypothesis of a unit root with drift is represented by H 0 : β = 1.

Granger causality
We study Granger-causality with the modified Wald statistic proposed by Toda and Yamamoto (Dolado and Lütkepohl, 1996;Toda and Yamamoto, 1995)  This method has been widely used because time series must be stationary for Granger tests from a vector autoregressive (VAR) model to be valid, and energy and GDP are usually non-stationary. A vector error correction model (VECM) can be used to do Granger tests in the presence of cointegration (Granger, 1988), as has been done by Akinlo (2008) while studying 11 African countries, Ghali and El-Sakka (2004) for Canada, and Yuan et al. (2008) for China, among others. Yet, pre-testing for unit roots and cointegration leads to problems with the size and power of Granger tests based on a VECM (Cheung and Lai, 1993;Clarke and Mirza, 2006;Harris and Sollis, 2003).
On the contrary, TY can be implemented regardless of the order of integration and cointegration relations of the series, and by bypassing pre-tests, leads to more reliable Granger-causality results.
In fact, there is strong evidence that compared to the procedure based on a VECM, the TY approach has a smaller size distortion and a similar performance (Clarke and Mirza, 2006;Yamada and Toda, 1998). Although Zapata and Rambaldi (1997) demonstrated that the TY method has lower power than the VECM approach in bivariate and trivariate models with sample size of 50 or less, our dataset has between 50 and 100 observations per country.
The implementation of the TY procedure requires choosing an appropriate lag length p for a VAR containing the variables of interest. This is usually done with the AIC, BIC, or Hannan-Quinn information criterion. We chose lag length according to the democracy of the three criteria, and in the case of a tie, we prioritized the BIC. Whenever evidence of serial correlation of residuals was found, the lag length was increased to eliminate it.
Next, lag length is extended by the maximum order of integration I (d m ) of the series under study, such that the VAR has p + d m lags. Thus, the model to implement the TY procedure is: Where each γ and δ is a parameter to be estimated, and each ε is a white noise error term. With this VAR, Granger-causality is inferred by testing for the joint significance of the first p parameters. Specifically, the null of Granger non-causality running from energy to GDP is represented by H 0 : δ 1, i = 0 ∀ i∈p, and the converse by H 0 : γ 2, i = 0 ∀ i∈p.

Super exogeneity
Although Granger tests have been the most popular identification strategy in the energy-GDP nexus literature, they are neither a necessary nor sufficient condition for causality (Hendry, 2004). Granger-causality is a measure of forecast capability (Granger, 1980;, and such capacity does not imply causality in non-stationary settings (Hendry and Mizon, 2000). Thus, Granger-causality should be used to study forecasting proficiency, and the establishment of causal links should rest on theory or super exogeneity.
A variable is super exogenous if it is weakly exogenous, and if the parameters of the conditional model of interest are invariant to a class of interventions (Engle et al., 1983;Hendry, 2004). A variable is found to be weakly exogenous if the joint distribution under study is not influenced when conditioned on such variable. This property allows for valid conditional inference. Parameter constancy is found when the marginal density of the weakly exogenous variable has no influence on all the parameters of the conditional model used to study such joint distribution. This property allows for valid policy analyses.
To study if a variable is super exogeneous we follow the VECMbased procedure described by Rodriguez-Caballero and Ventosa-Santaularia (2016). The procedure rests on the idea that if the error correction terms (ECTs) of a variable are jointly non-significant, then its marginal density can be marginalized from the analysis as the variable does not adjust upon shocks to the long run relationship of the joint distribution (Engle and Granger, 1987). On the other hand, parameter constancy can be studied by testing if the parameters of the reduced rank matrix of the VECM are recursively stable (Hansen and Johansen, 1999).
Estimating a VECM first requires studying the integration and cointegration of the series. If the unit root tests described above show that the series are integrated of the same order, then cointegration tests can be implemented to establish the rank of the cointegrating space. As the conventional cointegration test in Johansen (1988;1991) is invalid under structural breaks, we use the one in Johansen et al. (2000) to account for them. The dates of the structural break T b were chosen evaluating the endogenously chosen break dates for the ZA and P97 tests and the results from a Chow-type test (Chow, 1960) in light of each country's unique historical events (See Tables A3 and A4 in the Appendix).
For countries with evidence of cointegration, we estimate a VECM of the form: Where each α is an ECT, each β is an element of the cointegrating vector, and each Γ is a parameter representing a variable's short run dynamics. Also, t is a deterministic time trend, μ is a parameter ECT-GDP and ECT-E are the error correction terms in the GDP and energy consumption equations respectively. Beta is the parameter of energy in the cointegrating vector. P-values in parenthesis and q-values in squared brackets Argentina GDP←E GDP←E "C" means contradictory results. "-" for procedures that cannot be done. "neutral" supports the neutrality hypothesis. "No" means that weak or superexogeneity was not found. (a) GDP→E at 10% level of significance. (b) GDP←E at 10% level of significance. (c) GDP↔E at 10% level of significance associated with the constant term, D t = 1 (t > T b ) is a B × 1 vector containing B level breaks, and Ψ is a B × 1 vector of parameters associated with such breaks. Finally, is the lag length of the underlying VAR, and each is a white noise error term.
Given (5), there is evidence of weak exogeneity of GDP and energy if we fail to reject the null H 0 : α 1 = 0 and H 0 : α 2 = 0 respectively. Furthermore, parameter constancy is tested with the time path of the τ-statistics of recursive eigenvalues of the reduced rank matrix Π = [α 1 α 2 ]' [β 1 β 2 γ ]. As the null of eigenvalue stability is rejected if the τ-statistics are larger than their critical values, failure to reject implies that there is no evidence against the proposition that the marginal distribution of the weakly exogeneous variables is independent of the joint distribution of the two variables. If parameter constancy fails to be rejected, the variables found to be weakly exogenous are also super exogeneous.

Multiple inference testing
We use the FDR introduced by Benjamini and Hochberg (1995) to control for the expected proportion of type I errors derived from multiple hypotheses testing. We use standard q-values as described in Anderson (2008), which modify the p values from a family of tests as follows: Sort M hypotheses in order of decreasing significance such that their corresponding values are ordered as p 1 < ... <p M . Then, choose any value q ∈ (0,1) and the largest r ∈ M such that: The control of the FDR at the q level of significance implies rejecting the hypotheses associated with p 1 < ... <p r . Finally, the smallest q value at which a hypothesis would be rejectedthe equivalent to the P value-, is computed by repeating this procedure for all levels of q and identifying the point where the significance changes.
In this study we test 40 Granger-causality and 33 weak exogeneity hypotheses, and therefore present alongside each P value its corresponding q value. To obtain such values we consider all Granger and weak exogeneity tests as distinct "families" (Hochberg and Tamhane, 1987). Furthermore, we do not use the FDR for the integration and cointegration tests due to their nonstandard distributions.

RESULTS
The unit root tests show generally consistent evidence of nonstationarity, even under structural breaks. For GDP, all countries except Bolivia and Ecuador are inferred to be I(1). While Ecuador is inferred to be I(2), Bolivia shows contradicting results with stronger evidence supporting I(0) ( Table A1 in the appendix). For energy, 14 countries are inferred to be I(1). While Bolivia is inferred to be I(2), Ecuador, El Salvador, Haiti, Honduras, and Paraguay show contradicting results (Table A2 in the appendix).
Granger-tests following TY are done on all countries under study, expanding the optimal lag length by one or two according to the unit root tests. In the case of countries with contradictory results we expand the lag length by two. At the 5% level of significance and taking into consideration q-values, Granger-causality running from GDP to energy is found for seven countries (Argentina, Bolivia, Chile, El Salvador, Paraguay, Peru, and Venezuela), and from energy to GDP is found for one country (Cuba). Furthermore, bidirectional Granger-causality is found for two countries (Honduras and Mexico), while no relation is found for 10 countries (Table 1). At the 10% level of significance, two of these 10 countries show evidence of Granger-causality, one running from GDP to energy (Brazil) and the other in the opposite direction (Costa Rica). Moreover, at such lower significance level two countries (Chile and Venezuela) previously evidencing Grangercausality from GDP to energy show evidence of bidirectional Granger-causality. Lastly, the use of q-values instead of p-values changes relevant inferences in five cases at the 5% level of significance, and in four at the 10% level of significance.
Given the pre-tests, a VECM can be estimated for only 11 of the 20 countries under study, which still represent roughly 90 per cent of the GDP and energy consumption of the region. The ECTs of the estimated VECMs (Table 2) show that at the 5% level of significance, GDP is weakly exogenous in four countries (Chile, Costa Rica, Nicaragua, and Mexico), and energy is weakly exogenous in two (Panama and Venezuela). The remaining five countries (Argentina, Brazil, Cuba, Dom. Rep, and Uruguay) show no evidence of weak exogeneity. Note that the adjustment for multiple inferences makes GDP weakly exogenous for Mexico.
At the 5% level of significance, Costa Rica, Mexico, Panama, and Venezuela fail to reject the null of eigenvalue stability, while Chile and Nicaragua reject such null ( Figure 4). Thus, results are split evenly between the six countries with evidence of weak exogeneity, with super exogeneity of GDP in Costa Rica and Mexico, of super exogeneity of energy in Panama and Venezuela, and of no super exogeneity in Chile and Nicaragua. Moreover, the five countries with no evidence of weak exogeneity also fail to reject the null of eigenvalue stability, which implies a stable mutual adjustment between energy and GDP.

DISCUSSION AND CONCLUSION
A summary of all results is provided in Our results largely contradict those of Rodriguez-Caballero and Ventosa-Santaularia (2016), which is the closest previous study in terms of methodology and countries. Using super exogeneity for roughly the same countries, the authors report causality running from energy to GDP for eight countries, while we do so for only two (Panama is in both). Moreover, they report causality from GDP to energy for Colombia, Mexico, and the USA, while we do so for Mexico and Costa Rica. These differences are likely due to the energy series, as we study 100 years of data on all commercial energy sources, whereas they use 40 years of data on electricity consumption only. The evidence suggesting that electricity consumption is particularly limiting for economic growth (Ozturk, 2010), and the shallow electrification of the region (Rubio and Tafunell, 2014), helps explain our relatively scarce findings. Regarding the other studies in the region, using Granger tests Apergis and Payne (2009) and Chang and Carballo (2011) (2012).
We do not find a consistent causal relationship between energy and GDP despite using a new century long dataset for Latin America and a novel identification strategy. This highlights the difficulties of our current approaches in modelling the relation between energy and the economy, especially given the inexistence of a homogeneous relation under highly heterogeneous spatial and temporal dimensions. Given the essential relation between energy, economies, and societies, new approaches are required to model their interaction.
We should point to several caveats of our results. First, omitted variables bias might be present given the bivariate setting (Menegaki, 2014;Ozturk, 2010;Stern, 1993). Unfortunately, relevant data series (e.g., capital, labor) for over a 100 years for these 20 countries were not available. Second, our controls for structural breaks might be insufficient to account for major economic crisis (e.g., Great Depression, Debt Crisis) and technological breakthroughs (e.g., new prime movers, efficiency enhancements) that took place during the twentieth century. Third, the proper assessment of our results is limited by the "atheoretical" modelling of long-run dynamics done in this paper and generally in the literature as opposed to theorybased analysis (Pesaran, 1997). Yet, there is no alternative to this approach as long as the energy-GDP nexus lacks a sound theoretical framework (Bruns et al., 2014).
These qualifications point to future research avenues. One is expanding the dataset to include labor and capital to enrich this analysis. Another is delimited testing periods by major events, which might reveal that the energy-GDP nexus is regime specific.
A third is to assess if there are patterns capable of explaining the current heterogeneity of results, be it regime changes, structural transformation, or others. For example, as suggested by (Stern, 2011), the relation could evolve according to stages of economic development, where in earlier stages (contained in our data) energy might be relatively abundant, and thus GDP as a function of relatively scarce capital and labor limits energy consumption. In latter stages (those represented in studies starting after 1970), energy becomes relatively scarce and a limiting input to growth. Although this idea is compatible with findings that developing countries tend to support more the conservation hypothesis (Chen et al., 2012), it is not with the falling cost shares of energy (Csereklyei et al., 2016).
A last major future research avenue is the theoretical understanding of the energy-GDP nexus (Bruns et al., 2014) capable of specifying how it is influenced by labor, capital, and other factors. Admittedly, without an accompanying theory there is no reason to expect a homogenous relation between these variables that holds across space and time (Voudouris et al., 2015), nor is it possible to identify the factors that drive the mutating relationship. Perhaps the most valuable insight of this "no-results" paper is the inadequacy of our current modelling approach as little-to-nothing can be said with decent data and complex methods. This recognition underscores the urgency to better model how energy influences economies and societies. Attempts to provide such a theory can be found in Stern (2011), Kümmel et al. (2015), Court et al. (2018), and Keen et al. (2019) but it is still a work in progress.
Although there is no evidence to define one-size-fits-all policies on energy management, there is enough to allow for some policy recommendations and a word of caution. If policy was to be based on the results of the Granger-causality test alone, then for a large majority of countries in Latin America there would be a strong case to pursue aggressive energy efficiency measures without regards to their impact on GDP, given the evidence supporting the neutrality hypothesis and the seven countries where GDP growth has predictive capacity for energy use. However, the results of weak and super exogenenity call for a more cautious approach. Only in Mexico and Costa Rica such policies seem unlikely to affect GDP growth, while in Venezuela and Panama curtailing energy consumption would impact GDP growth negatively. Furthermore, for three of the largest economies of the region -Argentina, Brazil and Uruguay (plus Cuba and the Dominican Republic), our results imply a stable mutual adjustment between energy and GDP. Thus, policy makers designing and implementing energy policies should take with a grain of salt any advice on the energy-GDP nexus coming from a single methodological approach and based on aggregated results.