Relatedness and regional economic complexity: Good news for some, bad news for others

Purpose – Thisarticle aimstoevaluatetheentryand exitofcompaniesfrom localproductive structures,with a specific focus on the sectoral complexity of these activities and the complexity of these portfolios. The study focuses on empirically demonstrating the thesis that related economic diversification exacerbates the development gap between more and less complex regions. Design/methodology/approach – The article uses indicators formulated by the economic complexity approach.TheyallowarelevantdescriptiveanalysisoftheeconomicdiversificationprocessinBrazilianmicro-regionsandprovidethefoundationfortheeconometrictestsconducted.Throughthreedistinctestimationstrategies(OLS,logit,probit),theinfluenceofcomplexityandrelatednessontheentryandexiteventsoffirms fromlocalportfoliosistested. Findings – In all estimated models, the stronger relationship between an activity and a portfolio significantly increasesitsprobabilityofenteringtheproductivestructureand,atthesametime,actsasasignificantfactorinpreventingitsexit.Furthermore,theresultsrevealthatthecomplexityofasectorreducestheprobabilityofits specializationinlesscomplexregionswhileincreasingitinmorecomplexregions.Ontheotherhand,sectoralcomplexitysignificantlyincreasestheprobabilityofasectorleavinglesscomplexlocalstructuresbuthasno significanteffectinhighlycomplexregions. Research limitations/implications – Due to the data used, the indicators are calculated considering only formal job numbers. Additionally, the tests do not detect the influence of spatial issues. These limitations should be addressed by future research. Practical implications – The article characterizes a prevailing process of uneven development among Brazilian regions and brings relevant implications, primarily for policymakers. Specifically, for less complex regions, policies shouldfocus on creating opportunities to improve their diversification capabilities in complex sectors that are not too distant from their portfolios. Originality/value – The article makes an original contribution by proposing an evaluation of regional diversification in Brazil with a focus on complexity, introducing a more detailed differentiation of regions based on

1. Introduction Hausmann and Klinger (2007), Hidalgo, Klinger, Barabasi, and Hausmann (2007) claim that an economy's specialization in a certain product significantly affects its future performance.This is due to the fact that economies possess different capabilities that either enable or restrict their competitiveness in producing a particular product.That is, economies with more (or less) diversified capabilities are closer (or further) to acquiring competitiveness in new sectors.Additionally, at the regional level, there is also evidence that complexity stimulates economic growth and employment (Romero et al., 2022).
Empirical evidence supports the notion that economic diversification tends to occur towards sectors that share similar capabilities to be produced, as summarized by the principle of relatedness (Hidalgo et al., 2018).Studies examining regional growth and the entry and exit of activities from local portfolios have demonstrated that diversification patterns in Dutch (Frenken, Van Oort, & Verburg 2007), Italian (Boschma & Iammarino, 2009), Swedish (Neffke, Henning, & Boschma, 2011), American (Boschma, Balland, & Kogler, 2015;Essletzbichler, 2013) and Brazilian (Freitas, 2019, Françoso, Boschma, & Vonortas, 2022) regions follow a similar trajectory.In other words, regional economic diversification tends to prioritize sectors that are already related to the existing productive structure.
However, this process reinforces the existence of a diverging pattern of diversification among regions, as observed by Pinheiro et al. (2022) in the context of European regions.Specifically, while more complex regions have the potential to diversify into more complex activities, less complex regions face significant obstacles in achieving such diversification.This structural challenge hinders their development since complex sectors offer greater economic benefits.Consequently, although diversification is driven by related sectors, it can perpetuate economic disparities between regions, indicating that relatedness can be a good news for some and a bad news for others.
The existing literature in this area is still limited, and this study aims to contribute to this theoretical field.Previous analyses by Freitas (2019) and Françoso et al. (2022) had a broader target, primarily assessing the influence of relatedness on regional diversification in Brazil.However, complexity emerges as a key structural determinant of diversification in these regions.Therefore, this paper aims to evaluate the entry and exit of firms in local portfolios, with a specific focus on the sectoral complexity of these activities and on the complexity of these portfolios.While Freitas (2019) focused on examining differences in the influence of complexity only for the most complex regions, and Françoso et al. (2022) focused solely on the influence on the probability of entry, the primary contribution of this article is to emphasize the effect of complexity on regional diversification.
To achieve this objective, this study utilizes formal employment data in productive activities across micro-regions in Brazil from 2009 to 2019.Drawing on the complexity and density indicators formulated by Hidalgo et al. (2007), we examine the influence of these two variables on the likelihood of entering or exiting productive activities within local portfolios.In addition to examining whether the principle of relatedness (Hidalgo et al., 2018) holds true, as in Freitas (2019) and Françoso et al. (2022), where density promotes sector entry and hinders exit, our main focus is on assessing how sectoral complexity influences these probabilities, while also considering the level of regional complexity.For this, we propose a categorization of regions based on the economic complexity index (ECI): low, medium-low, medium-high and high.
These complexity-based regional groups enable the examination of previously untested hypotheses.It is assumed that sectoral complexity reduces the likelihood of new activity entry in less complex regions, while increasing it in more complex regions.Conversely, complexity raises the probability of activity exit in less complex regions but does not impact sector exit in already complex regions.By employing three different model specifications (OLS, logit and probit), the econometric results confirm the assumed hypotheses and highlight that the effect of sectoral complexity is more pronounced on the probability of exiting sectors.Moreover, investigating the effects of product complexity and relatedness for regions at different complexity levels helps also understanding the different diversification strategies to be pursued in regions with different development levels, which is also an area still relative unexplored in the literature.
Finally, the analysis is organized as follows: Section 2 discusses the literature on the subject, bringing relevant and similar contributions applied to different regions across the world.Section 3 demonstrates the indicators, database and econometric specifications used.Section 4 is a brief descriptive analysis.Section 5 brings the results of the econometric tests and section 6 ends with the final considerations.

Relatedness: Review of the empirical literature
The evolutionary perspective on economic change has introduced valuable concepts that enhance the understanding of the development process (Nelson & Winter, 1982).Extending this viewpoint to address the questions of economic geography has enabled the theoretical and empirical construction necessary to comprehend the evolution of regional economies and, more specifically, the determinants of their productive diversification (Frenken et al., 2007;Neffke et al., 2011;Boschma et al., 2015;Françoso et al., 2022).Therefore, this section aims to discuss the evolution of empirical literature that evaluates regional economic development, with a focus on describing and relating how these studies explain the related diversification process.
The work of Neffke et al. (2011) marks the beginning of a series of regional studies focused on analyzing the entry, maintenance and exit of firms based on the proximity of productive structures to sectors.The authors examine 70 Swedish regions from 1969 to 1994 and investigate the influence of the number of closely related industries in a region on the probability of entry, maintenance or exit of specific sectors in the local economy.They employ three groups of estimations: models to assess (1) the probability of entry, (2) maintenance and (3) exit.The dependent variables are represented by dummy variables indicating the occurrence of these factors for each region-industry pair within five-year intervals during the specified period.The estimations utilize ordinary least squares (OLS), probit and logit models.The findings align with the assumed hypotheses, indicating that industries technologically related to existing industries are more likely to enter and persist in the regional portfolio, while those on the technological periphery are more prone to exit.
A similar analysis was conducted by Essletzbichler (2013) using data from 360 US metropolitan areas.Essletzbichler examined the effect of proximity on the probability of firm entry, maintenance and exit.However, there are two notable differences compared to the earlier work.First, the measurement of the proximity indicator differs.While Neffke et al. (2011) determined relatedness based on the occurrence of products from distinct industries in manufacturing plant portfolios, Essletzbichler (2013) measured relatedness by analyzing the intensity of flows between pairs of industries using input-output relations.The results of the analysis indicate that an industry's proximity to the regional portfolio increases the odds of membership by 6.9% and the odds of entry by 3.7%, while decreasing the odds of exit by 3.1% per additional link.Rigby (2013) utilizes US patent data spanning from 1975 to 2005 to examine the impact of proximity on the likelihood of entering and exiting technology classes within cities' patent networks.The author investigates the effect of time-lagged proximity, measured by the degree of technological relatedness, on these probabilities.Linear probability models and conditional logit models are estimated using maximum likelihood techniques, with fixed effects incorporated for cities and technology classes.According to the linear probability model, increasing proximity by 1 unit for a technology in which a city has no specialization Relatedness and regional economic complexity enhances the probability of developing specialization in that area by 0.7%, while increasing proximity by 1 unit decreases the probability of developing specialization by 2.72%.Boschma et al. (2015) conducted a similar analysis to Rigby (2013) using US patent data.However, they employed different variables and a distinct model specification.To examine whether cities diversified into related sectors, the authors accounted for potential omitted variable bias by including city characteristics and technology class variables as controls.City-level characteristics considered included employment information, population density, inventive capacity (inventors-to-employees ratio), technological specialization, growth in the number of inventors and income per employee.Technology-level variables used as controls included the number of inventors in the class, technological concentration, growth in knowledge production and a measure of patent age.The authors then tested the impact of relatedness on the entry and exit of technologies in US cities, focusing on a 5-year window within this period.They employed OLS models with fixed effects for cities, technologies, and years.To ensure the robustness of the OLS model results, alternative methods such as probit and logit were employed.The results consistently showed that relatedness density had a significant statistical and economic effect on diversification across all model specifications.
In the study conducted by Boschma, Heimeriks, and Balland (2014), the authors applied a similar analytical approach to analyze the influence of scientific relatedness on the emergence or disappearance of biotech research topics in the scientific portfolios of cities worldwide.The relatedness indicator was calculated based on the co-occurrence of topics in journal articles.The model specification resembled that of Neffke et al. (2011).As control variables, the number of publications at the city and topic levels was used.The findings indicated that new scientific topics in biotech tend to emerge in cities with existing related scientific fields.On the other hand, loosely related topics are more likely to disappear from a city's scientific portfolio.
This method was also used to assess regional diversification in Brazil.Freitas (2019) used employment data from Brazilian microregions to investigate regional diversification in Brazil.The study focused on the impact of density and complexity on the entry, maintenance and exit of sectors in the regions' productive structure.By incorporating economic complexity indicators into the analysis, the author hypothesized that regions would be less inclined to develop new specializations in less related and more complex activities.Using OLS, logit and probit models with fixed effects for region, productive activity and period, the study examined three 5-year windows between 2006 and 2016.The findings confirmed the hypotheses, indicating that proximity to the local productive structure increased the likelihood of sectors remaining or entering the current portfolio, while regions demonstrated reduced propensity for developing new specializations in more complex activities.Françoso et al. (2022) conducted a study utilizing employment and patent data from Brazilian meso-regions to examine the impact of relatedness and complexity on regional diversification in Brazil.The authors focused on the probability of new sector entry into the local economies' productive portfolios between 2006 and 2019.To mitigate potential omitted variable bias, control variables such as population density, gross domestic product (GDP) per capita, and proxies for sector size and region diversity were employed.The study also compared the OLS results of two different samples, namely the 50% more complex and 50% less complex regions, to explore potential variations in the role of these variables.Across both employment and patent datasets, the findings demonstrated that regions tend to diversify into more related sectors, and higher levels of complexity generally reduce the probability of new sector entry.However, the relationship between complexity and sector entry is reversed in the highly complex region sample, indicating that higher complexity in such regions may actually increase the probability of new sector entry.
The inclusion of complexity indicators in the analysis of regional diversification in Brazil yields noteworthy findings that warrant further attention.In the study by Freitas (2019), focusing on the top 25% of regions with higher complexity, it is observed that the level of ECON complexity in a sector has a positive impact on the specialization of new economic activities.However, this coefficient also has a positive effect on explaining sector exit probability and a negative effect on explaining maintenance.The hypothesis put forth by the author suggests that complexity facilitates access to more complex activities but does not alleviate the "trap" of low complexity, as the probability of a sector remaining in the local structure is inversely proportional to its complexity even among the most complex regions.As previously mentioned, similar results were found by Françoso et al. (2022) when comparing the influence of complexity on activity and technological class entry in regions with varying levels of complexity.
These findings shed light on an aspect that has been largely overlooked in previous studies.The influence of complexity and relatedness in shaping regional diversification in Brazil appears to perpetuate a growing economic disparity between regions.The reversal of the complexity coefficient's sign, depending on the region, highlights an ongoing worsening of economic inequality among regions (Pinheiro et al., 2022;Hartmann & Pinheiro, 2022).However, this structural characteristic of the Brazilian regional diversification process was not the primary focus of the analyses conducted by Freitas (2019) and Françoso et al. (2022).The former examined only the diversification of the most complex microregions, while the latter only assessed the effect of complexity on the probability of entry of new productive activities.Therefore, it is crucial to investigate how the process of related diversification unfolds in Brazil, under the influence of complexity, to determine the extent of this uneven development.
The findings presented by Pinheiro et al. (2022) regarding European regions highlight a feedback loop of inequality between regions.Advanced economies tend to specialize in related high-complexity activities, while lagging and less complex regions concentrate on related low-complexity activities.The partial results mentioned by Freitas (2019) and Françoso et al. (2022) provide evidence of a similar pattern in Brazil, suggesting that relatedness can have positive implications for certain regions while potentially exacerbating challenges for others.This observation supports the thesis that relatedness is good news for some regions and bad news for others.
Indeed, the inability of regions to diversify into more complex activities represents a structural challenge for their development.Hidalgo and Hausmann, (2009) argue that engagement in complex productive sectors brings substantial economic benefits to regions due to the combination of various resources that are difficult to acquire and replicate.This creates a competitive advantage for the region, which can persist over time.In contrast, less complex activities are easier to imitate and can disperse quickly, offering less economic value and limited potential for competitive advantage.Moreover, the significance of regional complexity is empirically acknowledged and is associated with greater future GDP and employment growth (Romero et al., 2022).Hence, a diversification strategy centered on highly complex activities holds greater economic benefits for regions.
Therefore, understanding the process of regional diversification in Brazil from this perspective is crucial, as the country is characterized by a structural malformation in terms of regional inequalities, extensively explored in the literature on the subject.Silva (2017) emphasizes that this structural issue manifests in the notion that, irrespective of the socioeconomic indicator used to gauge regional inequality, the results consistently exhibit the same pattern.The North and Northeast regions consistently show the poorest rates, while the South and Southeast regions stand out with the most favorable averages.The differences between regions play a significant role in shaping their diversification trajectories and can contribute to divergent development outcomes.
Furthermore, a specific segment of the literature delves into the convergence among municipalities and regions to scrutinize the dynamics of regional inequalities.Vreyer and Spielvogel (2009) estimated the speed of per capita income convergence in Brazilian Relatedness and regional economic complexity municipalities from 1970 to 1996.Their findings underscored the lack of convergence and spatial dependence among municipalities, elucidating the persistence of inequalities and the clustering of impoverished areas in less developed regions of the country.However, more recent studies, such as those by Neto (2014) and Magalhaes and Alves (2021), suggest an improvement in the scenario, particularly from the 2000s onward.Nevertheless, Magalhaes and Alves (2021) contend that this improvement does not translate into low levels of regional inequality, highlighting the enduring structural challenge of fostering sustained and less unequal national development.
When examining the panorama of Brazilian regional complexity, studies align closely with anticipated patterns.Freitas and Paiva (2016) characterize regional inequality, emphasizing its manifestation in terms of complexity.The authors point out a concentration of poles of diversity and sophistication in exports solely within the South and Southeast regions, without discernible signs of improvement in the assessed period for other regions.In a separate study, Rezende et al. (2023) analyze the distribution of employment across Brazilian states from 2006 to 2020.They underscore a noteworthy concentration of jobs in highly complex activities exclusively within the South and Southeast regions, highlighting an ongoing process of regressive specialization across the country.
Given this Brazilian context, the challenges faced by less complex regions in diversifying into activities that require less common capabilities should be addressed through targeted policies aimed at avoiding a potential low-complexity trap.However, to design effective policies, it is essential to examine how this logic manifests itself in Brazil.This requires a comprehensive analysis of regional dynamics, economic capabilities and the interplay between complexity, relatedness and regional development policies in order to foster inclusive and sustainable diversification across all regions in Brazil.
In this context, some hypotheses will be tested for this evaluation: H1.The productive activities that enter the portfolio of a region are more related to other activities already produced in that location.

H2.
A region is more likely to cease specialization in a particular activity when it is less related to the other activities within the local productive structure.
H3. Highly complex regions possess sufficient capabilities, such that increasing sector complexity enhances the probability of sector entry into their portfolios, while it has little influence on the probability of sector exit.
H4.Among less complex regions, greater sector complexity decreases the probability of sector entry and simultaneously acts as a significant factor driving sector exit.
The first two hypotheses, which are based on the principle of relatedness, have already been investigated in previous studies, including those conducted on regions in Brazil.Françoso et al. (2022) specifically examined the probability of entry for new activities and found a positive effect of relatedness, supporting Hypothesis 1. Freitas (2019) analyzed both the probability of entry and exit and found positive effects of relatedness on entry (Hypothesis 1) and negative effects on exit (Hypothesis 2).However, in addition to attesting the principle of relatedness, the subsequent hypotheses in this study aim to shed light on the persistent divergent pattern of diversification among Brazilian regions and provide insights into why relatedness is only beneficial to some regions.
In this case, hypotheses 3 and 4 have not been fully tested together in previous studies and represent the main contribution of this paper.Françoso et al. (2022) partially examined Hypotheses 3 and 4, finding a positive influence of sectoral complexity on the probability of entry of new activities in regions with complexity above the median and a negative influence in regions with complexity below the median.However, they did not evaluate the effect of complexity on the probability of exiting activities from the portfolio, which is not sufficient to fully confirm Hypotheses 3 and 4. On the other hand, Freitas (2019) only tested Hypothesis 3 by focusing on the most complex regions (4th quartile of ECI).Although he found positive effects of sectoral complexity on the probability of entry and exit of firms, the analysis is not sufficient to confirm Hypothesis 4 as it did not include less complex regions.The following section describes the methods and data that will be used to test these assumptions.
3. Data and method 3.1 Complexity measures Hidalgo et al. (2007) utilized international trade data as the basis for their methodological approach, which builds upon the concept of revealed comparative advantage (RCA) introduced by Balassa (1965).The RCA index serves as a criterion for identifying specialization in a particular economic activity by comparing the share of that activity in the local economy to its total share in the overall economy.If the numerator (share in the local economy) is greater than the denominator (share in the overall economy), it indicates that the country or region has a competitive advantage in that sector.The RCA index shares the same conceptual framework as the location quotient (LQ) used in regional literature.The formal expression of the RCA index is presented below: where X i,j represents the quantity of product i exported by country j.Therefore, if the calculation of RCA yields a value equal to or greater than 1, it indicates that country j competitively produces product i in comparison to other countries.Conversely, if the resulting index is less than 1, it implies that product i does not play a significant role in the analyzed market.
Based on this, Hidalgo and Hausmann (2009) propose a methodology for measuring the internal productive capacities of economies, which explains the differences in growth between countries.They analyze foreign trade data using a bipartite network approach, where countries are connected to the products they export, allowing the measurement of the complexity level of the capabilities concentrated in these economies.This measurement is based on two indicators that quantify the sophistication of products and the diversification of countries.Formally: The quantity of goods exported with RCA serves as an indicator of the diversification of countries (D j ), while the number of countries that export a particular product with RCA reflects the ubiquity of that product (U i ).In this framework, complexity is measured based on both diversification and ubiquity.Hence, a complex country or product is characterized by high diversification and low ubiquity.The binary matrix (M i,j ) is used to represent the sectors in which countries possess RCA, taking the value of 1 when RCA exists and 0 otherwise.
To summarize the complexity of countries and products, Hidalgo and Hausmann (2009) employ iterated combinations of the two indicators.These combinations are designed to weigh the characteristics when one criterion alone is insufficient to determine high or low complexity.For instance, countries with high diversification but concentrated in the Relatedness and regional economic complexity production of highly ubiquitous goods are considered less complex.Similarly, products that are not ubiquitous but produced by countries with limited diversification are also deemed less complex.
Therefore, the complexity measure depends on the eigenvector of the iteration matrix associated with its second largest eigenvalue, which captures most of the variance in the original data.Consequently, the ECI and the product complexity index (PCI) are derived from the same operations in opposite ways and are formally defined as [1]: where K and Q are the eigenvectors associated with the second largest eigenvalue, the operator CD denotes the mean, and stdev represents the standard deviation.
3.2 Relatedness measure Hidalgo et al. (2007) introduced the concept of proximity, which plays a crucial role in understanding the relationship between economic activities.In addition, this concept also extends to measuring the proximity or distance between the productive structures of different locations and specific goods.The framework developed by Hidalgo et al. (2007) centers around quantifying relatedness by examining the likelihood of two products being exported together by countries.If two goods are frequently co-exported, it implies that they share common production capabilities and are therefore related.The proximity between each pair of products is then determined by taking the minimum value among the pairwise conditional probabilities of locations that competitively produce product i, given that they also competitively produce product f.Formally, this concept can be expressed as follows: In this expression, for a location j: However, proximity reveals relatedness only between products.To understand the influence of relatedness in the process of productive diversification in countries and regions, an indicator capable of measuring the distance between the portfolio of an economy and a given product is needed.This indicator was also formulated by Hidalgo et al. (2007) and is called density.Density measures the distance between a given good and the productive portfolio of a location.This index also represents the difficulty for a location to specialize in a sector since the further away the product is from the local portfolio, the lower the chances of having common capabilities for its production.Therefore, we call this indicator relatedness density, as defined by Boschma et al. (2014): Equation ( 8) demonstrates the relatedness density between a product i and the productive structure of a given country j.The indicator is the sum of the proximities between product i and the other goods that country j has RCA, weighted by the sum of the proximities between this product and all other goods.It represents the weighted proportion of goods related to product i that are competitively produced by country j.This indicator varies between 0% and 100%.A density of 0% for a given good i and country j means that there are no other related products in that country's portfolio.On the other hand, a density of 100% means that all goods related to product i are competitively produced by country j.

Data
To assess the diversification process of Brazilian microregions, our main data source will be employment data in economic-productive activities.Employment data has been widely utilized for subnational analyses as it provides more up-to-date information, covers the entire territorial dimension and offers a high level of specification (Freitas, 2019;Françoso et al., 2022;Romero et al., 2022).Unlike foreign trade data, which was utilized by Hidalgo et al. (2007) and Hidalgo and Hausmann (2009), employment data is more suitable for regional analysis in Brazil due to the large number of municipalities that do not engage in exporting or importing activities, thereby lacking relevant information.Moreover, considering the significant weight of the domestic market in the Brazilian economy, employment data provides a comprehensive perspective.Therefore, in order to measure the aforementioned indicators, we utilized the LQ instead of the RCA and adapted the concept of proximity based on the co-location of productive activities among Brazilian microregions.Formally, expressions (1) and ( 6) are constructed as follows: In equation ( 9), the quantity exported is replaced by employment in productive activity i and microregion j.In equation ( 10), proximity is calculated based on the concept of co-location between activities, which is determined by the number of locations where both activities are competitively and jointly produced.Thus, the proximity between activities i and f is defined as the minimum conditional probability that the LQ is greater than 1 in one activity, given that the LQ is greater than 1 in the other activity.This adaptation enables the measurement of indicators for Brazilian microregions and, consequently, facilitates the analysis of the diversification proposed in this paper.
The main data source for this study is the annual social information report (RAIS), organized by the Ministry of Labor and Employment.This database contains mandatory administrative records for formal employment establishments in Brazil.From the RAIS, we extracted information on the number of formal jobs by municipality (and region) and by economic activity sector.Economic activities were grouped based on the 6-digit class of the National Classification of Economic Activities (CNAE) proposed by the Brazilian Institute of Geography and Statistics (IBGE).The chosen territorial unit for analysis is the microregions [2] (IBGE, 1990).This approach allowed us to organize a database comprising 558 Brazilian micro-regions and 670 productive activities classified according to the CNAE class.Furthermore, it is essential to note that the results to be presented must be interpreted with the awareness that the data exclusively reflects the formal job market.This limitation is a crucial consideration for future research to address, since estimates indicate that about 40% of the employment in Brazil is informal.

Econometric specifications
The model specification follows the pattern tested by Neffke et al. (2011) and replicated in several other articles.Hence, we opted to assess the impact of relatedness density and sector complexity on the probability of entry and exit of productive activities in the portfolios of Brazilian regions.To do so, we will use binary dependent variables that indicate the occurrence of entry or exit for a specific activity in a given region.
The variable Entry is defined as 1 if a micro-region j is not specialized in economic activity i at time t (LQ < 1), but becomes specialized at t þ 5 (LQ ≥ 1).It takes the value 0 when the microregion was not specialized at time t and also does not become specialized at time t þ 5. So, it considers only the subset of activities that were not competitively produced by the micro-regions at time t (LQ < 1).On the other hand, the variable Exit follows the opposite logic.It is assigned the value 1 when microregion j was specialized in activity i at time t (LQ ≥ 1) but ceases to be specialized at time t þ 5 (LQ < 1).It is assigned the value 0 when the microregion was specialized at time t and continues to be specialized at time t þ 5. Therefore, for the "Exit" variable, the observations are limited to cases where the activity was competitively produced at time t (LQ ≥ 1).Formally, the definitions are as follows: The choice of 5-year intervals follows the approach adopted by Boschma et al. (2014Boschma et al. ( , 2015) ) and Freitas (2019).We selected two 5-year periods between 2009 and 2019 (2009-2014 and 2014-2019), resulting in a balanced and complete panel with 1,121,580 observations.However, for the estimation of the entry and exit models, the panel is further reduced.For the dummy variable Entry, only activities that have the potential to enter the portfolio in the subsequent period are considered.This means that the LQ must be less than 1 in the initial periods (2009 or 2014).As a result, the subsample used for the entry model comprises 647,801 observations.For the Exit dummy variable, we consider only activities that could potentially leave the portfolio of microregions in the following period (LQ ≥ 1 in 2009 or 2014), resulting in a subsample of 99,919 observations.The specification of the models is as follows: where Regions j,tÀ5 is the vector of variables used to control for observable characteristics that vary over time in Brazilian microregions.Similarly, Activities i,tÀ5 is a vector of variables that summarize the characteristics of productive activities.These variables are presented in Table 1.The fixed effects are represented by f j for regions and ψ i for activities.Finally, « j,i,t represents the residuals.
The main independent variables for the analyses conducted in this article are relatedness density and PCI.However, we also include additional variables to control for specific characteristics of regions and activities.The following table summarizes the control variables used.
In addition to controlling for structural characteristics such as per capita GDP, population, human capital, productivity, presence of entrepreneurial incentives and average sector size, we also include variables to capture the effects of local economy diversity (Diversity) and sector spatial concentration (coefficient of localization (CL)).The inclusion of the diversity variable aligns with the literature that aims to understand the influence of a diverse local economy in attracting new industries (Glaeser, Kallal, Scheinkman, & Shleifer, 1992, Henderson, Kuncoro, & Turner 1995).Furthermore, the CL is utilized to account for the inherent difficulty of attracting or exiting a particular sector.It is assumed that a higher spatial concentration of an activity indicates a lower probability of entry or exit.
Finally, it is important to note that regressions 13 and 14 will be estimated taking into account the complexity of each region.The micro-regions will be divided into four complexity groups, which are expected to exhibit different diversification processes according to the hypotheses.The classification criteria used will be explained and discussed in the section below.

Complexity groups
The hypotheses of this study will be tested by grouping the regions according to their level of complexity.The stratification will be based on the ECI value, resulting in four distinct groups of microregions: (1) Low complexity: microregions with an ECI up to 0.25.

Relatedness and regional economic complexity
The existing literature has not yet converged on a unanimous approach to classifying regions based on their complexity levels.Prevailing contributions often distinguish regions according to the distribution of the ECI, as demonstrated in the studies by Freitas (2019) and Françoso et al. (2022).However, this method frequently results in the grouping of regions with significantly divergent complexity levels.To address this concern, we have adopted a strategy that differentiates regions based on their specific index values.While this approach may yield groups with varying numbers of regions, it ensures a more equitable representation of complexity levels across these groups.It is essential to note, however, that this strategy relies on the establishment of arbitrary values for differentiating the ECI.
Figure 1 a, b, and c show the distribution of microregions across these complexity groups for the reference years of analysis.The configuration of the groups appears to be consistent and stable across all years.Microregions with high and medium-high complexity are primarily concentrated in the South and Southeast regions, particularly around major urban centers.Microregions with low and medium-low complexity are located in more inland regions as well as in the North, Northeast and midwest regions.
Figure 1 d and e illustrate the distribution of ECI within each complexity group, with separate graphs for entry models and exit models.As mentioned earlier, the reason for this separation is that these models represent different sets of activities.The distribution within each group reveals distinct patterns.Among microregions with low complexity, the ECI values are predominantly concentrated around 0.25, with some outliers towards the lower end of the range.For microregions with medium-low complexity, the distribution tends to be closer to 0.50.Similarly, microregions with medium-high complexity also show a distribution closer to 0.50, indicating concentration towards the lower values of the range.Finally, highly complex microregions tend to have ECI values close to the lower limit of the range, around 0.75.
The visualization of the S-shaped curve, representing the relationship between region complexity and proximity to new complex activities, is crucial for testing the hypotheses in this study.Figure 1 f presents this curve for Brazilian microregions, which exhibits a distinct configuration compared to European regions (Pinheiro et al., 2022) and other countries (Hartmann, Bezerra, Lodolo, & Pinheiro, 2020).Unlike the European context, Brazil's regional development features two primary stages.The first stage includes regions where an increase in complexity does not significantly reduce the distance from less complex productive structures (ECI between 0 and 0.50).The second stage comprises intermediate regions, where even small increments in complexity lead to substantial increases in proximity to complex activities (ECI between 0.50 and 1.00).This nuanced pattern highlights the emphasis on the intermediate stage of development in Brazil, where a significant number of complex microregions exhibit varying levels of proximity to new complex activities.
Finally, to complete the analysis, it is necessary to evaluate the dynamics of entry and exit of firms according to the previously defined complexity groups.Following Boschma et al. (2015), Figure 1 g and h show, respectively, the entry and exit rates of activities in the portfolio of microregions according to the average density in each one of them.In addition, the color of the dots identifies the groups to which each region belongs.Figure 1 g shows that the entry rate is well correlated with the average relatedness of the region, except that the groups of greater complexity present a greater dispersion around the line.On the other hand, for activity exit rates (Figure 1 h), there is a negative relationship with the average density of the regions, but a weaker correlation, mainly due to the less complex groups.

Econometric tests
Figure 2 illustrates the strength of the correlation among the dependent and independent variables used in the models.Correlations are in most cases positive, indicating some degree ECON of relationship, although they are generally weak.The strongest correlations are observed between diversity and relatedness (0.82) and between regional productivity and GDP per capita (0.65).This can be explained by the fact that more diversified economies possess a Complexity groups Relatedness and regional economic complexity range of capabilities that facilitate entry into new sectors.In the latter case, it is assumed that wealthier regions have higher average worker salaries, which serves as a proxy for productivity.Importantly, there is no significant collinearity among the regressors that would impede the estimation of the models.
Table 2 presents the construction of the final model, which is estimated by differentiating the regions into complexity groups.The table consists of six estimates: regression (1) measures the influence of only the main variables (relatedness density and PCI); regression (2) considers only variables controlling observable characteristics of microregions; regression (3) considers only control variables for activities; and regressions (4), ( 5) and ( 6) are the final models specified in equation ( 13).Moving on to Table 3, it presents the results of the final model considering the groups of microregions by complexity.Since the estimates using the three estimation strategies (OLS, logit and probit) are similar and consistent with each other, we chose to focus on presenting the results of the logit model in this section for simplicity [3].However, the corresponding estimates using OLS and probit can be found in Annex, which ensures the robustness of the logit model.Tables 4 and 5 follow the same approach, but for the dependent variable Exit.
The results presented in Table 2 support a consistent narrative.While the intensity of the coefficients cannot be evaluated due to the reasons discussed earlier, comparing models (4), ( 5) and ( 6) reveals that the direction of the effects of the independent variables are consistent across all three specifications, and the same variables remain significant.Consequently, the hypothesis of the principle of relatedness (Hidalgo et al., 2018) is validated, indicating a (2) (3) (4) (

ECON
positive relationship between relatedness density and the probability of entry.Therefore, acquiring a diverse set of skills that enhances the micro-regions' capacity to specialize in other activities is crucial for regional economic diversification.As mentioned, this confirmation is also present in previous studies on Brazil.However, the analysis of the PCI's role introduces complications, as, on average, sectors with higher complexity are less likely to become specialized within a region.This suggests that the process of accumulating skills and diversifying the economic activities is not straightforward.To delve deeper into this issue,

Relatedness and regional economic complexity
further analysis considering the complexity level of microregions is warranted.This dynamic, although cited, was not the primary focus of attention by Freitas (2019) and Françoso et al. (2022) and will be our main contribution.
Table 3, in turn, represents the first part of the main contribution of this article.Previous studies either do not achieve such a level of disaggregation when differentiating regions by complexity (Françoso et al., 2022) or solely focus on the most complex regions (Freitas, 2019).The segmented analysis based on the complexity level of regions reveals the inherent inequality in the diversification process of Brazilian microregions.Across all three estimation strategies, the relatedness density consistently shows a significant and positive effect on the ECON probability of a new activity entering the local productive structures, irrespective of the region's complexity level (Hypothesis 1).However, the complexity of regional portfolios differentiates the influence of PCI on the probability of new sector emergence.In the less complex groups (low and medium-low), an increase in PCI negatively affects the likelihood of a particular activity specializing in these regions.The medium-high complexity group appears to be in a transitional position, with varying coefficient signs across models and without statistical significance.In contrast, the microregions with high complexity (high) Relatedness and regional economic complexity exhibit positive and significant coefficients for PCI in all models.This indicates that the impact of activity complexity on the probability of new sector entry is reversed, becoming positive.This pattern reinforces the thesis that relatedness is good news only for some, as only the most complex regions possess capabilities that enable production in new, more complex sectors.
In quantitative terms, the results also demonstrate economic significance.Since we cannot interpret the coefficients directly as in OLS, the values enclosed in square brackets in Table 3 illustrate the impact of relatedness density and PCI on the probability of entry through the average marginal effects.A 0.1 increase in relatedness density corresponds to an approximately 4-5% increase in the probability of entry for low and medium-low complexity regions, an 8% increase for medium-high, and a 13% increase for high.As for the PCI variable, which is central to our argument, a 0.1 increase in the indicator results in a decrease of 1.2-1.4% in the probability of entry for regions with low and medium-low complexity, and an increase of 0.6% for regions with high complexity.
Tables 4 and 5 present the same estimates for assessing the probability of activities exiting.Comparing models (4), ( 5) and ( 6) in Table 4, the coefficients maintain the same sign, and the significant variables remain the same across the models.Once again, the principle of relatedness is supported, as the effect of relatedness density is consistently negative and significant in all estimates.This implies that a higher density decreases the likelihood of the activity ceasing to be specialized in the region.Conversely, complexity exerts a contrasting force, increasing the probability of an activity leaving.This finding aligns with the results found by Freitas (2019).The control variables exhibit effects opposite to those in the entry models, and generally, the same variables remain significant.
However, what is crucial here is to understand the impact of these variables while considering the differentiation of the complexity level among micro-regions.This represents the second part of our contribution, as it involves an analysis that is missing in previous works.Freitas (2019) examines the probability of exit only for the top 25% most complex regions, while Françoso et al. (2022) does not assess the removal of activities from the local portfolio.Once again, the results demonstrate an uneven diversification process, as the same pattern emerges in all three estimates.Relatedness density plays a role in reducing the probability of activity exit, regardless of the level of complexity.Nonetheless, the complexity of sectors is pivotal in increasing the likelihood of exiting the local portfolio, particularly in less complex groups (low and medium-low).On the other hand, as the complexity of microregions increases, the effect of PCI diminishes and becomes insignificant.This phenomenon highlights that less complex microregions lack the necessary skills, knowledge and capabilities to sustain complex activities within their structure, while more complex regions do not face the same challenge.
In terms of quantitative interpretation, the coefficients reveal a more pronounced effect of the main variables on the probability of exiting activities compared to the entry models.According to Table 5, a 0.1 increase in relatedness density results in approximately a 21% decrease in the probability of activities leaving low-complexity local structures.This reduction is 22% for medium-low and 17% for medium-high and high complexity microregions.On the other hand, the PCI has a greater impact on the probability of activity exit than on the entry of new sectors.Among low-complexity groups, a 0.1 increase in the PCI translates to a 14% increase in the exit probability for low regions and a 5% increase for medium-low regions.However, for more complex regions, the effect diminishes to 1.0% for medium-high and becomes insignificant for high complexity micro-regions.
The econometric tests conducted in this paper validate the hypotheses proposed.Across all estimated models, relatedness density strongly promotes the entry of new sectors into local structures and simultaneously acts as a significant factor in preventing the exit of existing sectors (supporting Hypotheses 1 and 2).While these results have been empirically demonstrated before, this work contributes by delving deeper into the study within the context of activity and regional complexity that had not yet been fully studied (Hypotheses 2 and 3).Furthermore, the results reveal that the complexity of a sector reduces the probability of its specialization in less complex regions while increasing it in more complex regions.Conversely, sectoral complexity significantly raises the likelihood of a sector leaving less complex local structures but has no significant effect in high complexity regions.These findings provide evidence for Hypotheses 3 and 4, supporting the notion that the process of related diversification is inherently uneven.
It is also worth noting that complexity has a stronger impact on the probability of sectoral exits compared to its effect on attracting new activities.Figure 3 summarizes the percentage changes in the probability of entry and exit due to a 0.1 increase in the PCI of the targeted activity.The greater effect on exit probability at lower levels of regional complexity indicates that unrelated diversification is very risky when it comes to low complexity regions.Moreover, this means also that it is much harder to keep complex activities in least developed regions than attracting them.This finding has important policy implications, stressing the importance of additional policies in relation to the usual investment attraction strategies.
Figure 3 also demonstrates that the effect of entering activities with higher complexity becomes weaker in regions at the medium-high complexity level.This suggests that this is the least risky development level to pursue unrelated diversification.The results show also that medium-low complexity regions could be the target of hybrid strategies, since the exit risk reduces considerably in relation to low complexity regions.
Furthermore, the econometric results reinforce the interpretation discussed in Figure 1 f, which illustrates an exponential curve.The figure helps us understand that as regional complexity increases, there is a closer proximity to more complex sectors, but only among those with an ECI of at least 0.50.Attracting more complex sectors to less complex regions is indeed a challenging task.This is why the PCI has a negative impact on the probability of entry for new sectors and a notably positive impact on the probability of exit among low and medium-low regions.Conversely, in regions of high complexity, the PCI increases the probability of a given activity becoming specialized and does not significantly affect the probability of exit.This is because the distance from more complex sectors is much Relatedness and regional economic complexity smaller in high complexity regions and decreases exponentially as regional complexity increases.The steep slope of the curve in this group indicates that the higher the regional complexity, the greater the proximity gains to more complex sectors.

Concluding remarks
This article introduced new elements for understanding the inherent inequality in the diversification process.An analysis under the prism of economic complexity highlights the challenges that less complex regions face in diversifying their economies, as well as the ability of more complex regions to maintain a productive structure that attracts and retains more complex activities.
The descriptive analysis of formal employment data from 568 Brazilian microregions yielded significant findings.Firstly, complexity in Brazil is regionally concentrated, with the highest rates observed in regions consisting of state capitals.Furthermore, in contrast to its application in other regions, the S-shaped curve actually takes on an exponential form for Brazilian microregions.Instead of development occurring in two distinct stages, as argued by Pinheiro et al. (2022), the composition in Brazil consists of one extreme stage represented by less complex regions and a transitional stage represented by more complex regions.While the former are trapped in a cycle where increases in complexity do not result in a closer proximity to more complex products, the latter experience that even small increases in regional complexity can lead to significant advancements in proximity to more complex sectors.In a context where development is heavily influenced by path dependence, regional polarization tends to escalate rapidly, exacerbating existing disparities.
The econometric tests confirmed the hypothesis of the principle of relatedness, while indicating the influence of sectoral complexity on diversification, considering the differentiation of regions based on their complexity levels.The regions were grouped to examine the effect of relatedness density and PCI on the probability of entry or exit of sectors from local portfolios.Consistent with expectations, all estimated models demonstrated a positive relationship between density and the probability of entry, as well as a negative relationship between density and the probability of exit.However, the impact of PCI has proven to be crucial to understand.In highly complex regions, an increase in sectoral complexity raised the probability of new sector entry, while having minimal impact on the probability of sector exit.Conversely, in less complex regions, greater sector complexity was found to decrease the probability of entry and significantly contribute to sector exit.
In this regard, this paper addresses a significant gap in the existing literature on the subject.Prior studies primarily focused on evaluating hypotheses related to the principle of relatedness, while giving less attention to the influences of sectoral and regional complexity.Previously, there was no disaggregated differentiation of regions based on complexity, and consequently, there was a lack of a comprehensive analysis of the combined influence of relatedness density and PCI on the probability of sector entry and exit in the local portfolio.Therefore, the main contribution was to propose an assessment of regional diversification in Brazil with a focus on complexity, by introducing a more detailed differentiation of regions based on their complexity levels and examining the impact of sectoral complexity on the diversification patterns within each group.
The findings emphasize that the loss of complex activities poses a primary challenge for less complex regions.This dynamic restricts their diversification opportunities, increasing the gap regarding more complex regions.These findings suggests that diversification into more complex (unrelated) activities is very risky when it comes to low complexity regions.Moreover, the effect of entering activities with higher complexity becomes weaker in regions ECON at the medium-high complexity level, indicating that this is the least risky development level to pursue unrelated diversification, while medium-low complexity regions could be the target of hybrid strategies, since the exit risk reduces considerably in relation to low complexity regions.
However, the discussion does not stop here.There is still a limited amount of research in the literature that aims to evaluate relatedness from this perspective.While it has been established why related diversification is good news only for some regions, further questions arise to delve deeper into this analysis.How good or how bad is this news?What policies can effectively support the development of less complex regions?Furthermore, it is crucial to assess the results considering the methodological limitations of the research.The database used exclusively covers formal jobs, and the definition of the ECI complexity groups originated from arbitrary values.Future estimates should take into account the relevant informal scope in Brazilian labor market; further explore the categorization of regions based on complexity, and test sector entry and exit events adopting alternative criteria.Additionally, case studies would provide valuable insights into understanding the potential obstacles faced by low complexity regions and exploring the variations among high complexity regions that exhibit different levels of proximity to other complex sectors.
2. Microregions are geographic areas consisting of neighboring municipalities that share similarities in terms of spatial organization.These regions are characterized by specific features related to the agricultural, industrial, mineral extraction and fishing production structures, as established by IBGE (1990).
3. Furthermore, the option for presenting the logit model for the microregion groups is that the coefficients for PCI are slightly more significant.

Figure 3 .
Figure 3. Changes in the probabilities of entry and exit in response to an increase in 0.1 in the complexity of the targeted activity

Table 1 .
Control variables a Municipal Basic Information Survey (MUNIC) conducted by IBGE b Monthly Banking Statistics by Municipality (ESTBAN) provided by the Central Bank Source(s): Own elaboration

Table 3 .
p<0.1; ** p<0.05; *** p<0.01 Robust standard-errors (clustered at the microregion and activity level) are in parentheses Initial periods (t) are 2009 and 2014 and final (tþ5) 2014 and 2019 The values in square brackets [ ] represent the average marginal effects Source(s): Own elaboration Emergence of new activitieslogit models *

Table 5 .
Exit of activitieslogit models