Patterns of inconsistency: a literature review of empirical studies on the multinationality – performance relationship Patterns of inconsistency

Purpose – This study aims to understand the performance implications of when a business internationalizes. Many managers take the performance implications of internationalization for granted. Whether seeking a broader customer base or cost reduction through cross-border outsourcing, the overwhelmingbelief is that internationalization leads tohigher pro ﬁ ts. Design/methodology/approach – This paper offers a systematic review, content analysis and cross-tabulation analysis of 115 empirical studies from over 40 major journals in management, strategy and international business between 1977 and 2021. Focusing on research settings, sample characteristics, underlying theoretical approaches, measurements of key variables and moderators in ﬂ uencing the multinationalityandperformance relationship,this study offersa detailed accountof de ﬁ nitions and effects. Findings – The ﬁ ndings of this study suggest a tenuous connection between internationalization and performance. No strain of research literature conclusively identi ﬁ es a consistent direct path from internationalization to performance. The context speci ﬁ city of the relationship makes general declarations impossible. Research limitations/implications – Future researchersshould recognize that internationalization is a process taking different forms, with no speci ﬁ c dominant form. General declarations are misleading. The focusshouldbeon theprocessof internationalizationratherthan ontheoutcome. Originality/value – This study contributes to the international business literature by exploring reasons for the inconsistent results and lack of consensus. Through a detailed account of de ﬁ nitions and effects, this paperexplores the lack ofconsensus aswell asthe identi ﬁ edshapesof therelationship.


Introduction
A substantial body of research in international business, strategy and general management is devoted to understanding firm internationalization. In this paper, understanding the performance implications of internationalization is of particular interest. Over the past half century, research on the relationship between firm multinationality and performance has been growing steadily, and given the increase in internationalization activities, it is seen as a seminal issue in strategic management (Hitt et al., 2006;Kirca et al., 2011). The term "multinationality" is frequently used to describe the spread of a firm's international activities and refers to the extent of value-adding activities conducted outside its home country (cf. Hitt et al., 2006;Lu and Beamish, 2004). In concrete terms, it is the extent of investment and/or control of assets and activities outside of the home market (Cantwell and Sanna-Randaccio, 1993;Teece, 1981). Multinationality measurements can be broadly divided into either scale or scope metrics (Rugman and Oh, 2011). There were only a few studies published prior to 1996 on the relationship between multinationality and performance, after which publication frequency increased dramatically.
Several theoretical perspectives, such as resource-based theory, internalization theory and organizational learning theory, offer explanations for the increased engagement in international activity. Two main arguments are that internationalization offers: increased strategic flexibility; and scale economies (Gaur et al., 2011).
In addition, international expansion is argued to enable firms to acquire cheaper resources, reduce capital costs and diversify operations geographically (Benito, 2015;Dunning, 1993;Sapienza et al., 2006). This, in turn, reduces risk and increases leverage. Together, these benefits are argued to have a positive effect on firm performance because they lower total costs and increase productivity (Yang and Driffield, 2012). The internationalization process also involves additional costs to a firm. International expansion generates a more complex and culturally diverse organization that is difficult to manage (Lu and Beamish, 2004). Early stages of the internationalization process are risky and carry high learning costs. Together, these costs have a negative effect on firm performance.
The contradictory outcomes of firm internationalization have triggered the interest for explaining the multinationality and performance (M-P) relationship, yet despite the large body of empirical research, results are inconclusive. Authors have found strong support for a positive linear relationship (Grant, 1987;Kim et al., 1989;Kotabe et al., 2002), a negative linear relationship (Michel and Shaked, 1986;Powell, 2014;Singla and George, 2013), a Ushaped relationship (Capar and Kotabe, 2003;Contractor et al., 2007;Lu and Beamish, 2001), an inverted U-shaped (Geringer et al., 1989;Hitt et al., 1997;Tallman and Li, 1996), an Sshaped relationship (Contractor et al., 2003;Lu and Beamish, 2004;Ruigrok et al., 2007), an M-shaped relationship (Almod ovar, 2012; Almod ovar and Rugman, 2014;Lee, 2010) and a W-shaped relationship (Almod ovar, 2012) [1]. Meanwhile, some studies argue that there is no systematic relationship at all (Hennart, 2007;Rugman et al., 2016). These inconclusive results suggest that we are far from reaching consensus on understanding the M-P relationship, and that additional empirical studies on the subject might not be the way forward, but rather to try to find the answers in the vast number of existing studies. Tallman and Pedersen (2012, p. 313) highlight that the topic of multinationality and performance is, " [O]ne of the mainstays of studies of multinational enterprises and their strategies yet they remain disappointed by the fact that the 'empirical results [in previous studies] have largely been disappointing, perplexing, and inconclusive'". Contractor et al. (2007) speak of previous findings as contradictory and Hennart (2007) calls them disappointing. The diversity in the results is claimed to be attributed to underlying theories (Wiersema and Bowen, 2011), measures (Rugman and Oh, 2010, p. 484;Verbeke and Forootan, 2012), sampling issues, availability of data or how the M-P relationship is moderated. We suggest that one important step forward in finding possible explanations for the incongruent results is within the vast number of existing studies and not by conducting CPOIB yet another empirical study as there is reasons to suspect that it will only be another study with inconclusive results. In this paper, we analyze almost half a century of M-P literature, searching for patterns in the empirical studies to possibly bring clarity into why the results diverge. Through a detailed account of definitions and effects, the paper explores reasons for inconsistent results and lack of consensus within and across research streams as well as in relation to the identified shapes of the relationship. Consequently, we question the dominant academic discourse in international business focused on finding support for a relationship between internationalization and performance outcomes. It may well be futile to continue on the same path, testing new measures and moderators in pursuit of an explanation.
The paper offers a systematic review and content analysis of the international business, strategy and general management literatures, analyzing 115 empirical studies from 42 major journals between 1977 and 2021, with focus on: research settings; measurements of key variables; underlying theoretical approaches; and moderators influencing the M-P relationship.
By providing a systematic overview of M-P studies in the fields of international business, strategy and general management, this literature review also differs from existing review articles (Annavarjula and Beldona, 2000;Li, 2007;Nguyen, 2017;Nguyen and Kim, 2020;Sullivan, 1994) in multiple ways. First, one major contribution is to summarize and present moderators used to study the relationship between multinationality and performance. This has implications for questioning the direction of the causal link between multinationality and performance. Second, it illustrates and critically discusses the influence that different research settings, measurements, theoretical assumptions and moderators have on the M-P relationship. Third, it encompasses the most relevant empirical studies published over the past 44 years (i.e. since the start of the Uppsala School of Internationalization), investigating key constructs, measures, samples, major findings and analytical methods, making it the most recent and most comprehensive review so far.

Research methodology
The starting point for the systematic literature review and content analysis was a Boolean search in the Web of Science and Business Source Premier databases for peer-reviewed articles, using the self-constructed search string [(multinational* OR international*) AND performance].
The search was limited to the publication period between 1977 and 2021, and to journals in the fields of international business, general management and strategy that were rated 2, 3 or 4 in the Chartered Association of Business Schools Academic Journal Guide 2015. This was followed by an issue by issue search in the same fields in all 61 journals to ensure that no articles were overlooked. Appendix 1 presents an overview of the selected journals, as well as an indication of initial hits and articles included in this literature review. Multinationality, internationality and performance are popular terms, especially within the international business literature and are often referred to or used for argumentation without defining or measuring the concepts. As the focus of this literature review is the relationship between the two concepts multinationality and performance, it is important that they were key concepts in the articles. As authors tend to mention their key concepts in the title, and to avoid an overly large and irrelevant sample of academic papers, the search was limited to the title of the article. This resulted in 491 articles. As some authors refer to multinationality or internationality as regional or geographic diversification, an additional Boolean search in both databases and an issue-by-issue search in the selected journals was done with the self-constructed search string [((region* OR geographic*) diversification) AND performance] and the same limitations. This resulted in 152 additional articles. Moreover, to capture the variety in vocabulary used to describe multinational firms, a third Boolean search in both databases and an issue-by-issue search in the selected journals was done with the self-constructed search string [(transnational* OR "born global*") AND performance], applying the same limitations as above. This resulted in 11 additional articles. As the search strings could overlap, all articles were downloaded into a citation management system and checked for duplicates. Duplicates were deleted, resulting in a sample of 654 unique scholarly articles.
The articles were confronted with a set of predefined exclusion criteria. Following Sinkovics and Reuber (2021), a search protocol with a detailed account of the exclusion criteria can be found in Appendix 2. First, both multinationality and performance had to be key variables in the study, excluding those studies where, for example, one of the concepts was used as a control variable. Second, studies included in the literature review had to measure corporate performance, meaning that those studies measuring either: different kinds of performance (such as corporate social performance or environmental performance); or the unit of analysis was not on a firm level (e.g. subsidiary performance) were excluded from the study.
Third, studies had to undergo a qualitative assessment by the researcher about their relevance for the literature review. For example, a study by Jean et al. (2015) fulfilled the previous criteria, but focused its analysis on the customer-supplier relationship. Consequently, a number of studies could not be included in the final sample because: either multinationality or performance were used as a moderator or control variable (À16 articles); different kinds of performance were measured (À110 articles); performance was not measured on a corporate level (À36 articles); different kinds of diversification (e.g. product diversification or board diversification) were measured (À92 articles); and multinationality and/or performance were not a key variable (À261 articles).
As our focus was on the empirical findings, we limited our sample to only empirical papers. As a consequence, from the remaining 139 articles that fulfilled the requirements outlined above, conceptual papers[2] (À7 articles) and literature reviews[3] (À10 articles) were excluded. We also excluded meta-analyses[4] (À7 articles) for two reasons. First, the results of meta-analyses are based on largely the same empirical papers as are used for this literature review. Second, meta-analyses are highly criticized for investigating weakly defined and operationalized constructs that could lead to misleading results (Klein and Delery, 2012). Therefore, the final sample consists of 115 empirical studies. Table 1 provides an overview of the search results and exclusion criteria, and their effect on the final sample. Appendix 3 summarizes the 115 empirical articles in the final sample, highlighting their theoretical perspective, dependent and independent variables, moderators and the form of their relationship.

CPOIB
Each article underwent a content analysis where information about different parameters was collected and coded categorically. In a first step, each article was given equal attention and coded descriptively and attributively (Saldaña, 2015, pp. 59-64). In a next step, the initial descriptive and attributive codes were categorized into clusters based on similar attributes. In a final step, the clusters were aggregated to a topical, descriptive level, and organized into main categories and subcategories. Table 2 shows the three levels of the categorization scheme. The categories included information about the underlying theoretical arguments and information about the sample and research context, for example, the region where the research was conducted, firm size and industry. Fundamental to understanding the relationship is to also understand how it has been measured. Thus, the categorical codes include different types of performance (e.g. accounting-based, market-based or operational performance) and their measures (e.g. return on assets, return on sales, return on equity, Tobin's Q), different types of multinationality (e.g. structural or financial measures, or index-based) and their measures (e.g. foreign sale to total sales, foreign assets to total assets, ratio of foreign to total employees, number of countries the firm has operations/subsidiaries in) and finally the shape of the identified relationship between multinationality and performance. The codes for the moderators (e.g. firm characteristics, home-country context or strategy) and their measures (e.g. firm size, firm age, family ownership, entry mode or cultural diversity) were derived descriptively and attributively in order to cover the full range of moderators applied to the M-P relationship literature.  Table 3 provides an overview of the identified shapes of the M-P relationship by the year of the published articles. It shows that, although there were some studies published earlier, it was during the late 1990s that the M-P relationship as a research topic became more and more popular. This can be explained with the general raise of globalization that triggered research projects associated with the performance outcomes of global activities. During the past 12 years, the research field grew even more, peaking with 11 publications in 2012. The identified shapes of the relationship however are scattered across the whole spectrum, leading to no clear pattern that could be associated with the year of publishing and the identified shape. In most recent years, a positive linear shape, along with an inverted Ushape and S-shape are the most dominant found relationships. Part of the explanation for this finding is due to the evolution of statistical analysis that has allowed for more complex investigation of nonlinear relationships, which indicates that a continuous development in statistical methods also in the future might contribute to our findings rather than the factual relationship between multinationality and performance. The content analysis presented in Table 4 shows a summary of the frequency of the coded categories, such as type of theory, cross-tabulated with the shapes of the relationships between multinationality and performance. To test for whether there is an association between the identified relationships between multinationality and performance (including no relationship), and the theory used, the region, firm size, industry, measurement type for performance and multinationality, and type of moderator, we did a cross-tabulation analysis. Using the data from Table 4, we applied the chi-square test for independence to all possible 2 Â 2 cross-tabulation tables. This tests for a statistically significant association between categories, for example, the type of theory and the form of the relationship between multinationality and performance. No chi-square test indicated a statistically significant pattern between categories.
Findings reveal a great variety of empirical studies investigating the M-P relationship. This can be observed in: different research settings; measurements of key variables; underlying theoretical approaches and identified shapes of the M-P relationship; and moderators influencing the M-P relationship.
All of these approaches contribute to diverse and inconsistent findings, thereby confounding the search for a unified theory for the relationship between multinationality and performance. Below, the diverse approaches are presented in more detail. They are contrasted with the outcomes presented in the papers to identify possible patterns in previous findings.

Research settings
Variety within the research setting is beneficial to the overall validity of findings. While the majority of studies still choose to focus on a single country as their research setting (81 studies), using comparative studies in the form of investigating and comparing multiple countries has been on the rise. With a dramatic increase from three studies between 1988 and 1998 to 11 studies in 1999-2009, and even 17 studies between 2010 and 2021. Yang and Driffield reported in 2012 that 42% of studies use a US sample, indicating an overrepresentation of US firms. Our results show 38.3% of empirical studies focus on US firms, 35.6% on European firms and 42.6% on Asian firms, indicating that since 2012 the research settings have become more balanced. Table 5 shows that the amount of positive linear relationships and inverted-U shaped relationships is also quite evenly distributed between Asian, European and US firms. Notes: Some papers have either found multiple different shapes or have not made a clear statement about the identified shape of the relationship. Therefore, the amount of papers published per period does not match the total amount of identified shapes per period. POS LIN = The paper has found a positive linear relationship between M and P; NEG LIN = The paper has found a negative linear relationship between M and P; U = The paper has found a U-shaped relationship between M and P; INV U = The paper has found an inverted U-shaped relationship between M and P; S = The paper has found a Sshaped relationship between M and P; M = The paper has found a M-shaped relationship between M and P; NONE = The paper has found no relationship between M and P Table 4.

CPOIB
In total, 38 studies out of 115 explicitly state that they investigate emerging markets. Between 1988 and 1998, there was only one study with an emerging market setting. During the following decade there were 10 studies, and the decade after that there were 24. The most dominant identified shapes of the M-P relationship were positive linear (9 studies) and inverted-U shaped (8 studies). This indicates that, as with many other field of research, emerging markets have become more and more relevant to the research setting and are likely to continue to grow in importance in the future. Overall, positive linear and inverted-u shaped relationships are the dominant forms throughout the different research settings. Nevertheless, no consistent linear or nonlinear pattern is observed for the M-P relationship when investigating different countries. Furthermore, there is no difference in papers focusing on single or multiple countries (see Table 4). Notes: Some papers have either found multiple different shapes or have not made a clear statement about the identified shape of the relationship. Therefore, the amount of papers published per period does not match the total amount of identified shapes per period. POS LIN = The paper has found a positive linear relationship between M and P; NEG LIN = The paper has found a negative linear relationship between M and P; U = The paper has found a U-shaped relationship between M and P; INV U = The paper has found an inverted U-shaped relationship between M and P; S = The paper has found a S-shaped relationship between M and P; M = The paper has found a M-shaped relationship between M and P; NONE = The paper has found no relationship between M and P

Sample characteristics
Concerning characteristics of the samples used in the empirical studies, 7% of the studies solely investigate small-and medium-sized firms, while 45% focus on large firms. As many large firms might be publicly listed, financial information is easier to obtain from their annual reports than for small-and medium-sized firms. This might explain an overrepresentation of large firms in previous empirical studies. Interestingly, 27% of the studies were not clear in reporting the size of the firm. Comparing firm size with the identified relationship shapes, no clear pattern can be observed. Interestingly, the category for large firms is the largest group in the sample and finds all the different relationships except for an M-shape. Again, positive linear and inverted-U shaped relationships are the most commonly identified M-P relationships for empirical studies investigating large firms. However, it may simply mean that none of the studies tested for the M-shape. Those studies that have not stated any firm size explicitly found an S-shaped relationship as the second most prominent relationship identified (after positive linear).
Concerning industry, there is a bias toward manufacturing firms. Forty-four studies solely consider manufacturing, whereas only 14 solely look at the service industry. Thirtysix are blended studies and 20 do not reveal the industry the study was investigating. Comparing the different shapes to the industries, no clear pattern is observed (see Table 6). All industries are represented in every category, except for the M-shaped relationship. Notes: Some papers have either found multiple different shapes or have not made a clear statement about the identified shape of the relationship. Therefore, the amount of papers published per period does not match the total amount of identified shapes per period. POS LIN = The paper has found a positive linear relationship between M and P; NEG LIN = The paper has found a negative linear relationship between M and P; U = The paper has found a U-shaped relationship between M and P; INV U = The paper has found an inverted U-shaped relationship between M and P; S = The paper has found a S-shaped relationship between M and P; M = The paper has found a M-shaped relationship between M and P; NONE = The paper has found no relationship between M and P CPOIB Between 2010 and 2021, there were three published articles finding an M-shaped M-P relationship for manufacturing firms. When comparing the time-span of the samples in each of the empirical studies, no pattern emerges. As can be seen in Table 6, papers divided into long-term perspective (from 7 years up to 35 years) and short-term perspective (from 1 year up to 6 years) are quite homogenously distributed. Though, there is a slight trend for long-term perspective studies to more frequently find an S-shaped relationship. This could be explained with that to identify an S-shaped M-P relationship, longitudinal data is required, to fully plot an Sshaped relationship.

Underlying theories
Within the internationalization process literature, multiple theories have been applied to explain both the benefits and drawbacks of an increased degree of multinationality and its effect on performance. Although many studies apply different theories in an attempt to explain the assumed causal relationship between multinationality and performance, there are no conclusive results connected to the use of the underlying theory. However, certain trends can be observed. For example, it is not surprising that no study using the resourcebased view found a negative linear relationship between M and P. Although the sample is quite small, the logics behind the resource-based view, advocating for benefits of internationalization stemming from the exploitation of firm strategic advantages, indicates a positive relationship. Finding a negative linear relationship would contradict the theory.
Economic theories, such as transaction-cost theory, mainly found a positive linear and an inverted-U shaped M-P relationship. Interestingly, only 1 out of 41 studies using an economic theory found no relationship at all. Table 7 provides a detailed account of the theories and the identified shapes of the M-P relationship over the years.

Measures of multinationality and performance
Findings related to the broad variety of measures used for both key variables are presented in Table 8. To capture the depth of the key variable Multinationality, it was split into structural, financial and index-based measurements. Financial measurements are the most dominant (64%), followed by structural (37%) and index-based measures (23%). The ratio between foreign sales to total sales is the key financial measure for multinationality, employed in 84% of the studies. The number of foreign subsidiaries is measured in 58% of the studies and is the leading measure for structural multinationality. For index-based measures, an entropy measure is most popular.
For the key variable Performance, we followed Hult et al. (2008), and split the performance measure into financial performance, operational performance and overall performance. By far (110 studies), financial performance is the dominant measure. The most popular measurement for financial performance is return on assets (57%). Comparing the different types of measures, no patterns are identified concerning the M-P relationship. Note that many studies use multiple measures, so the totals exceed the 115 papers included in Table 8.

Moderators
M-P research strongly suggests a dynamic relationship that requires going beyond simple linear explanations (Lu and Beamish, 2004). Given their fundamental importance to understanding the M-P relationship, we documented all moderating variables. We report a detailed record in Appendix 4. In total, 54 out of the 115 empirical studies (i.e. 47%) have introduced at least one moderator, and 90 unique moderators are identified. It is important to Patterns of inconsistency note that, although researchers sometimes use the same moderators, the measurements are different. Given the sensitivity to context and measurement, it is no surprise that the findings are inconsistent. No patterns connected to the identified shapes of the M-P relationship are identified. Furthermore, there is no difference between papers that include  1977-1987 1988-1998 1999-2009 1 3 1 2 7 2010-2021 1 3 1 1 3 7 Notes: Some papers have either found multiple different shapes or have not made a clear statement about the identified shape of the relationship. Therefore, the amount of papers published per period does not match the total amount of identified shapes per period. POS LIN = The paper has found a positive linear relationship between M and P; NEG LIN = The paper has found a negative linear relationship between M and P; U = The paper has found a U-shaped relationship between M and P; INV U = The paper has found an inverted U-shaped relationship between M and P; S = The paper has found a S-shaped relationship between M and P; M = The paper has found a M-shaped relationship between M and P; NONE = The paper has found no relationship between M and P CPOIB moderators and papers that do not include moderators. Again, positive linear and inverted U-shaped M-P relationships are marginally more common than the other shapes, although all shapes are represented. However, it is evident that adding moderators to the model became more popular during the past 12 years than it was before.
In the examination of the moderators, it is possible to identify and group them into three clusters based on shared features, which are shown in Appendix 4. The first cluster includes moderators that are commonly listed as firm characteristics (Kogan and Tian, 2012; Subrahmanyam and Titman, 2001;Zou and Stan, 1998). For example, the size of the firm (Fisch, 2012;Kirca et al., 2012;Singla and George, 2013), the age of the firm (Singla and George, 2013) or business group affiliations (Gaur and Kumar, 2009;Kim et al., 2004;Singla and George, 2013). The second cluster is associated with factors usually described as the institutional or the homecountry context (Devinney et al., 2010;Ghemawat, 2001;Scott, 2008). For example, home-country legal institutions (Li and Yue, 2008;Marano et al., 2016), home-country political stability (Chao and Kumar, 2010;Tan and Chintakananda, 2016) and home-country governance (Chao and Kumar, 2010;Li and Yue, 2008). In the last cluster, the moderators are linked to strategic decisions a firm makes in diverse areas, and includes, for example, advertising intensity (Kirca et al., 2016;Lu and Beamish, 2004), R&D intensity (Bae et al., 2008;Berry and Kaul, 2016;Kirca et al., 2016;Kotabe et al., 2002;Lu and Beamish, 2004;Pattnaik and Elango, 2009) and entry mode decisions (Jain and Prakash, 2016). The three clusters have been compared for patterns, but again, no clear pattern emerges (see Table 9).
In sum, there is a broad variety of moderators that have a positive, negative or no effect on the M-P relationship. It is interesting to see that although many researchers use the same moderators, the results are different. Hence, the random use of moderating variables has made it difficult to identify consistent patterns in relation to the identified shape of the M-P relationship.

Concluding remarks 4.1 Discussion
This literature review and content analysis encompasses the 115 most relevant empirical studies publish over the past 44 years on the relationship between multinationality and performance at the firm level. Categorizing for different research settings, measurements, theories and moderators, we search for patterns that may explain the variety of incongruent findings in the extant literature. We test for patterns through cross-tabulation analysis and chi-square tests. Our findings challenge the prevalent belief in the international business literature that a direct and overall positive relationship exists for multinationality on performance.
First, we investigated different research settings, defined as different countries or regions, and found no clear linear or nonlinear pattern for identified shapes of the M-P relationship, neither from the content analysis nor from the cross-tabulation analysis. This includes single and multiple country settings. We conclude that there are no systematic patterns between the type of research setting and the nature of the M-P relationship.
Second, for sample characteristics we compared firm size and industry to the shape of the M-P relationship. We also considered whether the data represented a short-term (up to and including 6 years) or long-term (7-35 years) perspective. Many studies claim that firmspecific characteristics of small-and medium-sized enterprises (SMEs) impact their internationalization (Cavusgil and Knight, 2015;Chetty and Campbell-Hunt, 2004;Hilmersson and Johanson, 2020;Hilmersson et al., 2022). Size is a boundary condition to firm internationalization as size often implies limited resources, including assets, finances and infrastructure (Knight and Kim, 2009). However, size also impacts firm governance, organization and decision-making (Verbeke and Ciravegna, 2018). Given this, it is somewhat surprising that we could not identify any patterns in the content analysis or the cross-Patterns of inconsistency tabulation analysis. The limited number of articles in the size category may very well have contributed to not finding significant patterns in our data. Another explanation may be the diversity of definitions and measures of SMEs (Zahoor et al., 2020), what Child et al. (2022) describe as inconsistencies in conceptualizing SMEs. We conclude that sample characteristics do not systematically influence the shape of the relationship between multinationality and performance. One common problem concerning samples, and thus results, lies in the ambiguity of definitions and measures of sample characteristics. That is, ambiguity in the sample creates ambiguity in the results (Sumpter et al., 2019). Klein and Delery (2012, p. 58) explain it as, "(. . .) the most serious consequence of construct ambiguity is the lack of confidence that can be placed in the conclusions drawn from the extant literature." Third, we scrutinized the underlying theories applied to explain the relationship between multinationality and performance. The several shapes of the relationship are explained by the authors utilizing many different and sometimes contradicting theories. Among others, the most popular explanations are derived from transaction cost theory, internalization Notes: Some papers have either found multiple different shapes or have not made a clear statement about the identified shape of the relationship. Therefore, the amount of papers published per period does not match the total amount of identified shapes per period. POS LIN = The paper has found a positive linear relationship between M and P; NEG LIN = The paper has found a negative linear relationship between M and P; U = The paper has found a U-shaped relationship between M and P; INV U = The paper has found an inverted U-shaped relationship between M and P; S = The paper has found a S-shaped relationship between M and P; M = The paper has found a M-shaped relationship between M and P; NONE = The paper has found no relationship between M and P CPOIB theory and the resource-based view of the firm. All theories share the common denominator that multinationality affects performance. Interestingly, almost all the theories have results across the spectrum of shapes of the relationship, leading us to conclude that there is no systematic relationship between the applied theory and the shape of the multinationality and performance relationship. This finding is in line with several researchers arguing that there is no systematic relationship between the two concepts (cf. Hennart, 2007;Rugman et al., 2016). The results of the cross-tabulation analysis support this conclusion. However, one interesting observation is the lack of consideration of the individual manager playing a vital role in the decision-making process concerning internationalization. Bridging the existing macro-level theories with micro-level foundations would allow for a more detailed understanding of how multinationality and performance interact (cf. Cowen et al., 2022). Fourth, we examined the measurements used for multinationality and performance. We found that most of the studies applied financial measures for both concepts. Return on assets is most popular for performance and the ratio between foreign sales to total sales is the most popular for multinationality. The ease of access to this kind of financial data would explain these preferred measures, in spite of the possibility that they may not represent the most accurate depiction of the degree of multinationality or performance. Hult et al. (2008) advocate for incorporating operational performance and overall performance to compliment financial performance, thus depicting a more accurate and holistic view for measuring performance. We could not identify any statistically significant pattern between these types of measures and the shape of the relationship between multinationality and performance. One possible explanation is a lack of clarity when it comes to the definition and measurement of the constructs. There are limited discussions on what constitutes the constructs and how they are actually being measured (Klein and Delery, 2012;Suddaby, 2010). Promising progress has been made by Miller et al. (2016) who split multinationality into international intensity, international distance and international diversity to capture a more holistic picture of the different aspects that constitute multinationality. Giachetti and Spadafora (2017) suggest conformity in multinationality as a new measure that captures the extent to which a firm's multinationality resembles the multinationality of its peers at a particular point in time. This allows for more comparative analyses of individual firms in relation to their competitors.
Last, we investigated the effect of different moderators or no moderator on the shape of the relationship between multinationality and performance. No patterns emerged. We conclude that there are no systematic effects of moderators on the shape of the multinationality and performance relationship. Although investigating different moderators is crucial for the development of future research (Zahoor et al., 2020), instead of enlarging the spectrum of applied moderators to the M-P relationship, it is imperative that researchers fundamentally question the nature and direction of the relationship between multinationality and performance.

Conclusions and suggestions for future research
We set out to explore reasons for inconsistent results in research on the M-P relationship. Given the absolute lack of any consistent results, our conclusion is that the relationship is so complex and contextually bound that it is neither possible nor fruitful to strive for a unifying theory. The content analysis shows that despite the variety of results there is consistency in the importance of the variables we have identified. The relationship between multinationality and performance can take many forms; however, it is an oversimplification of the relationship to examine it as simply two variables and a possible moderator.
The inconsistency may also be a function of the dynamics in the relationship. Internationalization is an evolving process, yet the vast majority of the published research relies on cross-sectional research designs. Findings at one time in the relationship will most Patterns of inconsistency likely differ from findings at a different time, depending on where the relationship is in terms of the stage of the process. Frankly, the form of the relationship may simply be a function of the analytical choices made by the researchers. If the researchers are only testing linear relationships, then they may just see the linear part of what in actuality is a nonlinear relationship. This could even be a function of the available analytical tools and computing power. Future researchers should recognize that the relationship is a process taking different forms. There is no specific dominant form. The context specificity of the relationship makes general declarations difficult, if not impossible.
Over the past four decades, the M-P paradigm has been a major focus of practitioners and researchers (Elango and Sethi, 2007). Paradigms, to some degree, are immune to contradictory empirical evidence (cf. Håkanson and Kappen, 2017). By their nature, they are accepted as the established norm. Our findings concur with a growing body of evidence (cf. Hennart, 2011;Tallman and Pedersen, 2012;Verbeke and Brugman, 2009) that we are due for a paradigmatic shift (Kuhn and Hacking, 2012), which would allow the international business research field to develop in a fruitful new direction. Specifically, there is a small but growing literature arguing to turn the tables and investigate the performance-multinationality relationship (cf. Grant, 1987;Beamish, 2001, 2004;Morck and Yeung, 1991;Schmuck et al., 2022). A handful empirical studies have empirically investigated either a dual or a reversed causality (Grant et al., 1988;Hong Luan et al., 2013;Jung and Bansal, 2009). Though promising, the outcomes from these studies require further investigation.
We suggest that future research focus more on the process of internationalization rather than on the outcome. Although the goal of internationalization is to achieve a particular outcome, multiple contextual factors need to be considered in the model. Depending on, for example, financial assets, strategic decisions or time since the founding of the company, firms reside in different stages of their internationalization processes. Taking cross-sectional observations fails to properly represent the process, distorting general conclusions. Moreover, a successful and sustainable internationalization process should be the focus of strategic decision making, rather than potential financial gains or losses. After all, as other literature reviews have shown, and as our findings show, after 44 years the international business research community still cannot agree on the effect of multinationality on firm performance. A theme for future consideration is to capture the time dimension in the internationalization process and the effect of time on performance. That is, the speed and timing of internationalization (Hilmersson et al., 2017;Hult et al., 2020).
We have endeavored to provide an overview and classification of the M-P moderators. Due to the large diversity in the moderators, we suggest researchers use more diligence in selecting and measuring moderators, multinationality and performance. In sum, we do not see a fruitful future for research on the M-P relationship, as long as researchers continue to rely on the dominant paradigm and other underlying assumptions. We advocate a critical reevaluation of the current oversimplifications of the M-P relationship and suggest future research to critically assess the choices of theories, methods, models and statistical analyses.  5. Elango (2006) identified a positive linear relationship for service firms, and an inverted U-shaped relationship for manufacturing firms. (2007)  11. The statistical analysis used by Dikova and Veselova (2021) did not allow for making conclusions on the relationship between multinationality and firm performance. Notes: Latest ranking according to the Academic Journal Guide 2015 in brackets behind the journal name.

Elango and Sethi
The following journals had no initial hits and are therefore excluded from this table: General management, ethics and social responsibility: California Management Review (3)  Total of initial search results: n = 654 articles (2) Downloading the bibliographic information (title, year, author, abstract, journal) of the 654 articles into the EndNote reference manager software and exporting into an excel file to create a database (3) Manual reading and checking of all articles included in the initial database against the following exclusion criteria: Studies using one of the key concepts multinationality or firm performance as a moderator or control variable (16 articles) Studies not measuring corporate performance Studies measuring different kinds of performance (e.g. corporate social performance, or environmental performance) (110 articles) Studies where the unit of analysis is not on a firm level (e.g. subsidiary performance) (36 articles) Studies measuring different kinds of diversification (e.g. product diversification, or board diversification) (92 articles) Studies not using both key concepts multinationality and firm performance as key variables (261 articles). Total of articles that fulfilled the selection criteria: n = 139 articles (4) Selection of empirical articles, due to the focus of the literature review Exclusion of conceptual papers (7 articles) Exclusion of literature reviews (10 articles) Exclusion of meta-analyses (7 articles) Final sample: n = 115 articles   Notes: ATNITA = after-tax net income to total assets; EBITOA = earnings before interest and taxes divided by total assets; ESTS = export sales to total sales; FATA = ratio of foreign to total assets; FETE = ratio of foreign to total employees; FITI = ratio of foreign to total income; FOTO = ratio of foreign to total offices; FORSUB = number of foreign subsidiaries; FRTR = foreign to total revenues; FSTS = ratio of foreign to total sales; GPM = gross profit margin; GSI = Geographic Spread Index; NPM = net profit margin; OCTS = operating costs to total sales; OPM = operating profit margin; OPSAL = ratio of operating costs to sales; OPSALINV = ratio of sales to operating costs; PEP = profits per equity partner; ROA = return on Assets, ROE = return on equity; RONA = return on net assets; ROOA = return on operating assets; ROS = return on sales; RSTS = regional sales to total sales; TAT = total asset turnover; Tobin's Q = sum of the market value of equity and the book value of debt divided by the book value of assets Table A2.     Notes: The coding of the effect (positive, negative or none) is based on the claims made by the authors in the respective paper, even though they sometimes reported insignificant results Table A3.