Using single impact metrics to assess research in business and economics: why institutions should use multi-criteria systems for assessing research

Sergio Olavarrieta (Department of Business Administration, FEN, University of Chile, Santiago, Chile)

Journal of Economics, Finance and Administrative Science

ISSN: 2218-0648

Article publication date: 21 March 2022

Issue publication date: 12 July 2022

1668

Abstract

Purpose

Despite the general recommendation of using a combination of multiple criteria for research assessment and faculty promotion decisions, the raise of quantitative indicators is generating an emerging trend in Business Schools to use single journal impact factors (IFs) as key (unique) drivers for those relevant school decisions. This paper aims to investigate the effects of using single Web of Science (WoS)-based journal impact metrics when assessing research from two related disciplines: Business and Economics, and its potential impact for the strategic sustainability of a Business School.

Design/methodology/approach

This study collected impact indicators data for Business and Economics journals from the Clarivate Web of Science database. We concentrated on the IF indicators, the Eigenfactor and the article influence score (AIS). This study examined the correlations between these indicators and then ranked disciplines and journals using these different impact metrics.

Findings

Consistent with previous findings, this study finds positive correlations among these metrics. Then this study ranks the disciplines and journals using each impact metric, finding relevant and substantial differences, depending on the metric used. It is found that using AIS instead of the IF raises the relative ranking of Economics, while Business remains basically with the same rank.

Research limitations/implications

This study contributes to the research assessment literature by adding substantial evidence that given the sensitivity of journal rankings to particular indicators, the selection of a single impact metric for assessing research and hiring/promotion and tenure decisions is risky and too simplistic. This research shows that biases may be larger when assessment involves researchers from related disciplines – like Business and Economics – but with different research foundations and traditions.

Practical implications

Consistent with the literature, given the sensibility of journal rankings to particular indicators, the selection of a single impact metric for assessing research, assigning research funds and hiring/promotion and tenure decisions is risky and simplistic. However, this research shows that risks and biases may be larger when assessment involves researchers from related disciplines – like Business and Economics – but with different research foundations and trajectories. The use of multiple criteria is advised for such purposes.

Originality/value

This is an applied work using real data from WoS that addresses a practical case of comparing the use of different journal IFs to rank-related disciplines like Business and Economics, with important implications for faculty tenure and promotion committees and for research funds granting institutions and decision-makers.

Keywords

Citation

Olavarrieta, S. (2022), "Using single impact metrics to assess research in business and economics: why institutions should use multi-criteria systems for assessing research", Journal of Economics, Finance and Administrative Science, Vol. 27 No. 53, pp. 6-33. https://doi.org/10.1108/JEFAS-04-2021-0033

Publisher

:

Emerald Publishing Limited

Copyright © 2022, Sergio Olavarrieta

License

Published in Journal of Economics, Finance and Administrative Science. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence maybe seen at http://creativecommons.org/licences/by/4.0/legalcode.


1. Introduction

There is a continuous and increasing interest in how to assess research at academic institutions (Adler and Haring, 2009). University and school administrators need to manage their resources to increase research output and school reputation, raise rankings, achieve or keep international accreditations and maintain or increase external funding (Peters et al., 2018). Research assessment then is linked to relevant strategic goals of these institutions. At the same time, research assessment plays an important role at the micro or individual faculty level. Research assessment practices may be linked to research promotion policies, economic incentives, academic careers and school- and university-level promotions. Good assessment practices may improve individual and institutional research output, due to the direct and indirect effects of assessment methods on individual performance. Moreover, shrinking budgets and increased societal pressures regarding the sustainability of universities in terms of fulfilling the needs of multiple stakeholders (Jack, 2021) suggest that sound research assessment practices may be more important if universities and schools want to fulfil their strategic goals and remain sustainable over time.

Universities, schools and national agencies establish assessment procedures to evaluate existing/previous research and assign research funds and benefits (e.g. courses reduction, travel funds, etc.) honours and awards, academic promotion and direct economic incentives. Different assessment methods have been used including journal lists (institutional or external lists like ABCD in Australia, ABS in the UK or the Financial Times), individual citation patterns, peer-reviewed assessments and collegiate review committees. Strategic control and assessment systems are crucial for guiding an institution’s behaviour and performance (Kaplan and Norton, 1996).

With the increasing bibliographic information on journals and citations (e.g. Salcedo, 2021a), and the rising burden/complexity of faculty and school assessment tasks, quality peer evaluation has been somewhat substituted for the use of journal impact metrics (Garfield, 1972, 2006; Adler and Harzing, 2009; Rizkallah and Sin, 2010; Haustein and Lariviere, 2015; Brown and Gutman, 2019). Two factors are probably driving this trend, their availability and their objectivity status. These effects might be even more relevant for institutions where management needs to use discretion and judgement rather than just financial measures to assess performance or for institutions that have a less “formal” or well-understood strategy (Gibbons and Kaplan, 2015). In those cases, Gibbons and Kaplan (2015) argue that formal measures – included in assessment systems – may give “clarity to the strategy” (p. 449) and the school and faculty actions. The design of a school’s research assessment system is then a key element for facilitating the implementation of a higher education institutions' strategy. This choice of research impact indicators will affect both individual and institutional research behaviour (Fischer et al., 2019; Jack, 2021).

Research assessment systems that are based on single impact indicators may be risky for institutions because they may channel faculty and school efforts towards indicators that are consistent with particular disciplines, stakeholders or goals that do not consider the entire spectrum of outcomes that are expected for a sustainable Business School or university. These results may be very complex when university or business school revenues are contingent upon serving those other needs (Peters et al., 2018; Morales and Calderón, 1999). We argue that these challenges are even higher when schools and institutions embrace different disciplines and are included in the same assessment process.

Despite the problems derived from overestimating the value of these impact metrics institutions continue using them, with potentially complex implications for the assessment process itself, and for achieving the schools’ strategic goals and sustainability (see for example Jack, 2021, for the challenges of using too narrow metrics in business school rankings).

Only a few authors have addressed this issue empirically, warning about the problems of using single journal/level indicators for assessing research contribution. Mingers and Yang (2017) in a recent study for business and management journals provide evidence that in the business disciplines multiple impact indicators should be used in order to overcome the biases that particular indicators may entail when ranking journals and using those rankings for assessing business research. We aim to provide further empirical evidence regarding the risks of using single indicators in assessing research outputs, especially when assessing journals or researchers from different disciplines.

This paper explores the effect of using particular single impact metrics when assessing research contributions in related disciplines, in this particular case: Business and Economics. Even though both disciplines are regularly taught in Business Schools and programs their relationship is not as strong as one might think. Azar (2009), for example, reports that only 6.9% of citations in business journal articles come from economics, and with a reducing trend over time. For Business, other disciplines like psychology, sociology, decision sciences and communications, have a strong influence. Since specific research impact indicators have different objectives and assumptions and are sensitive to specific citation patterns (the raw input for those indicators), the use of particular impact indicators may significantly affect the relative assessment of scientific work when different scientific disciplines are evaluated together.

In this paper, first, we briefly cover the literature of research and journal assessment and impact metrics and its connection with university rankings and strategic performance and sustainability. Then we define the main Web of Science (WoS)-based impact metrics and analyse these metrics for Business and Economics journals. We analyse the effects of using single impact indicators: standard impact factor (IF) measures and the new eigenfactor and article influence scores, for ranking Business and Economics journals and assessing the work of Business School scholars. As in previous research, we compute the correlations of these different indicators finding generally consistent results with existing literature. We then generate relative rankings for all journals in the Business and Economics WoS categories, using these different indicators. Significant changes in rankings are identified depending on the type of measure used (e.g. standard WoS impact factors vs eigenfactor scores or AIS scores). By calculating the implicit academic value of different disciplines using the AIS journal scores, we provide further insights regarding the reasons for these different results, providing additional support for the need to use multiple families of indicators when attempting to design a sound and fair research and promotion assessment system that helps institutions to achieve their strategic goals. Implications for theory and practice of research assessment, future research avenues and conclusions are provided in the last section of the paper.

2. Literature review

The evaluation of the research output is very important in academic life since it drives hiring, funding and tenure and promotion decisions. The implications are very relevant for individual researchers since their academic careers and economic incentives may be driven by these decisions. In the following sections, we will examine relevant literature addressing research assessment systems and metrics.

2.1 Research assessment systems and indicators

As stated earlier, research assessment is a relevant but very complex process that affects the behaviour of individual faculty and the whole institution. For this reason, the academic tradition established peer review committees of senior faculty members as a reasonable way to deal with this strategic process. These committees normally review individual manuscripts and outputs for quality, relevance and overall value. As a way to provide a more standard rule to compare different research productions, some schools developed internal lists of desired journals, ranking them in terms of subjective quality. Other schools also used journal quality lists developed by external parties and associations (e.g. ABCD list in Australia, ABS in the UK, Univeristy of Texas–Dallas list in the USA, Capes/Qualis in Brazil, see for example Harzing.com).

Additionally, research publications may be evaluated through quantitative indicators like the direct citations to the paper or through some sort of impact metric of the journal (based on the total citations to the journal, Garfield, 1972, 2006; Franceshet, 2010). The availability of large bibliometric databases (WoS, Scopus or even Google Scholar), has made citation-based metrics easier to find and use and a more common assessment approach (Haustein and Lariviere, 2015; Harzing, 2019). Journals and editors engage in reputation, through the expansion of indexing and becoming more known and cited by relevant research communities (see for example Salcedo, 2021b).

Despite some concerns regarding the validity of impact metrics (see for example Carey, 2016; Paulus et al., 2018), the burden of assessing research output for an increasing faculty body has made a common practice the use of journal impact metrics to assess individual faculty research outputs in many institutions. Here we present the main impact metrics used in academia separated into two groups: the standard or more traditional IF scores and the newer eigenfactor-related scores.

2.2 Standard/traditional impact factor scores

Total cites (TotCite). The total number of citations in a year received by a journal for its articles published in the previous two years.

The journal impact factor (IF). It represents the total citations obtained in a year by articles published in the previous two years divided by the total number of articles published (by the journal) in the previous two years. Self-citations – citations to journals from articles published in the same journal – are included in the count and computations.

The 5-year impact factor (5YIF). The five-year impact factor is similar to the regular IF, but it considers the articles published in a journal in the previous 5 years. Then the five-year impact factor is defined as the total citations obtained in a year by articles published in the previous five years divided by the total number of articles published by the journal in the previous five years.

The impact factor without self/cites (IFwoSC). It is the same as the journal IF, but the self-citations are excluded in the numerator. It represents the total citations (without self-cites) obtained in a year by articles published in the previous two years divided by the total number of articles published (by the journal) in the previous two years. Self-citations – citations to journals from articles published in the same journal – are not included in the count and computations.

Immediacy index (IMMI). This index can also be defined as a zero-year IF and is computed as “the total citations to papers published in a journal in the same year divided by the total articles published by the journal in that year” (Chang et al., 2016).

2.3 Eigenfactor-related metrics

The creators of the eigenfactor metrics indicate that they derived an algorithm based on the idea of Google Page Rank for sorting and ranking web pages, i.e. based on the networks that visited particular websites. Instead of the connections or visits used for ranking webpages, they use the citations in the WoS database a particular journal receives to compute the eigenfactor through this iterative algorithm. Bergstrom (2007) argues that a “single citation from a high-quality journal may be more valuable than multiple citations from peripheral publications”. The importance of a single citation can be computed by the “influence of the citing journal divided by the total number of citations appearing in that journal.” By this procedure, they argue they “aim to identify the most influential journals, where a journal is considered influential if it is cited by other influential journals”. However, they recognize that the eigenfactor aggregates the individual influence of all papers appearing in a particular journal, and for this reason, it will be higher for larger journals. Larger journals will generate more visits, more citations and, therefore, larger eigenfactor scores. The authors suggest that this procedure corrects for differences in citation patterns and propensities across disciplines but also hinders more peripheral or newer disciplines and journals. Therefore, its computation is not neutral to the newness or centrality of disciplines, particularly when the number of citations and academic reputation is built through time, and when these variables may also affect centrality in the whole scientific field.

Eigenfactor (EIG). For the reasons indicated above the eigenfactor score is calculated annually by a PageRank type algorithm, based on five-year citation data and published in eigenfactor.org and is defined as “the journal's total importance to the scientific community” (for a more detailed description of the method, www.eigenfactor.org, Eigenfactor, 2009). An important element of the eigenfactor computation is that it excludes journal self-citations and that citations are normalized by the total number of outgoing citations of each journal. A journal’s recent eigenfactor scores are scaled so that the sum of all journals included in the Journal of Citation Reports (JCR, 2017 Journal Impact Factors, 2018) of the WoS add up to 100. Then if a journal has an eigenfactor score of 0.085 (the average journal eigenfactor), it means that this journal has 0.085% of the total influence of all indexed publications. The eigenfactor score can also be labelled a journal’s influence score (Chang et al., 2016).

Normalized eigenfactor score (NEig). It is a rescaled eigenfactor score so that the average journal scores 1 (instead of 0.085) and can be computed as the Eig*N/100, where N is the number of journals included in the JCR. Therefore, correlations between the eigenfactor and its normalized version are 1.0, and rankings of journals using both impact metrics generate the same results.

Article influence score (AIS). The article influence score is calculated by dividing the journal eigenfactor score by the fraction of the number of articles in the journal to the total articles published in the 5-year window (0.01 × eigenfactor score/(5-year journal article count/5-year all journals article count)). The AIS is then scaled to a mean of 1.0, meaning that the average article published in the WoS database (Sciences and Social Sciences) in a particular year is 1.0. Then, Bergstrom suggests that a journal with an AIS of 17 means that the average influence of an article appearing in that journal has 17 times the influence of the average article in all sciences. There are two important clarifications about this number. Firstly, that AIS are scaled to 1.0 does not mean that the average AIS for a journal in the database is 1.0. The average AIS for journals in all sciences is 0.84. Secondly, social sciences and sciences have different average AIS (science larger than social sciences). Despite the intended objective to control for differences across disciplines, several studies have indicated that this is not the case favouring more traditional and central basic sciences and disciplines compared to newer and more applied social disciplines (Waltman and Van Eck, 2010; Dorta-González and Dorta-Gonzáles, 2013; Walters, 2014; Merigo et al., 2016).

2.4 Other impact metrics: Journal lists, Scopus-based, Google Scholar and web-based measures

In addition to WoS-based impact metrics, there are other sources of journal impact and quality measures. SCOPUS and Google Scholar are the two most relevant ones (apart from WoS) and have the advantage over WoS of including a broader array of journals in most disciplines (37.000+ in Scopus and 11,500+ in WoS). For example, in a recent revision performed by the authors of journals in Business, SCOPUS includes 1,742 journals and WoS only 448. Scopus and other institutions publish impact metrics based on the citations and records included in SCOPUS and are easily available on the net. They publish the cite score, SNIP and SJR, the latter indicators being attempts to normalize and measure “prestige” or influence of journals within the SCOPUS database (González-Pereira et al., 2010). Google Scholar, on the other hand, uses the information available on the Internet, thus providing an even wider set of titles and citations. Google scholar publishes the H-5 index, which is the h-index for a journal, calculated based on the articles published in the last five years (the h-index is the number of papers in a journal having at least h-citations, see Harzing, 2009, 2019).

The availability of web-based information on research manuscripts has generated the use of alt metrics that are evaluation metrics that do not use citations and that focus on attention and visibility by measuring views, hits, downloads or other indicators of reader engagement with the research piece (Weller, 2015). Newer developments, using text mining and data science techniques, have focused on examining the relevance of academic research. For example, Jedidi et al. (2021) have recently published the R2M (relevance to marketing) index, by contrasting top concepts appearing in practitioner marketing journals with the ones published in academic journals.

Finally, an alternative approach to impact metrics is the development of journal lists that consider several indicators but are normally curated by a group of peer scholars (see the different lists available in Harzing.com). These journal lists are published by universities and academic institutions and provide a more holistic perspective on the impact and relevance of journals. One of the most comprehensive lists is the one published by the Australian Business Deans Council, the ABDC list, which includes over 2,700 journals in Business or Management. However, most of these lists have a language bias, underrepresenting journals published in Spanish and Portuguese and other languages.

The comparison with those different research assessment metrics is beyond the scope of this paper.

2.5 Issues and challenges in using research impact metrics

The use of research assessment metrics to measure the impact and to rank journals is quite controversial. This controversy is now expanding into the defenders of particular impact metrics. For example, Carey (2016) mentions eight major criticisms regarding the computation of the traditional IF including (1) citation mingling; (2) self-citations; (3) restricted evaluation period (in the case of the impact factor just 2 years); (4) subject dependency; (5) publication emplacement dependency; (6) indiscriminate parity among authors; (7) disproportionate significance of highly cited articles; and (8) different citation patterns by discipline. As stated by Carey, editorial teams can game the system by including highly citable items or by encouraging citation stacking. Some authors suggest that impact metrics may be considered good measures of the visibility of publications instead of their quality (Gorraiz et al., 2017).

The eigenfactor metrics creators (Bergstrom and West, 2008; Bergstrom et al., 2008) suggest that the eigenfactor metrics do provide a fix to some of these problems like: self-citations and different citation patterns across disciplines and should be preferred to assess the real influence of journals. Opposing this view, some authors argue that since IFs (2-year, 5-year and 2-year without self-cites) highly correlate with AIS and that total cites highly correlate with eigenfactors, parsimony and simplicity will advise the use of existing simpler metrics (Davis, 2008; Arendt, 2010; Elkins et al., 2010; Salvador-Oliván and Agustín-Lacruz, 2015).

Other authors have taken a more neutral and pragmatic approach. They do not argue against the general high correlations between IF and eigenfactor metrics, but they suggest that they are not perfect and that assessment may benefit from using the different specific information provided by these different measures (Chang et al., 2011, 2016; Kianifar et al., 2014). Additionally, the high correlations may also suggest that disciplines are different in terms of their citing patterns and traditions. Therefore, they indicate that using just the IF (or eigenfactor metrics) will be risky and advise for the combined use of research assessment metrics. They provide some examples for the neurology paediatric and economics journals, using harmonic means of rankings based on these different metrics to provide a unified ranking.

Several issues have been raised regarding the inconvenience of using citation-based indicators for assessing research (Paulus et al., 2018). At the more general level, two major critiques are presented. First, direct citations are a proxy – not a perfect measure – of the quality of a paper. Papers with errors and controversial papers may be very highly cited but cannot be considered a signal of quality. Excellent or very relevant papers may be published in less known or newer journals (particularly if new subdisciplines or themes are rising) or in working papers or document series not considered by the established databases, getting very few citations due to the outlet published. Other criticisms focus on the validity of citations and the way they might be manipulated (Carey, 2016).

Secondly, aggregate impact measures such as the IF of a journal are also a distant proxy of the quality of a particular paper published in that journal. Journal quality does not equal paper quality. As Brito and Rodríguez-Navarro (2019) show, the difficulties of assessing and discriminating paper quality based on journal impact are even higher if the differences in those impact factors are lower, penalizing new research or research in fields that are less cited. An interesting point is made by Paulus et al. (2018), who suggest that the use of single IF metrics may in fact imply that peers or assessors have weak arguments to justify the quality of a research piece or that they are uncertain of its particular value. Consistently – on a specific application for the business field – Mingers and Yang (2017) offer a similar but extended perspective favouring the use of multiple indicators. They rank business and management journals based on research assessment metrics computed with WoS, Scopus and Google Scholar information, deriving a synthetic rank from the total sum of the different ranks for each journal. Based on their results, they suggest the Google Scholar h-index and Scopus-based SNIP index should be preferred for assessing business journals.

2.6 Research assessment, rankings and business school strategies

Research assessment systems are relevant at the individual researcher level but are also crucial for Schools attempting to fulfill their established missions and serve stakeholders in a highly competitive and globalized world. Research assessment systems are relevant for explaining both individual and school/university behaviour and performance and, therefore, for strategy implementation.

Business school education in particular has experienced important transformations in the past 50 years (Peters et al., 2018). Starting as a more practice-oriented discipline, the business discipline has transformed itself moving towards a more scientific and theoretically strong field of study, borrowing from the traditions of other related fields such as sociology, psychology, decision sciences and economics. The theoretical advancement in particular business disciplines like management, finance and marketing, the strengthening of business doctoral education, global competition and international accreditation agencies and rankings have played an important role in this development process.

Today, business schools face two main evaluation systems: accreditation and rankings (Pitt-Watson and Quigley, 2019). Despite the advances in business schools and business school education, there is a wider debate about business school curricula and research outputs that are consistent with societal needs of a more sustainable and inclusive 21st-century economy (Pitt-Watson and Quigley, 2019). This debate has generated important changes in accreditation standards of the major agencies (AACSB, AMBA and Equis) to include and value societal impact beyond academia (AACSB, 2020). Business school rankings are also embracing these challenges and institutions like Financial Times are adapting their methodologies to include the broader impact of business schools (Jack, 2021). These changes are recent and may not be completely understood in the inner discussions of academic, research assessment and promotion committees within business schools, which represent the academic stakeholders. For example, although several research outputs or intellectual contributions can be identified (AACSB, 2012), academicians tend to focus on articles published in peer-reviewed journals. Most of these evaluations are based on the quality of research publications (i.e. journal impact metrics) despite the multi-dimensional nature of business school missions.

Business school and higher education administrators face an important challenge then, as to how to integrate these external changes and expectations for Business Schools to their business models formulation and implementation. The promotion and assessment of the adequate school’s research mix or portfolio are part of these key challenges. Earlier on Ghoshal (2005) and other business scholars were warning regarding the distancing between business schools and scholars and business practice and suggested that the excess of bad or not fully tested theories were destroying management practice. More recently, the Responsible Research in Business and Management network (see RRBM position paper 2020) also poses that business schools and scholars should “transform their research toward responsible science, producing useful and credible knowledge that addresses relevant problems for business and society”.

In order to fulfil their institutional mission, university administrators have different levers for implementing a defined Business School strategy. They can assign resources and define systems and processes that may help generate the desired behaviours and outcomes. The research assessment system within a business school is one of these key levers since it has strong effects on directing faculty resources, behaviours and energies, which may be reproduced in the future (see for example Riazanova and McNamara, 2015). Figure 1 exhibits a guiding framework based on the reviewed literature that describes the role of research assessment systems for the implementation of a business school strategy and its sustainability.

Figure 1 presents a process model with three phases: strategy formulation, implementation and outputs/feedback from stakeholders. Schools develop their strategies to provide research and teaching (and some other) outputs to fulfil their mission and serve societal needs. Business schools implement their strategies through securing and deploying resources and the functioning of designed systems and processes. In this framework, we focus mainly on the research value chain, which will generate effects on research and overall school outputs. As suggested by O'Brien et al. (2010), business schools' research production may generate economic value for students and constituencies, measured by salary differences in the USA. Based on this belief, schools have developed systems for stimulating research outputs through strong faculty recruitment and selection processes, research assessment systems and faculty promotion procedures aimed towards publishing in the best journal outlets. All these systems combined with the faculty body deployment and the resources allocated to research (funding, incentives and support) will interact and produce individual faculty outputs, which in turn will generate the school’s research and teaching aggregate outputs. These outputs may be mediated by the effects of intrinsic and extrinsic motivations (see Gibbons and Kaplan, 2015 for the effect of formal measures on individual behaviour and organizational culture, and Fischer et al., 2019, for the effects of intrinsic and extrinsic motivation on creativity and innovation). Other individual differences like particular starting conditions such as the original doctoral school research emphasis or research collaboration opportunities and strategies, may also help to explain particular individual research outputs (Li et al., 2019; Riazanova and McNamara, 2015). These individual and synergistic (or not) organizational behaviours will produce the school’s overall real research and teaching outputs. These outputs will be contrasted with planned and expected outputs by stakeholders strengthening or diminishing the school’s sustainability.

As suggested by Peters et al. (2018), in addition to research, business schools have strong value chains dedicated to the delivery of different business education programs and are at the core of the new emerging business models in today's competitive world. Business schools are increasingly depending on the revenues generated by these programs. Stakeholders like the students, employers, academia, the government and accreditation and ranking agencies will assess the business school performance and provide feedback in terms of opinions; recommendations; money; purchasing of services, etc.; promoting or hindering the school sustainability (see e.g. AACSB, 2020; RRBM, 2020; Jack, 2021). School rankings and accreditation agencies representing and anticipating such stakeholder opinions and assessments will generate the needed feedback to institutions to modify strategies, resources and systems.

According to strategic theory, the specificities of a school research assessment system should be consistent with external standards and expectations (society, accreditation agencies and ranking makers) and with the school strategy to compete and become sustainable in today's competitive environment. In particular, our framework in Figure 1 suggests – consistent with strategic fit and control literature (Gibbons and Kaplan, 2015; Kaplan and Norton, 1996) and dynamic capabilities and micro-foundations approach to strategy (Teece, 2007, 2017) – that the alignment (misalignment) of systems/processes with the school strategy can be considered a key driver of strategic success (failure).

Based on these trends, one may argue that research assessment may also require adaptations that converse with these changing external assessment criteria and that are broader in nature. AACSB, the global accreditation agency, for example, has included within its new standards the need to report on the impact of scholarship, considering the quality of intellectual contributions, the ability to contribute to a wide variety of eternal stakeholders, through a mix of basic, applied and/or pedagogical research. Very narrow research assessment systems based on specific and single metrics may generate large risks for Business Schools in the pursuit of their strategic goal and sustainability.

Adding to this growing literature, in this paper, we examine the effect of particular impact measures on the ranking of journals from two related but different disciplines like Business and Economics. We want to study the effects of using single indicators on individual and school-level research assessment, focusing on the strategic and managerial implications of such systems design. Based on the research assessment and the university strategic management literature, we hypothesize that combined and multiple indicators designs will be much more appropriate to assess individual research outputs, particularly when faculty members participate in different disciplines with different research traditions and communities.

The use of single impact metrics may produce strong misalignments between a school research output and the expected impact and the sustainability potential of the school.

3. Method

3.1 Data collection

We collected all the data from the Clarivate Web of Science database, particularly from the Journal of Citation Reports Social Science and Science collections (2017 Journal Impact Factors, 2018). As indicated earlier, other databases like SCOPUS may contain a broader and more diverse collection of journals in the social sciences and business and economics. However, the most used impact indicators are the impact factor indicators computed using the WoS database. Also, the eigenfactors scores and AIS are computed using this database.

Data gathered contained journal and publisher information, total cites, citable items, journal impact indicators, WoS categories and other relevant information. We concentrated on three impact factor indicators: general IF, the 5YIF and IFwoSC and two eigenfactor impact scores: eigenfactor (Eig) and AIS. Additionally, we also consider the total cites indicator, and the immediacy factor (IMMI). We obtained all these impact factors for all the journals included in the above databases.

3.2 WoS category assignment and management

Journals in the WoS database are organized under categories. A major category is the one that distinguishes the Sciences (SCIE) from the Social Sciences (SSCI). Within these general categories, journals are included (they can request it) in particular categories defined by WoS. However, many journals are included in more than one category which makes the definition of the main category of a journal a relevant issue, particularly if you want to have single journal records. While some authors have suggested more complex and combined methods to assign journals to a single WoS category (see for example Dorta-González and Dorta González, 2013), we decide to use a simpler procedure.

Journals indexed in one category were considered in the registered WoS class. For journals that appear in 2, 3 or 4 WoS categories, we used the following procedures to assign the journal to a particular one. We considered the main business/economics category and the relative percentile of the journal IF for the different categories as the main criterion for classification. We can explain it through a hypothetical example. Journal X is classified under three WoS categories: management, psychology and economics. Considering the journal IF (the default information presented by JCR), Journal X is ranked 140 out of 200 in category M = management (percentile 70%), 65 out of 100 in category P = psychology (percentile 65%) and 250 out of 300 in Economics (percentile 17%). Then, given our specific focus of considering business and economics journals, our procedure favours the assignment of journals to those business and economics categories. Therefore, we assigned the journal to the top percentile category with Business and Economics categories. In our example, we assigned journal X to the category M: management, even though in psychology the journal had a better percentile or rank (65%).

We performed this journal allocation process manually, going over all journals case by case. In a few cases, we observed obvious misclassifications by using this rule. For example, some journals might be assigned to a secondary category just by minor differences in the percentile rank – 85% vs 86%. For those cases, we added an expert judgment rule to the best percentile rank rule. The expert judgment rule involved the examination of the journal title and considering: (a) the inclusion of the name of the WoS categories in the title (management, business, finance and economics) and (b) the order in which they appear in the title and general subject area classifications particularly for management (e.g. strategy, general management, OB and entrepreneurship were considered under management and marketing, logistics and multidisciplinary journals were included in the business category).

We used this combined procedure for two purposes: (1) to preserve all the selected 669 journals in the Business and Economics journals within these fields since the use of simple top percentile rule will leave some of the journals in other SSCI or SCIE categories and (2) have a stronger validity for our results within the field of business and economics. We believe that since authors can send their papers to any journal, business schools tend to favour those journals that fall under the Business and Economics WoS categories. Therefore, assigning journals to the top Business or Economics category appears to be the better solution to solve this multiple category issue.

4. Results

4.1 Correlations between journal impact indicators

Consistent with previous studies in different disciplines (see, for example, Salvador-Oliván and Agustín-Lacruz, 2015), we found significant relationships between many IF indicators. In Table 1, we include the correlation matrix, and most indicators are large, positive and significant. Particularly strong correlations were found between the eigenfactor score, and total cites (0.93) and between the AIS and the 5YIF. Correlations of the AIS with the impact factor was 0.83 and with the IF without self/cites was 0.89. These results are consistent with previous studies as presented in Table 2. Also, we computed the correlations for different specific disciplines (within the sciences and social sciences), finding consistent results across all of them. We also report the correlation coefficients for the business and economics WoS categories.

Those relationships can also be graphically visualized (Figure 2). Eigenfactor scores are highly correlated with total cites, and AI scores show a stronger linear relationship with the 5YIF. Of course, since most relationships were not 1.0, some variance was not captured by the other indicators, but they are pretty good predictors of eigenfactor and AI scores.

Dispersion graphs showing high correlations between logarithms of eigenvalue scores and cites; eigenvalues and AIS sores; AIS scores and 5YIF scores, and eigenfactor scores and IF scores without self-citations. Source: Own elaboration.

4.2 Are different citations patterns an issue?

Many researchers have indicated that self-citations (in this case, citations to a given journal coming from articles in the same journal) may differ across disciplines and have stated the need to control for it. In fact, WoS publishes the IF without self-cites as a way to have a cleaner IF. We computed two self-citation effect variables: the absolute increase in IF due to self-cites (DiffIF = IF – IFwoSC) and the percentual or relative increase in IF due to self-cites (PercDiff = DiffIF/IFwoSC). We provide the graphs in Figure 3, showing no strong relationships between the AIS and eigenfactor indices and absolute self/citation effects. It also shows a soft negative relationship in the case of percentage or relative effect of self/citation patterns, thus indicating some evidence for AIs and eigenfactor to potentially reduce these effects.

As stated by Arendt, the persistent correlations between AIS and IFs (particularly the 5YIF) provide a stronger argument that the differences in citation patterns across fields (if exist) are not removed using the AIS. Two explanations provided by Arendt (2010) might explain this phenomenon. The first one is that structural differences between scientific fields do exist. Some fields cite more, and some fields are more citable and influential or “prestigious” than others. The second explanation is linked to the particular field connection (position) to the citation network. Fields with a larger number of journals and with already prestigious journals (higher IFs or AIs) will be favoured by citations and will be more influential.

Dorta-González and Dorta-González (2013) cover some of these issues suggesting four potential sources of citation variance in addition to the number of references per an article in the field, like different dissemination channels (e.g. books and proceedings vs journal articles and relative coverage of the WoS of different disciplines), different field growth (reduction) rates, the ratio of total citations in the discipline within the target window and different ratios of cited to citing (or citation exchange between fields).

Similarly, Waltman and Van Eck (2010) suggest that no impact measure (not even the eigenfactor) can manage two main opposite properties: the insensitivity to field differences and the insensitivity to insignificant journals. The eigenfactor and AIS cannot deal with both situations simultaneously, and their capability to deal with one more than the other will depend on the parameter alpha (0–1) used for the algorithm estimation. In fact, several other researchers have offered their own metrics for trying to account for these field differences, like the Audience factor (Zitt and Small, 2008) and the Source Normalized Impact or the SCImago Journal Rank, which is basically a Page Rank inspired indicator, similar to the eigenfactor, but computed using the SCOPUS database (González-Pereira et al., 2010). No single impact metric can capture the complexity of quality assessment and control at the same time for all intervening variables.

4.3 Does the use of impact factor vs eigenfactor metrics affect the assessment of business and economics research?

As stated earlier, we wanted to examine the effects of using particular impact metrics when assessing research outputs from Business and Economics. Therefore, we computed the average scores for all six previously mentioned journal assessment metrics, plus the total cites, also published by WoS, using the Business and Economics journals included in Web of Science (in the Business, Business/Finance, Management and Economics categories). As tabulated in Table 3, mean scores for the IF, 5YIF, IFwoSC and total cites are higher for All Business journals compared to Economics journals. Eigenfactor metrics – eigenfactor and article influence scores – on the other hand, are larger for Economics journals. What is the explanation for these results? Are there specific assumptions that may affect the computation of IF metrics and eigenfactor metrics that may induce these inconsistent results?

An answer can be found in the computation logic of the eigenfactor scores. According to Bergstrom (2007), the total influence of a discipline in a year is the sum of the eigenfactors of all journals in that discipline, and the total production of science (defined as the WoS citable pieces) is defined to be 100. The eigenfactor is a measure of the influence of a particular journal on the sciences. Since they are approximately 11,681 journals in the database, the average estimated contribution of a journal (assuming all journals contribute the same) will be 100/11,681 or 00.085 (which can also be read as a percentage of influence). Table 3 reports the average eigenfactors for all Business (00.033) and Economics journals (00.046) and individual Business categories.

Using this information, we can also calculate the relative influence of each WoS category on Science in general (to the science included in all 11,681 journals). Multiplying the average eigenfactor score by the number of journal titles provides the following WoS Category influence scores: Business (0.25 = 0.0024 × 99), Management (0.54), Business-Finance (0.36), All Business – the sum of Business, Management and Business-Finance (1,145) and Economics (1.47). If we divide these numbers by the total citations obtained by journals in those disciplines, we can get the value of a citation in a business journal (000,000.076) versus the value of a citation in an Economics journal (000,000.192) or in an average scientific journal (000,000.155).

We can estimate the relative value of a citation in different disciplines by dividing these citation values by the citation on an average journal. Relative values of citations are Business (0.486), Management (0.554), Business-Finance (1,048), and Economics (1,223). According to these computations based on the Eigenfactor metrics algorithm, a business citation is worth half an average cite in all scientific journals, while a citation in an Economics journal is worth 1 and 2, an average cite. It is important to notice, that these values would change if the database considered is different (e.g. SCOPUS) and is sensible to the disciplinary coverage of each database. These different relative values in citations – economics journal citations counting 2; 5, a business journal citation – is relevant to explain the positive differences in eigenfactor and AI scores for economics journals when compared to business journals.

4.4 Using impact factor vs eigenfactor AIS for assessing specific business and economics journals

After providing a general overview of the impact of using standard IF or eigenfactor metrics to assess research from different disciplines, we wanted to examine its effects at the specific journal level. Using the different impact and eigenfactor metrics, we ranked all 669 Business and Economics journals. Tables 4 and 5 show the top 50 journals under each of these rankings. Major differences can be observed between these rankings in terms of the representation of different disciplines.

In Table 6, we present a summary of the presence of journals from each WoS category in the top 50 rankings when a particular impact or influence metric is used. As can be seen, the effect is quite dramatic: the presence of Economics journals goes from 11 to 12 (23%) if you use the IF or the 5YIF to 32 and 29 (61%) if the eigenfactor scores or AIS are used to prepare the ranking.

The stronger presence of economics journals in the top 50 list when you use the eigenfactor metrics may be derived from the higher value the extracting algorithm procedures assign to economics journals (1.4 vs 1.1 for business), which can be associated with the characteristics of each disciplinary network (size, centrality and density) within the whole network of Science. Rosvall and Bergstrom (2011) present a hierarchical map of science consistent with this construction, where Economics is more central and larger/dense network within the Social Sciences field and is closer to the gateways (e.g. Statistics) to the other major scientific fields: Physical Sciences, Ecology and Earth sciences, and Life Sciences (see Figure 3 for a graphical representation of these arguments).

4.5 Using impact factor vs eigenfactor AIS for assessing journals in the social sciences

To look for more generalizable findings beyond Business and Economics, a similar analysis was performed for all Social Sciences. As in the previous analysis, the presence of different disciplines in the top 100 journals (in this case representing the top 3% of the social science journals) changes depending on the type of metric used (Table 7). If the traditional IF is used, both Psychology and Management have a stronger presence in the top 100 with 25 and 26 journals. Economics has 11 journals in the top 100. Similar results are obtained with the other derived impact factor measures. Economics presence increases dramatically, from 11 to 35 journals in the top 100 if AI scores are used. In fact, Economics + Finance represents 40% of the top 100 in all sciences compared to 14% if the IF is used.

These results are consistent with the average scores and ranks by discipline. We ranked all scientific disciplines according to the different impact metrics, and we establish the IF rank as the benchmark. Then we computed the differences in ranks when using a particular metric compared to the benchmark IF-based rank. In Table 8, we include the 20 disciplines that increase their rankings the most if the AI score is used (last column in the table). Mathematics (#1) rises 172 places, Statistics and Probability rises 170 places (#2), Applied mathematics 129 places (#3), Economics 127 places (#4) and Business-Finance (#8). Most of these 20 fields with the top larger ranking increases are older fields and very quantitative in nature (therefore more central to the Total Science spectrum, and according to Rosvall and Bergstrom, 2011, more influential).

We also include the 10 disciplines that face the largest reductions. Smaller fields appear as the ones most affected by the AIS ranking. Also, Business remains basically in the same position (rank is only two places higher), while Management experiences a slight increase of 18 places. These results produce, however, important changes in the relative positions and distances among Business and Economics disciplines, with Economics in the 25th place of all disciplines, Business-Finance in the 35th place, Management in the 40th place and Business in the 74th place (the opposite will happen if you use the impact factor metrics). The prevalence of quantitative methods and mathematical modelling, cross-citation between fields and the relative (in) balance in citing vs cited patterns, centrality in the scientific database chosen and the underrepresentation of disciplines in the database (e.g. WoS vs Scopus), number of journals and tradition/history of the field might be potential reasons for these differences that need further study. In the case of business disciplines, Business-Finance with a heavier empirical and mathematical approach (similar with Economics in the use of Econometrics) is the field with a large ranking increase.

The category Management includes the most traditional and original subdisciplines (organization theory and behaviour, general management and management science) and journals (e.g. Administrative Science Quarterly; Academy of Management Journal; and Academy of Management Review) that regularly cross-cite with journals in Sociology, Psychology, Decision Sciences and Economics, while the Business category includes more recently developed and applied fields like Marketing and Advertising, Logistics, Electronic Markets, Services, that use a combination of quantitative and qualitative approaches and are smaller in size (99 Business vs 167 Management journals) (Table 9).

5. Discussion

5.1 Theoretical implications

Our results confirm previous studies that show high correlations between different impact factor metrics, confirming the idea that they are somewhat measuring a similar underlying construct. Most of the important research quality variance is already captured in the regular impact metrics – total cites and 5-year JIF – which are highly correlated with the eigenfactor score and the AIS, respectively. Then the arguments of Chang et al. (2011, 2013, 2016) and other scholars in favour of simpler and more transparent impact metrics gain additional support. However, our results confirm that for certain cases, eigenfactor metrics, in particular, the AIS can capture some information that is not captured by standard IF metrics.

They seem to provide some control for self-citations and place a major weight on the centrality (or influence) of particular journals and disciplines in the overall scientific network. Older, more traditional, more interconnected and dense disciplines will be favoured when using the eigenfactor, and particularly the AIS score. However, the main assumption that the value of a discipline is represented by the size and centrality of its network needs to be further discussed and justified. When using these assumptions, the Social Sciences represent less than 10% of the value provided by all sciences (see for example Rosvall and Bergstrom (2011), who estimated 4%).

Then when assessing the research outputs of researchers, social science scholars will be undervalued compared to science researchers because their contributions are part of a smaller and less influential subnetwork according to the way the AIS is computed. The same effect occurs when you compare Business and Economics disciplines. When using the AIS, Business journals represent 1.14% of the total value in Science and Economics journals represent 1.47% of the value of all sciences. Then, when considering both types of researchers together and using just AIS scores to give recognition, awards and promotions or to assess individual research output, business research will be considered on average as having a lower impact.

Therefore, a stronger theoretical discussion is needed regarding the underlying assumptions of specific metrics on the relative value or influence of sciences, social sciences and its particular disciplines.

At the individual level, we suggest that these results may also be connected to the literature incentives on intrinsic and extrinsic motivations, fairness and internal and external equity and the overall design of incentive systems in research organizations (see Welpe et al., 2015, for a compendium on this issue; Rizkallah and Sin, 2010; Fischer et al., 2019).

Our research has also some implications for the business school management literature. Based on the capability-based and strategic fit approaches in strategic theory (Teece, 2007; Kaplan and Norton, 1996), we argue that school outcomes will be better and more sustainable the larger the fit between systems and resource allocations with the original strategy and mission. Research assessment systems are at the core of these processes since they may enhance faculty selection (due to candidates' self-selection or cultural self-replicating actions) and affect faculty promotion and tenure decisions. Research assessment systems may also generate stress with faculty deployment systems to fulfil the faculty needs of both research and teaching value chains.

We argue based on the above theories that these effects may be more negative for Business Schools and universities, the larger the misalignment between the needed research outcomes and the type of research favoured by particular metrics. Consistently, these initial results provide support to the literature, suggesting that journal impact metrics cannot be used as complete substitutes for qualitative assessment of individual research contributions. This is in line with new developments on the research incentives literature examining broader research outcomes given stakeholders' expectations, like research translation, dissemination and utilization (Jessani et al., 2020). Additionally, since assessment systems are developed by those being assessed (professors) affecting their promotions, benefits and internal power, there are obvious self-regulation risks and issues (Gomes and Frade, 2019) that need to be accounted for in the design process. Further research needs to examine these particular relationships in more detail.

5.2 Implications for research assessment systems and strategic management of business schools and universities

The previous discussion has placed special attention on the issues of research assessment to build more valid and fair systems that generate the conditions and incentives to improve the research outputs at the individual, school or institutional level. A related but different issue has to do with the linkage between research assessment systems and strategy formulation and implementation at higher education institutions.

Some scholars will still argue that economic growth has a stronger intrinsic value than fashion management, but if you work for a State School of Fashion and you are training fashion marketers in a province of Colombia or Guangzhou, maybe this “less central research” will be the basis for better professional training, and the main driver of economic growth in those regions and will be very valuable for serving your school mission. However, the use of eigenfactor and AIS will value the economic growth article as more “influential” than the fashion management piece, just because economic journals are cited by more influential journals and are overrepresented in the WoS database (compared to fashion or management journals). This element should be considered when designing research promotion policies to provide better guidance to faculty and to have a consistent strategy and use of resources within Schools and Universities.

Then, as a general implication, our research confirms the notion that the design of research assessment systems needs to consider both qualitative and quantitative indicators and should be administered by senior scholars with a sound knowledge of the disciplines, the school’s mission and the expectations of a wide variety of stakeholders (not just Academia).

Additionally, since business schools (and other schools within universities) are becoming increasingly multidisciplinary, the inclusion of multiple impact metrics is advised. If a single factor metric would be used for assessing research from different disciplines, e.g. Psychology, Economics, Sociology and Business, the fight over which single/metric to use will be filled with conflicts of interest and not necessarily follow the aims of promoting research. Besides, the use of single/metrics will also provide a fruitful scenario for the appearance of winner-take-all markets, particularly if some fields or subfields have an initial advantage in terms of research traditions, number of existing journals, use of mathematical/quantitative methods and modelling vs cases and qualitative research, previous publications by the particular subfield.

Similar implications can be drawn for the design of university-wide and national research assessment systems, which should take into consideration a wide variety of fields and disciplines, from the Sciences, Social Sciences and Humanities. Despite the intent to control for disciplinary differences in citations, peer review, authorship and publication patterns, it would be difficult to justify that the disciplines in the Natural Sciences have nine times more value (influence) than the Social Sciences. University-wide research assessment and tenure and promotion systems need to have higher legitimacy within all disciplines, and the use of multiple metrics may be relevant for reducing the undesired effects and biases produced by the embedded computation logic of eigenfactors and AIS calculation, against Social Sciences, and more peripheral/newer/practical disciplines.

The above tables indicate that the relative ranks of both disciplines and journals may be very sensitive to the type of impact measure used, and undesired winner take all markets instead of competitive markets may be fostered. Also, since dramatic changes are present at the journal level rank, special care is needed, when schools and research bodies are using journal-level ranks to assess article quality and research productivity (Mingers and Yang, 2017).

Our results suggest that the use of multiple impact metrics may provide a better solution and a broader perspective on journals and research assessment. No particular metric fulfils all desirable criteria and despite the claim that some impact metrics – like the Eigenfactor and AIS – include the implicit assumption that certain fields are more influential than others given the existing network size and cross citation patterns, which goes against the original objective and may reduce its acceptance. Additionally, for business schools with balanced teaching-research missions – like most business schools in emerging nations – it would be less advisable to use impact metrics that consider business disciplines as less valuable and more peripheric (e.g. eigenfactor and AIS scores than other support disciplines to the Business profession (e.g. political science or statistics). It will be difficult to justify that business schools' research assessment systems would be rewarding research on these disciplines more than research on core business subjects.

5.3 Implications for individual research assessment

Finally, research committees at the national, university or school levels, when considering individual research records, should use multiple indicators and have discipline-based benchmarks. Even bibliometric studies in particular fields or regional areas will provide a better understanding of the contributions of a researcher (school or country) to a particular field (see for example Cancino et al., 2018; Olavarrieta and Villena, 2014). Like traditional IFs, AIS and eigenfactor scores do vary considerably between disciplines. To assess individual research performance, you need to combine information from different indicators, and you need to go to the particularities of each case. Recently, Nature (2017 Journal Impact Factors, 2018), one of the top Science journals, decided to diversify the presentation of its impact and performance indicators. They decided to do so since they recognize the differences in citation patterns across disciplines, and that IFs sometimes overrate journals with few very highly cited papers and undervalue research with few citations, particularly in fields with lower citation propensities. Although AIS may reduce the effects of self-citation patterns, most differences across fields cannot be understood as based mainly on this particular dimension. Our results indicate that network-related indicators as the eigenfactor and the AIS are particularly conditioned by the characteristics or structure of the network, making comparisons based on the AIS particularly complex if you have researchers publishing in different sub-networks or disciplines. Internal and external equity issues should be considered to stimulate extrinsic motivation and avoid the negative effects of unfairness perceptions. Let us consider two cases: (a) individuals with few non-cited papers published in good influential central journals and (b) researchers with several highly cited papers published in journals of more peripheric disciplines. Would it be fair to rule out the second, and not even let them compete for research funds, awards or promotions, in favour of the first just based on the comparison of journal level impact factors? Would this promote relevant and good research in your institutions and countries? Would this promote a good resource allocation process? Research profiles, individual article level information and disciplinary peer judgment cannot be substituted for algorithms based on single metrics (Mingers and Yang, 2017; Adams et al., 2019). Multiple indicators are advised and given the lack of relative coverage of business journals in WoS (Harzing, 2020), the use of additional indicators based on Scopus or Google Scholar will also add relevant information.

5.4 Future research agenda

In this study, we present descriptive evidence regarding the general effects in assessing and ranking journals and disciplines when using standard impact metrics compared to eigenfactor-derived metrics. We developed plausible explanations for these effects based on the design and computation definitions provided by their developers (Bergstrom, 2007; Bergstrom and West, 2008; Begstrom et al., 2008; Rosvall and Bergstrom, 2011). Future research may further investigate the importance of different factors on the scores and rankings of journals based on these scores. Some of the key variables that may need to be further researched are the age of the field and journal; journal and disciplinary network size and density; centrality and closeness and cross-citation patterns of the journal/discipline with other disciplines outside social sciences. Other important variables to be explored are the newness and practice orientation of a particular field. Again, the use of eigenfactor indicators may generate lower impact levels for journals and disciplines that are rising and that is fostering innovation. If this were the case, the relative validity of standard IFs versus eigenfactor metrics may rise, and the case for peer-based multi-dimensional assessment systems will be stronger.

Future studies may extend the study of impact metrics on journal and discipline rankings beyond the WoS database to the Scopus database. It might be possible, that since Scopus has a wider representation of business journals and territories, smaller changes may be found when using the Scopus database.

Both at the individual and institutional levels, it might be relevant to study the effects of using particular metrics or single metric vs holistic research assessment systems to channel research activities and resources on school research outcomes, teaching outcomes and overall sustainability. At the individual level, research should take into consideration the potential effects of initial individual conditions (doctoral school) and the mediating effects of intrinsic and extrinsic motivation, for explaining individual outputs. At the organizational level, studies can be conducted that focus on school outcomes, the mediating and moderating factors (school resources, faculty size, composition and organizational culture) or the stakeholders' evaluations (e.g. students, employers and accreditation agencies).

From a higher education management perspective, future research may investigate the specific effects of using single indicators on research performance and sustainability indicators at the School or University level, compared to institutions that design broader research assessment systems. In this sense, the guiding framework presented in Figure 1 provides guidance to examine this topic in more detail at different levels of analysis. Studies can focus on individual output performance, departmental, school, university or country performance. Also, studies can be conducted to examine the effects on intermediate variables like intrinsic and extrinsic motivations, as drivers of final research output.

Other research avenues can consider the effects on overall performance considering both research and teaching outputs, and the external assessment by stakeholders.

6. Conclusions

Our research shows that research assessment based on single IF metrics (e.g. journal impact factor vs AIS) may generate biases on the assessment of researchers or area outputs. These biases may be larger when assessment involves researchers from related disciplines – like Business and Economics – but with different research foundations and traditions. AIS favours older, more traditional and quantitative-oriented disciplines, more connected to all sciences while reducing the assessed value of newer, peripheric or more qualitative research-oriented disciplines. This single-metric design may produce a misallocation of resources and may deviate business schools and universities towards particular outcomes that will only partially fit the school’s strategic goals and longstanding sustainability. The use of peer-based assessment systems including multiple metrics and criteria is advised to reduce these problematic effects. Business school rankings and accreditation agencies may enhance or reduce these effects if they include stakeholder expectations in their methodologies and standards.

Figures

Research assessment system and its impact on strategic sustainability – the risk of misalignment in single metric systems

Figure 1

Research assessment system and its impact on strategic sustainability – the risk of misalignment in single metric systems

Eigenfactor scores and other journal indicators

Figure 2

Eigenfactor scores and other journal indicators

Eigenfactor and AI scores and self-citations (DiffIF: absolute increase in impact factor due to self-citations; PercDiff: Percentual increase)

Figure 3

Eigenfactor and AI scores and self-citations (DiffIF: absolute increase in impact factor due to self-citations; PercDiff: Percentual increase)

Correlations between impact and eigenfactor metrics for all SSCI and SCIE journals

Correlations
lgIFlg5YIFlgIFwoSClgEigenlgAISlgCiteslgIMMI
lgIF10.967**0.988**0.786**0.864**0.723**0.790**
lg5YIF0.967**10.963**0.792**0.913**0.738**0.776**
lgIFwoSC0.988**0.963**10.787**0.885**0.716**0.778**
lgEigen0.786**0.792**0.787**10.785**0.930**0.621**
lgAIS0.864**0.913**0.885**0.785**10.670**0.706**
lgCites0.723**0.738**0.716**0.930**0.670**10.578**
lgIMMI0.790**0.776**0.778**0.621**0.706**0.578**1

Note(s): **p < 0.01

Source(s): Own elaboration

Previous impact metrics correlation studies compared

Metric 1Metric 2Corr or R2 Davis 2008Ferscht (2009)Rousseau et al. (2009)Franceschet (2010)Elkins et al. (2010)Arendt (2010)Salvador-Oliván and Agustín-López (2015)Chang et al. (2016) EconomicsThis Study All DisciplinesThis Study Business and Economics
EigenfactorIFr  0.8270.77  0.80.6260.7860.704
  R20.86         
Eigenfactor5YIFr   0.77  0.8070.7030.7920.765
  R2          
EigenfactorIFwoSelf Citesr       0.6250.7870.740
  R2          
EigenfactorAIr  0.8270.76  0.780.9430.7850.889
  R2          
EigenfactorTotal Citesr        0.930.887
  R20.950.968        
AIIFr  0.9180.810.790.7720.8340.8260.8640.744
  R2     0.596    
AI5YIFr   0.88   0.9020.9130.811
  R2          
AIIFwoSelf Citesr       0.850.8850.786
  R2          
AITotal Citesr        0.670.735
  R2          

Source(s): Own elaboration

Journal impact metrics and relative values by disciplines

CATEGORY_DESCRIPTION1 IMPACT_FACTOR5YR_IMPACT_FACTORImpact Factor without Journal Self CitesEigenfactorNORM_EIGENFACTORARTL_INFLUENCETOT_CITESCitable ItemsRelative Contribution to ScienceCite value Contrib to ScienceCite value r to Tot Avg Cite Vlaue
BUSINESSMean2.293.161.900.002480.290110.76573286.953.20.250.000000760.486
N9983999999839999
BUSINESS, FINANCEMean1.552.111.370.004250.496551.08722609.845.10.360.000001631.048
N8478848484788484
MANAGEMENTMean2.523.472.180.003210.375471.01963730.046.20.540.000000860.554
N167160167167167160167167
ALL BUSINESS 2.223.061.900.00330.38040.97043335.847.91.14
350321350350350321350350
ECONOMICSMean1.461.861.320.00460.53761.22832400.855.31.470.000001921.233
N319306319319319306319319
TotalMean2.332.542.120.00861.00000.85506.3109.6100.000.000001551.000
N11,65111,27011,68111,68111,68111,27011,68111,681

Source(s): Own elaboration

TOP 50 journals in Business and Economics according to different journal metrics, rank based on impact factor metrics

Top 50 journals in Business and Economics, rank based on eigenfactor metrics

Journals by discipline in the top 50 Business and Economics rank, given a particular single metric is used to rank the journals

CategoryIF5Yr IFIF wo SCEigAIS
Econ1112133229
BusFIn44445
Management262323914
Business9111052
Total5050505050

Source(s): Own elaboration

Journals by discipline in the top 100 ALL SOCIAL SCIENCES Rank given a particular single metric is used to rank the journals

WoS categoryIF5YrIFIFwoSelf citesEigenfactor ScArticle influence score
Psychology2521273118
Economics1112123235
Management2624221115
Business812854
Business and finance35445
Sociology45415
Political Science24239

Source(s): Own elaboration

Differences in rankings for disciplines when using AIS vs impact factor

Relative presence (size) of disciplines in WoS vs Scopus

SCOPUS categoriesSCOPUS%BusEcon% TotSCIWoS%BusECon% TotSCIWoS categories
Business, Management and Accounting1,441 140 Business
210 Management
Finance269 98 Business, Finance
All Business1,71066.6%4.6%44855.9%3.6%
Economics85833.4%2.3%35344.1%2.9%
Tot Bus Econ2,568 801
Total Sciences37,461 12,327

Source(s): Own elaboration

References

2017 Journal Impact Factors (2018), Journal Citation Reports Science and Social Science Edition.

AACSB (2012), “Impact of research: a guide for business schools, electronic document”, available at: https://www.aacsb.edu/insights/publications/impact-of-research-a-guide-for-business-schools.

AACSB (2020), “AACSB 2020 Guiding principles and standards for business accreditation”, Electronic document, available at: https://www.aacsb.edu/educators/accreditation/business-accreditation/aacsb-business-accreditation-standards.

Adams, J., McVeigh, M., Pendlebury, D. and Szomszor, M. (2019), “Profile, not metrics”, Institute for Scientific Information, Clarivate Analytics, available at: https://clarivate.com/webofsciencegroup/campaigns/profiles-not-metrics/.

Adler, N. and Harzing, A. (2009), “When knowledge wins: transcending the sense and nonsense of academic rankings”, Academy of Management Learning and Education, Vol. 22 No. 1, doi: 10.5465/amle.2009.37012181.

Arendt, J. (2010), “Are article influence scores comparable across scientific fields?”, Issues in Science and Technology Librarianship, Vol. 60, Winter, pp. 1-14, doi: 10.5062/F4FQ9TJW.

Azar, O.H. (2009), “The influence of Economics articles on Business Research: analysis of journals and time trends”, The Journal of Industrial Economics, Vol. 57 No. 4, pp. 851-869.

Bergstrom, C.T. (2007), “Eigenfactor: measuring the value and prestige of scholarly journals”, College and Research Libraries News, Vol. 68 No. 5, pp. 314-316.

Bergstrom, C.T. and West, J.D. (2008), “Assessing citations with the Eigenfactor metrics. Neurology”, Vol. 71 No. 23, pp. 1850-1851.

Bergstrom, C.T., West, J.D. and Wiseman, M.A. (2008), “The Eigenfactor metrics”, The Journal of Neuroscience, Vol. 28 No. 45, pp. 11433-11434.

Brito, R. and Rodríguez-Navarro, A. (2019), “Evaluating research and researchers by the journal impact factor: is it better than coin-flipping?”, Journal of Informetrics, Vol. 13 No. 1, pp. 314-324.

Brown, T. and Gutman, S.A. (2019), “Impact factor, eigenfactor, article influence, Scopus SNIP, and SCImago journal rank of occupational therapy journals”, Scandinavian Journal of Occupational Therapy, Vol. 26 No. 7, pp. 475-483.

Cancino, C.A., Merigó, J.M., Torres, J.P. and Díaz, D. (2018), “A bibliometric analysis of venture capital research”, Journal of Economics, Finance and Administrative Science, Vol. 23 No. 45, pp. 182-195.

Carey, R.M. (2016), “Quantifying scientific merit: is it time to transform the impact factor?”, Circulation Research, Vol. 119 No. 12, pp. 1273-1275.

Chang, C.-L., McAleer, M. and Oxley, L. (2011), “How are journal impact, prestige and article influence related: an application to neurosciences”, Journal of Applied Statistics, Vol. 38 No. 11, pp. 2563-2573.

Chang, C.-L., McAleer, M. and Oxley, L. (2013), “Journal impact factor, eigenfactor, journal influence, and article influence”, Tinbergen Institute Discussion Paper TI2013-002.

Chang, C.-L., Massoumi, E. and McAleer, M. (2016), “Robust rankings of journal quality: an application to economics”, Journal of Econometric Reviews, Vol. 35 No. 1, pp. 50-97.

Davis, P.M. (2008), “Eigenfactor: does the principle of repeated improvement result in better estimates than raw citation counts?”, Journal of the American Society for Information Science and Technology, Vol. 59, pp. 2186-2188.

Dorta-González, P. and Dorta-González, M.I. (2013), “Comparing journals from different fields of science and social science through a JCR subject categories normalized impact factor”, Scientometrics, Vol. 95, pp. 645-672.

Eigenfactor (2009), “EigenfactorTM score and article influence TM score: detailed methods”, available at: http://www.eigenfactor.org/methods.pdf.

Elkins, M.R., Maher, C.G., Herbert, R.D., Moseley, A.M. and Sherrington, C. (2010), “Correlation between the journal impact factor and three other journal citation indices”, Scientometrics, Vol. 85, pp. 81-93.

Fersht, A. (2009), “The most influential journals: impact factor and eigenfactor”, Proceedings of the National Academy of Sciences, Vol. 106 No. 17, pp. 6883-6884, doi: 10.1073/pnas.0903307106.

Fischer, C., Malycha, C.P. and Schafmann, E. (2019), “The influence of intrinsic motivation and synergistic extrinsic motivators on creativity and innovation”, Frontiers of Psychology, Vol. 10, Art. 37, pp. 1-15, doi: 10.3389/fpsyg.2019.00137.

Franceschet, M. (2010), “Journal influence factors”, Journal of Informetrics, Vol. 4 No. 3, pp. 239-248.

Garfield, E. (1972), “Citation analysis as a tool in journal evaluation”, Science, Vol. 178 No. 4060, pp. 471-479.

Garfield, E. (2006), “The history and meaning of the journal impact factor”, The Journal of the American Medical Association, Vol. 295 No. 1, pp. 90-93.

Ghoshal, R. (2005), “Bad management theories are destroying good management practices”, Academy of Managament Learning and Education, Vol. 4 No. 1, pp. 75-91.

Gibbons, R. and Kaplan, R.S. (2015), “Formal measures in informal management: can a balanced scorecard change a culture?”, American Economic Review, Vol. 105 No. 5, pp. 447-451.

Gomes, O. and Frade, J. (2019), “Fool me once: deception, morality and self-regeneration in decentralized markets”, Journal of Economics, Finance and Administrative Science, Vol. 24 No. 48, pp. 312-326.

González-Pereira, B., Guerrero-Bote, V.P. and Moya-Anegón, F. (2010), “A new approach to the metric of journals' scientific prestige: the SJR indicator”, Journal of Informetrics, Vol. 4 No. 3, pp. 379-391.

Gorraiz, J., Wieland, M. and Gumpenberger, C. (2017), “To be visible, or not to be, that is the question”, International Journal of Social Sciences and Humanity, Vol. 7 No. 7, pp. 467-471.

Harzing, A.W. (2019), “Two new kids on the block: how do crossref and dimensions compare with Google scholar, microsoft academic, Scopus and the web of science”, Scientometrics, Vol. 120, pp. 341-349.

Harzing, A.W. (2020), “Everything you always wanted to know about research impact”, in Clark, T. and Wright, M. (Eds), How to Get Published in the Best Management Journals, Edward Edgar Publishing, pp. 127-141.

Harzing, A.W. and Van der Wal, R. (2009), “A Google Scholar h-index for journals: an alternative metric to measure journal impact in economics and business”, Journal of the American Society for Information Science and Technology, Vol. 60 No. 1, pp. 41-46.

Haustein, S. and Larivière, V. (2015), “The use of bibliometrics for assessing research: possibilities, limitations and adverse effects”, in Welpe, I., Wollersheim, J., Ringelhan, S. and Osterloh, M. (Eds), Incentives and Performance, Springer, doi: 10.1007/978-3-319-09785-5_8.

Jack, A. (2021), “Business school rankings: the financial times’ experience and evolutions”, Business and Society, May, pp. 1-6, doi: 10.1177/00076503211016783.

Jedidi, K., Schmitt, B.H., Ben Sliman, M. and Li, Y. (2021), “R2M index 1.0: assessing the practical relevance of academic marketing articles”, Journal of Marketing, Vol. 85 No. 5, pp. 22-41, doi: 10.1177/00222429211028145.

Jessani, N.S., Valmeekanathan, A., Babcock, C.M. and Ling, B. (2020), “Academic incentives for enhancing faculty engagement with decision-makers—considerations and recommendations from one School of Public Health”, Humanities and Social Science Communications, Vol. 7, 148, doi: 10.1057/s41599-020-00629-1.

Kaplan, R.S. and Norton, D.P. (1996), The Balanced Scorecard: Translating Strategy into Action, Harvard Business School Press, Boston, MA.

Kianifar, H., Sadeghi, R. and Zarifmahmoudi, L. (2014), “Comparison between Impact factor, Eigenfactor metrics, and SCImago journal rank indicator of pediatric neurology journals”, Acta Informatica Medica, Vol. 22 No. 2, pp. 103-106.

Li, W., Aste, T., Caccioli, F. and Livan, G. (2019), “Early coauthorship with top scientists predicts success in academic careers”, Nature Communications, Vol. 10 No. 1, pp. 1-9.

Merigó, J.M., Zurita, G. and Link-Chaparro, S. (2016), “Normalization of the article influence score between categories”, in Proceedings of the World Congress on Engineering, Vol. 1, pp. 403-408.

Mingers, J. and Yang, W. (2017), “Evaluating journal quality: a Review of journal citation indicators and ranking in business and management”, European Journal of Operational Research, Vol. 257 No. 2, pp. 323-337.

Morales, M. and Calderón, L.F. (1999), “Assessing service quality in schools of business: dimensions of service quality in continuing professional education”, Journal of Economics, Finance and Administrative Sciences, Vol. 5 Nos 9-10, pp. 125-140, doi: 10.46631/jefas.1999.n9-10.06.

O'Brien, J.P., Drnevich, P.L., Crook, T.R. and Armstrong, C.E. (2010), “Does business school research add economic value for students?”, Academy of Management Learning and Education, Vol. 15 No. 3, doi: 10.5465/amle.9.4.zqr638.

Olavarrieta, S. and Villena, M. (2014), “Innovation and business research in Latin America: an overview”, Journal of Business Research, Vol. 67 No. 4, pp. 489-497.

Paulus, F.M., Cruz, N. and Krach, S. (2018), “The impact factor fallacy”, Frontiers in Psychology, Vol. 9, 1487,doi: 10.3389/fpsyg.2018.01487.

Peters, K., Thomas, H. and Smith, R.R. (2018), Rethinking the Business Models of Business Schools: A Critical Review and Change Agenda, Emerald Publishing, Bingley.

Pitt-Watson, D. and Quigley, E. (2019), “Business school rankings for the 21st century”, United Nations Global Compact Report, available at: https://grli.org/resources/business-school-rankings-for-the-21st-century-january-2019/.

Riazanova, O. and McNamara, P. (2015), “Socialization and proactive behavior: multilevel exploration of research productivity drivers in the US business schools”, Academy of Management Learning and Education, Vol. 9 No. 4, doi: 10.5465/amle.2015.0084.

Rizkallah, J. and Sin, D.D. (2010), “Integrative approach to quality assessment of medical journals using impact factor, eigenfactor, and article influence scores”, PLoS One, Vol. 5 No. 4, e10204.

Rosvall, M.J. and Bergstrom, T. (2011), “Multilevel compression of random walks on networks reveals a hierarchical organization in large integrated systems”, PLoS One, Vol. 6 No. 4, e118209.

Rousseau R, the STIMULATE 8 Group (2009), “On the relation between the WoS impact factor, the Eigenfactor, the SCImago Journal Rank, the Article Influence Score and the journal h-index”, available from E-LIS archive, ID: 16448.

RRBM co-founders (2020), “A vision for responsible research in business and management: striving for useful and credible knowledge”, Position paper, available at: www.rrbm.network.

Salcedo, N. (2021a), “Editorial: review and roadmap from the last 10 years (2010-2020)”, Journal of Economics, Finance and Administrative Sciences, Vol. 25 No. 51, pp. 2-6, doi: 10.1108/JEFAS-06-2021-271.

Salcedo, N. (2021b), “Editorial: an upcoming 30th anniversary encouraging the papers' publication”, Journal of Economics, Finance and Administrative Sciences, Vol. 25 No. 52, pp. 178-181, doi: 10.1108/JEFAS-11-2021-329.

Salvador-Oliván, J.A. and Agustín-López, C. (2015), “Correlación entre indicadores bibliométricos en revistas de Web of Science y Scopus”, Revista General de Información y Documentación, Vol. 26 No. 51, pp. 2-6.

Teece, D.J. (2007), “Explicating dynamic capabilities: the nature and microfoundations of (sustainable) enterprise performance”, Strategic Management Journal, Vol. 18 No. 7, pp. 509-533, doi: 10.1002/smj.640.

Teece, D.J. (2017), “Dynamic capabilities and (digital) platform lifecycles”, In Furman, J., Gawer, A., Silverman, B. S. and Stern, S. (Eds.), Advances in Strategic Management, Vol. 37, pp. 211-225, published online, doi: 10.1108/s0742-332220170000037008.

Walters, W.H. (2014), “Do article influence scores overestimate the citation impact of social science journals in subfields that are related to higher-impact natural science disciplines?”, Journal of Informetrics, Vol. 8, pp. 421-430.

Waltman, L. and Van Eck, N. (2010), “The relation between eigenfactor, audience factor, and influence weight”, Journal of the American Society for Information Science and Technology, Vol. 61 No. 7, pp. 1476-1486.

Weller, K. (2015), “Social media and altmetrics: an overview of current alternative approaches to measuring scholarly impact”, in Welpe, I., Wollersheim, J., Ringelhan, S. and Osterloh, M. (Eds), Incentives and Performance, Springer, Cham, doi: 10.1007/978-3-319-09785-5_16.

Welpe, I., Wollersheim, J., Ringelhan, S. and Osterloh, M. (Eds), (2015), Incentives and Performance, Springer, Cham, ebook doi: 10.1007/978-3-319-09785-5.

Zitt, M. and Small, H. (2008), “Modifying the journal impact factor by fractional citation weighting: the audience factor”, Journal of the American Society for Information Science and Technology, Vol. 59 No. 11, pp. 1856-1860.

Acknowledgements

Funding: This research was funded by the FEN Universidad de Chile Research Fund.

Corresponding author

Sergio Olavarrieta can be contacted at: solavar@fen.uchile.cl

Related articles