Methodological roadmapping: a study of centering resonance analysis

Purpose – This paper aims to map the creation and evolution of centering resonance analysis (CRA). This method was an innovative approach developed to conduct textual content analysis in a semi-automatic, theory-informed and analytically rigorous way. Nevertheless, despite its robust procedures to analyze documentsandinterviews,CRA is still broadlyunknownandscarcely used in managementresearch. Design/methodology/approach – To track CRA ’ s development, the roadmapping approach was properlyadapted. The traditional time-based multi-layered mapformatwas customized to depict, graphically, theresults obtainedfroma systematic literature reviewof themain CRApublications. Findings – In total, 19 papers were reviewed, from the method ’ s introduction in 2002 to its last tracked methodological development. In all, 26 types of CRA analysis were identi ﬁ ed and grouped in ﬁ ve categories. The most innovative procedures in each group were discussed and exempli ﬁ ed. Finally, a CRA methodological roadmap was presented, including a layered typology of the publications, in terms of their focus and innovativeness; the number of analysis conducted in each publication; references for further CRA development; a segmentation and description of the main publication periods; main turning points; citation-based relationships; andfour possible future scenariosfor CRA asa method. Originality/value – This paper offers a unique and comprehensive review of CRA ’ s development, favoring its broader use in management research. In addition, it develops an adapted version of the roadmapping approach, customized formapping methodological innovations over time.


Introduction
Case studies have been essential to the development of management theories. The evolution of strategic management researchboth worldwide (Herrmann, 2005;Hitt et al., 1998;Hoskisson et al., 1999;Ketchen et al., 2008) and particularly in Brazil (Colla et al., 2011;Walter et al., 2008) exemplify it.
Case-oriented research [1], differently from variable-oriented studies, allows an intensive knowledge of the study objects (Blatter and Haverland, 2012). Moreover, this possibility enables one to follow distinctive theory-building approaches (George and Bennett, 2005). Nevertheless, to be robust, the research inquiry needs to be done by means of formal analysis (Mahoney, 2000). However, most of the analytical methods that were developed are not applicable to case analysis (Ragin, 2008). Thus, it is necessary to develop methodological alternatives which are adequate to this specific kind of analysis. Some initiatives in that direction have been observed over the past decades, especially in sociology and in political science.
In the late 1980's, for instance, Ragin (1987) published the first qualitative comparative analysis (QCA) book. Since then, more than 500 academic papers about the theme were published (www.compasss.org/bibdata.htm, accessed in 16th October, 2017)including two other books written by Ragin (2000Ragin ( , 2008, which consolidated QCA principles. Similarly, event structure analysis (ESA) was proposed in the late 1980's in sociology (Heise, 1989). In addition to the seminal article, Corsaro and Heise (1990); Griffin (1993) and Griffin and Korstad (1998) established the foundations of the method. Because of its originality and logical rigor, ESAtogether with QCAestablished a new methodological category called formal qualitative analysis (Griffin and Ragin, 1994). Several applications of ESA procedures have been made in the social sciences during the 1990's and 2000's (www.indiana.edu/socpsy/ESA/ESApubs.html, accessed in 16 October 2017).
More recently, original approaches were developed to track these causal chains observed in case studies (Bennett, 2010;George and Bennett, 2005;Hall, 2008;Blatter and Haverland, 2012). Amongst them, the causal process tracing has been gaining importance on comparative historical research (Hall, 2008;Kittel and Kuehn, 2013;Rohlfing, 2013). Even combinations of this approach with ESA (Mahoney, 2012) and QCA (Baumgartner, 2013;Schneider and Rohlfing, 2013) have already been proposed.
Finally, this same gap is observed in centering resonance analysis (CRA). Developed in the social communication field , CRA is a methodological innovation for case analysis which is being used to infer: the most important words in a speech (either a transcript or a written text); the way in which these words interact between themselves; and the similarity between different texts. Frequently adopted in information and computer sciences, CRA was only recently introduced into management-oriented journals (Hofer et al., 2012;Tate et al., 2010;Barbosa et al., 2017). Yet, there are only some few works published using this method in this area and no Brazilian publication has been found. This observation can also be done for several other case-oriented research methods [2]. However, it is not the goal of this section to exhaustively review these possibilities. The proposition of this article is that there are methodological innovations in case analysis which could be more broadly and effectively used in management research.
As it is one of the most recent case analysis innovations, CRA has been chosen as this article's focus to facilitate its diffusion among management scholars. Thus, the overall aim of this research was to map CRA's methodological evolution. The specific objectives were: RAUSP 53,3 to build an adequate typology to classify the analytic processes associated with CRA; to evaluate the methodological innovations of CRA's publications; to track the evolution of CRA, identifying its main historical patterns, development periods and turning points; and to characterize CRA's state-of-the-art by inferring possible scenarios for this methodological strand.

Methodology
To achieve the objectives, a literature review (Hart, 1999;Knopf, 2006) has been made which included a bibliographic survey, a narrative review and a graphic review, by means of a methodological roadmapping procedure.
The bibliographic survey was done in the following databases: ISI Web of Science (WoS), Scopus e SciELO [3] all of them are multidisciplinary [4]. While WoS has been chosen for its selective indexing of only high-impact journals (c.f. JCR -Journal Citation Reports), Scopus has been selected because of its broader search reach. On the other hand, SciELO has been included for its Ibero-American representativeness. Table I lists the five search entries that were used. Table I shows that the search entries used both the method's full expression and its initials, were made both in the "topic" and in the "title" search fields and were made both in English, in Portuguese and in Spanish.
The results were narratively revised, focusing on the most relevant points for the objectives of this research. The review was also done in a graphical format (Grant and Booth, 2009), adapting the typical roadmap structure (Freitas et al., 2013;Freitas, 2014;Freitas et al., 2011b;Phaal and Muller, 2009;Phaal et al., 2010;Oliveira et al., 2012). The detailed procedures can be found in the next section, together with the roadmap itself.

Results
The bibliographic survey results are shown in Table II. Out of the 20 items, full text access was not granted for only one paper (Willis and Miertschin, 2010b). Most of these publications were journal articles but there were also some book chapters and conference papers. Few publications had high citation levels but most of them had not yet shown a high impact on academic literature, as expected.
3.1 Introductory publication CRA was proposed by Corman et al. (2002). In this introductory publication, CRA was presented as a new type of computerized text analysis, but, more specifically, as a Study of centering resonance analysis representational methodbecause its goal is to extract an efficient representation of the content of a given text. As these representations have a (word) network format, the authors have also classified CRA as a network analysis method. According to Corman et al. (2002), CRA focuses on texts because these correspond to a detailed level of human communication (i.e. the word level). On the other hand, the fact that it is computerized makes the analysis applicable to a wide range of textual material (i.e. in terms of both quantity and content aggregation level). The authors reinforce that the combination of deepness and range was not possible in common methodological alternatives in human communication research (c.f. cited examples are ethnographic participant observation, conversation analysis, surveys and computational simulation).
Besides that, as it adopts a representational approach, CRA looks forward to extracting the words' meaning without using external references (e.g. dictionaries). Thus, results of different studies are easily comparable. Finally, in representing texts by word networks, CRA brings up the full analytical potentialities of the techniques developed for studying this kind of data structure (Carley, 1997;Carley and Kaufer, 1993).
However, according to Corman et al. (2002), what used to differentiate (i.e. in 2002) CRA from other computerized word-network representation methods was its capacity to identify units of analysis and their relations in a text. While other similar methods make use of the word's co-occurrence in the visualization window of a given software to identify these units, CRA unitizes and links the words based on a linguistic theory that considers the way in which texts are produced. Thus, CRA analysis is not based on arbitrary software window sizes but, rather, on a consistent theoretical perspective.
Specifically, CRA is based on centering theory (Grosz et al., 1995;Walker et al., 1998). This theory posits that human beings bring coherence to discourses by using "centers". Centers can be understood as nouns and adjectives (i.e. noun phrases) that work as the subject or object of a sentence. Therefore, the more these centers refer to each other (both retrospectively and prospectively), the more a discourse becomes coherent. Another important notion in CRA is "resonance", which is a measure of the discursive similarity between two different texts. This similarity is based on words' co-occurrence weighted by their importance in the corresponding texts . High resonance values, for instance, indicate that the two texts are very similar on the way that discursive coherence was obtained using wordsi.e. how words were articulated and their relative importance. Corman et al. (2002) define four basic steps for implementing CRA. In the first step ("Selection"), noun phrases (i.e. phrases that contain at least one noun, associated or not with an adjective) [5] are identified. These selected phrases are minimally pre-processed to eliminate pronoun ambiguity (by replacing them for their respective noun) and standardize prefixes and suffixes (e.g. plural stemming).
In the second step ("Linking"), words pertaining to a same noun phrase are connected (not directionally). Besides, the last and the first words of two consecutive noun phrases are also connected. The underlying assumption is that this way of connecting words reflects the writer's centering (unconscious) process. Once linked, words form a semantic network.
The third step is "Indexing". Here, a word's betweenness centrality is calculated to measure how much the discursive coherence of a text depends on that specific word. Once betweenness values are calculated, the resonance between two texts can be obtained. Resonance is calculated by the sum of the products of the influence of two exact same words which occurred in different texts. Resonance can also be found for pairs of words (instead of an individual word) by using pair influence (betweenness) as its input parameter. Both single-word and word pair resonance must be standardized to compensate for texts' size differences.
Finally, the fourth step is "Application", in which the indexed network (or a part of it) can be used for specific analytical purposes (e.g. visualization, scaling, clustering and information retrieval).
In short, the CRA's introductory publication highlighted that the method introduced a new wayconsistent from a theoretical perspective and semi-automated from a technical point of viewof identifying important words and similar texts in a textual corpus [6].

The evolution of the method
This section reviews the evolution of CRA from this seminal publication by Corman et al. (2002). The 19 papers collected were classified by chronological order. For each of them, we listed: the type of data used; the analytical procedures (categorized by type and group), in the order in which they were reported; and the results obtained.
Finally, each analysis-result combination was assessed [7] for its innovativeness in relation to the preceding published combinations. Analyzing the 19 papers in this way, it is possible to observe the data diversity that was already analyzed in CRA applicationsfor example, from short excerpts (Dooley et al., 2002) to thousands of pages (Tate et al., 2010) (Canary and Jennings, 2008). Besides that, 26 kinds of analysis were made in the applications of the method (Table IV). It is also noted that these types could be grouped into five distinct groups [8]. In the following subsections, we highlight the main innovations for each of these groups of analytical procedures. Most of the methodological innovations corresponded to combinations of CRA with other methods (group "CRA þ other methods"). Among these, the comparison between CRA results and those from other methods for cross-validation was the most innovative type (Type C). Dwyer (2012), for example, pointed to the superiority of the CRA resonance index in its ability to distinguish influential writers from writers susceptible to influence, in sequential texts (e.g. posts from a blog and their subsequent comments). Another type of innovative analysis was the introduction of statistical tests to verify the significance of values (or differences in values) of the CRA indexes (Type G). Tate  Cluster analysis (Type D) and exploratory factor analysis (Type K) were the most reproduced procedures in subsequent articles (i.e. seven and six times, respectively).

Calculation and comparison of influences.
The second most analytically innovative group was the one concerned with calculations and comparisons of influences ("influence" group). Comparing and contrasting the influence values between groups of texts (Type J) can be considered an innovative approach because of how this procedure was operationalized (i.e. differently from other analysis). O'Connor and Shumate (2014), for instance, compared the highest and the lowest influence of words and word pairs between distinct groups of texts to identify typical terms of each group. Another type of analysis from this group was the calculation of theme-level influence[9] (Type L). Both the sum (Hofer et al., 2012) and the average (Williamson et al., 2004) of the word influences associated with the theme were used (for a hybrid procedure, Tate et al., 2010).
Finally, there was also some innovation in calculating the influence of words or word pairs (Type H). A simple innovation was the introduction of the word's influence calculation in a set of texts as the average of the word's influence on each of the corresponding texts (Williamson et al., 2004). This was only an incremental innovation, but it seems to be more consistent than the alternative solution (Canary and Jennings, (2008)) to calculate this aggregate influence by treating the set of texts as a single text.

Calculation and comparison of resonances.
This group was classified as the third most important when considering the number of innovations, but the most innovative, proportionally (i.e. eight of the nine analysis-outcome combinations). We note, however, that no innovation of this group has been replicated so far.
In this group, the most important type of analysis was the calculation of the resonance between sequential texts (Type P). All four analytical procedures of this type were innovative. Canary and Jennings (2008) have introduced this way of using the resonance indexes (i.e. longitudinally). Dwyer (2012) has developed it considerably by distinguishing source resonance (i.e. between a text and a later text) from target resonance (i.e. between a RAUSP 53,3 text and an earlier text). Based on this differentiation, the author proposed a new set of metrics to identify the influence of a writer or a theme in a sequence of texts (Dwyer, 2012). Dwyer (2012) also innovated by proposing a way of discounting the effect of homophilia (i.e. preexisting cognitive similarity) in calculating the resonance between two writers (Type W). Another innovation introduced by the author was the calculation of theme-level resonance (i.e. not of words or word pairs alone) -Type V. Considering that different people can use different words to refer to the same theme, Dwyer (2012) proposed that different words (that, nevertheless, correspond to the same theme) should also be considered the same in the resonance calculation.
Finally, another interesting innovation in this group was the calculation of the resonance between different types of texts (Type F). Dooley et al. (2002) calculated the resonance between consulting demands (expressed in the form of inquiries) and teachers' resumés to identify which teacher would be the most recommended to answer each question.
3.2.4 Interpretation and comparison of word networks. The group of interpretation and comparison of word networks also introduced some innovations. The way that Corman et al. (2002) interpreted a network became the standard for later publications (Type A). The same was true for the comparison between different networks. On the other hand, the way in which Garyantes and Murphy (2010) compared and contrasted two networks is not recommended, as they focused the analysis only on the identification of unique or shared words, or word pairs (Type B).
3.2.5 Preprocessing of texts. Finally, the last group of analytical procedures, with the lowest level of total innovations, but proportionally innovative (i.e. five of seven analysisresults combinations), was the preprocessing of texts. Miertschin (2010a, 2010b) inserted implicit centers in the original text to increase their textual coherence. O'Connor and Shumate (2014) fixed the number of texts per set of texts to limit the undesirable effects of the difference in text size (type T). Williamson et al. (2004), on the other hand, developed a specific ontological dictionary for his CRA application to associate the original words with semantically broader categories. With the same purpose, Dwyer (2012) used an English standard thematic dictionary to automatically categorize all words of the texts under analysis.

Graphical review
After reviewing the main innovations introduced during the evolution of CRA, Figure 1 illustrates this methodological development graphically. The bubbles represent the 19 publications (see numbering in Table II). Bubbles' size corresponds to the number of different types of analysis performed in each publication, ranging from 1 (articles 15 and 19) to 11 (article 10). The arrows represent the citations (i.e. when a previous publication was cited by a subsequent work). As it is cited by all publications except the 19th, the arrows from publication 1 were formatted differently to indicate this paper's widespread impact without visually polluting the map. The asterisk indicates turning points, highlighting publications that were highly citedand which, therefore, are outstanding. The underlined numbers, on the other hand, emphasize very innovative publications that were subsequently not mentioned at allthus, representing items relevant for building a new agenda of CRA's methodological research and development.
The bubbles are positioned in relation to two axes: publication typology (vertical) and time (horizontal). The typology was inductively constructed, from the categorization of publications in two dimensions: transversality (i.e. transversal versus focal) and innovation (i.e. innovative versus conservative). A publication was considered transversal if its respective analytical procedures covered at least three of the five analysis groupsand focal, if not. Similarly, from the analysis of the distribution of innovative analysis-outcome combinations[10], a publication Study of centering resonance analysis Notes: Legend. Bubble: reviewed work (Table II) Table -the more innovative. For example, the "3" is more innovative than "1", which is more innovative than "2").
In the horizontal axis, analyzing the distribution of publications per year, the past was divided into three main historical phases (separated by two periods without publications). The first phase (2002)(2003)(2004) was entitled "Introduced by the idealizers" because it comprises three publications authored by the proponents of the CRA method (Steven Corman, Timothy Kuhn, Robert McPhee and Kevin Dooley). The second (2008)(2009)(2010) was entitled "Few reproductions and big innovations" because it reproduces publication 1 in a focal and conservative way (i.e. publications 5, 6, 7 and 9), but it also includes considerable innovations, such as publications 4 and 10 (transversal)and in a lower degree, publication 8. Finally, the third historical period (2012-2013) was named "Second generation reproduction" because the major part of its publications is conservative and based on publications of the second period (c.f. citations of 4, 6, 8 and 10). The recent past/present, or state-of-the-art, has been called "Conservative, with apparent alienation" because, until this moment, it is restricted to conservative publications, unaligned with developments from previous periods.
From this visualization, we suggest four scenarios from the combination of the publications' typology ( Figure 1). A possible scenario would be the persistence of a conservativeness focused on a small set of well-established practices of the method. In this case, CRA may become marginalized, being treated as a method of minor importance or small analytical potential.
A second scenario would result from some aspects of CRA gaining prominence over others because of the emphasis on some of its technical particularities in its future developments. In this scenario, CRA could evolve to an updated version, specialized in the technical improvement of a subset of its initial characteristics.
A third scenario could be the reproduction of applications of the current version of method. In this case, a widespread diffusion of CRA can be expected. However, the diffusion's speed would depend on the publication rate per year and on the capability of these new publications to incorporate the CRA's methodological benefits that were developed over the past years.
Finally, a fourth scenario would be characterized by a group of methodological innovations in various aspects of the method, driving CRA to a notable change when compared to its original proposition. In this case, new text coherence theories, for example, vein theories (Cristea et al., 1998), new semantic network centrality calculations, new similarity matrix metrics and new combinations with other methods (e.g. with ESA, due its focus on verb phrases, and not on noun phrases) could foster the emergence of a new method, more robust than CRA.
This literature review was done precisely to contribute to the further evolution of CRA in this direction, towards a type of scenario that considers the rich methodological framework that has already been developed for the application of the method.

Centering resonance analysis' contributions to management research
CRA is relevant for management research in similar situations in which other content analysis methods are appliedas in the applications identified by Duriau et al. (2007), for example. After all, like these other methods, CRA consists, essentially, a quantitative analysis of textual data. In this sense, CRA can support any analytical work intended to Study of centering resonance analysis infer the most important words of a discourse (transcript or written), the way these words relate to each other or the similarity level among different texts . However, unlike the most common approaches of content analysis, in CRA, the importance of each word is determined more accurately (i.e. by calculating betweenness centrality) than by counting its frequency of occurrence. The method selects words and links them based on centering theory and not on subjective criteria or on software arbitrary characteristics (e.g. window text size). Finally, CRA provides a robust index to compare texts' similarity (i.e. resonance), whereas other methods limit themselves to the comparison of the most frequent words used in each text.
Specifically, regarding management research, CRA could be applied for: literature reviews (e.g. most important words from a research stream; similarity between articles; article clusters); interview analysis (e.g. identification of main themes and their connections; similarity between interviewees' discourses); and document analysis (e.g. conduct codes; CEOs/shareholders' letters; advertisements and releases; corporate reports).
For example, Barbosa et al. (2017) highlight some possible applications of CRA in supply chain management: analysis of documents exchanged between suppliers and buyers; job description studies (i.e. knowledge, tasks and responsibilities) in supply chain management; identification of the competitive forces that influence the supply chain of an industrial sector, from an interviewee's point of view; and corporate report analysis to understand how companies publish their social and environmental strategies.
Finally, we suggest Visone (www.visone.info/html/extensions.html) as a supporting tool for CRA's applicationspecifically, its "Natural Language Processing (NLP) extension". This extension includes a CRA "module" capable of transforming an input text into a word network, using centering theory rules and the Stanford Lexicon Parser, embedded in the program. The software also provides a tutorial for using this module.

Conclusion
The underlying proposition of this paper is that management research does not satisfactorily explore the methodological innovations that are applicable to case analysis, despite their potential for theoretical development. Specifically, we have argued for a widespread use of CRA because, as our literature review emphasizes, it is still a relatively unknown method, but considerably useful to analytical approaches involving written or transcribed materials (e.g. case studies).
To contribute to CRA's wider adoption, this paper mapped the evolution of the method, highlighting main publications and innovations that contributed to its development. We reviewed 19 publications from selected databases and identified 26 technical analysis types concerning CRA. These were clustered into five distinct groups, for each of which we pointed the corresponding methodological innovations. Besides that, we presented a methodological roadmap of CRA's evolution, divided in past, present and future scenarios foreseen for the method. RAUSP 53,3 Surely, this paper has its own limitationsopening up future refinement possibilities. First, new article databases, less selective in nature, should be incorporated to increase the number of papers included. During the literature review, we identified new references associated to CRA that were not included in this paper. In this sense, a citation analysis of each reviewed publication could refine this initial search results.
Besides that, the innovativeness categorization and assessment may be refined. Instead of executing this process simultaneously by two research assistants, this task could be done independently to evaluate coding robustness a posteriori. This strategy would strengthen results' reliability and, therefore, the corresponding inferences.
However, we hope that this paper, despite its limitations, will contribute as a reference for how a methodological roadmap can be constructed and analyzed, so that other relevant innovations may be mapped and tracked over time.