This study aims to present a guide for using grounded theory methods for exploring organizational phenomena of the new online era.
A reflexive account is adopted on how one can build upon the foundations of traditional offline grounded theory for conducting grounded theorizing with online-based data.
Guidelines for conducting grounded theory on online contexts are presented for crafting research questions, gathering online data and using consolidated methods for analyzing online data. This study shows future and present challenges posed by the new online era for grounded theorizing, as well as helpful lessons to be learned from traditional offline grounded theory to mitigate them.
The implications are helpful for established qualitative organizational scholars that are yet to catch-up in the boundary spanning process of using the digital sources of data in grounded theory. They are equally helpful for newcomers on qualitative grounded theory by guiding them on where and how to start these challenging research endeavors of grounded theorizing in this new online era.
Scant attention has been given on applications of grounded theory in the new online era. The differences between online and offline settings have not been clearly defined to this date, and neither do guidelines exist for how qualitative grounded theorists can take advantage of online data to build theory about new organizational phenomena emerging in the online era.
Bonfim, L. (2020), "Spanning the boundaries of qualitative grounded theory methods: breaking new grounds into the new online era", RAUSP Management Journal, Vol. 55 No. 4, pp. 491-509. https://doi.org/10.1108/RAUSP-04-2019-0061
Emerald Publishing Limited
Copyright © 2020, Leandro Bonfim.
Published in RAUSP Management Journal. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) license. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this license may be seen at http://creativecommons.org/licences/by/4.0/legalcode
Grounded theory has been one of the most influential and widely adopted methods in qualitative research for a long time (Murphy, Klotz, & Kreiner, 2017; Timonen, Foley, & Conlon, 2018). Grounded theory is defined as a method for discovering theory from a systematic comparative analysis of emerging patterns from social science data (Glaser & Strauss, 1967). Since its inception in the latter end of the 1960s, the method has been in constant evolution after a wide range of second-generation grounded theorists decided to follow Glaser and Strauss’ (1967) footsteps (Morse, 2016). The attention to the method can be expressed by the volume of work discussing grounded theory, showing how it is applied (Charmaz, 2006; Gioia, Corley, & Hamilton, 2013; Morse, Stern, Corbin, Charmaz, & Clarke, 2016) and presenting the most common myths and misconceptions when doing qualitative grounded theory research (Suddaby, 2006; Timonen et al., 2018).
However, to the best of our knowledge, only scant attention has been given to addressing how scholars can make the best use of what the new online era of research has to offer for constructing a theory grounded upon online data (Levina & Vaast, 2016). While quantitative scholars are taking advantage of it by using Twitter or Instagram data through data mining or Amazon’s Mechanical Turk (www.mturk.com/) for conducting large-scale surveys and experiments, qualitative scholars are yet to explore the opportunities (and the challenges) presented by the new online era in their research (Hewson, 2014; Mauskapf & Hirsch, 2016). But why is it important to be addressed in the first place?
One can answer, for instance, that it is important because the new forms of organizing that are emerging in contemporary days, such as online communities or the gig-economy, are not easily explained by traditional offline grounded theorizing (Massa, 2017; Roberts & Zietsma, 2018). Furthermore, addressing this issue is important because beyond presenting certain commonalities with traditional offline research, online-based data have idiosyncratic features that can shape grounded theory in a more creative and flexible manner (Hewson, 2014). For example, qualitative scholars can explore big data available “to generate rich, contextual explanations of social phenomena” (Mauskapf & Hirsch, 2016, p. 28). However, to date, there is not a clear guide on how they can achieve it by adopting a grounded theory method. Moreover, online settings can become an ally for grounded theorists when analyzing organizational phenomena through processual lenses (Garud, Berends, & Tuertscher, 2018; Levina & Vaast, 2016), because the processes held in the online world are fast-paced (so one can study the entire process at once) and leave a large amount of highly traceable evidence behind (Hewson, 2014).
Thus, this article aims at presenting a comprehensive guide for those interested in engaging with grounded theorizing and exploring what the new online era has to offer. The central purpose here is not looking back to the roots of grounded theory and its “tussles, tensions, and resolutions” (Morse, 2016), but looking forward to what the future – and to some extent, the present – reserves for the next generation of grounded theorists to help them navigate the vast ocean of the new online era that is yet to be discovered in organization and management research. This aim is achieved by presenting theoretical and practical discussions on conducting grounded theory on online contexts starting from crafting the research question, followed by the methods of online data gathering. Alternative consolidated methods for analyzing these new kinds of data are also presented.
Moreover, the article addresses future and present challenges posed by the new online era for grounded theorizing, as well as how the lessons learned from traditional offline grounded theory can be helpful to mitigate them. It is suggested that this article can be helpful for both established scholars that are yet to start the process of boundary spanning on the traditional offline method by using the digital sources of data and for newcomers on qualitative grounded theory methods that have no clue on where and how to start these challenging research endeavors that are characteristic of grounded theory (Suddaby, 2006) in this new online era.
Crafting a good research question
In the academic community, it is almost consensual that whether using traditional in-person interview and archival data or new online and internet-based data, good grounded theories (actually, all good research) must start with well-crafted research questions (Gioia et al., 2013; Sandberg & Alvesson, 2011). Albeit it is acknowledged (and also desired) that a research question almost always evolves and changes some during the research process (Kreiner, 2016), one has to know “a good deal” about where he or she wants to get to, because if you as a researcher “don’t much care where – then it doesn’t matter which way you walk” (Carroll, 1901, p. 87). In practical terms, this means that a research question will define not only what the researcher is looking for, but where he/she is going to look (that is, the research context), as “grounded theory efforts are more concerned with finding data sources that are highly relevant to the research question at hand” (Murphy et al., 2017, p. 294)
Unlike in traditional qualitative inquiry, grounded theorists do not start with theoretical gap-spotting (Sandberg & Alvesson, 2011); that is, when the researcher starts with finding blind spots in the literature concerning a given phenomenon and then tries to find an appropriate real-world problem and a research context (Pratt, 2016). They can instead craft their research questions through two alternative means. First, they can get intrigued by a specific research context, something that is relatively new, and which extant literature does not seem to offer reasonable explanations for the phenomena under study [what Murphy et al. (2017) call a blue-sky topic], and then trying to identify a real-world problem within this context. For example, Roberts and Zietsma (2018) crafted their research questions based on a recent and understudied context – the digital on-demand economy (i.e. mobile applications or apps) – and then they found a real-world problem in this context: understanding how workers deal with role conflicts that emerge in this new kind of job (the drivers play at the same time the roles of app users, driver-bots and business partners of the company they work for/with).
On the other hand, one can also begin with an interesting and undertheorized real-world problem, and then finding a research context in which they can theorize. For example, Kataria Kreiner, Hollensbe, Sheep, and Stambaugh (2018) crafted their research question with a real-world problem in mind: How do emotions expressed in virtual environments affect sense processes during organizational changes? Aiming at answering this question, they have looked for a context in which they could gather in-depth data containing interactions held during an organizational change, which led them to study how people were expressing their emotions in blog interactions in the context of change in a large worldwide Christian community. In both cases, instead of beginning with a blank agenda (Suddaby, 2006), the authors crafted good research questions related to the new online era, which led them to develop theoretically sound contributions grounded in data available on the internet.
Getting the hands dirty
So far, it was seen that completing grounded theory’s turn to the new online era is not that different from traditional offline grounded theory: one must start with good research questions – either a “blue sky” or a “black box” topic (Murphy et al., 2017) even before starting the research itself. The next step is guiding grounded theorists through the minefield of gathering data in online spaces  and analyzing it effectively. First and foremost, grounded theory has been mistakenly considered an interview-focused or even an interview-only method, and some researchers actually assume that in-depth or semi-structured interviews are the core of their view of grounded theory (Charmaz, 2016; Gioia et al., 2013; Reay, Zafar, Monteiro, & Glaser, 2019).
However, as the kind of data going to be gathered must flow from the research question (Charmaz, 2016), one has to bear in mind that everything can become data when conducting grounded theory (Glaser, 2007; Walsh et al., 2015), especially considering that online data allow taking a multimodal approach to research (Zilber, 2017). But what are the steps that need to be taken for grounded theorizing in the new online era? For answering this question, in this section, it is addressed how one can conduct grounded theory with online data. Figure 1 details a five-step approach proposed in this article to conduct grounded theory with online data effectively. Moreover, it contributes by allowing those interested in grounded theory based upon online data to predict major decision points and important cautions one is likely going to have to consider while engaging with data in the new online era.
Step one: finding out the temporality of data
The first step to be taken is identifying what kind of data one’s research questions ask for in terms of temporality of data. In this regard, one can gather real-time (synchronous) data that are those collected at the moment the phenomenon takes place, and retrospective data (asynchronous) that are those available in online records or accessed through the informants’ memories (e.g. personal online interviewing) (Hewson, 2014). The collection of retrospective data is where the new online era shines the most, as “the Web as it exists today readily creates a mass of easily locatable, often content-searchable traces of online collaborative activity and social interaction, traces that create a potentially rich source of data for use” in qualitative research (Hewson, 2014, p. 424).
Retrospective data can be helpful for theoretical sampling and constant comparison because the field is always “open” for the researcher to come back and check the emerging categories against the “real-world” over and over again (Levina & Vaast, 2016). An empirical illustration of the use of temporally asynchronous data for grounded theorizing can be found in Massa (2017), in which the author relies on previous computer-mediated communications (forum postings, chat logs, memes, images and videos) to trace longitudinally the emergence of an online community (Anonymous) from 2003 to 2011.
On the other hand, real-time data has the strength of allowing the research data to be gathered closer to the event and perceive instant reactions of the people or organizations engaging in the event (Hewson, 2014) or even for conducting participant-observation research. Synchronous data gathering is very helpful for those academics interested in conducting process-based as well as interactional-based research (Garud et al., 2018; Levina & Vaast, 2016). For example, Roberts and Zietsma (2018) conducted an online ethnography to collect synchronous data that allowed them to theorize on how workers of online applications (e.g. Uber drivers) understand their roles in boundary defining activities of on-demand organizations. The value of gathering real-time data for the authors resides in the fact that “online ethnographic data was collected under a period of significant rate drops throughout the United States. Thus, there was a lot of fluctuation in how workers understood their work as drivers beyond monetary compensation” (Roberts & Zietsma, 2018, p. 201).
Although it seems to be a trivial step of grounded theorizing, it is noted that the need to become aware that the boundaries defining what is real-time (synchronous) and what is retrospective (asynchronous) data are not clear when dealing with online data (Hewson, 2014; Tunçalp & Lê, 2014). In the new online era, real-time data can become retrospective data in a limited matter of time. This is because “data emerge in real time but usually leave persisting empirical traces” (Roberts & Zietsma, 2018, p. 200), that is, the real-time social interactions in which researchers can be engaged with will likely become recorded in logs that are going to be accessible online in a later timeframe by the researchers themselves or even by the general public.
Step two: identifying the right platforms
After defining whether the research demands synchronous or asynchronous data types (or even both), the next step is identifying the right platforms for gathering them. The number and variety of platforms available for collecting retrospective data online are astonishing, and the type of platform that is going to be the most helpful will depend both on the research question and on the temporality of data (Tunçalp & Lê, 2014). One empirical example of a coherent rationale for platform identifying can be found in Kataria et al., 2018. The authors argue that blogs were the ideal platform to address their research question because it “is especially appropriate for studying sensemaking since blogs are a place where people express themselves with a specific type of audience/reader in mind” (Kataria et al., 2018, p. 460).
Moreover, one needs to bear in mind that no matter the data source he/she is going to choose, the platforms must adhere to the following criteria : relevance of data available in the platform to the phenomenon under investigation; accessibility of data in the platform; credibility of data gathered in the platform, including adherence of the data collection procedures to ethical and legal standards; and activeness of the users of the platform as grounded theorization demands the collection of thick data. Although not being explicitly mentioned by Vaast and Levina (2015), the four criteria stated above can be noticed in their research on occupational identity through the BankingOC forum (an online community of retail bankers). For example, activeness can be perceived because “the online community had a relatively tightly knit group of very active participants that was relatively stable over our period of investigations” (Vaast & Levina, 2015, p. 79).
The five main types of platforms which researchers can rely upon for gathering online data for grounded theorizing are going to be addressed: communication tools, blogs, online communities, streaming platforms and social media . The first and more often adopted type of platform is represented by online communication tools. These tools are usually adopted for asynchronous data collection, making them the most similar to traditional offline techniques such as interviews (Hewson, 2014). Current tools allow for video recording that can be used later for transcribing them into textual data or even for in vivo coding of videos through a qualitative analysis software. Another asynchronous type of platform that is widely adopted in qualitative grounded theory is personal and corporate blogs. This type of platform allows accessing informants’ in-depth textual and even visual expressions of their personal emotions and feelings without any researcher’s interference or bias (Kataria et al., 2018).
Also considered a rich source of data, online communities are often used when the researcher needs to get access to data related to specific events, interest groups or communities (Massa, 2017; Roberts & Zietsma, 2018; Younger & Fisher, 2020). Online communities are often used for asynchronous data collection. However, depending on the interactions the researcher makes during the research process (e.g. online chats and private messaging), online communities can be turned into real-time data collection (Tunçalp & Lê, 2014). Streaming platforms are also a useful source of primary data for qualitative grounded theorists because videos “can be commented on by others, often generating an exchange of opinions and views on the video content itself” (Hewson, 2014, p. 436). These platforms combine both asynchronous and synchronous features, as live events can be broadcasted with instant interactions from the audience, and these interactions can become recorded for researchers to access in a later date.
More recently, with the technological development of portable gadgets with broad connectivity such as smartphones, social media is becoming the crown jewel of the new online era, as individual, professional and corporate users are migrating from other platforms to social media (Subramaniam, Iyer, & Venkatraman, 2019). In this regard, these platforms (that can be both retrospective and real-time) are relevant if one aims at understanding, for example, phenomena such as social contagion or manipulation (Barberá-Tomás, Castellò, de Bakker, & Zietsma, 2019), as many social movements that are influencing organizational life are being organized in the online space of social media.
Step three: accessing and collecting data
Once the grounded theorist has identified and selected the platforms in which he/she is going to gather data, the next step is accessing it . Prior to effective data collection, the researcher must go through a process of familiarization with the field, that is, a process of cultural embeddedness of the researcher in the context and phenomena. This is helpful as it allows the researcher to become accustomed to the jargons, languages and practices adopted by the community, and then readily identify key actors and influencers (Kozinets, 2010). As Massa (2017) shows in his online grounded theory research, this process was essential for him as a researcher to make the appropriate distinction between noise and relevant data concerning the Anonymous online community.
Regarding data access, it can vary at some extent depending on whether the data that are going to be gathered on online spaces are publicly available archival data or privately held interactional data (Kozinets, 2010; Levina & Vaast, 2016). If one decides to research based on publicly available archival data (such as social media publications, blog posts or videos posted or broadcasted in streaming platforms) the researcher might be aware of the platform’s terms of condition (Voss, Lvov, & Thompson, 2017). Collecting and storing this kind of data can be done with the help of updated versions of the main computer-assisted qualitative data analysis software (CAQDAS) (e.g. ATLAS.ti – https://atlasti.com/, NVivo –www.qsrinternational.com/nvivo/ and MAXQDA –www.maxqda.com/), as they have built-in functions that allow retrieving social media content through hashtags (#) or “mentions” (@) signs, as well as filters for geolocation and language.
Beyond automatically retrieving data with analytical software packages, an alternative form to collect publicly available data is doing it manually (Barberá-Tomás et al., 2019; Kozinets, 2010), because “manual data collection strategy has proven beneficial to enhance the depth of trace data, by allowing the researchers to document their context of production, to identify new areas of interest and to make sense of the emerging patterns” (Latzko-Toth, Bonneau, & Millette, 2017, p. 207). In this way, manual collection of data can be done by “saving web pages, making screenshots of mobile applications or even copying and pasting the entries of interest” (Voss et al., 2017, p. 165). As can be exemplified by Barberá-Tomás et al. (2019), grounded theorists can then organize the data in a text or spreadsheet processing software that can be used for assisting him/her during the coding process.
On the other hand, when gathering data from privately held interactions, it can become challenging to get access to the field. This is because it demands a high level of trust of the community in the person engaging in the interactions, so they can keep their routine interactions without being constrained by the presence of the researcher in the group. It is important to note that prior consent for accessing and using the data of privately held interactions must be given, and the researcher must be sensitive concerning confidentiality, anonymity and the security of data (Hewson, 2014; Levina & Vaast, 2016). If the online source is closed, one might need to create an account login to have access to data or even reach out to some gatekeeper who can give access to the online space (e.g. WhatsApp discussion groups). After accessing data, the researcher has to decide if he/she is going to play a participant or non-participant role in the observations  (Kozinets, 2010; Tunçalp & Lê, 2014).
Finally, although engaging with different platforms for collecting online data is encouraged, narrowing down the sources of data is also important to some extent. During traditional offline grounded theory research, the risk of getting lost before getting found in the data is a commonplace (Gehman et al., 2018; Gioia et al., 2013) because grounded theorists deal with an overwhelming amount of offline hard-to-get-access-to data. However, risks only get amplified when researchers have at their disposal almost unlimited access to online kinds of data (Levina & Vaast, 2016). Thus, sticking close to research questions from the outset, to the emergent categories during theoretical sampling, and knowing when to stop collecting and analyzing data is of utmost importance for handling the iterative process of grounded theorizing.
Step four: storing and cleaning up the data
The advantages of getting access and gathering online data are encouraging. However, iteratively analyzing them can become a daunting task when compared to traditional offline methods. While in the latter there is a need to break larger narratives into smaller and more manageable sets of data, in the latter the challenge is analyzing and making sense of small chunks of unorderly data (Reay et al., 2019), such as a collection of 140-character interactions on Twitter. Furthermore, the history behind data is neither logically constructed, nor the data directed by the researcher is the same as it would be in traditional interviews or archival material (Levina & Vaast, 2016). In this regard, effective data treatment is crucial for effective online grounded theorizing (Halaweh, 2018). This can be exemplified by Massa’s (2017) study of the Anonymous online community, where the author created “time codes,” which allowed him to put data into chronological order and also to establish distinct phases in which events took place.
Moreover, treating and cleaning up the data is also important because inefficient data treatment can be harmful, potentially causing the researcher both to lose data deepness or thickness (Latzko-Toth et al., 2017) and may also lead to ethical issues as stated above. Regarding ethical standards, researchers must apply treatment for anonymizing the data before disclosing it in publications (Hewson, 2014). For example, Roberts and Zietsma (2018) were zealous to protect their online data sources to the point of anonymizing both of their adopted research platforms by using pseudonyms (i.e. social media and standalone). Thus, creating tables with codes so only the researcher can link data with every participant’s traceable nickname or avatar is essential so that the researcher can protect the individuals from any potential harm that can be caused by the disclosure of data.
In what concerns to the deepness and thickness of online data in grounded theory research, it is vital to maintain the elements that can convey nuanced meaning and emotions in virtual written communication, such as emoticons, emojis, acronyms and abbreviations (Kataria et al., 2018). The grounded theorist must pay careful attention to these kinds of elements; this helps him/her to avoid misinterpreting data because of expressions that may denote, for example, sarcasm, anger or joy, such as the use of capitalization of words (all caps). Barberá-Tomás et al. (2019) took advantage of these elements during their data treatment and further data analysis by coding and storing emojis, emoticons and capitalized words in social media comments. Furthermore, the researcher must also be aware that in online interactions, the use of slang and profane language is not unusual (Roberts and Zietsma (2018)). Thus, during cleanup of the data, the researcher must decide if he/she is cleaning them out while realizing that it may lead to losing some data deepness.
Finally, after accessing and collecting online data, the grounded theorist must be careful with storing it. In this concern, “[m]aximizing the security of data storage and transmission methods online is imperative, particularly when data are of a highly sensitive and personal nature” (Hewson, 2014, p. 434). Thus, the researcher must store the data in highly secured servers protected by a strong cryptographic system, because “it is wise to assume that data are at risk and that, therefore, they need to be carefully curated and preserved to remain discoverable, accessible and of value to potential users” (Voss et al., 2017, p. 162) and protected from disclosure to non-academic or scientific purposes.
Step five: analyzing data and getting out of the “online field”
The fifth step of the grounded theorizing is analyzing the online data and knowing when to stop data collection. The data analysis in grounded theory consists of three stages: coding, categorizing and theorizing. However, it can vary in terms of the approach the researcher takes upon existing literature, that is, if he/she believes in theorizing purely from empirical data (inductive approach) or from a combination of the data with extant knowledge regarding the phenomena (abductive approach) (Gehman et al., 2018; Gioia et al., 2013). While the former primarily “presumes a level of semi-ignorance or some suspension of belief in the received wisdom of prior work” (Gioia et al., 2013, p. 23), the latter “formally, consciously, and reflexively brings existing theory into the coding process, even into the earliest stages of data analysis” (Kreiner, 2016, p. 352) .
If the researcher decides to take the inductive approach, he/she begins the analysis by “identifying relevant concepts in the data and grouping them into categories (open coding)” (Gioia, Price, Hamilton, & Thomas, 2010, p. 8). This stage (first-order analysis) uses in vivo coding, that is, whenever possible, the researcher must keep the wording of codes representing emerging concepts as close as possible to terms used in the online source of data, including emoticons, emojis and slangs (Gehman et al., 2018; Gioia et al., 2013). After the first-order coding, the next stage is the second-order (axial) coding, in which one must take the emergent themes and constantly compare them with existing theory to form aggregate dimensions (Gioia et al., 2013). In this stage, the researcher can have “a clear sense of the developing relationships among categories and their related themes” (Gioia et al., 2010, p. 8). The aggregate dimensions become, then, the core elements of the emerging grounded theory (Langley & Abdallah, 2011; Rheinhardt, Kreiner, Gioia, & Corley, 2018).
On the other hand, if one decides on taking an abductive approach, he/she has to perform both in vivo and theory coding (open and axial coding) simultaneously and iteratively, building emerging theories that are both groundbreaking and connected with existing scholarly conversations (Kreiner, 2016; Murphy et al., 2017). The analysis starts with coding any kind of data one has at his/her disposal (interviews, blog posts, social media interactions, video transcripts, etc.) according to the researchers’ feelings regarding what is observed. In this regard, if one sees something that is either/both connected to existing theory concepts or grounded concepts, he/she must code it as it is (Kreiner, 2016). After concomitant open and axial coding (that is, constant comparison), the next stage is categorizing the data in hierarchically organized families of codes that is the baseline for establishing the relations that are going to shape the emerging grounded theory (Kreiner, 2016; Murphy et al., 2017).
At this stage of research, one has to resort on constant comparison and theoretical sampling to decide when to stop collecting data (Morse et al., 2016), especially if he/she is dealing with synchronous interactive kinds of data. One empirical study that applies constant comparison and theoretical sampling with online data is Roberts and Zietsma’s (2018) study on workers for on-demand organizations. The authors report that “[a]fter a period of time away from the online ‘field,’ or the forum, to focus on data analysis, we then returned to the field for a second iteration of online ethnographic data collection to deepen our emerging analysis” (Roberts & Zietsma, 2018, pp. 201–202).
In sum, one can stop gathering new online data when he/she has put the emerging concepts and categories into proving through additional data collection and analysis until the emerging theory is consistent enough (constant comparison); and the additional data collection did not bring any novel emerging concepts and categories to the emerging theory (theoretical sampling) (Charmaz, 2016; Suddaby, 2006; Timonen et al., 2018). Additionally, it is essential to the researcher to be open to unexpected pathways during the iterative process of data collection and analysis, because it can provide novel and creative insights to the theory emerging from data (theoretical sensitivity). Table 1 sums-up the challenges, opportunities and pitfalls of applying each of the five steps for getting your hands dirty during grounded theorizing. Moreover, empirical illustrations are provided for those interested in grounded theory based upon online data, so relevant research applying the tenets of grounded theory in the new online era can be easily located.
Looking forward, but not putting the past out of sight
The lack of a boilerplate for evaluating qualitative studies in general and grounded theory specifically has been a long-lasting concern in management and organization research (Pratt, 2009). However, although these problems have been partially fixed with the emergence of institutionalized qualitative research templates (Langley & Abdallah, 2011; Murphy et al., 2017), the new online era brings a whole new set of issues to be addressed by grounded theorists. What is a sign of relief for those intending to adopt a grounded theory approach beyond the traditional offline settings is that the advances held in the field in the recent past ended up “leaving a trail of bread crumbs” (Rheinhardt et al., 2018, p. 526) on how to be taken seriously by the overall academic system.
One of the major challenges posed to the next generation of grounded theorists is keeping it flexible without becoming unsystematic (Rheinhardt et al., 2018; Walsh et al., 2015). It is important to stick with the traditional tenets of the method. Online data collection and its analysis must be iterative; so one can make adjustments in their sensitizing concepts during theoretical sampling and constant comparison (Charmaz, 2016; Kreiner, 2016; Suddaby, 2006). However, there is also a need to be careful so that rigor does not become rigor mortis (Eisenhardt, Graebner, & Sonenshein, 2016); that is, when researchers end up forcing a fit into predetermined models instead of balancing systematic analytical tools of scientific methods with creativity necessary for the art of crafting theory fundamentally grounded in online data (Suddaby, 2006).
Another caution that needs to be taken is that although new kinds of online data are important, scholars must be aware of the risks of having organizational researchers locked in their own “digital labs” and losing contact with the “real-life” of organizations. Thus, it is urged to scholars to dive deep into the vast ocean of the new online era but also taking care of their roots that have been grown on firm grounds. For avoiding these risks, it is suggested to grounded theorists to take seriously the call for strong multimodal research (Zilber, 2017) by combining online data with more traditional offline ones. For example, one can mix online observation with semi-structured interviews along the lines of some exemplary articles (Barberá-Tomás et al., 2019; Illia, Romenti, Cánovas, Murtarelli, & Carroll, 2017; Vaast & Levina, 2015; Younger & Fisher, 2020).
Finally, looking forward to researching on the new online era demand additional axiological considerations compared to traditional offline research (Garud et al., 2018). In this regard, because there still is a lack of institutionalized ethical standards for dealing with qualitative online-based research, ethical issues are the most delicate ones (Hewson, 2014; Whiting & Pritchard, 2018). The main debates have been around concerns about whether data are considered public domain because they are available in online spaces or they still maintain their private character; to what extent there is a need for disclosure of research so the participants are aware and consent with being a part of a study; or how can one deal with traceability issues that are inherent to these online environments to protect the anonymity of groups and individuals and to avoid any possibility of harming others at any degree (Hewson, 2014; Levina & Vaast, 2016; Whiting & Pritchard, 2018).
This article presents an introductory guide for breaking the grounds of grounded theory to the new online era terrains. It has shown that grounded theorizing starts by crafting good research questions leading to interesting unsolved real-world problems. If research questions demand to navigate the online world to get appropriate answers, researchers can feel confident to span the boundaries of the method into this new reality. Moreover, it shows that gathering online data can be easier than collecting traditional offline ones, but they are also challenging because of the overwhelming volume and types of data available. Thus, understanding the research question is essential for the researcher to narrow down only online data that are actually going to be useful in the grounded theorizing process.
It is also shown that grounded theorists do not need to create entirely new methods for analyzing online data because they already have at their disposal proven tools for getting the job done. Thus, scholars only have to identify which of these tools offer them the most for answering their research questions and making theoretically and practically sound grounded theory. In sum, the future of qualitative grounded theory in this new online era is bright indeed, but there is a need to build upon the foundations that have been laid over the past decades so organizational research does not become detached to the real-life of organizations. The list of avenues of research in this new online era only grows in these days. For example, gadgets such as the Amazon Alexa or Google Home are soon to be spread all over the organizational world, so one can look on how interactions of humans with non-human (robots) objects are shaping the way people deal with their organizational life.
Finally, when compared to current and past generations of grounded theorists, future organizational scholars (i.e. current students of doctoral programs) are much better versed in all the technologies that have been emerging lately. Thus, it is likely that current and future generations will “hit the ground running” on grounded theory based upon online types of data. Thus, the possibility of seeing the migration of organizational qualitative grounded theorists to alternative online sources of data is not a matter of if, but it is a matter of when this kind of data will be regarded as mainstream in the qualitative organizational research.
The opportunities, challenges and pitfalls of grounded theory in the new online era
|The opportunities||The challenges and pitfalls||Empirical illustrations|
|Step 1 – temporality of data||- Online data provide traceable, searchable and easily accessible retrospective data
- Online grounded theorists can resort on the temporality of data to make easier the process of constant comparison and theoretical sampling (the field is always open for the researcher to come back)
- Real-time events in which the grounded theorists can have access to organizations and audiences’ instant reactions are abundant
- Online data is a rich source of synchronous interactional data that would be stricter in traditional offline settings
- Online settings provide opportunities to conduct longitudinal research in a short period of time
|- It is difficult to define the boundaries between real-time (synchronous) and retrospective (asynchronous) data
- If the grounded theorist decides to work with synchronous data, the immersion in the field can become time-consuming or even exhausting because of the full-time availability and access to data
- The sources of asynchronous or retrospective data are so abundant that they can become overwhelming for the researcher to deal within the first place
|- Massa (2017) relies on retrospective (asynchronous) online data (previous computer-mediated communications such as forum postings, chat logs, memes, images and videos) to trace longitudinally the emergence of an online community (Anonymous) from 2008 to 2011
- Roberts and Zietsma (2018) conducted an online ethnography with real-time (synchronous) online data (forum and social media interactions between app drivers) to theorize on how workers of online applications (e.g. Uber drivers) understand their roles in boundary defining activities of on-demand organizations
|Step 2 – platform identification||- The pool of platforms available for online data collection is astonishing, varying from blog posts to podcasts, streaming, forums and online communication tools (see Appendix for a comprehensive list)
- The combination of multiple platforms provides opportunities to adopt a multimodal approach to research (gathering textual and visual data simultaneously)
- There is likely an adequate platform for collecting data for a wide range of research questions concerning organizational phenomena
- Different platforms can provide access to real-time data without geographical or budgetary constraints (e.g. long-distance interviewing)
|- It is not easy to access the quality criteria of online platforms (relevance, accessibility, credibility and activeness)
- Although online data tends to be traceable and searchable, some platforms have a limited timeframe to store their events (e.g. social media platforms such as Snapchat, Instagram or Facebook have “stories” that disappear 24 h after being posted)
- Grounded theorists must avoid the trap of collecting data in platforms that are not suitable to answer their research questions (e.g. if social media data is not helpful, gathering data in such platforms will consume time and effort that could be invested in collecting data in other suitable platforms)
|- Vaast and Levina (2015) show how grounded theorists can effectively identify if a platform attains the four quality criteria, given that the BankingOC online community was relevant for addressing their research question, it was accessible, it was credible for the banking community (over 23 thousand registered members) and it had an active and stable membership
- Barberá-Tomás et al. (2019) use social media data (Facebook, Twitter, Instagram, YouTube, etc.) because it can convey traces of emotional contagion and manipulation that are necessary to bring about social change in the case of commotion for fighting plastics pollution
- Younger and Fisher (2020) combine blog posts, website texts and podcasts to get key organizational actors’ accounts (founders and investors) regarding organizational image formation of new ventures of an emerging organizational category (venture accelerators)
|Step 3 – data access and collection||- Grounded theorists have at their disposal several tools that facilitate the access and the collection of online data (e.g. CAQDAS)
- Researchers have at their disposal almost unlimited access to online kind of data, considering that much of the data available in the new online era are stored and can be accessed through public-domain servers and platforms
- Access to research informants can become less challenging because of the fact that several online platforms (e.g. LinkedIn) allow contacting organizational actors without the burden of getting through gatekeepers like in offline traditional data collection procedures
|- Collecting online data may demand extra efforts from the researcher to familiarize with jargons that can be particular to each online community or even to understanding the codes to locate relevant data in each platform (mentions, hashtags, etc.)
- Collecting observational data in online settings does not exempt the researchers of building high levels of trust within the community where he/she is engaging with. One might have in mind that those involved in interactional data collection should be aware that their interactions are being observed
- It is difficult to assess whether people are being honest or not when collecting interactional data through online sources, what makes difficult for the researcher to assess whether he/she is dealing with relevant data or with noise
|- Massa (2017) shows that accessing and collecting online data can also be time-consuming. The author spent 10 h per week online following Anonymous forum threads for a 38-mo period
- Barberá-Tomás et al. (2019) used as a tool for data collection a software package that fetches and prepares data for analysis (Social Data Analytics Tools, http://cssl.cbs.dk/software/sodato). Data collection was complemented by “hashtag” searches and monitoring and platforms such as Twitter and Instagram and by manual data retrieving (Facebook and YouTube comments)
- Kataria et al. (2018) show that even in open online spaces, there is a need to get informed consent from research participants. The authors requested permission and obtained permission from blog owners to read and use their blog posts as online data for their research
|Step 4 – data treatment and storage||- There is a wide range of options available to the researcher to store and protect data on encrypted servers
- Most CAQDAS provide the opportunity of treating data to preserve nuanced meaning and emotions in virtual written communication, such as emoticons, emojis, acronyms and abbreviations
|- Non-efficient treatment of data can be harmful in terms of losing deepness or thickness of data
- Anonymizing and protecting data can be a challenging task in the new online era considering that simple online searches can reveal the identity of informants
- The use of slangs and profanity language is not unusual. Thus, the grounded theorist must decide if he/she is cleaning them out knowing that it may lead to losing some deepness of data
- Data must be stored in highly secured servers and protected by strong cryptographic systems, which can lead to additional costs for the research team
|- Massa (2017) performed data treatment to avoid false positives in his research. For example, when searching the “Anonymous” in several online data sources, the author got 1,683 articles, but only 178 were not false positives
- Roberts and Zietsma (2018) were zealous to protect their online data sources to the point of anonymizing both platforms adopted by them in the research by using pseudonyms (i.e. social media and standalone). Moreover, the authors had also decided to not clean up all the profanity language used by the Uber drivers for the sake of preserving the thickness of data
|Step 5 – data analysis||- Incorporate elements that may denote emotions and meaning, for example, sarcasm, anger or joy, such as the use of capitalization of words (all caps)
- The openness of the field provides the opportunity to adhere to the systematic tenets of grounded theorizing (e.g. theoretical sampling and constant comparison)
- The new online era increases the chances of getting to unexpected pathways during the iterative process of data collection and analysis, providing novel and creative insights to the theory emerging from data and theoretical sensitivity
|- The large volume of online data available is a double-edged sword. Although the process of getting lost in the data before getting found is commonplace in offline grounded theory, the risk gets amplified when dealing with data from the new online era
- The history behind data is neither logically constructed nor the data is directed by the researcher as it would be in traditional offline ground theory (for example, it can become challenging to analyze and make sense of small chunks of unorderly data collected in platforms such as Twitter, Instagram or Snapchat)
- As the field is always open, it is more difficult for the researcher to know when to stop data collection analysis (getting out of the field)
|- Massa (2017) tackles the lack of order of online data by creating what he calls time codes. This analytical procedure was useful for the author to put data into chronological order, making possible to establish distinct phases in which events took place
- Barberá-Tomás et al. (2019) take advantage of the elements denoting emotions during their data analysis by coding and storing emojis, emoticons and capitalized words in social media comments
- Younger and Fisher (2020) adopt constant comparison techniques both in first- and second-order coding. Roberts and Zietsma (2018) provide an exemplar of the use of constant comparison and theoretical sampling to decide whether new data collection is needed to confront the emerging theory grounded upon online data
Sample of useful online platforms for data collection
|Type||Source||URL||Temporality||How can it be helpful/advantageous|
|Blogs||Tumblr||www.tumblr.com/||Asynchronous||- Gathering in-depth insights or textual accounts of events in personal or corporate blogs
- There is not a constraining limit of space available for the informants to express their feelings, emotions, thoughts and beliefs
- Provide material elements (such as photos or videos) attached to each text
|Social media||www.facebook.com/||Synchronous/asynchronous||- Social media are rich sources for collecting relational or interactional kind of data
- Provide tools for analyzing instant reactions of social actors to events (like, dislike and comments)
- It allows simultaneous real-time and retrospective data collection in most platforms
- Data collection is essentially multimodal, going beyond textual data (e.g. photos and videos)
|Social media (professional/academic)||www.linkedin.com/||Asynchronous||- Can be very helpful for investigating professional environments, such as human resource management practices
- Can provide access to people that would be difficult to approach otherwise (directors, chief executive officers [CEOs], board members from companies, professors, researchers or staff from universities)
- The platforms also have some features of online communities (e.g. focused discussion groups, question and answer sections, recommend, like and comment sections on publications)
|Online communities (forums)||www.reddit.com/||Synchronous/asynchronous||- Online communities are a good source of data if one needs access to specialists’ accounts of events
- mIRC is helpful for getting access to geek communities. It also has a real-time interaction that can turn data collection into synchronous
- Reddit is functional if one wants to understand how communities make sense and evaluate the topics under discussion (popular topics and replies are pinned in the top of the page)
|Streaming platforms||YouTube||www.youtube.com/||Synchronous/asynchronous||- Can become synchronous when live events are broadcasted with instant interactions from the audience
- Helpful when researchers want to investigate corporate narratives (including executives and CEOs)
- Excellent source of data to gather discourses or narratives from actors that are no longer alive
|Communication tool||Skype||www.skype.com/||Asynchronous||- Can help mitigating constraints researchers face concerning limited geographical boundaries by cutting down cost-prohibitive travels to do research “on-site”
- Online interviewing makes cross-context comparisons more feasible, generating more theoretically sound and also more generalizable concepts and theories
|www.whatsapp.com/||Synchronous/asynchronous||- Beyond traditional interviewing features, it can become a hybrid of communication tool and online community, as current trends in the direction of forming discussion groups in this platform are a rich source of primary data (e.g. advocacy groups)|
Adapted from Kozinets (2010).
A set of platforms for gathering online data as well as how each platform can be helpful is available in Appendix.
Although it is recognized that online interviewing through platforms such as Skype or Zoom represent a relevant part of the new online era data collection, the process of accessing and collecting the data does not vary as much when compared to traditional offline methods (e.g. getting in touch with the person, asking permission to interview and recording the interview – video or audio-only data). Thus, for the sake of conciseness, the foci here are on the platforms that differ the most from the traditional offline data collections.
Every time one enters in the field, even in online spaces, it is suggested that researchers should be open to participants regarding their observations for them to give an informed consent statement. In forums or online communities when it is not possible to speak to each participant, try to reach out to moderators of other possible gatekeepers and expose the motivations of your stay in their community as well as your research goals.
This section on how to analyze and theorize upon the online data is inspired by two grounded theory methodologies that are taking the management and organization field by storm (Garud et al., 2018; Langley & Abdallah, 2011; Murphy et al., 2017): the Gioia methodology (Gioia et al., 2013), a purely inductive qualitative approach to grounded theory; and the twin slate methodology (Kreiner, 2016), an abductive approach.
Barberá-Tomás, D., Castellò, I., de Bakker, F. G. A., & Zietsma, C. (2019). “Energizing through visuals: how social entrepreneurs use emotion-symbolic work for social change”. Academy of Management Journal, 62, 1789–1817, 10.5465/amj.2017.1488
Carroll, L. (1901). Alice’s adventures in the wonderland, Boston, MA: DeWolfe, Fiske & Co.
Charmaz, K. (2006). Constructing grounded theory: a practical guide through qualitative analysis, Thousand Oaks, CA: Sage.
Charmaz, K. (2016). “Shifting the grounds: constructivist grounded theory methods”. In J. M. Morse, P. N. Stern, J. M. Corbin, K. Charmaz & A. E. Clarke, (Eds), Developing grounded theory: the second generation (pp. 127–154). New York, NY: Routledge.
Eisenhardt, K.M., Graebner, M.E., & Sonenshein, S. (2016). “Grand challenges and inductive methods: Rigor without rigor mortis”. Academy of Management Journal, 59, 1113–1123, 10.5465/amj.2016.4004
Garud, R., Berends, H., & Tuertscher, P. (2018). “Qualitative approaches for studying innovation as process”. In M. Raza, & S. Jain, (Eds), The routledge companion to qualitative research in organization studies (pp. 226–247). New York, NY: Routledge.
Gehman, J., Glaser, V.L., Eisenhardt, K.M., Gioia, D.A., Langley, A., & Corley, K.G. (2018). “Finding theory–method fit: a comparison of three qualitative approaches to theory building”. Journal of Management Inquiry, 27, 284–300, 10.1177/1056492617706029
Gioia, D.A., Corley, K.G., & Hamilton, A.L. (2013). “Seeking qualitative rigor in inductive research: notes on the Gioia methodology”. Organizational Research Methods, 16, 15–31, 10.1177/1094428112452151
Gioia, D.A., Price, K.N., Hamilton, A.L., & Thomas, J.B. (2010). “Forging an identity: an insider-outsider study of processes involved in the formation of organizational identity”. Administrative Science Quarterly, 55, 1–46, 10.2189/asqu.2010.55.1.1
Glaser, B.G. (2007). Doing formal grounded theory: a proposal, Mill Valley, CA: Sociology Press.
Glaser, B.G., & Strauss, A.L. (1967). The discovery of grounded theory, Chicago: Andine.
Halaweh, M. (2018). “Integrating social media and grounded theory in a research methodology: a possible road map”. Business Information Review, 35, 157–164, 10.1177/0266382118809168
Hewson, C. (2014). “Qualitative approaches in internet-mediated research: Opportunities, issues, possibilities”. In P. Leavy, (Ed.) The oxford handbook of qualitative research (Vols 423/451). Oxford, UK: Oxford University Press.
Illia, L., Romenti, S., Cánovas, B.R., Murtarelli, G., & Carroll, C.E. (2017). “Exploring corporations’ dialogue about CSR in the digital era”. Journal of Business Ethics, 146, 39–58, 10.1007/s10551-015-2924-6
Kataria, N., Kreiner, G., Hollensbe, E., Sheep, M.L., & Stambaugh, J. (2018). “The catalytic role of emotions in sensemaking: Evidence from the blogosphere”. Australian Journal of Management, 43, 456–475, 10.1177/0312896217734589
Kozinets, R.V. (2010). Netnography: doing ethnographic research online, London, UK: Sage.
Kreiner, G.E. (2016). “Tabula geminus: a “both/and” approach to coding and theorizing”. In K.D. Elsbach, & R.M. Kramer, (Eds), Handbook of qualitative organizational research: Innovative pathways and methods (pp. 350–361). New York, NY: Routledge.
Langley, A., & Abdallah, C. (2011). “Templates and turns in qualitative studies of strategy and management”. In D.D. Bergh, & D. J. Ketchen, (Eds), Building methodological bridges: Research methodology in strategy and management (pp. 201–235). Bingley, UK: Emerald.
Latzko-Toth, G., Bonneau, C., & Millette, M. (2017). “Small data, thick data: thickening strategies for trace-based social media research”. In L. Sloan, & A. Quan-Hasse, (Eds), The SAGE handbook of social media research methods (pp. 199–214). London, UK: Sage.
Levina, N., & Vaast, E. (2016). “Leveraging archival data from online communities for grounded process theorizing”. In K.D. Elsbach, & R.M. Kramer, (Eds), Handbook of qualitative organizational research: Innovative pathways and methods (pp. 215–224). New York, NY: Routledge.
Massa, F.G. (2017). “Guardians of the internet: building and sustaining the anonymous online community”. Organization Studies, 38, 959–988, 10.1177/0170840616670436
Mauskapf, M., & Hirsch, P.M. (2016). “Ups and downs: Trends in the development and reception of qualitative methods”. In K.D. Elsbach, & R.M. Kramer, (Eds), Handbook of qualitative organizational research: Innovative pathways and methods (pp. 24–30). New York, NY: Routledge.
Morse, J.M. (2016). “Tussles, tensions, and resolutions”. In J. M. Morse, P.N. Stern, J.M. Corbin, K. Charmaz, & A.E. Clarke, (Eds), Developing grounded theory: the second generation (pp. 13–22). New York, NY: Routledge.
Morse, J.M., Stern, P. N., Corbin, J.M., Charmaz, K., & Clarke, A.E. (2016). Developing grounded theory: the second generation, New York, NY: Routledge.
Murphy, C., Klotz, A.C., & Kreiner, G.E. (2017). “Blue skies and black boxes: the promise (and practice) of grounded theory in human resource management research”. Human Resource Management Review, 27, 291–305, 10.1016/j.hrmr.2016.08.006
Pratt, M.G. (2009). “From the editors: for the lack of a boilerplate: tips on writing up (and reviewing) qualitative research”. Academy of Management Journal, 52, 856–862, 10.5465/amj.2009.44632557
Pratt, M.G. (2016). “Crafting and selecting research questions and contexts in qualitative research”. In K. D. Elsbach, & R.M. Kramer, (Eds), Handbook of qualitative organizational research: Innovative pathways and methods (pp. 177–185). New York, NY: Routledge.
Reay, T., Zafar, A., Monteiro, P., & Glaser, V.L. (2019). “Presenting findings from qualitative research: One size does not fit all”. In T.B. Zilber, J.M. Amis, & J. Mair, (Eds), The production of managerial knowledge and organizational theory: New approaches to writing, producing and consuming theory (pp. 1–24). Bingley, UK: Emerald.
Rheinhardt, A., Kreiner, G.E., Gioia, D.A., & Corley, K.G. (2018). “Conducting and publishing rigorous qualitative research”. In C. Cassell, A.L. Cunliffe, & G. Grandy, (Eds), The sage handbook of qualitative business and management research methods (pp. 515–531). London, UK: Sage.
Roberts, A., & Zietsma, C. (2018). “Working for an app: organizational boundaries, roles, and meaning of work in the “on-demand” economy”. In L. Ringel, P. Hiller, & C. Zietsma, (Eds), Toward permeable boundaries of organizations? (research in the sociology of organizations) (Vol. 57, pp. 195–225). Bigley, UK: Emerald.
Sandberg, J., & Alvesson, M. (2011). “Ways of constructing research questions: gap-spotting or problematization?.” Organization, 18, 23–44, 10.1177/1350508410372151
Subramaniam, M., Iyer, B., & Venkatraman, V. (2019). “Competing in digital ecosystems”. Business Horizons, 62, 83–94, 10.1016/j.bushor.2018.08.013
Suddaby, R. (2006). “From the editors: what grounded theory is not”. Academy of Management Journal, 49, 633–642, 10.5465/amj.2006.22083020
Timonen, V., Foley, G., & Conlon, C. (2018). “Challenges when using grounded theory: a pragmatic introduction to doing GT research”. International Journal of Qualitative Methods, 17, 1–10, 10.1177/1609406918758086
Tunçalp, D., & Lê, P. L. (2014). (“Re)locating boundaries: a systematic review of online ethnography”. Journal of Organizational Ethnography, 3, 59–79, 10.1108/JOE-11-2012-0048
Vaast, E., & Levina, N. (2015). “Speaking as one, but not speaking up: Dealing with new moral taint in an occupational online community”. Information and Organization, 25, 73–98, 10.1016/j.infoandorg.2015.02.001
Voss, A., Lvov, I., & Thompson, S.D. (2017). “Data storage, curation and preservation”. In L. Sloan, & A. Quan-Hasse, (Eds), The SAGE handbook of social media research methods (pp. 161–176). London, UK: Sage.
Walsh, I., Holton, J.A., Bailyn, L., Fernandez, W., Levina, N., Glaser, B. (2015). “What grounded theory is…a critically reflective conversation among scholars”. Organizational Research Methods, 18, 581–599, 10.1177/1094428114565028
Whiting, R., & Pritchard, K. (2018). “Digital ethics”. In C. Cassell, A.L. Cunliffe, & G. Grandy, (Eds), The sage handbook of qualitative business and management research methods (pp. 562–579). London, UK: Sage.
Younger, S., & Fisher, G. (2020). “The exemplar enigma: New venture image formation in an emergent organizational category”. Journal of Business Venturing, 35, 10.1016/j.jbusvent.2018.09.002
Zilber, T.B. (2017). “A call for ‘strong’ multimodal research in institutional theory”. In M.A. Höllerer, T. Daudigeos, & D. Jancsary, (Eds), Multimodality, meaning, and institutions (research in the sociology of organizations) (Vol. 54A, pp. 63–84). Bingley, UK: Emerald.
The author gratefully acknowledges the Management and Organization Department Faculty and doctoral students at the Smeal College of Business at The Pennsylvania State University, for the discussions held during his stay there. Special thanks to Charlene Zietsma and Anna Roberts, from whom he had the opportunity to learn a lot about qualitative research in general and online-based research in specific. He further thanks Andrea Segatto for the valuable knowledge on qualitative research she has been sharing with him in the past four years. He would like to extend his gratitude to Marlei Pozzebon and Diógenes Bido for their guidance on the submission process, RAUSP Management Journal’s Editor-in-Chief Flavio Hourneaux and the two anonymous reviewers for their insightful comments and suggestions that led to considerable improvements in this article that could not be done otherwise. This study was financed in part by the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq: 141051/2019-1) and the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.