Data-driven innovation processes within federated networks

Purpose – Within digital innovation, there are two significant consequences of the pervasiveness of digital technology:(1)theincreasingconnectivityisenablingawiderreachandscopeofinnovationstructures,suchas innovationnetworksand(2)theunprecedentedavailabilityofdigitaldataiscreatingnewopportunitiesforinnovation.Accordingly,thereisagrowingdomainforstudyingdata-driveninnovation(DDI),especiallyin contemporarycontextsofinnovationnetworks.ThepurposeofthisstudyistoexplorehowDDIprocessestakeforminaspecifictypeofinnovationnetworks,namelyfederatednetworks. Design/methodology/approach – A multiple case study design is applied in this paper. We draw our analysis from data collected over six months from four cases of DDI. The within-analysis is aimed at constructing the DDI process instance in each case, while the crosscase analysis focuses on pattern matching and cross-case synthesis of common and unique characteristics in the constructed processes. Findings – Evidence from the crosscase analysis suggests that the widely accepted four-phase digital innovation process (including discovery, development, diffusion and post-diffusion) does not account for the explorative nature of data analytics and DDI. We propose an extended process comprising an explicit exploration phase before development, where refinement of the innovation concept and exploring social relationships are essential. Our analysis also suggests two modes of DDI: (1) asynchronous, i.e. data acquired before development and (2) synchronous, i.e. data acquired after (or during) development. We discuss the implications of these modes on the DDI process and the participants in the innovation network. Originality/value – The paper proposes an extended version of the digital innovation process that is more specifically suited for DDI. We also provide an early explanation to the variation in DDI process complexities by highlighting the different modes of DDI processes. To the best of our knowledge, this is the first empirical investigation of DDI following the process from early stages of discovery till postdiffusion


Introduction
The digital innovation literature at large looks at how digital technology is changing the landscape of products, services, business models or even entire industries (Nambisan et al., 2017). Within the area of digital innovation, there are two significant consequences of the pervasiveness of digital technologies. First, the amount and speed of digital data generated from our everyday interactions, along with advances in storage and analytical technologies, is creating new opportunities for innovation (Trabucchi et al., 2018). This has led to an increasing interest in data-driven innovation (DDI) (Rindfleisch et al., 2017;Trabucchi and Buganza, 2019). Second, digital technology enhances connectivity, which enables a wider reach to innovation participants, as well as a broader scope for their participation and 2. Related research We position this paper within the broader digital innovation literature, sharing Yoo et al.'s (2010) definition of digital innovation as "the carrying out of new combinations of digital and physical components to produce novel products" (p. 726). Products here are interpreted as a general innovation outcomes, including products, services or combinations thereof, as it becomes challenging to recognize the boundaries between them (Yoo et al., 2012). Since the generation of digital data and the associated analytical techniques are considered digital components, DDI is a specific phenomenon of digital innovation. Since we aim to explore DDI processes in the context of federated networks, the following subsections review the related literature on the digital innovation process, the role of data analytics, and the challenges that federated networks bring about such process.

Digital innovation process
Innovation processes are, in general, regarded to be nonlinear and complicated (Van de Ven, 2017; Van de Ven et al., 2008). Nevertheless, various streams of the literature assert that every digital innovation goes through phases of discovery, development and diffusion (Fichman et al., 2014;Garud et al., 2013;Kohli and Melville, 2019). In addition, conceptual models of innovation processes often include a postdiffusion phase where the innovations are scaled up, exploited or even terminated (George and Lin, 2017;Kusiak, 2009;Van de Ven et al., 2008). Here, we delimit our scope to the temporal processes through which an innovation outcome, such as a product, service or a combination thereof, progresses. Accordingly, digitizing processes or process innovations lie outside the scope of this paper. It is important to note that we regard these phases as high-level stages of significance to the specific innovation outcomes, rather than a precise model of activities (Garud et al., 2013(Garud et al., , 2016. The following subsections briefly describe the aforementioned phases. 2.1.1 Discovery. Discovery refers to the emergence of novel ideas that have the potential to develop into one of the digital innovation outcomes discussed previously. The idea in digital innovation may be triggered and/or enabled by digital technology (Nambisan, 2013). This phase may also be referred to as invention (Garud et al., 2013), exploration of concepts (Bergvall-K areborn and St ahlbr€ ost, 2009), idea generation (Foulonneau and Turki, 2015;Hansen and Birkinshaw, 2007) and initiation (Kohli and Melville, 2019). The emergence of novel ideas often follows a period of conception and incubationpredicting exactly when or how these ideas emerge is almost impossible, but the recombination of ideas and artifacts across knowledge domains is suggested to enable such novelty (Garud et al., 2013). Understanding the conditions underlying their emergence, and how to filter sound ideas from noise, is rather recommended (Kohli and Melville, 2019). Ideas then go through some form of evaluation or critical assessment (Bergvall-K areborn and St ahlbr€ ost, 2009;Foulonneau and Turki, 2015;Garud et al., 2016).
2.1.2 Development. This phase covers the transformation of the idea or concept into one or more of the abovementioned innovation outcomes (Fichman et al., 2014;Garud et al., 2013). This phase may start with proliferation of ideas into a few plausible developmental paths (Van de Ven et al., 2008), where the different resources are matched to facilitate the development (Wheeler, 2002). The level of maturity of such an outcome depends on the iteration, resources and choices of the innovator; thus, the outcome from this stage can be a prototype, earlier version of the innovation outcome, or a version ready for full launch into the market (Bergvall-K areborn and St ahlbr€ ost, 2009). Questions addressed in this phase include design questions on how things work, as well as why they work. Various reference domains converge in this phase where activities from design to evaluation to understanding adoption factors come into play (Kohli and Melville, 2019).
2.1.3 Diffusion. The diffusion phase is when the innovation "spreads across a population of potential users" (Fichman et al., 2014, p. 336). This may also refer to innovation adoption or implementation (Bergvall-K areborn and St ahlbr€ ost, 2009;Garud et al., 2013;Kohli and Melville, 2019;Rogers, 2010). The reason we do not use implementation to describe this phase is to avoid the confusion with hardware and software implementation (i.e. postsales activities). As digital technology improves connectivity and reach to potential users, the boundaries between development and diffusion become blurry (Nambisan et al., 2017), especially when users are also involved early in the process for testing. Thus, diffusion includes both the use of the innovation and the integration of the innovation with resources in its intended context (Fichman et al., 2014).
2.1.4 Postdiffusion. Various studies highlight different phases following innovation diffusion, including its impact, scaling, exploitation or even termination. With impact, the focus is on assessing and understanding the effects of the innovation (Fichman et al., 2014). The level of impact depends on the scope of diffusion preceding this stage (e.g. can be assessed on the individual, organizational, industry or societal level). And since diffusion does not happen in vacuum, the integration of the innovation with existing artifacts and practices produces both intended and unintended consequences that need to be accounted for. This includes assessing the perceived value of the innovation in use (Wheeler, 2002). Depending on the success of the innovation and its consequent diffusion, innovators are then concerned with either scaling (George and Lin, 2017;Kusiak, 2009), exploiting new opportunities enabled by the innovation (Kohli and Melville, 2019) or considering the termination of the innovation (Garud et al., 2013;Van de Ven et al., 2008). Table 1 summarizes the digital innovation process phases. These four phases are used (1) to organize the DDI literature and (2) as seed categories to analyze the empirical data collected in this study.

Datadriven innovation
DDI is an emerging phenomenon that refers to the integration of digital data and analytics into innovation (Engel and Ebel, 2019). Given the early stage of the literature, relevant studies are dispersed. Hence, we use the four main stages of innovation to organize the literature on DDI (see Table 2). This does not imply that all these forms exist in one process instance though.
The role of data analytics in the innovation process varies from one phase to another. The most common role of data analytics is providing innovators with new knowledge or knowledge combinations that they could not obtain otherwise in a feasible manner (Wu et al., 2019). This includes analyzing existing data sets to extract patterns that reveal opportunities for innovation, by means of identifying needs of users (Kuehl et al., 2016) or identifying new users (Trabucchi et al., 2017(Trabucchi et al., , 2018. It is also used to identify problems that define a need for innovation (Herterich et al., 2015). Using data analytics for this type of knowledge search may create the conditions for innovation discovery or the sprout of innovative ideas (Garud et al.,

Phase
Description References

Discovery
Emerging new ideas that are triggered and/or enabled by digital technology (Nambisan, 2013;Nambisan et al., 2017) Development Transformation of ideas into product, service or a combination (Fichman et al., 2014;Garud et al., 2013) Diffusion Innovation's adoption and use by a population of users and integration with existing resources (Fichman et al., 2014;Kohli and Melville, 2019) Postdiffusion Activities commencing after innovation integration, including scaling, assessing its impact, exploitation or termination (Garud et al., 2013;George and Lin, 2017) Garud et al., 2013), which can also be conducted using data analytics (Kusiak, 2009). Engel and Ebel (2019) differentiate between the roles of data analytics in knowledge discovery and validation as explorative and validative DDI, respectively. The latter is more commonly present in the phase of innovation development, where data are collected for monitoring (Brunswicker et al., 2015;Lee et al., 2014) or testing and simulation (Meyer, 2015;Wrasse et al., 2015) purposes. However, data analytics is also used in an explorative manner to inform design and development decisions that would improve user experience Lin et al., 2016) or incorporate user insights into the design (Yeh and Chen, 2018). Moreover, albeit less of a role of data analytics, in design of DDI an explication of data needs is crucial for some of the following phases (Seidelin et al., 2017). This is particularly relevant for innovations that continue to generate data during diffusion and postdiffusion phases, namely generative DDIs (Engel and Ebel, 2019).
In diffusion, data analytics is viewed as an embedded technology in the use and delivery of innovation. For example, the servitization of the manufacturing industry yielded some of the early insights on how data analytics is used to provide predictive maintenance of products (Lee et al., 2014;Neely, 2008). With ubiquitous digitalization, innovations continue to be customizable and individual through data analytics (Lehrer et al., 2018). Analytics also affords the possibility of continuous monitoring of user engagement during this phase (Okazaki et al., 2015) and the possibility for early interventions. Operating in such a datadriven manner after launching an innovation into the market is found to be a contingent factor for scaling (Huang et al., 2017). Last but not least, data analytics creates a loop out of such traditional innovation process, where data generated from one innovation can be exploited for improvements, trading or the exploration for new knowledge to inform the discovery of other innovations (Trabucchi et al., 2018;Trabucchi and Buganza, 2019).
In addition, the literature provides key insights on cross-cutting issues that affect various phases. These insights are represented by different conceptual frameworks that aim at guiding innovators in their journey, such as identifying the scope of analytics and innovation (George and Lin, 2017), understanding types of data-driven outcomes (Rizk et al., 2018), clarifying the objects and people on which data are collected (Maglio and Lim, 2016), to building relevant data analytics capabilities to support the process (Wu et al., 2019).  Lin et al., 2016;Seidelin et al., 2017;Yeh and Chen, 2018) Testing and simulation (Meyer, 2015;Wrasse et al., 2015) Monitoring and evaluating development (Brunswicker et al., 2015;Lee et al., 2014) Diffusion Customization and individualization (Lehrer et al., 2018) Continuous service delivery (Lee et al., 2014;Neely, 2008) Monitoring user engagement (Okazaki et al., 2015) Postdiffusion Exploitation of innovation data (Herterich et al., 2015;Trabucchi et al., 2018;Trabucchi and Buganza, 2019) Scaling (Huang et al., 2017)  While the literature is rapidly providing foundational knowledge for DDI, a holistic process following DDI outcomes is largely underexplored. To the best of our knowledge, the current body of knowledge lacks the process perspectives of how such outcomes emerge throughout discovery, development and diffusion, and how they evolve and sustain onwards (or terminate).

Federated networks
The vast majority of digital and DDI literature reports on innovation activities within organizational structures, where established routines and practices provide an organizational backdrop to support the innovation activities (Kohli and Melville, 2019). However, digital technology is pushing the boundaries of innovation beyond the organizational perimeters, into a world of networked collaborations (West and Bogers, 2017). Federated network is one type of such networks, where innovation participants come from heterogenous groups of actors, who bring along heterogenous knowledge contributions (Lyytinen et al., 2016). However, federated networks have a central node which typically exercises tight control over the innovation outcome and highly centralized coordination mechanisms. Thus, similar to the political use of the term, stakeholders involved in a federated network operate with some degree of autonomy while working under the coordination of the central node. Recent evidence suggests that federated networks are the most common archetype of digital innovation networks in practice and that other networks are also shifting toward becoming federated ones (Hund and Wagner, 2019).
Federated networks may enjoy a wider knowledge base that boosts innovation, but they also face complicated, nonlinear innovation processes. More specifically, federated networks' participants capture and share knowledge with different representations, which makes their respective cognitive translations ambiguous and emergent (Lyytinen et al., 2016). They also face social translations, the process by which social relations between the network participants change due to their participation in the innovation process (Lyytinen et al., 2016). Indeed, those two translation processes are interwoven. How data analytics influences these processes, or vice versa, along the main phases of DDIs is unknown. We explore this phenomenon by following four process instances of DDIs taking place in federated network structures. The following section outlines the method used for data collection and analysis.

Research method
In this study, we chose a case study research approach, motivated by a few factors. First, empirical studies reporting on the DDI process are limited, and the phenomenon itself is one of a contemporary nature. Second, the boundaries between the process, the outcome, and the context are not easy to separate in practice (Yin, 2018). The research strategy is an exploratory one; i.e. exploring the innovation process in its rich contextual detail and illustrate how they shape the innovation outcomes.

Case context
This study is part of a smart city project: OrganiCity. OrganiCity is a V7.2m project funded by the European Commission's Horizon 2020 Research and Innovation framework program from 2015 to 2018. The scope of the project was to build an Experimentation-as-a-Service (EaaS) platform where third-party teams innovate with Internet of Things (IoT), urban data and cocreation tools in three European cities: London, Santander and Aarhus. The project operated by providing the experimentation facility and cascading funds to teams that are interested in innovating with said tools to design and deliver their own data-driven solutions. The teams applied to an open call for applications and their proposals were evaluated on the basis of novelty, expected impact and feasibility.
A total of 37 teams (of 2-7 members each) have been funded by the project with up to 60,000 Euros over 2 open calls and a period of execution of six months each. During this Data-driven innovation processes period, the teams were required to submit a plan and two progress reports as part of the funding requirements. In addition, they were required to cocreate their innovations with their respective community and to integrate their solutions with the EaaS facility. On one hand, the teams were given freedom regarding their working methods, choice of technology, partners and processes. On the other hand, the consortium's objective of creating an experimentation ecosystem for smart cities imposed some constraints on those processes. Thus, innovations are neither emerging from a homogenous knowledge community, nor are they products of networks of collaboration that are working toward a single objective (Brunswicker et al., 2015;West and Bogers, 2017). Each of the selected teams had at least two types of external stakeholders with whom they were cocreating their innovations (e.g. end user community, business customer, governmental authority, etc.) Thus, their respective innovations are emerging through a federated network coordinated by the centralized control of the teams, drawing on the heterogenous knowledge pools of the consortium, their respective users, customers and other stakeholders (Lyytinen et al., 2016). This context brings other aspects of uncertainty into the DDI process.

Case selection
Four proposals, in which the teams relied on data as a main resource for delivering functionality and value, are selected and analyzed for the purpose of this study. Their respective proposals were funded for six months, from October 2016 till April 2017. The selection of the cases was based on purposeful sampling (Patton, 2005), where the cases are suitable to study the target phenomenon and its unit of analysis (Coyne, 1997). The four teams coordinating these innovations are comprised of a few members each, but their networks included participants from the scientific communities, organizational beneficiaries, end-users and the OrganiCity consortium, all of which contributed in shaping their innovation outcomes (Lyytinen et al., 2016). In addition, we selected teams that were operating in the same city (London) to eliminate cultural or structural factors as sources of variation in the process.

Data collection
Researchers studying innovation from a process perspective are recommended to look at critical events along the journey (Bygstad et al., 2016;Tuertscher et al., 2014). To create an account of those events and journeys of the four selected innovation processes, we relied on multiple data sources that were produced during those six months. The data sources used in the study include the innovation proposals, initial planning documentation and progress reports, public communications such as blogs and interviews. In addition, some teams chose to share other material such as their developed artifacts and/or stakeholder engagement workshop material. The teams' innovation proposals are their applications to OrganiCity project's open call for funding. Proposals were submitted through a competition created on F6S, which is a startup community platform. In these proposals, they outlined and documented the initial forms of the ideas behind the DDIs, along with a high-level execution plan and expected impact. These proposals provided insights on the early shape of the ideas and information on the team members, their demographics and competences.
Upon selection, the teams were asked to draft initial planning documents as part of their Experiment Agreement (EA), with more detailed plans concerning their development and cocreation activities, foreseen milestones and deliverables to the project. Progress reports were submitted twice for each case across the six months of their funding, and each submission was conditional to releasing a portion of their funding. Interim reports submitted halfway through the project covered their experience with experimentation and their development thus far, as well as reporting on their cocreation activities with their respective beneficiaries. Final reports were submitted at the end of the project as a reflection on the whole journey of innovation with a focus on the whole process and assessing the impact of their innovations. Blog entries were written throughout the project at significant moments for the different innovations or either posted on the team's own website or on third-party hosting service, such as the main project's page or Medium. Both reports and blog entries enabled capturing the different activities and events in close proximity of the time they took place, which makes those data sources of high accuracy and credibility (Silverman, 2013

Data-driven innovation processes
Interviews are used to collect rich data about a phenomenon in its natural context (Kvale, 2008). The purpose of the interviews is to capture the big picture of the process as experienced by the teams firsthand, in addition to filling any gaps in the reports. Nine semistructured indepth interviews were conducted with the four teams. Semistructured interviews are guided by a set of questions that are designed beforehand in order to address relevant information, in addition to open-ended questions to account for unexpected information (Hove and Anda, 2005). The interviews were conducted with the team lead and other team member(s) who were needed to provide details about the areas that the team lead was not familiar with. For example, if the team lead was a social designer, the data scientist was also interviewed to complement the lead's view on the development and deployment of the innovation.
All interviews were conducted online through Skype or Google Hangouts, and were voice recorded and later transcribed. An interview protocol containing open-ended questions was used, addressing different themes around the processes, activities and tools and technologies used throughout their six-month experimentation period. Follow-up questions were asked when needed to clarify and/or expand on a certain event or activity. Interviews lasted between 45 and 115 min. All interviews were conducted in English, transcribed and sent to the team leads to confirm accuracy. Table 3 provides a summary of the empirical data collected and used in this study. In addition to the proposals, planning and progress reports, which were equally distributed among the cases, Table 4 describes the distribution of the three other data sources among the cases.

Data analysis
In this study, we adopt the view of a process as "a sequence of events representing changes in things" to understand how the four DDIs emerged within their respective networks over time (Garud et al., 2018, p. 227). Accordingly, the first step toward this understanding was the inductive analysis of data to specify and order the events. Indeed, the scale of the events is proportionate to the scale of the innovations, meaning that we sought to identify events that were significant in shaping the innovations. Each event was identified by coding the textual data produced by the team (either transcriptions of interviews or documents they wrote). Each event was then associated with an outcome to confirm it is an actual event. Actors involved in the activities leading up to this event, or the event itself were also identified. The events and actors were subject to open coding by the first author who interviewed the teams. The chronological sequence of events was then corroborated from the different sources of data, in Table 3, and then sent to the team leads to confirm the events' accuracy and order. This yielded 81 distinct events for all four cases. A sample of the coded events is shown in the Annex, along with excerpts from the sources.
Next, those events were subject to concept-driven coding using the digital innovation phases as seed categories (Gibbs, 2007). The events were mapped to the phases according to the phase descriptions presented earlier (Garud et al., 2013;Van de Ven et al., 2008). It is important to note that these phases were rather used as analytical instrument rather than for theory testing. Two issues emerged with the 4-phase analysis: (1) Around 17% of the activities and events (14 out of 81) did not clearly match the description of the phases and (2) no clear pattern was observed across the cases. Inspired by Wheeler's (2002) idea of the extended phase of conversion of ideas into concepts to be developed, an "exploration" phase was added between discovery and development, and the events were remapped accordingly.
After this classification by the first author, secondary classification was conducted by a group of innovation researchers during a 3h workshop, in which consensus was reached after briefly discussing each event. Their codes were then compared with the first author's and an inter-rater reliability (IRR) score was calculated yielding 96% agreement. Disagreements were then discussed with the second author and resolved. An event-listing matrix (Miles et al., 2013) was then developed for each case to obtain an overview of the process. In addition, the process with a networked view of the participating actors was visualized (see Figures 1-4), which further enabled the cross-case analysis and pattern detection. The scope of the cross-case analysis was in each phase to identify common enabling or challenging social and cognitive translations.

Case analysis
To make a distinction between the phases, specific events were defined for each phase. Proposal submission and the decision for funding marked the end of discovery phase, since a relatively clear idea was proposed and OrganiCity judged it to be worthy of allocating resources for its pursuit. A refined innovation concept or idea marked the end of exploration phase, during which available resources and network connections were explored and examined. A ready innovation launched to the respective market to be used by its users marked the end of the development phase and the beginning of diffusion. Since the analysis focused on the data collected during the teams' work during the OrganiCity project, the diffusion view was limited to that period. However, we included a postdiffusion status which reports on the teams' and innovations' status, three years after their funding by OrganiCity (at the time of writing the paper).
Within each phase of the process instance, the participating actors, activities/events and resources, were visualized temporally. In doing so, a few elements needed to be highlighted (summarized in Table 5, legend to Figures 1-4): (1) Social interactions between participating actors, which overall contributed to understanding social translations.
(2) Knowledge exchange specific to shaping the innovation outcome, or result from interacting with the innovation, which contributed to understanding cognitive translations.
(3) Data flow, representing important datasets used in the design and development of the innovation, or generated as a result of its use.
(4) Product/service offerings represented by the innovation outcome, or the ones indirectly affected by the outcome.
(5) Contractual resources representing both social and knowledge exchange that were bound by a contractual agreement-specifically between the teams and OrganiCity.
Otherwise, the main activities and events were sequenced in the process instances using alphabetical sequencing. In the following four cases, descriptive narratives are provided, focusing on illustrating the unique events for each case and visualized in Figures 1-4 below. Accordingly, their relationship with OrganiCity was moderated, especially during the discovery and development phases, where the funding process and subsequent reporting/ support were similar across the cases.  (Waters et al., 2019). Prior to their involvement with OrganiCity, an acoustic consultant and a team of individuals composed of artists and urban designers launched a social media campaign (via Instagram) so that London residents and visitors-the end users-can share their perceptions of tranquility through crowdsourced images.
In case A, the discovery phase started with creating a core team using their personal networks. Their proposed idea focused on understanding the characteristics-both objective and subjective-of tranquil spaces, with the aim of encouraging the design and use of such places. Objective metrics included air and noise pollution data, and subjective characteristics are based on the end users' perceptions through patterns observed in their crowdsourced contributions. The main idea was to explore "if we can change our perception of the city to not be noisy, busy and polluted all the time." (Team lead). Their proposal considered the end users who will benefit from understanding the health and well-being benefits of tranquil spaces, as well as how this understanding would influence the protection and creation of more of those spaces in the cities in the future (from the experiment agreement).
During the exploration phase, the team worked intensely on understanding tranquility as their phenomenon of interest. This entailed interacting with various knowledge bases and data sources. Three of the most significant knowledge development efforts and representations that shaped their innovation are (1) conducting a literature review drawing on related aspects such as air quality, noise pollution, happiness and emotional well-being and ecology/biodiversity, (2) exploring the subjective aspect of tranquility by extracting the main features represented in the images from the end users' crowdsourced data set and (3) exploring the objective aspect of tranquility by collecting open data sets of noise and air quality. Accordingly, they refined their scope as follows: "to understand the links of urban tranquility with low pollution exposure" (Sustainability consultant). In parallel, they worked with city planners to try and find a common language: "We also work with master planning, so we were trying to understand and find a language that they could understand and buy in. Something like, 'By protecting these sites, there is a value associated to that. Value in terms of health, in terms of property prices, in terms of many things.'" (Sustainability consultant) During the development phase, the team conducted data analytics and service development activities. The latter was regarded as more value-adding toward the innovation. As described by the sustainability consultant, "the guys have been mainly involved in crunching that data. But the key thing is the map. Because if you just provide the data to the public, it does not necessarily make sense, but making the map makes working on it super easy." They also conducted workshops with both the end users and city planners to test the map and interpret the insights, respectively. As a result of these workshops, a new feature was recommended, where the user would compare routes and their respective pollution exposure is calculated. Then the variance would be quantified in different ways: "you can put a monetary value, quantify it through the days of life lost, or the days of enhanced quality of life, etc. That was possible because we know about pollution impact and how these things affect you. So we have been working with statistical econometric analysis behind that. And all the methodologies are based on WHO's" (Sustainability consultant).
During these workshops, the team realized the challenge associated with inviting the heterogenous participants-end users and city planners-to the same workshops. As the team lead described it: "we had that mix of people. So that was great because we had people who were influential and people who were just citizens who wanted to just reduce their exposure. But

Data-driven innovation processes
perhaps due to their different expectations, an urban planner said 'these things should be left to professionals.'" In the diffusion phase, and throughout the process, the crowdsourced images using their designated hashtag were still being posted. The team launched the map along with organized walks to use the map in its designated context, i.e. exploring tranquil spaces. Simultaneously, the team exchanged the insights they gathered with city planners, with a relatively more challenging engagement. The team lead compared how the two types of participants interact with the innovation as follows, starting with end-users: People were just literally trying to find the best route; they were not complicating things, not trying to think if they know this street or not. They react to the experiment in a much more simple and straightforward way, which worked well. And people who had a bit more influence [city planners] were thinking at higher level, I guess on a grander scale.
As a result of their development with OrganiCity, they also uploaded a tranquility data set that contained images of tranquil spaces and their spatially associated pollution index, adding to the overall facility's open datasets. From that point onwards, the team continued to define themselves to be working with a movement or initiative rather than a specific innovation, partially in order to keep crowdsourced data coming. Three years after their initial funding from OrganiCity, they ran a portfolio of services that includes the map, an aggregated data analysis tool (ADAT), a tranquility index, an API and the experience of walks and tours.

Case B
The team behind case B was comprised of four individuals who got together during a hackathon a few years before OrganiCity's open call. As a result of that hackathon and seed funding afterward, they developed a mobile sensor that measures air quality indicators with a location. They had ideas on mounting such sensor on bikes but that did not work. The team knew about OrganiCity's open call through their team lead, who noted "I work for an NGO campaigning on air quality and we were invited to the [OrganiCity] initial launch event." This event was a workshop in which OrganiCity was defining city challenges. During this workshop, the team identified the problems with existing air quality monitoring in London as follows: "So, in terms of city challenges, we identified the challenge of air quality, and identified the challenge that there are only 120 static monitors in London with modelled data in between, and we wanted to provide more spatiotemporally granular data, so that is what we are trying to do through harnessing existing networks of moving agents." (Team lead) The team then approached a green cargo company to partner with them in order to host the team's sensor on their fleet of electric vehicles, based on which the team would provide them with a routing tool. Accordingly, the company-as a business customer-provided the team with the opportunity, while OrganiCity provided them with the funding and resources for pursuing their idea.
During the exploration phase, the team investigated a few crucial elements for the development of their innovation. First, they discussed the design requirements for the new configuration they were seeking, that is mounting the sensor on vehicles. Then they scanned the OrganiCity platform for tools that they can use in developing their innovation, and explored one of the main sensing toolkits, namely SensiNact. They also started meeting with their prospect business customer and hosting partner to explore their needs to refine their innovation concept for the sensor and the routing tool. However, the team rapidly realized the challenge with this partner early on in the development phase. The team leader described their experience as follows: Data-driven innovation processes we had this high aspiration of creating this routing tool which turns out to be really difficult. We have discussed with them and talked to them about what it would and could look like and talked about the possibility. The difficulty was also the fact that it's such a novel dataset and service made it really difficult for them to imagine how it could integrate.
In addition to this readiness challenge, the business customer voiced their drivers' (i.e. passive end users) concerns over their privacy and safety-in terms of revealing their real-time locations. The team widened their engagement efforts to seek interested prospects as a fallback scenario. They cooperated with scientists for calibration experiments to improve their prototype and integrate other needed sensors in their device. After a few iterations between the labs and the field, they finally mounted the sensors on the vehicles. Initially, the team designed the sensors to send this data directly to the OrganiCity platform; however, due to limited capacity and the users' privacy concern, the team introduced an intermediary server that stored the data generated by the sensors and in which their data analytics was conducted.
In the phase of diffusion, the team integrated a new sensing module for the SensiNact tool of OrganiCity mounted a stable device composed of a number of air quality sensors on ten vehicles and developed visualizations from the generated data. These visualizations were used both in the digital service provided to their business customer (routing tool) and in approaching investors to further expand their innovation. Now after three years from the end of OrganiCity funding, team B's innovation is used by four different organizations, including a local council (in construction sites) and schools (in their playgrounds).

Case C
The team behind case C aimed to develop a digital application that maps citizens' well-being using both objective and subjective well-being data. The objective was to help both citizens and their respective service providers to gain deeper insights on aspects of well-being of which they were not aware. During the discovery phase, the opportunity for OrganiCity's open call was spotted through the team lead's professional network, in which she also approached two other individuals to form the core team to submit their proposal for funding. All three members were social designers, with little experience developing digital innovations; thus, it was understood that later on during development they would need to expand the team to include a member with technical expertise. In writing the proposal, the team searched for problems and solutions around well-being, problems such as those facing their local council, social service providers and citizens and solutions that use digital technology to overcome them. Accordingly, they partnered with a social service provider that works with families to improve the lives of their children. The partnership's objective was two-fold: (1) define the well-being problem they would target through their innovation and (2) access the families who are their innovation's prospective end users. Both the idea and the partnership were targeting the proposal and funding, as the team lead notes in the interviews "So we just needed a kind of a framework for how to create a possible idea around the proposal criteria" and later on the timing of their partnership with the social service providers "They come onboard for the proposal basically." Until submitting the proposal, the idea was an "application" that will empower citizens and foster cocreative relationships between service providers and beneficiaries through a better understanding of well-being data. During the exploration phase, their activities commenced by conducting ethnographic research with families in overcrowded housing (specific end users) to establish trust and understand their perceptions of over crowdedness and how it affected their well-being (phenomenon of interest). They analyzed their research data to better understand the phenomenon, based on which they developed their application's design principles (e.g. the application should not consume the users' scarce resources). Then they used existing digital and data components, represented by ingredient cards, in a workshop with the end users to generate concrete and desirable application ideas. During this workshop, the first concrete innovation concept was proposed: a chatbot that acts like a digital stress ball. The design principles were used as assessment criteria for the generated ideas, matching what is possible and desirable to what is feasible and relevant. Additionally, this phase had intense interaction between the team and the end users, in which the latter contributed to the innovation concept (i.e. the chatbot).
In the development phase, the team hired a data scientist since they were not familiar with AI technologies. Due to time limitations, they developed the chatbot using an existing AI engine (motion.ai) to be deployed through Facebook Messenger-an interface the end user was familiar with and required no consumption of their resources. The chatbot was tested using test conversations, and three visualizations were developed accordingly: (1) a data diary for the end user to gain insight on their own well-being with a temporal dimension and in relation to their local community, (2) a dashboard for the social service providers to understand the topics that the users talk about, also with a temporal dimension and (3) a dashboard for the team to monitor the use and engagement of users with the innovation.
During diffusion and when the chatbot was released, the key data resource, that is the content and metadata extracted from the conversation with the chatbot, was generated simultaneously with the chatbot use. Based on this stream of data, the different visualizations were updated automatically. As a result of this chatbot use, the social service provider acquired new knowledge on their beneficiaries' well-being through a series of workshops: "These data visualizations and data collection from parents show the new insights where people actually talk about other issues like transportation, family, relationship, and communities. These are really important and have a strong relationship to their general wellbeing." (Team lead) In turn, this new knowledge can be used to improve existing services and/or provide them with new services. While the diffusion of their chatbot was a success among their target users, the adoption of the associated dashboards by the social service providers was challenged. One team member (service designer) described the main challenge as follows: "the organization as a whole needed some kind of capability building around service design, co-design, and systems thinking. And even data thinking. So we started doing that with them now [during workshop], but we needed to do that with them before we even started development." The team retrospectively highlighted the role of culture in challenging their engagements with the different stakeholders in the early phases of the process. The team lead noted, "There were two different cultures that were present. With parents, it was very easy to engage in the experimentation, play and role-playing. It might be something to do that we were in a children's center and that children were in and out of the room, so the vibe was different. And then the culture with our service [providers] was not conducive to experimentation. They were very solution focused. It's a common narrative, but we did not have a solution yet." In addition to those two stakeholders, the team did not upload or integrate their data sets with the OrganiCity facility, in order to protect the end users' privacy. Even though the innovation was relevant to its end users, it was terminated after the OrganiCity funding ended. Due to the challenges with the service providers, the team could not seek further funding from them or approach similar paying customers. Similarly, the lack of integration with the OrganiCity facility made it difficult to get more funding during the following open calls. 4.4 Case D In case D, the two cofounders of a small data science consultancy company attended one of OrganiCity networking events for funding opportunities, in which they partnered with urban designers/planners to write their proposal for funding. They aimed to improve accessibility for mobility-impaired users on the London transport network (LTN). Their proposed idea was to use spatiotemporal data from their users' movement, along with their accessibility profiles, to identify mobility blackspots in LTN. The places with issues would be investigated and design recommendations would be provided to the infrastructure and service provider, Transport for London (TfL). Even though the team described both users and the infrastructure provider as key stakeholders in the proposal, later during one of the interviews the team lead noted that the latter was their main target: One of the problems here is that major redesign [of LTN] is going to be significantly very expensive. So we are going to propose various improvements where we can put a when and where [problems] happen. It's difficult to say but the idea hopefully would make it more useful for people accessing the transport network, but mainly for TfL in order to identify where these problems are. At the moment what they do is that they have regular meetings with the different groups where people verbally say, you know, we have problems in this or that area. But there is not really a lot of quantitative data to represent the problem. So that's where we hope to add additional insights.
None of these stakeholders were engaged until later during development, though, and that was the user. During their exploration phase, the team was focused on exploring the tools provided by the OrganiCity facility, especially those that allowed tracking user movement using smartphones. A tool named Smartphone Experimentation (SE) was selected and development commenced right away. The first prototype was designed and developed as a mobile application built over SE's tracking module. However, few sign-ups for testing were made, which the team attributed to the instability of the SE tool and the users' privacy concerns over location tracking. Hence, in the second prototype the team developed their own web application from scratch and changed the data acquisition mechanism to crowdsourcing, where the users can control what they share by uploading their location history from Google Maps.
The data collected were then subject to a) preprocessing, in terms of treating missing points, extracting start and end points of a journey and feature extraction and b) analytics to identify routing patterns and compare their time differentials to TfL's app routes. These patterns were discussed with the urban designers to help them interpret such "blackspots." Through these discussions, the team realized that due to limited usage, the routes did not overlap enough for them to identify a lot of blackspots. Assuming privacy as a main barrier, they modified their application to enable the users to annotate problematic areas directly on the map. With the help of both crowdsourced and computed blackspots, the team was hoping to get a better picture and develop recommendations both for the users as (alternative) navigation recommendations and TfL as redesign recommendations.
After launching the innovation, the diffusion rate was extremely low. The team tried to reach out to mobility impaired users through social media and different nongovernmental organizations (NGOs) that serve them, with whom they conducted interviews to understand why adoption was challenged. According to the team lead, the timing of introducing the innovation to the users is a major factor especially with vulnerable groups: "a lot of people go to these vulnerable groups and say I can do this or that, but they do not really deliver. And obviously, we contacted them in November based on launching in December. We were basically falling into this category, because we promised we're going to do this, and with the technology it wasn't feasible at this stage." The challenging user diffusion meant that the data set of movements was limited, which also meant that the insights (blackspots) were limited that had two important consequences: they could neither give back to the users the insights they expected, nor could they share any new knowledge with the infrastructure provider. In addition, the team could not share the Data-driven innovation processes Figure 4. Data-driven innovation process at case D collected data on the OrganiCity facility due to privacy concerns, as the facility was designed with sensory data (e.g. air quality, weather or traffic) in mind, where data sets are by default public. With such challenged diffusion, the innovation was terminated by the end of the OrganiCity funding.

Cross-case analysis
This crosscase analysis highlights the main common and unique characteristics observable across the four process instances (summarized in Table A2 in the Appendix). In this study, we found that, even though OrganiCity's open call was identified as an opportunity to innovate and collaborate in these new networks, the starting point for the different DDIs varied across the cases. These networks came to know about OrganiCity's open call through active professional and personal networks, seemingly as a plausible beginning in all cases, but there was a subtle difference in what preceded their knowledge of the open call. For instance, the teams in cases A and B had already started to draft some related ideas and collaborated with each other before spotting the open call, but the ideas that got funded were specifically new to their work. Here, the related ideas were already in motion with some of the team members, unlike cases C and D, whose teams and efforts were put together specifically for the funding. After their ideas were selected for funding, a distinct phase of exploration and experimentation could be observed in all cases, before the activities pertaining to developing the DDI commenced. This phase was characterized by exploring data sources, increasing understanding of the phenomenon at hand and/or establishing important relationships. Cases A and B ensured there was a dialog with their organizational beneficiaries (city planners and business customer, respectively). During this phase, they also integrated these stakeholders' needs and viewpoints into their development plans. Having deep insights into their needs became a crucial knowledge contribution for the team to develop an innovation that was viewed as relevant to their prospect funder or paying customer. On the other hand, the team in case C invested in their relationship with the end user during exploration, resulting in an innovation viewed as valuable for the end user, but not to their respective organizational beneficiary. In case D, the team focused on technical exploration with regards to existing tools at the expense of building relationships with, and understand the needs of, end users or organizational beneficiaries.
Another common element in the process instances of cases A and B was the engagement with a scientific knowledge community, whether through desktop research (A) or through experimental collaborations (B). This engagement helped them stay at the forefront of their fields and ensured the novelty of their execution, even though this engagement took place in different phases due to situational circumstances in their specific innovation journey. Neither of the teams in cases C and D involved a similar stakeholder.
In all four instances, as expected with DDI, there were a number of data science activities (i.e. data collection or generation, as well as data analytics or visualization). However, the manner in which these activities took place, involvement of stakeholders, and their timing varied. To clarify, in both cases A and B, the data were acquired first, based on which a digital service was developed. This was done asynchronously, in different time frames. On the other hand, cases C and D developed services that synchronously collected data and provided (visualized) insights. We refer to these as modes of DDI processes. While there were no observable advantages or disadvantages associated with either of the modes, there were a few observable implications on the innovation process.
With regard to development, the synchronous mode entails that data science and service design/development activities were highly intertwined, adding to the complexity of the process. In contrast, the asynchronous mode allowed the teams to analyze and interpret the relevant insights, even with other participants in the network (in case A), before building on it Data-driven innovation processes further in designing and developing their innovation. The implications of these modes are also visible in the diffusion phase. In cases C and D, it was more challenging to motivate end users to adopt the innovation, when they cannot see the value of the innovation until after a pool of users start using it. This dependency on scale also makes the organizational beneficiaries skeptical until that level of diffusion is achieved. In contrast, in case A the air quality data sets were available independent of the users' adoption of the innovation, and the end users were motivated to provide the crowdsourced images independent of the innovation. In case B, the sensor device was provided first, as part of their innovation, which motivated the organizational beneficiary to install it and provide the data, with no extra effort. With regards to integration in the diffusion phase, two facets of the DDIs were significant: data integration and innovation integration. When data are available and integrated with other resources (such as in cases A and B), the traction of the innovation increased as other innovators use this data. On the other hand, as the innovation integrated with users' existing resources and cognitive frames, it became easier for them to adopt the innovation. For each of the four cases, the innovation needed to integrate with resources of at least two types of stakeholders: the organizational beneficiaries and the end users. As pointed out by the teams in B and C, it was challenging for organizational beneficiaries to adopt new insights if they are not guided on how to use them considering their working practices. As for end users, integrating the innovation with other services they are already using minimized this gap and enabled adoption.
Moreover, innovation outcomes that relied on the analysis or visualization of data as their value proposition without providing other actors with suggestions for action were less likely to be perceived as valuable by the said actor. Teams in cases A and B considered the alternative routing features as their value adding features, whereas in cases C and D the analyses were presented to the actors (both end users and organizational beneficiaries) without a suggestion for action.
There are two issues that could be observed in all process instances, posing challenges for the innovation teams. First, with the heterogeneity of participating actors, it was common and challenging to mix those groups in cocreative activities. The diversity of expectations and nature of relationships with the innovation team made those activities relevant to some groups at the expense of others. While the end users appreciated informal setups, as close as to the innovation use situations as possible, the organizational beneficiaries were critical when involved in similar informal setups. On the other hand, organizational beneficiaries were willing to adopt prototypes and earlier versions of the analytical insights, whereas end users were critical to immature innovations. Second, privacy was a crosscutting issue in all instances. Whether it was a conscious contribution by crowdsourcing (A), passive tracking of location (B) or capturing of a subjective personal experience (C and D), privacy was a concern regarding how this data will be visible to others within the innovation network and beyond. The privacy issue was also amplified in cases C and D, where the end users were described as vulnerable groups and the respective data were regarded as more sensitive compared to cases A and B.

Discussion
Our empirical analysis of the four DDI process instances suggests two novel properties of DDI processes: (1) DDI is essentially explorative and thus the DDI process should accommodate such exploration and (2) there are different modes of DDI processes that involve a diversity of complexities and opportunities in the innovation process. These two properties are discussed as follows.

Extended data-driven innovation process
When performing DDI processes, the widely accepted four-phase innovation process (see Fichman et al., 2014;Garud et al., 2013;Kohli and Melville, 2019) does not fully cover all important elements of DDI. In this study, we have observed a phase of exploration emerging between the phases of discovery (of ideas) and development. Our cases indicate that the main characteristic of this phase is the refinement of the innovation idea and/or concept by means of experimenting with data resources and learning about essential social relationships. While it may be argued that this is a form of critical evaluation of the proposed idea (traditionally taking place in the discovery phase (Bergvall-K areborn and St ahlbr€ ost, 2009)), these activities appeared as explorative activities made possible by the allocation of resources to pursue and mature specific ideas, which traditionally marks the transition from discovery to development.
In addition to this five-phase process, we propose a shift in perspective pertaining to digital innovation research, especially in networked structures. Our analysis shows that datadriven insights can be used as a probe supporting different actors to find new meanings around specific phenomena of interest. This suggests that such data narratives enable digital innovation through their potential in establishing shared cognition and joint sense-making among innovation actors or participants (Nambisan et al., 2017). However, this enabling role of data comes with a challenge, since-at least in a networked context-the coordinating team is expected to share the data-driven insights in relevant forms and functions, differently among different social relationships. Thus, while the data narrative may be the same, it is often used as a leverage in negotiating social relations and needs different cognitive translations to create value to the network participants who contributed with data and knowledge throughout the process. We suggest that data and analytics should be regarded as a distinct analytical unit in digital innovation research when considering social and cognitive translations and sense-making (Lyytinen et al., 2016;Nambisan et al., 2017).
In this study, we move from merely focusing on exploring what data and analytics enable in processes of innovation to include the perspective of exploring how the innovation outcomes that are driven by data and analytics emerge overtime. Currently, this literature lacks an empirical investigation of such process exploring data analytics from early to later stages of innovation (Engel and Ebel, 2019). The five-phase process as suggested in this paper extends the process of digital innovation and offers a first step toward establishing a DDI process grounded in the empirical analysis of DDI processes.

Modes of data-driven innovation processes
Two distinct modes of DDI processes were observed in this study. The first mode is referred to as the asynchronous mode, i.e. data before development. In the asynchronous mode, the data are acquired before the development of the innovation started. Here, relevant innovation activities have started prior to presenting their ideas to the open call from OrganiCity. This means that in this mode, the gestation period ( Van de Ven et al., 2008) was already ongoing and the teams had preunderstanding of their data and the potential value it could offer, leading up to their new idea. In fact the data were an integral part of the discovery phase, as suggested by Trabucchi et al. (2018), focusing on identifying opportunities for innovations in existing data sources, by, e.g. understanding tranquility or measuring air quality. Using preexisting data source facilitates a deep understanding of the data which reveals opportunities for innovation (Kuehl et al., 2016), which also was apparent in our study where the data fostered digital innovation.
The second mode is the synchronous mode of DDI, i.e. data after (and during) development. In this mode, the teams design their innovations in a manner in which data are simultaneously generated from the use of the innovation and exploited through data analytics to deliver insights and value. In the literature, this type of innovation can be classified as the generative-explorative or fully integrated DDI (Engel and Ebel, 2019). While there is acknowledgment that this kind of DDI requires new and extended capabilities and Data-driven innovation processes competencies (Engel and Ebel, 2019), our analysis suggests that this mode also poses major complexities on the innovation process. In the following, we outline two main consequences on the innovation process associated with the different modes: (1) Network participants: In many contemporary digital innovation processes, the innovation has moved to the periphery of organizations, as in our cases, hence finding means to facilitate engagement of heterogeneous knowledge sources becomes important (Yoo et al., 2012). The need of integrating heterogeneous knowledge sources has also been intensified by the convergence of digital resources into new contexts. Interestingly, in the asynchronous mode, the engagement of heterogeneous knowledge sources did not only include end users and organizational beneficiaries but was expanded to include the scientific community. The knowledge sources were consulted through desktop research or through experimental collaboration in the early stages of the process, with the aim to remain in the forefront of their fields. In line with the literature, these network participants were only temporarily integrated in the process to support specific tasks (Lyytinen et al., 2016;Yoo et al., 2012). In the synchronous mode, interaction and relationship building between the different participants in the network was challenged since the team lacked data and insights acting as boundary objects around which discussion could be held. The availability of data and insights also made it possible to communicate the value of the innovation prior to the actual use, since it could be contextualized and visualized and thus a critical assessment could be made on an early stage, as suggested by Foulonneau and Turki (2015).
(2) Value of innovation: When it comes to digital innovations per se, the digital services built on acquired data did also, in these cases, offer the users suggestions for actions to be made based on the data. In this way, the service did not only visualize the data to the user but also offered an enhanced experience to the users by providing guidance. The digital innovations being developed in the asynchronous mode are still, three years after the funding has ended, still alive to different extent and in sometime in a different form. Important to note here is that we do not claim that this mode is more successful than the synchronous mode but may have profound implications on the value proposition of the innovation and its diffusion among users. Another challenge with synchronous DDI that is suggested from our analysis during development and diffusion activities is that the knowledge generated from data and value from insights are fully realized only if/when the innovation is diffused and adopted at scale. While this may be more feasible to achieve in specific domains-i.e. when there is a clear and outspoken need for a specific innovation, or when there is customer willing to pay (Engel and Ebel, 2019), our analysis suggests that it is more complex in other domains and in a networked context.

Conclusion
We began this paper with a review of the digital innovation literature, focusing on the process and highlighting the value of data analytics to the innovation process. One of the motivations we have had is the need for research to explicate the DDI process, which is more commonly taking place in a federated network context. We then introduced our multiple case study research design. Four cases were selected for the event-and process-based analysis we conducted, using data triangulation in each of the four cases, see Table 3. With regards to the first research question "How do innovation processes take form in federated networks?"; we found that the widely accepted four-phase digital innovation process, which has remained a frequently revisited model in the literature, needs to account for an important element of DDI. This element is exploration; a corresponding phase is suggested between the discovery and development phases to propose an extended, fivephase process of DDI. Evidence from the cases pinpoint the importance of this phase in refining the innovation idea via experimentation with data resources and learning about essential social relationships within the innovation network. We plea for more research to be conducted as far as the process is concerned, in light of our findings.
With regards to the second research question "How does data analytics influence the innovation processes within federated networks?"; it is to be noted that our work extends prior studies on DDI (Engel and Ebel, 2019;George and Lin, 2017;Rizk et al., 2018) by providing a process perspective and studies on innovation processes (Fichman et al., 2014;Garud et al., 2013) and by providing DDI-specific insights from federated network constellations. More specifically, we discussed two modes of DDI processes; synchronous and asynchronous DDI. They are characterized by when the key data resource is acquired in relation to the design and development of the innovation outcome. The implications of these modes on the participation in the innovation network and value of the innovation were discussed.
This study is also subject to a few limitations, in spite of the careful application of the case study research method. Since the coding of events relies on the researchers' interpretation owing to the nature of qualitative research, further cases and other research designs are needed to test our findings. We mitigated this limitation by using a triangulation of data sources and confirming the analysis of events with our teams. We also conducted the workshop with experts for the validation of event-to-phase mapping. In addition, the size of the teams was small which limited the number of interviews conducted. This was also mitigated by relying on other data sources that offered close to real-time documentation of the events. Finally, the activities conducted in the four process instances were bound by a contract with OrganiCity; thus, our proposed five-phase process and modes of DDI processes needs further investigation within startup companies and other settings with different funding structures.
In this concluding section, we examine some of the practical implications of our ideas. These practical implications are relevant for entrepreneurs, innovation managers and universities that assume a coordinating role within an innovation network dealing with various stakeholders, and who wish to pursue DDI initiatives. Our research suggests that the role of end users is prominent in the exploration, development and diffusion of DDI. Their early involvement in the process brings value through knowledge sharing (West and Bogers, 2017) and feedback on early versions of the innovation (Bergvall-K areborn and St ahlbr€ ost, 2009). However, it also brings challenges with regards to their expectations and protecting their privacy. We suggest four practical recommendations to enable easier diffusion of DDI: (1) Utilizing the innovation as a data collection tool requires a strong value proposition and realization. Scan existing data sources that may provide similar insights to reduce the process complexity.
(2) Recommend actions for users based on the generated insights in order to encourage engagement.
(3) Utilize services and platforms that the user already uses for deployment in order to lower barriers and uptake of user base.
(4) Anonymization is not necessarily sufficient to address privacy issues because they are prevalent even when data is not directly collected about people (e.g. being tied to a physical space in which they are present). Integrative privacy protection along the process is needed.
Regarding organizational beneficiaries:

Data-driven innovation processes
(1) Involve them as early as the discovery and exploration phases to ensure their buy-in and relevance of innovation, as well as potential sustainability.
(2) Understand their working practices and cointerpret insights with them to reach recommendations for action that suits their working practices.
(3) Treat their adoption distinctively from end users' diffusion throughout the innovation process, while reflecting on how they influence one another.