How shopping habits change with artificial intelligence: smart speakers ’ usage intention

Purpose – The research aims to understand how smart speakers are perceived by their actual and potential users, their attitude towards smart speakers and consequently their intention to use them. Design/methodology/approach – The authors apply a structural equation modelling (SEM) approach to test the research hypotheses through data coming from a structured questionnaire. Findings – Theresultsshowthatthehighertheimportanceattributedtousefulnessandeaseofuse,thehigher the positive attitude that in turn positively affects the intention to use smart speakers. A significant relationshipalsoemergedbetweentasktechnologyfitandattitudetowardssmartspeakers,aswellasbetweenperceivedenjoymentandattitudetowardssmartspeakers.Perceivedprivacyrisk,innovativenessandsocialattractionhavebeenfoundtonotsignificantlyimpactattitudestowardssmartspeakers. Originality/value – Although several academic studies have focused on various aspects of smart technologies, only a few studies discuss the factors that push consumers to use smart speakers for activities relatedtocommercialtransactions.Therefore,lookingattherapidriseofsmartspeakersfordailytasksandthegradualacceptanceofvoiceinteractionwithdigitaltools,theauthorsproposedastudyaboutItalianusers ’ intentionto use smartspeakers. Specifically,to fill the gap in the existing literature,the authors applied a SEM approach to identify utilitarian and hedonic benefits that motivate the use of these devices.


Introduction
The digital revolution and big data have paved the way for new tools and applications to be used in management practice and marketing. The digitalization process has created new media, channels and touchpoints to articulate the new digital customer journey (Schweidel et al., 2022). Specifically, modern society is faced with advances in the field of Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DP) that require scholars and managers to reflect on and reshape the relationship between man and technological applications. Literature has recently investigated the implementation of these innovations in numerous fields of human activity, with a particular focus on marketing activities that concern one of their most successful applications for the retail sector: chatbots that, thanks to Natural Language Processing (NLP), allow users to interact with voice assistants through their voices.
The software that allows the operation of a voice assistant can be found both in smart speakers and in the operating systems of modern smartphones. The software, through the microphones of the specific device, is constantly waiting for a keyword to be activated (Kasmi and Esteves, 2015;Hoy, 2018). Once the keyword is spoken and a command is given, the smart speaker records the information that is sent to a specialized server, to be processed and interpreted. Depending on the type of command, the server provides in a few moments, thanks to the Internet connection, the device with the information that meets the user's request and can then execute it (Kasmi and Esteves, 2015;Hoy, 2018).
Nowadays, the number of devices, smart speakers and other services that support voice commands is increasing dramatically, as is the number of users who use them. From 2016 to 2021, the number of households with at least one smart speaker in the United States rose from 8% to 50%, and by 2025, this penetration rate is expected to reach 75% (Statista, 2022a).
By 2025, revenues generated by the smart speaker market will amount to $35.5 billion, which has been growing steadily since 2016 (Statista, 2022b). A market with such values attracts enormous capital and involves equally important investments and a great stream of research for different and better uses of this technology and its applications, which increasingly involve the user now immersed in a new digital ecosystem, especially after the Covid-19 pandemic that forced billions of people into the house for months (Statista, 2021).
These are significant results if we think that until a few years ago this market did not exist. It was only in 2010 that the voice assistant appeared thanks to Apple first launching the voice assistant Siri (Rzepka et al., 2022). In 2018, the Cupertino-based company, recognizing the growth potential, decided to enter the smart speaker market by integrating Siri into Apple devices. In 2013, Microsoft released Cortana which is currently only present in the desktop version, where it operates through the search engine, Bing. A year later, in 2014, it was Amazon that physically entered the homes of Americans with ALexa and Echo family devices, thus creating a new market. By installing the ALexa application people can set and control all available features such as the playback of songs, activation of skills, adding notes and reminders, and connecting many smart objects for home automation such as bulbs, thermostats or motion sensors. Finally, in 2016, Google also released its first own voice assistant, Google Assistant, available for smartphones and integrated into Google Home devices. To the four most widespread and advanced smart speakers and voice assistants currently present, more are added, and noteworthy is Alice of Yandex and Bixby of Samsung as well as the Chinese giants of Baidu and Alibaba that have launched respectively their devices Xiaodu and AliGenie.
The enormous success in the world found by smart speakers and, in general, in the interactions through the voice, has led these tools to become everyday use, especially within the home. Recent research by Statista (2020) explains how smart speakers are used in everyday life: almost 40% of American smart speaker owners used them every day to listen to music, almost 30% to ask questions and 33.9% to listen to weather forecasts. Interestingly, most of the activities performed by a voice assistant are about routine operations (setting the alarm or timer, checking the weather, listening to the news and preparing a recipe). This constant use makes the smart speaker an integral part of the daily lives of the users and consecrates them as tools to support and improve life in the relaxation of their home. The same study emphasizes a slow but constant change in the relationship with Internet research: 83.1% of Americans have tried at least once to ask an assistant, who has searched for the answer through the web and 29.4% of users do it daily (Statista, 2020). Interestingly, this trend is destined to grow, not only for what concerns the answer to more or less complex questions but also for looking for information about specific products and then for making purchases. Specifically, 27.9% of Americans say they search for information about specific products monthly and 7.3% daily (Statista, 2020). These statistics are relevant if we consider that this activity could be configured as an activity immediately before the purchase through the smart speakers. Furthermore, 14.3% of Americans have tried monthly to buy something through their assistant and 25.2% say they have tried at least once in their life (Statista, 2020). As a result, including the massive business opportunities, two of the companies contending for leadership in the American and European markets, Amazon and Google, have integrated the ability to make purchases through voice. This has opened the way to a new way of doing business, a phenomenon that can be considered an evolution of e-commerce, the so-called Voice Commerce (Mari et al., 2020;Zaharia and W€ urfel, 2020).
One of the most discussed issues within the scientific community is the adoption of new technologies by consumers. Recently, several academic studies have focused on various aspects of smart technologies, including, for example, the degree of consumer acceptance, involvement, and perceived pleasure in using it, the privacy risk related to the disclosure of sensitive user data, the usability of the voice to require the performance of specific tasks and the impact they have on web search habits (Kowalczuk, 2018;Moriuchi, 2019;McLean and Osei-Frimpong, 2019;Ling et al., 2021;Zwakman et al., 2021). However, only a few studies discuss the factors that push consumers to adopt and use smart speakers for activities related to commercial transactions, such as the recent study proposed by Zaharia and W€ urfel (2020) in which the authors analysed the acceptance of smart speakers by German consumers, finding that they generally perceive high risk in using new technologies.
Therefore, looking at the rapid rise of smart speakers for daily tasks, the increased use of voice assistants and the gradual acceptance of voice interaction with technological tools, we decided to propose a study that would focus on Italian users to better understand their intention to use smart speakers.
For all these reasons, this work aims to study how individuals interact with smart speakers to define utilitarian and hedonic benefits that motivate the use of these devices and increase the number of individuals adopting such technology. Starting from the model of technology adoption TAM by Davis et al. (1989), a SEM model is used to identify the variables affecting the intention to use smart speakers. From a theoretical perspective, our work confirms the validity of the application of the TAM model to measure the factors underlying the adoption of new and specific technology, the smart speaker, with the same success as other technologies. From a managerial perspective, our work attempts to light on the revolution taking place to fully satisfy the individual intention to use smart speakers have the potential to substantially alter all phases of the customer journey, from searching for information on a product to repurchasing, even automated (Mari et al., 2020).
The interaction that millions of users are establishing all over the world with voice assistants opens up multiple interesting scenarios in economic and marketing terms for companies able to reap the benefits of the changes taking place.
The paper is organized as follows. First, a literature review of smart speakers and voice assistants' functionality as well as motivations underlying the use of such technology is presented. Secondly, we present our hypotheses that lead to our conceptual framework of smart speaker adoption and the methodology used. Then, we present and discuss our findings that offer theoretical and managerial implications concerning the role played by smart speakers in the retail sector. Finally, the last section is devoted to limitations, and future directions aimed at underlining how voice assistants are slowly changing the humanmachine relationship.

Theoretical background and research objectives
The recent development of ML has allowed scholars to take huge steps in Natural Language Processing (NLP), a field of study that combines AI, computer science and linguistics (Hirschberg and Manning, 2015;Kaplan and Haenlein, 2019;Zhao et al., 2021). NLP uses computational techniques for learning, understanding and producing linguistic content to process human language for numerous tasks and applications (Zhao et al., 2021). The origins of this technology date back to 1964, when Joseph Weizenbaum, a German computer scientist at MIT, created ELIZA, the first program capable of reproducing a human conversation (Kaplan and Haenlein, 2019). However, it is only in the last twenty years that computational linguistics has become an area of interesting scientific research as well as a practical technology incorporated in several commercial products (Kasmi and Esteves, 2015;Hoy, 2018).
According to Hirschberg and Manning (2015), numerous developments in NLP are due to four key factors: a vast increase in the computing power of computer devices, a huge amount and availability of linguistic data, the development of ML methods and a greater understanding of the structure of human language.
The recent shift toward messaging systems as the primary channel for communication (both personal and professional) has contributed to the rapid spread of this innovative technology. One of the most common and successful applications of NLP is the chatbots, tools that simulate human conversation, allowing individuals to interact with digital devices (De Cicco et al., 2020). According to Shawar and Atwell (2007), chatbots are defined as computer applications with which humans are capable to maintain a conversation as natural as a conversation with another person.
Since the interaction with users through the voice is what characterizes the innovative and smart tools and that has allowed their spread in the world in a very short time, the literature identifies two main types of chatbots: declarative and conversational (Shawar and Atwell, 2007;Bors et al., 2020;Anagnoste et al., 2021). The first type is a limited domain, that is only able to interact through automated, specific and structured responses, mainly for support and service functions. It is possible to find them on companies' websites, to answer common questions (such as an alternative to FAQs or customer service) and as the first point of contact between customer and company before asking a real operator to solve problems or specific requests. This kind of tool, recognized in the literature as the traditional chatbot, plays as digital touchpoints to improve the customer experience (De Cicco et al., 2020).
Conversely, conversational chatbots, typically referred to as virtual or digital assistants, are open-domain chatbots and therefore much more interactive and sophisticated. These elaborate tools use predictive intelligence and data analysis to interact with users, provide recommendations and learn about tastes or preferences (Bors et al., 2020). The software, through the microphones of the specific device, waits for a keyword to be activated (Kasmi and Esteves, 2015;Hoy, 2018). Once the keyword is spoken and a command is given, the smart speaker records the information that is sent to a specialized server, to be processed and interpreted. Depending on the type of command, the server provides in a few moments, thanks to the Internet connection, the device with the information that meets the user's request and can then execute it (Kasmi and Esteves, 2015;Hoy, 2018). It is precisely in this ability that a radical difference between smart speakers and other conversational agents such as chatbots is manifested: while the latter is limited in most cases to answer the questions that are asked to them through a question-and-answer mechanism, the former allows much more dynamic conversations and they are also able to grasp nuances of meaning, such as irony and sarcasm, and to respond with a completely similar tone (Mari, 2019). This is compatible with the entertainment purpose of the smart speakers, with the desire to arouse a positive emotion in the user and contribute to a process of anthropomorphization and humanization of such systems.
The significant improvements recorded within AI in solving problems or highly specific tasks, such as voice recognition and the interactive features of smart speakers, allow two-way communication with usersunlike the main means of communication used in marketing which are, by their nature, unidirectionalare the main aspects that have contributed most to the large-scale deployment of these systems (Chou, 2018;Mari, 2019). Specifically, the spread of chatbots, smart speakers and voice assistants has significantly increased the overall amount of conversations between user and machine that, according to Chou (2018), will exceed the total number of text conversations as early as 2023.
Scholars identify three characteristic functions of smart speakers and voice assistants (VA) (Mari, 2019): first, the use of natural language for which they are designed and built to imitate the interactions that normally occur between people; second, the understanding of the context that represents their ability to identify the relevant aspects and factors in which a given conversation took place and more complex factors such as preferences or consumer browsing or purchasing history; third, the ability to learn and self-improve that indicates the ability of the AI systems to understand the existence of any errors of understanding or dissatisfaction by the customer, that, once identified, are recorded in the memory to be considered in further similar situations.
In this sense, smart speakers become the major sources of data collection on consumer activity and for the conclusion of transactions. Specifically, smart speakers become tools for searching for information and evaluating alternatives in the new digital customer journey. According to Taylor (2017), the entire research process will take place mainly through voice assistants. This opens interesting scenarios on Search Engine Optimization (SEO) and the management of results displayed within the search and voice search pages (Nair and Gupta, 2021).
In addition, since the use of smart speakers and voice assistants, in general, meets the time-saving needs of the consumer, several scholars suggest that constantly using a voice assistant can increase traffic and purchase intention in the new digital customer journey (Nair and Gupta, 2021). Smart speakers operate, in fact, in an attempt to understand the needs of users/consumers based on their previous purchasing activity and the information required regarding certain categories of products; the data collected in this way are subsequently processed to identify possible future purchases, to try to predict future purchasing behaviour and suggest products and services compatible with it. In this way, the software processes preferences and identifies implicit behaviours, through which to suggest the most suitable product for the present and future needs of the customer.
According to Davenport et al. (2020), thanks to AI-based technologies, retailers try to better predict what their customers want moving their attention from a shopping-thenshipping business model towards a shipping-then-shopping business model. In this sense, as stated by Agrawal et al. (2017Agrawal et al. ( , 2018, retailers thanks to AI will be able to identify customers' preferences and ship items to customers without a formal order, with the option to return what customers do not need. The shipping-then-shopping model, that is well developed in the fashion sector, together with the AI-based technologies and algorithm might improve both the efficacy and efficiency of the mass customization process by analysing information not explicitly provided by the consumer and relying on a broad big data database for a given consumer (Grandinetti, 2020). In addition, this model involve customer in an emotional level and let them live a gratifying experience (Tao and Xu, 2018) thus achieving high levels of customer satisfaction and loyalty (Davenport et al., 2020;Grandinetti, 2020). This shift would transform retailers' marketing strategies and customer behaviours, especially for the information search and it is able to generate new and original developments in the very next future (Davenport et al., 2020;Grandinetti, 2020).
Precisely because of their ability to act as real prompters and recommenders of purchases making the interaction much richer and more engaging, smart speakers and voice assistants have allowed a new way of doing business, a phenomenon that can be considered evolution and a subcategory of e-commerce (Weber and Ludwig, 2020), the so-called Voice Commerce (Mari et al., 2020;Zaharia and W€ urfel, 2020), a "zero-click purchase" phenomenon in the B2C to keep under control (Kraus et al., 2019). According to the literature, voice commerce refers to commercial interactions occurring online through voice-activated, conversational user interfaces with mobile or home-based virtual assistants and smart speakers or other speech recognition tools (Kraus et al., 2019;Zaharia and W€ urfel, 2020;Rzepka et al., 2022).

Smart speakers' usage intention
The smart speakers have therefore integrated the possibility of making purchases through the voice, opening the way to what is to be considered an evolutionbut not an alternativeof e-commerce. To date, the studies related to this innovative form of e-commerce are relatively few and are based mainly on the intention of use of this tool (Zaharia and W€ urfel, 2020). The reason seems to be mainly linked to the still little diffusion of voice commerce among those who have a smart speaker as well as to the inherent limitations of this mode of shopping. According to Mari (2019), an effortless purchase like this does not guarantee an optimal level of satisfaction. In fact, on the one hand, a simplified representation of alternatives such as those provided by smart speakers, significantly reduces the number of alternatives available to consumers, emphasizing even more the role and weight of the ranking system used (Mari, 2019); on the other hand, voice-based technologies risk to limit the senses of the users that make shopping without browsing any visual content such as photos or videos (Mari et al., 2020).
One of the most discussed issues within the scientific community is the adoption of new technologies by consumers to identify the antecedents associated with the intention to use a specific technology (Davis et al., 1989). The most used theory behind the adoption of new technology is the Technology Acceptance Model (TAM) by Davis et al. (1989) which was designed to predict the likelihood of a new technology being adopted by a group of individuals or companies. The TAM suggests how the Perceived Usefulness (PU) and the Perceived Ease of Use (PEOU) are the two determining factors for the adoption of any technology. Over the years, some re-elaborations of the original formulation of the TAM model have been proposed. However, although the TAM model has been enriched by new constructs, numerous studies have confirmed its validity in the analysis of the adoption of several technologies, ranging from software packages to online and mobile services, in different sectors, from fashion to finance, medicine and retailing (Venkatesh et al., 2012;Aiolfi and Bellini, 2019).
Regarding the studies on shopping habits, recently some scholars tested the TAM for the adoption of mobile apps for grocery shopping. Specifically, Shukla and Sharma (2018) and Aiolfi and Bellini (2019) used the TAM model to understand the factors that determine respectively the adoption by Indian and Italian consumers of a mobile app to buy grocery goods. Finally, a recent meta-analysis by Luceri et al. (2022) confirms the validity of the TAM in mobile commerce adoption.
Regarding the research on smart speakers' adoption, the literature identified some benefits associated with the adoption of smart speakers during the consumer journey. Specifically, the perception of a greater anthropomorphism of the smart speaker and voice assistants and their systems of speech recognition (McLean and Osei-Frimpong, 2019) as well as the efficiency, convenience and enjoyment proved during the usage of smart speakers are considered among the impactful benefits of smart speakers' adoption (Kraus et al., 2019;Moussawi et al., 2021). Furthermore, according to McLean and Osei-Frimpong (2019), the motivations that push individuals of different generations to adopt and use a smart speaker daily are primarily utilitarian, then symbolic and finally social benefits while hedonistic benefits do not motivate the use of this technology.
Among the well-discussed topics in the literature regarding voice assistants and smart speakers, a vital element to consider is the conflict of interest of some VA's developers (Aguirre et al., 2020). For instance, smart speakers simultaneously sell VA services and offer platforms to advertise and promote the purchase of some brands. Therefore, it is uncertain whether voice assistants sell only goods relevant to customers based on their preferences and data acquired by AI-based technologies or if they sell the items of which their producers have an interest thus favouring those brands that pay for their promotional services to be proposed first to the users of their VAs services (Nilsson et al., 2019;Seymour et al., 2020Seymour et al., , 2022. For a deeper understanding of the conflicts of interest inherent in VAs, Aguirre et al. (2020) discussed how voice assistants tend to reduce the perception of conflicts of interest and seemed to act in the best interests of users, while giving priority to information and services for the benefit of sellers.
Several scholars argue about customer scepticism and acceptance of conversational commerce in online shopping (Devi and Chanda, 2022) as well as ethical concerns with voice assistants and the privacy risk related to the disclosure of sensitive user data (Seymour et al., 2022;Zhan et al., 2022). Specifically, Seymour et al. (2022) and Zhan et al. (2022) conducted in recent years systematic reviews of the main ethical concerns with voice assistants and smart speakers. In addition, a recent study by Devi and Chanda (2022) explained whereas global risks associated with AI-based technologies, hacking, unauthorized access as well as misuse of personal information and security threats were the main reasons for consumer scepticism and rejection of digital assistants and smart speakers, which led to customer dissatisfaction with such technologies.
Other scholars investigated the degree of consumer acceptance, involvement and perceived pleasure in using smart speakers, the usability of the voice to require the performance of specific tasks as well as the impact that smart speakers have on Web search habits (Kowalczuk, 2018;Moriuchi, 2019;McLean and Osei-Frimpong, 2019;Ling et al., 2021;Zwakman et al., 2021).
However, only a few studies investigated the antecedents of smart speakers' adoption in the consumer journey for activities related to commercial transactions. For example, interesting is the recent study proposed by Zaharia and W€ urfel (2020) in which the authors analysed the acceptance of smart speakers by German consumers, finding that they generally perceive high risk in using such new technologies. Specifically, in the model developed by Zaharia and W€ urfel (2020) the key factors positively influencing the acceptance of smart speakers to obtain information as well as make a purchase referred to performance expectancy, hedonic motivation, perceived price value and previous experience with smart speakers while perceived risk showed a negative effect on intended usage. Therefore, starting from these considerations and looking at the rapid rise of smart speakers for daily tasks, the increased use of voice assistants and the gradual acceptance of voice interaction with technological tools, we tried to fill the gap in the existent literature by proposing a study that would focus on Italian users to better understand their intention to use smart speakers. Specifically, the present study aims to understand how this technology is perceived by its actual and potential users. Knowing how individuals interact with smart speakers is essential to identify utilitarian and hedonic benefits that motivate the use of these devices, what kind of attitude they have towards smart speakers, and consequently investigating their intention to use them to increase the number of individuals adopting such technology.
Since the TAM model is considered a solid theoretical and methodological tool to predict the behaviour of potential and real users and their respective attitudes towards a specific technology (Aiolfi and Bellini, 2019), we used its variables to develop a SEM model to identify the variables affecting the intention to use smart speakers. In addition to the original TAM variables, other dimensions capable of directly influencing the attitude and indirectly the intention to use smart speakers were added. Ultimately, the conceptual framework is based on the eight hypotheses shown in Figure 1.
Perceived usefulnessattitude towards smart speakers Over time, several studies demonstrate the validity of the TAM model in providing a useful model for researchers to analyse the cause of rejection or acceptance of a technology (Aiolfi and Bellini, 2019). The TAM model suggests how the Perceived Usefulness (PU) and the Perceived Ease of Use (PEOU) are the two determining factors for the adoption of any Smart speakers' usage intention technology that directly impact the attitude of use of a technology and, therefore, indirectly on the intention to use it. Specifically, Davis et al. (1989, p. 320) defined PU as "the degree to which a person believes that using a particular system would enhance his or her job performance".
As far as smart speakers are concerned, perceived usefulness refers to the extent users believe smart speakers will help them perform better based on their purposes. Thus, a NLP tool high in perceived usefulness, in turn, is one for which a user believes in the existence of a positive acceptance-performance relationship. This relationship is strengthened by the level of personalization, the type of personal data used to target the information and the amount of information used. In addition, according to Zaharia and W€ urfel (2020), the main reasons customers mention for using smart speakers refer to usefulness and convenience, namely to save time, and conduct "hands-free" operations. Therefore, we hypothesized the following: H1. The perceived usefulness of smart speakers positively influences the attitude towards these devices.
Perceived ease of useattitude towards smart speakers TAM has been successfully applied in many areas, including mobile commerce and online transactions (e.g. Luceri et al., 2022), and it is a robust model of technology acceptance behaviours in which the causal linkage between perceived ease of use and users' attitude is well studied. According to Davis et al. (1989, p. 320), PEOU refers to "the degree to which a person believes that using a particular system would be free of effort". The construct is considered an indicator of the cognitive effort that individuals need when learning and utilizing a new technology. As far as the adoption of smart speakers, it represents the degree to which those devices are perceived to be easy to understand, learn about and use. According to Venkatesh et al. (2012), when a technology is considered user-friendly, it lowers the effort needed to operate: if an individual does not perceive particular difficulties in using a given technology and believes that it can simplify a certain activity, the attitude towards this innovation will be favourable. Regarding smart speakers, the option to use verbal commands over a traditional keypad appears to be easier and quicker for some consumers (Kessler and Martin, 2017;Zaharia and W€ urfel, 2020). Conceptual framework IJRDM Therefore, we supposed that the perception of ease in using user-friendly smart speakers influences positively the attitude towards those devices.
H2. The perceived ease of use of smart speakers positively influences the attitude towards these devices.
Perceived privacy riskattitude towards smart speakers Perceived privacy risk has been heavily investigated, and in most cases, it negatively affects consumers' perceptions and activities. According to McLean and Osei-Frimpong (2019, p. 30), privacy risks about technology refer to "the perceived threat to an individual's privacy due to the increased level of information that technology gathers on individuals beyond the individual's knowledge and sometimes control". Consistent with George (2004), perceived privacy risk refers to a consumer's general concern about improper acquisition of information, improper use of information, privacy invasion (e.g. direct mailing) and improper storage (e.g. no optingout) which have a negative impact on shopping attitudes and behaviours. As far as smart speakers are concerned, they can perform sophisticated commands that require an extensive set of permissions and information to undertake their tasks, which individuals overwhelmingly provide (Lei et al., 2018;McLean and Osei-Frimpong, 2019).
Another threat is hacker attacks and smart speaker vulnerabilities that concern three aspects: one-factor authentication; no access control based on physical presence; poor security of applications from third-party developers (Lei et al., 2018).
The characteristics of smart speakers, due to the voice interface, make it complex to read privacy policies if not through another device, such as a smartphone. In addition, supervision and protection by companies and competent organizations are also lacking.
Therefore, due to the initial uncertainty about smart speakers' activities in terms of privacy loss, perceived privacy risk is found to be a negative predictor of technology adoption (Benlian et al., 2020). Specifically, if users perceive the intent of smart speakers' operations as manipulative, concerns about the invasion of one's privacy outweigh the possible benefits in terms of relevance, resulting in a negative attitude towards smart speakers (Ham, 2017;Benlian et al., 2020). Therefore, we posit the following hypothesis: H3. Perceived privacy risk negatively affects attitudes towards a smart speaker.
Innovativenessattitude towards smart speakers Perceived innovativeness is defined by Watchravesringkan et al. (2010, p. 266) as "the degree to which consumers believe that the product possesses important attributes of innovation such as newness and uniqueness".
Several authors discussed the role of perceived innovativeness related to the adoption of new technology and its relevance for the success of the new product/service (e.g. Hwang et al., 2019;Kim et al., 2021). Since perceived innovativeness indicates that the product itself reflects the novelty of the technology, the level of perceived innovativeness has a positive impact on the adoption behaviour for new technologies (Leckie et al., 2017).
The stream of existing studies on this topic indicates that if individuals perceive high levels of innovativeness from a new technology, they demonstrate a favourable attitude and they are more likely to use the new technology over time (Hwang et al., 2019;Kim et al., 2021).
As far as smart speakers, perceived innovativeness refers to the consumers' perceptions of the newness of the smart device and to how the consumers' perceived innovativeness affects their daily life and changes their behaviour.
Therefore, consistent with the literature, since smart speakers are still in the early stages of adoption and many consumers have never used such technology before, we hypothesized that their innovativeness acts positively on attitude towards smart speakers: Smart speakers' usage intention H4. The perceived innovativeness of smart speakers positively influences the attitude towards these devices.
Perceived enjoymentattitude towards smart speakers Literature on the antecedents of the adoption of technology suggests that hedonistic aspects also play a crucial role in consumers' attitudes and behaviours (Venkatesh et al., 2012). Specifically, among all the hedonic motivations, literature found that perceived enjoyment is an important predictor of attitude (Zaharia and W€ urfel, 2020). Perceived Enjoyment refers to the intrinsic reward derived from the technology use (Verhoef et al., 2007). According to Zaharia and W€ urfel (2020), as far as smart speakers are concerned, perceived enjoyment reflects the extent to which a consumer perceives the use of smart speakers during the customer journey as fun, entertaining, exciting and pleasant. From this perspective, users can be fascinated by the ability of smart speakers to let them interact, converse with the device or AI voice assistant and entertain themselves through games or fun conversations (Kessler and Martin, 2017;Zaharia and W€ urfel, 2020). Thus, feeling pleasure, satisfaction and fun using a smart speaker can create a positive emotional relationship towards the smart speaker individuals use and form the basis of a positive and lasting relationship. For these reasons, we hypothesized the following: H5. The perceived enjoyment towards smart speakers positively influences the attitude towards these devices.
Social attractionattitude towards smart speakers Social attraction is defined by McCroskey (1974) as a "social or personal liking property". In this sense, literature refers to social attraction as the extent to which machines make individuals feel as though they are in the presence of another social entity (McLean and Osei-Frimpong, 2019). Specifically, Moon (2000) posited that since humans are socially oriented beings, they apply for social roles when interacting with technology. For example, human-like attributes elicit social responses such as pausing for a response and curtsy during interactions in the same way as they would with another human. Regarding smart speakers, language-based conversations between users and AI-powered devices serve as an important human-like attribute that elicits a sense of social presence in the mind of individuals (Sundar et al., 2017;McLean and Osei-Frimpong, 2019).
McLean and Osei-Frimpong (2019) and Lee et al. (2020) suggest that an anthropomorphic perception of smart speakers significantly affect the feeling of social relationship with it which, in turn, leads to positive attitudes and behaviours, that is, the user's intention to communicate and make friends with those devices. Therefore, as individuals become comfortable in their conversations with an artificial personification, we hypothesized that social attractiveness may motivate individuals to engage with the AI technology of smart speakers in the same way as they would with other human counterparts: H6. Social attraction perceived by the users of smart speakers has a positive influence on the attitude towards these devices.
Task technology fitattitude towards smart speakers Being able to actively dialogue with software able to carry out thousands of actions in response to a specific request is radically changing the man-machine relationship. The ability of a technology to support a task is defined in literature with the construct of Task Technology Fit (TTF). Goodhue and Thompson (1995, p. 216) defined TTF as "the degree to which a technology assists individuals in performing their portfolio of tasks". Specifically, TTF is the correspondence between task requirements, individual abilities and the functionality of the technology to match the capabilities of the technology to the demands of the task (Dishaw and Strong, 1999). Scholars investigated how performance impacts will result from TTF and how TTF should be a determinant of whether technologies are believed to be more useful, more relevant or give more relative advantage to the users (Goodhue and Thompson, 1995). According to Goodhue and Thompson (1995), the higher the discrepancy between task and technology, the lower the perceived TTF and the lower the value users get from using technology. Conversely, a higher level of TTF positively affects users' adoption and utilization of the technology. The same findings are discussed by Rzepka et al. (2022) which illustrates how in the service context, TTF has already been investigated concerning the whole customer journey and specific tasks such as information search.
As far as chatbots, a recent study by Chen et al. (2021) investigates the matching of a chatbot's interaction style with goal-directed and experiential tasks. In addition, Go and Sundar (2019) illustrated how interactive experiences through chatbots and voice assistants are considered better than static information transmission and a high level of interactivity can often compensate for the impersonal nature of a chatbot.
Regarding smart speakers, the TTF is the perception of the adequacy or inadequacy of a smart speaker to perform the actions for which it is programmed, correctly, without margins of error. This reflection for a tool that is activated through the voice does not concern only the execution of the command itself but rather the immediately preceding part or the understanding of that command to perform exactly the required task. Precisely because the use of smart speakers is more about carrying out everyday actions, the sentence inputs spoken can also differ, be misunderstood and generate frustration for users who interface with them daily. In addition, at the core of the reflection is undoubtedly the topic of customization: smart speakers, thanks to AI and ML, continuously collect information and feedback to improve and optimize their operation and respond better to subsequent requests of the same user. All these considerations and the theoretical background lead to the following hypothesis: H7. Higher levels of Task Technology Fit positively affect users' attitudes towards smart speakers.

Attitude towards smart speakersintention to use smart speakers
The concept of attitude towards new technology and intention to use it are two of the most studied constructs in consumer behaviour as important factors in predicting individuals' decisions or behaviours (Ajzen, 1991;Fishbein and Ajzen, 1977;Hwang et al., 2019). Attitude is defined by Ajzen (1991, p. 188) as "the degree to which a person has a favourable or unfavourable evaluation or appraisal of the behaviour". Regarding the adoption of new technology, attitude is considered an individual's psychological disposition to react in a manner that is favourable or unfavourable to the particular behaviour of using new technology (Hew et al., 2016;Hwang et al., 2019).
According to several theoretical and empirical studies (Ajzen, 1991;Davis et al., 1989;Fishbein and Ajzen, 1977), attitude plays a critical role in stimulating the intention to use a new technology: if consumers have a favourable attitude towards a specific new technology, they are more likely to use the new technology in the future (Hew et al., 2016;Hwang et al., 2019).
According to Oliver et al. (1997, p. 28), behavioural intention is defined as "a stated likelihood to engage in a behaviour". Regarding the adoption of new technology, intention to use/adopt a technology refers to the first-time individuals adopt a technology, that is, a Smart speakers' usage intention motivational component of behaviour that measures the strength of individuals' conscious effort exerted to perform the particular behaviour of adopting the new technology (Fishbein and Ajzen, 1977;Davis et al., 1989).
Based on the theoretical and empirical backgrounds, we posit the following hypothesis: H8. Attitude towards smart speakers positively affects intentions to use those devices.

Methodology
To test all the hypotheses of the proposed model, a self-administered questionnaire was filled in online by a sample of 159 subjects randomly recruited via online channels and social networking sites. Consumers who surf the Web might represent the most suitable target for the research objectives (Aiolfi et al., 2021). Indeed, those who are used to surfing online are most likely more familiar to consider voice assistants and using innovative smart speakers. Consistent with prior research we conducted the survey to the methodological tool of questionnaire wide used in recent research on new technology adoption (Venkatesh et al., 2012;Shukla and Sharma, 2018;Aiolfi and Bellini, 2019). In addition, surveys are one of the most commonly used research methods across all fields of research (Lazar and Feng, 2010;Van Biljon, 2014) to provide a quantitative description of trends, attitudes and opinions as well as to find cause and effect or the relationships between variables making the generalization of quantitative findings possible in terms of reliability and validity.
In order to prevent response bias and coherent with all the research studies dealing with human responses, we followed a specific protocol and guidelines to design the survey (among the other Evan andMiller, 2006 andKountur, 2011). Specifically, we measured all the variables considered in the online survey with multiple-item scales that come from previous research about shoppers and marketing theories. All the items, once translated into Italian, were adapted for the context of smart speakers and measured using a 7-point anchored Likert scale from 1 (disagree) to 7 (agree) (see Table A1 in Appendix). According to Hartman et al. (2002), the use of a Likert scale, with both positive and negative questions, is useful to neutralize the random answer given by the respondents that might not read the questions carefully.
Specifically, the four-item scales of the Perceived Usefulness and Perceived Ease of Use, already used in the TAM model, derived from Fr€ ohlke and Pettersson (2015) and subsequently adapted to the digital context by Aiolfi and Bellini (2019). The Perceived Privacy Risk considered four items adapted from McLean and Osei-Frimpong (2019), the level of Innovativeness was measured through four items adapted from Sundar and Noseworthy (2016), the Perceived Enjoyment considered a three-item scale adapted from Kowalczuk (2018), the three-items scale of the Social Attraction was drawn from McLean and Osei-Frimpong (2019) while the four-items scale of the Task Technology Fit was adapted from the recent work of Ling et al. (2021). Finally, the Attitude towards smart speakers was measured by three items adapted from Fishbein and Ajzen (1977) while the Intention to use smart speakers considered three items adapted from Al-rahmi and Othman (2013). Both of these two scales were subsequently adapted to the digital context by Aiolfi and Bellini (2019).
In addition to the scales discussed above, demographic questions and smart speakers' usage habits were included (e.g. individual awareness; motivations and frequency of use). When designing this section of the questionnaire as well as the scales, we were careful to build simple, clear and relatively short questions: we avoided double-barrelled questions as well as double negatives; we used both mutually exclusive and exhaustive response categories for closed-ended questions. Furthermore, we reversed the wording in some of the questions to help prevent response sets, we considered unbiased language, free from gender and ethnic bias (Kountur, 2011). Furthermore, to neutralize the response bias we tested the survey on a first sample to understand the response times, if there were errors in understanding the questions or if there were errors in the construction of the questionnaire. Once it was all established, the questionnaire was administered to the online random sample.
As far as the analytic procedures, data underwent two phases of analysis. First, a confirmatory factor analysis (CFA) with the latent variables considered was performed to obtain evidence of reliability (Santos, 1999), convergent validity (Anderson and Gerbing, 1988) and discriminant validity (Fornell and Larcker, 1981) for each of the scales and individual items used in the questionnaire.
Second, the paths of the hypothesized relationships were explored. Structural equation modelling (SEM) with the maximum likelihood method employed for the CFA and the analysis of the conceptual model. Data analysis was performed using the IBM SPSS 25.0 statistical software and the software LISREL 8.80.

Respondent profile
The sample was represented by 159 subjects, 72% women and 28% men, with a mean age of 37.8 (min 5 18; max 5 75).
The difference in terms of gender reflects the women's online presence in Italy in 2021, which according to a recent research (Lazzati, 2022) has grown by 29.1% with the research carried out online by women that was equal to 41.7%, compared to 58.3% of male research. This phenomenon is not only Italian, in fact, in the United Kingdom, digital women in 2021 have been 42.1%, in France 44.6%, in Germany 45.8% and in Spain reached the majority with 51.2% (Lazzati, 2022).
Almost all respondents (95.7%) stated that they know about smart speakers and 44.2% own at least one. Among them, almost 80% own an Amazon device of the Echo Dot series. The majority of the respondents (63.8%) use their smart speakers at least once a day and almost all smart speaker users adopt those devices to listen to music (95.7%). For a complete understanding of the socio-demographic profiles and habits see respectively Tables A2 and A3 in Appendix.

Analysis of the measurement model
As the skew and kurtosis statistics showed that the normality assumption was violated, the model was estimated using the method by Satorra and Bentler (1994). The fit statistics indicated that the measurement models fit the data well (χ 2 5 735.850, df 5 428, p 5 0.000, CFI 5 0.979, RMSEA 5 0.065, NNFI 5 0.976, SRMR 5 0.058). All items substantially and significantly loaded onto the expected latent construct (Anderson and Gerbing, 1988) and all constructs also showed satisfactory levels of Composite Reliability (CR) and Average Variance Extracted (AVE), exceeding the recommended cut-off points for the adequacy of 0.70 and 0.50 respectively (Fornell and Larcker, 1981). Discriminant validity was also assessed based on Fornell and Larcker's (1981) criterion. We tested reliability using Cronbach's alpha and all values are higher than the minimum acceptable value of 0.70 (Santos, 1999). Table A1 in Appendix reports the reliability and validity indexes for each construct.
Tests of the structural model The fit indices indicated an acceptable overall fit of the structural model to the data:

Smart speakers' usage intention
The results of the path analysis are shown in Figure 2 with all the path coefficients (intensity and direction of relations) and the significance (t-value) for each of them while Table A4 in Appendix summarizes the SEM's results.
Results in Figure 2 show that the variables of the original TAM significantly affect the attitude towards smart speakers. Therefore, the model supports H1 and H2 for which the higher the importance attributed to usefulness and ease of use the higher the positive attitude (γ usefulness 5 0.337, p < 0.001; γ easeofuse 5 0.171, p < 0.01) that in turn positively affects the intention to use the smart speakers proving H8 (β attitude 5 0.649, p < 0.001).
Thus, as suggested by the original TAM model, perceived usefulness and the perceived ease of use are the two determining factors for the adoption of a new technology in generaland smart speakers in particularthat directly impact the attitude of use of a technology and, therefore, indirectly on the intention to use it. A positive acceptance-performance relationship exists and it is strengthened both by the level of personalization, the type of personal data used to target the information, the amount of information used and by the level of convenience namely intended as the opportunity to save time, and conduct "hands-free" operations (Zaharia and W€ urfel, 2020).
A significant relationship also emerged between perceived enjoyment and attitude towards smart speakers (γ enjoyment 5 0.419, p < 0.001) as well as between task technology fit and attitude towards smart speakers (γ tasktechnologyfit 5 0.250, p < 0.01), thus supporting H5 and H7.
Specifically, among the factors affecting the acceptance of smart speakers, the variable that shows the largest correlation with attitude towards smart speakers (except for the relation attitude-intention) is the perceived enjoyment. This confirms the importance of the hedonistic aspects played in consumers' attitudes and behaviours in general (Venkatesh et al., 2012), and of perceived enjoyment, as a predictor of attitude, in particular (Zaharia and W€ urfel, 2020). The feeling pleasure, entertainment and fun using a smart speaker fascinate the users thus creating a positive emotional relationship towards the smart speaker and can form the basis of a positive and lasting relationship (Zaharia and W€ urfel, 2020). The utilitarian aspect of smart speakers, namely the ability of the technology to support a task, is confirmed to be a determining factor of whether technologies are believed to be more useful, more relevant, or give more relative advantage to the users (Goodhue and Thompson, 1995). The correspondence between task requirements, individual abilities and the functionality of the technology to match the capabilities of the technology to the demands of the task affect the positive perception of individuals reducing frustration for users who interface with them daily. In addition, smart speakers, thanks to AI and ML, continuously collect information and feedback to optimize their operation and respond better to the requests of the users allowing them to live interactive experiences. It is precisely this high level of interactivity through smart speakers and voice assistants that can compensate for the impersonal nature of a smart speaker (Go and Sundar, 2019).
Contrary to what was hypothesized, the relationship between perceived privacy risk and attitude has been proved to be not statistically significant (γ privacyrisk 5 À0.033, p > 0.05), thus not supporting H3. One possible justification might come from the Privacy Paradox: although people say they care about their privacy and are not willing to share their data, actually they give their private information in exchange for small benefits or convenience (Norberg et al., 2007;Aiolfi et al., 2021). Therefore, although people say they are afraid of privacy risks, people express the opposite and these concerns do not impact the attitude towards smart speakers.
Finally, no significant relationship has been found between innovativeness and attitude towards smart speakers nor between social attraction and attitude towards smart speakers. Therefore, H4 and H6 have not been supported.
The resulting negative direction of the relationship assumed in H4 appears contrary to what is claimed in the literature (e.g. Hwang et al., 2019;Kim et al., 2021). However, the fact that in our model the significance of the negative relationship is not proved might be considered a positive result for our analysis if compared to the results of prior analysis for which high levels of innovativeness from a new technology demonstrated a favourable attitude to use the new technology over time. However, H4 is a points deserving of further analysis. For instance, since perceived innovativeness indicates that the product itself reflects the novelty of the technology (Leckie et al., 2017), the level of familiarity of the users with the context in which the new technology is inserted may have affected the level of perceived innovativeness. If users are not familiar with AI-based technologies it becomes difficult for them to evaluate the level of their perception about those tools.
Finally, despite the positive effect of social attraction on attitude (H6), our model shows that this relationship is not statistical significant. This may be another consequence of the fact that the level of familiarity with voice assistants as well as the level of awareness of the individuals about the human-like attributes of smart speakers that can elicit a sense of social presence in the mind of individuals (Sundar et al., 2017;McLean and Osei-Frimpong, 2019). Therefore, if respondents are not familiar with the human-like attributes of smart speakers it becomes difficult for them to be engaged with the AI technology of smart speakers in the same way as they would with other human counterparts.

Conclusions and implications
Smart speakers are already a real revolution in the customer experience and accelerate the need to adopt an omnichannel managerial perspective in which the entire customer journey must be completely rethought. This phenomenon, in addition to stimulating important developments in the daily life of human beings, opens up multiple interesting scenarios for companies able to reap the benefits of the changes taking place. The spread of voice assistants and smart speakers represents a major challenge, both for manufacturers and for retailers. Smart speakers are real intermediaries and real sales channels within which consumers and companies exchange feedback, complete transactions and establish real business relationships.

Smart speakers' usage intention
In a context radically transformed by scientific progress, where the relationship between man and machine is radically changing, the present study has attempted to understand the variables that influence the attitude towards smart speakers and the intention to use them in order to analyse how those technologies will impact the future of marketing, namely how AI may influence marketing strategies and customer behaviours.
From a theoretical perspective, our work examines the evolution of smart speakers and their potential effects on the marketing practice and, concerning consumers' perceptions, it confirms the validity of the application of the TAM model to measure the factors underlying the adoption of new and specific technology, the smart speaker, with the same success as other technologies. The antecedents of attitude and intention are in line with the characteristics of smart speakers most appreciated by those who are familiar with them.
At the core of this technology improvement, there is undoubtedly the topic of personalization. The art of offering potential customers personalized items, capable of satisfying their wishes and expectations is as old as commerce and is still practiced by sales assistants in traditional shops. What is innovative is the evolution of this approach thanks to AI-based technologies that allow smart speakers continuously collect information and feedback on the requests submitted to them. A user's preference can be inferred from previous searches or purchases made, through the suggestion of related products or categories, or obtained from previous conversations (Kraus et al., 2019). Smart speakers process and use this knowledge to improve and optimize their functioning and be able, later, to better respond to subsequent requests from the same user.
From a managerial perspective, the challenge today is to make conversations and interactions similar to those that take place between human beings. This phenomenon highlights the importance of how interactive experiences are considered better than transmitting static information, and a high level of interactivity often manages to compensate for the impersonal nature of a chatbot (Go and Sundar, 2019).
The voice is the characterizing factor of smart speakers. It represents the only means of interaction between machine and user, reasons for which it must be treated in detail in terms of tone, intonation and choice of language. A voice that is not pleasant to the ear or not consistent with the institutional communication of the brand, risks compromising the usefulness perceived by the users. The range of actions that a smart speaker can perform must be expanded and go beyond carrying out routine activities. The way to increase the perceived usefulness is to give evidence that users can use the voice to have answers even to unthinkable actions and perform these actions optimally and easily and make users perceive the benefit in terms of time, efficiency and comfort. It is, therefore, necessary to invest resources to create a stable and continuous link between a user and smart speaker based on the logical chain that the more it can carry out multiple, easy and differentiated actions, the more useful it is and therefore the greater the attitude towards it and consequently its use.
Moreover, since the TTF is the perception of the adequacy or inadequacy of a smart speaker to perform the actions for which it is programmed, all the activities it can perform, present (and future), must always be carried out correctly, without margins of error. This aspect does not concern so much the execution of the command itself but more the immediately preceding part such as the understanding of the voice command to perform exactly the required task. The main limitations to the development of a smart speaker depend on the progress achieved in the field of NLP that clashes with the complexity of the human language does not always follow a linear flow and is full of aspects complicated to translate into computational linguistics. The phrasesinputs spoken can also differ or be misunderstood thus generating frustration for the users who interface with them daily. Thus, beyond all the ancillary features that a smart speaker may be able to have, companies are forced to ensure better speech recognition understood as a greater understanding of sentences and commands given, the ability to understand multiple languages and to add emotional aspects within conversations with devices (Zwakman et al., 2021). It is precisely this emotional bond that must be created and most exploited by firms to form the basis of a positive and lasting relationship. Perceiving enjoyment can create a positive emotional relationship with people's voice assistants. This is for example the main reason why users can activate skills that allow them to be entertained through music, games or fun conversations.
In addition, despite the growing interest of users in the protection of personal and sensitive data, our results show how individuals are not afraid of sharing their data through smart speakers. This result could be attributable to the paradox of privacy (Norberg et al., 2007). Today giving consent to the processing of personal data has become almost a routine, an automatic gesture (sometimes mandatory to use some applications) and although people say they care about their privacy and are not willing to share their information, they give their data in exchange for small benefits or convenience (Aiolfi et al., 2021).
To sum up, our work has attempted to light on the revolution taking place to fully satisfy the individual intention to use smart speakers that may turn on to be also a new retail channel. Smart speakers have the potential to substantially alter all phases of the customer journey, from searching for information on a product to repurchasing, even automated (Mari et al., 2020); in the phase before the sale, they allow individuals to search for details of a specific product and compare prices and alternatives; complete the purchase only through a voice confirmation and finally, the post-purchase is based on updates related to the status of the delivery (Zaharia and W€ urfel, 2020).
Thus, in the next future, it is essential to pay attention to the development of this new sales channel, voice commerce, that has the potential to change the current purchasing habits of individuals who use e-commerce, especially convenience goods, for which the consumer is not willing to invest time in the search for information and evaluation of alternatives. With ALexa, Amazon and Echo, however, voice shopping has reached full maturity and voice commerce, to be expected, will determine a number of new shopping best practices and e-commerce experiences that will impact not only customer behaviours but also retailers' marketing strategies and business models. Thanks to a smart speaker device, e-retailers will be able to guide the user in the navigation of a website, facilitating the choice of the product with targeted recommendations, improving the user experience and reducing the risk of cart abandonment.
Since AI-based tools enable better predictions for what customers want, and with high accuracy, online retailer (e-tailers) might move away from a shopping-then-shipping business model and towards a shipping-then-shopping business model (Davenport et al., 2020;Grandinetti, 2020). The customization, achieved through the interaction between an AI-tool, the smart speaker, and a consumer, improves the efficacy (and efficiency) of the mass customization process (Grandinetti, 2020). Using AI tools retailers can, in fact, analyse constantly expanding information not explicitly provided by the consumers that are involved on an emotional level thank to the globally gratifying experience where they are immerse in (Tao and Xu, 2018;Grandinetti, 2020). Leading e-tailers had already previously started their services to provide customers with buying recommendations and recently, in the fashion industry, some e-tailers have started to combine a sophisticated and complex AI system with the work of human stylists that lead to developing projects in the field of "anticipatory shipping" (Davenport et al., 2020;Grandinetti, 2020). More generally, the association between smart speakers and customization is an area worth exploring that could generate new and original developments for e-tailers in several sectors.
While it is true that individuals are still sceptical in the actual usefulness of smart speakers, more and more users consider them as a valuable help in finding and buying products faster and consequently living better shopping experiences. Both manufacturers and retailers have to start seizing opportunities from voice commerce and, to gain an advantage over their competitors, they need to consider the emerging habits of consumers in their omnichannel processes and strategies. Finally, it will therefore be necessary to provide concrete actions regarding the integration of voice features in their Apps and websites.

Smart speakers' usage intention
Limitations and future research Some limitations are associated with the online survey and the sample size. Although the sample is considered acceptable for a structural equation model, we intend, for future research, to enlarge the sample.
In addition, some relationships in our model may have been influenced by the level of familiarity of users with the smart speakers as well as the level of awareness of the individuals about the human-like attributes of smart speakers that can elicit a sense of social presence in the mind of individuals. Therefore, for future research, we suggest evaluating the level of awareness and familiarity with the smart speakers and using it as control variables or moderators of the relationships in the model. In addition, to simplify the model, we considered only the most cited variables of the TAM with the belief that TAM is the most studied approach to technology adoption and potentially we may have overlooked other variables affecting the adoption of smart speakers. For further research, we intend to consider other relevant variables affecting attitude and intention coming from UTAUT and its extensions that have already been used for recent research on language assistants. Specifically, for future research, we intend to investigate whether brand awareness among the target audience and the weight of relationships currently maintained with the customers can somehow contribute to or mitigate the perception of privacy risk. Brand credibility plays a fundamental role in solving privacy issues; although brands can take advantage of an AI system that can fully meet the social and psychological needs of consumers, consumers may still be reluctant to use such systems due to the credibility and trust associated with the brand. The credibility and reputation of the brand might be moderators of the relationship that is established between smart speakers and users.
For further research, a more sophisticated model should analyse the main ethical concerns that are not considered in our research and that go beyond the privacy risk. Conducting a deeper analysis of all the several ethical concerns emerged in literature about smart speakers (Aguirre et al., 2020;Seymour et al., 2020Seymour et al., , 2022Devi and Chanda, 2022;Zhan et al., 2022), that in our research is limited to privacy concerns, would identify new useful insights for voice assistant and smart speakers' developers who try to improve their credibility in the eyes of the users.
In addition, all the aspects related to the emotional perspective of the relationship with smart speakers should be investigated in future studies because although the pragmatic aspects of using a smart speaker have reached discrete levels of understanding, the emotional aspect remains unknown.
Finally, we suggest researchers, who want to investigate this topic, concentrate their efforts on studying the new phenomenon of voice commerce promoted by smart speaker adoption. Since shopping-related smart speakers and voice assistants are likely to radically change the way consumers search and purchase products, it is crucial to analyse their effects on brands. The AI-enabled tools represent a "black box" for brand owners and, thus, for further research we propose to study the brand owners' interpretation of a voice-enabled marketplace that may influence future marketing choices.