Search results

1 – 10 of over 1000
Article
Publication date: 3 June 2019

Tran Khanh Dang, Duc Minh Chau Pham and Duc Dan Ho

Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data

Abstract

Purpose

Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data authentication model for such systems.

Design/methodology/approach

The data modification problem requires careful examinations in which the data are re-collected to verify their reliability by overlapping the two datasets. This approach is to use different anomaly detection techniques to determine which data are potential for frauds and to be re-collected. The paper also proposes a data selection model using their weights of importance in addition to anomaly detection. The target is to significantly reduce the amount of data in need of verification, but still guarantee that they achieve their high authenticity. Empirical experiments are conducted with real-world datasets to evaluate the efficiency of the proposed scheme.

Findings

The authors examine several techniques for detecting anomalies in the data of users and products, which give the accuracy of 80 per cent approximately. The integration with the weight selection model is also proved to be able to detect more than 80 per cent of the existing fraudulent ones while being careful not to accidentally include ones which are not, especially when the proportion of frauds is high.

Originality/value

With the rapid development of e-commerce fields, fraud detection on their data, as well as in Web crawling systems is new and necessary for research. This paper contributes a novel approach in crawling systems data authentication problem which has not been studied much.

Details

International Journal of Web Information Systems, vol. 15 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 30 October 2019

Tingting Jiang, Qian Guo, Shunchang Chen and Jiaqi Yang

The headlines of online news are created carefully to influence audience news selection today. The purpose of this paper is to investigate the relationships between news headline…

1244

Abstract

Purpose

The headlines of online news are created carefully to influence audience news selection today. The purpose of this paper is to investigate the relationships between news headline presentation and users’ clicking behavior.

Design/methodology/approach

Two types of unobtrusive data were collected and analyzed jointly for this purpose. A two-month server log file containing 39,990,200 clickstream records was obtained from an institutional news site. A clickstream data analysis was conducted at the footprint and movement levels, which extracted 98,016 clicks received by 7,120 headlines ever displayed on the homepage. Meanwhile, the presentation of these headlines was characterized from seven dimensions, i.e. position, format, text length, use of numbers, use of punctuation marks, recency and popularity, based on the layout and content crawled from the homepage.

Findings

This study identified a series of presentation characteristics that prompted users to click on the headlines, including placing them in the central T-shaped zones, using images, increasing text length properly for greater clarity, using visually distinctive punctuation marks, and providing recency and popularity indicators.

Originality/value

The findings have valuable implications for news providers in attracting clicks to their headlines. Also, the successful application of nonreactive methods has significant implications for future user studies in both information science and journalism.

Details

Aslib Journal of Information Management, vol. 72 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 8 March 2024

Juan Shi

Users' voluntary forwarding behavior opens a new avenue for companies to promote their brands and products on social networking sites (SNS). However, research on voluntary…

Abstract

Purpose

Users' voluntary forwarding behavior opens a new avenue for companies to promote their brands and products on social networking sites (SNS). However, research on voluntary information disseminators is limited. This paper aims to bring an in-depth understanding of voluntary disseminators by answering the following questions: (1) What is the underlying mechanism by which some users are more enthusiastic to voluntarily forward content of interest? (2) How to identify them? We propose a theoretical model based on the Elaboration-Likelihood Model (ELM) and examine three types of factors that moderate the effect of preference matching on individual forwarding behavior, including personal characteristics, tweet characteristics and sender–receiver relationships.

Design/methodology/approach

Via Twitter API, we randomly crawled 1967 Twitter users' data to validate the conceptual framework. Each user’s original tweets and retweeted tweets, profile data such as the number of followers and followees and verification status were obtained. The final corpus contains 163,554 data points composed of 1,634 valid twitterers' retweeting behavior. Tweets produced by these core users' followees were also crawled. These data points constitute an unbalanced panel data and we employ different models — fixed-effects, random-effects and pooled logit models — to test the moderation effects. The robustness test shows consistency among these different models.

Findings

Preference matching significantly affects users' forwarding behavior, implying that SNS users are more likely to share contents that align with their preferences. In addition, we find that popular users with lots of followers, heavy SNS users who author tweets or forward other-sourced tweets more frequently and users who tend to produce longer original contents are more enthusiastic to disseminate contents of interest. Furthermore, interaction strength has a positive moderating effect on the relationship between preference matching and individuals' forwarding decisions, suggesting that users are more likely to disseminate content of interest when it comes from strong ties. However, the moderating effect of perceived affinity is significantly negative, indicating that an online community of individuals with many common friends is not an ideal place to engage individuals in sharing information.

Originality/value

This work brings about a deep understanding of users' voluntary forwarding behavior of content of interest. To the best of our knowledge, the current study is the first to examine (1) the underlying mechanism by which some users are more likely to voluntarily forward content of interest; and (2) how to identify these potential voluntary disseminators. By extending the ELM, we examine the moderating effect of tweet characteristics, sender–receiver relationships as well as personal characteristics. Our research findings provide practical guidelines for enterprises and government institutions to choose voluntary endorsers when trying to engage individuals in information dissemination on SNS.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 13 April 2023

Dandan He, Zhong Yao, Futao Zhao and Yue Wang

Retail investors are prone to be affected by information dissemination in social media with the rapid development of Web 2.0. The purpose of this study is to recognize the factors…

Abstract

Purpose

Retail investors are prone to be affected by information dissemination in social media with the rapid development of Web 2.0. The purpose of this study is to recognize the factors that may impact users' retweet behavior, namely information dissemination in the online financial community, through machine learning techniques.

Design/methodology/approach

This paper crawled data from the Chinese online financial community (Xueqiu.com) and extracted author-related, content-related, situation-related, stock-related and stock market-related features from the dataset. The best information dissemination prediction model based on these features was determined by evaluating five classifiers with various performance metrics, and the predictability of different feature groups was tested.

Findings

Five prevalent classifiers were evaluated with various performance metrics and the random forest classifier was proven to be the best retweet prediction model in the authors’ experiments. Moreover, the predictability of author-related, content-related and market-related features was illustrated to be relatively better than that of the other two feature groups. Several particularly important features, such as the author's followers and the rise and fall of the stock index, were recognized in this paper at last.

Originality/value

This study contributes to in-depth research on information dissemination in the financial domain. The findings of this study have important practical implications for government regulators to supervise public opinion in the financial market.

Details

Aslib Journal of Information Management, vol. 76 no. 4
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 28 July 2021

Yong-Hai Li, Jin Zheng, Shan-Tao Yue and Zhi-Ping Fan

In recent years, electronic word-of-mouth (e-WOM) concerning travel products reflected in online review information has become an important reference for tourists to make their…

Abstract

Purpose

In recent years, electronic word-of-mouth (e-WOM) concerning travel products reflected in online review information has become an important reference for tourists to make their product purchase decisions, while for travel service providers (TSPs), monitoring and improving the e-WOM of their travel products is always an important task. Therefore, based on the online review information, how to capture e-WOM of travel products and find out specific ways to improve the e-WOM is a noteworthy research problem. The purpose of this paper is to develop a method for capturing and analyzing e-WOM toward travel products based on sentiment analysis and stochastic dominance.

Design/methodology/approach

Specifically, online review information of travel products is first crawled and preprocessed. Second, sentiment strengths of online review information toward travel products concerning each feature are judged. Then, the matrix of structured online review information toward travel products is formed. Further, the matrix of e-WOM comparisons between any two travel products is constructed, and e-WOM ranking concerning each travel product is determined. Finally, trade-off chart models are constructed to conduct the e-WOM improvement analyses concerning the travel products.

Findings

An empirical study based on the online review information toward six travel products crawled from the Tuniu.com website is given to illustrate the use of the proposed method.

Originality/value

The proposed method can not only realize the real-time e-WOM monitoring to travel products but also be useful for TSPs to improve the e-WOM of their travel products.

Details

Kybernetes, vol. 51 no. 10
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 19 June 2020

Liyaning Tang, Logan Griffith, Matt Stevens and Mary Hardie

The purpose of this paper is to discover similarities and differences in the construction industry in China and the United States by using data analytic tools on data crawled from…

2031

Abstract

Purpose

The purpose of this paper is to discover similarities and differences in the construction industry in China and the United States by using data analytic tools on data crawled from social media platforms.

Design/methodology/approach

The method comprised comprehensive data analytics using network link analysis and natural language processing tools to discover similarities and differences of social networks, topics of interests and sentiments and emotions on different social media platforms.

Findings

From the research, it showed that all clusters (construction company, construction worker, construction media and construction union) shared similar trends on follower-following ratios and sentiment analysis in both social media platforms. The biggest difference between the two countries is that public accounts (e.g. company, media and union) on Twitter posted more on public interests, including safety and energy.

Research limitations/implications

The research contributes to knowledge about an alternative method of data collection for both academia and industry practitioners. Statistical bias can be introduced by only using social media platform data. The analyzed four clusters can be further divided to reflect more fine-grained groups of construction industries. The results can be integrated into other analyses based on traditional methodologies of data collection such as questionnaire surveys or interviews.

Originality/value

The research provides a comparative study of the construction industries in China and the USA among four clusters using social media platform data.

Details

Engineering, Construction and Architectural Management, vol. 27 no. 8
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 23 August 2013

Changhyun Byun, Hyeoncheol Lee, Yanggon Kim and Kwangmi Ko Kim

It is difficult to build our own social data set because data in social media is generally too vast and noisy. The aim of this study is to specify design and implementation…

Abstract

Purpose

It is difficult to build our own social data set because data in social media is generally too vast and noisy. The aim of this study is to specify design and implementation details of the Twitter data collecting tool with a rule‐based filtering module. Additionally, the paper aims to see how people communicate with each other through social networks in a case study with rule‐based analysis.

Design/methodology/approach

The authors developed a java‐based data gathering tool with a rule‐based filtering module for collecting data from Twitter. This paper introduces the design specifications and explain the implementation details of the Twitter Data Collecting Tool with detailed Unified Modeling Language (UML) diagrams. The Model View Controller (MVC) framework is applied in this system to support various types of user interfaces.

Findings

The Twitter Data Collecting Tool is able to gather a huge amount of data from Twitter and filter the data with modest rules for complex logic. This case study shows that a historical event creates buzz on Twitter and people's interests on the event are reflected in their Twitter activity.

Research limitations/implications

Applying data‐mining techniques to the social network data has so much potential. A possible improvement to the Twitter Data Collecting Tool would be an adaptation of a built‐in data‐mining module.

Originality/value

This paper focuses on designing a system handling massive amounts of Twitter Data. This is the first approach to embed a rule engine for filtering and analyzing social data. This paper will be valuable to those who may want to build their own Twitter dataset, apply customized filtering options to get rid of unnecessary, noisy data, and analyze social data to discover new knowledge.

Details

International Journal of Web Information Systems, vol. 9 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Content available
Article
Publication date: 18 April 2023

Fei Fan, Kara Chan, Yan Wang, Yupeng Li and Michael Prieler

Online influencers are increasingly used by brands around the globe to establish brand communication. This study aims to investigate the characteristics of social media content in…

4093

Abstract

Purpose

Online influencers are increasingly used by brands around the globe to establish brand communication. This study aims to investigate the characteristics of social media content in terms of presentation style and brand communication among online influencers in China. The authors identified how characteristics of social media posts influence young consumers’ engagement with the posts.

Design/methodology/approach

The authors analyzed 1,779 posts from the Sina Weibo accounts of ten top-ranked online influencers by combining traditional content analysis with Web data crawling of audience engagement with social media posts.

Findings

Online influencers in China more frequently used photos than videos to communicate with their social media audience. Altogether 8% and 6% of posts carried information about promotion and event, respectively. Posts with promotional incentives as well as event information were more likely to engage audiences. Altogether 22% of the sampled social media posts mentioned brands. Posts with brand information, however, were less likely to engage audiences. Furthermore, having long text is more effective than photos/images in generating likes from social media audiences.

Originality/value

Combining content analysis of social media posts and engagement analytics obtained via Web data crawling, this study is, to the best of the authors’ knowledge, one of the first empirical studies to analyze influencer marketing and young consumers’ reactions to social media in China.

Details

Young Consumers, vol. 24 no. 4
Type: Research Article
ISSN: 1747-3616

Keywords

Article
Publication date: 26 April 2018

Eugene Wong and Yan Wei

The purpose of this paper is to develop a customer online behaviour analysis tool, segment high-value customers, analyse their online purchasing behaviour and predict their next…

7015

Abstract

Purpose

The purpose of this paper is to develop a customer online behaviour analysis tool, segment high-value customers, analyse their online purchasing behaviour and predict their next purchases from an online air travel corporation.

Design/methodology/approach

An operations review of the customer online shopping process of an online travel agency (OTA) is conducted. A customer online shopping behaviour analysis tool is developed. The tool integrates competitors’ pricing data mining, customer segmentation and predictive analysis. The impacts of competitors’ price changes on customer purchasing decisions regarding the OTA’s products are evaluated. The integrated model for mining pricing data, identifying potential customers and predicting their next purchases helps the OTA recommend tailored product packages to its individual customers with reference to their travel patterns.

Findings

In the customer segmentation analysis, 110,840 customers are identified and segmented based on their purchasing behaviour. The relationship between the purchasing behaviour in an OTA and the price changes of different OTAs are analysed. There is a significant relationship between the flight duration time and the purchase lead time. The next travel destinations of segmented high-value customers are predicted with reference to their travel patterns and the significance of the relationships between destination pairs.

Practical implications

The developed model contributes to pricing evaluation, customer segmentation and package customization for online customers.

Originality/value

This study provides novel method and insights into customer behaviour towards OTAs through an integrated model of customer segmentation, customer behaviour and prediction analysis.

Details

International Journal of Retail & Distribution Management, vol. 46 no. 4
Type: Research Article
ISSN: 0959-0552

Keywords

Book part
Publication date: 10 July 2019

Jingjing Wang, Zhiqiang Li, Huanhuan Feng, Yuanjing Guo, Zhengbo Liang, Luyao Wang, Xing Wan and Yalin Wang

Recently, sharing economy is gradually accepted by people, and it has expanded from life to knowledge. It is important to encourage people to produce high quality content in…

Abstract

Recently, sharing economy is gradually accepted by people, and it has expanded from life to knowledge. It is important to encourage people to produce high quality content in knowledge sharing area, and knowledge payment is one of the most effective ways to achieve it. Therefore, the knowledge payment has been regarded as a huge business opportunity, and it is of great meaning to study the development trend and feasibility of knowledge payment. This chapter, through big data methods, analyzes the business model of Zhihu (a Chinese platform of knowledge sharing) after it introduced knowledge payment projects, such as Zhihu Live and Pay Consultation. According to data of Zhihu users’ Q&A, concerned fields and others, this chapter tries to outline its user profile to find out the target groups of different topics, the proper form of knowledge payment and the hot topics of Zhihu Live. Through the analysis of knowledge graph, this chapter finds that Zhihu Live is expected to be the mainstream knowledge payment form in the future, and the most potential topics are mainly focused on science, law, and business. Meanwhile, it establishes a pricing model for Zhihu Live, and provides suggestions for the development of knowledge payment.

Details

The New Silk Road Leads through the Arab Peninsula: Mastering Global Business and Innovation
Type: Book
ISBN: 978-1-78756-680-4

Keywords

1 – 10 of over 1000