Search results

1 – 10 of over 1000
Article
Publication date: 3 June 2019

Tran Khanh Dang, Duc Minh Chau Pham and Duc Dan Ho

Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data…

Abstract

Purpose

Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data authentication model for such systems.

Design/methodology/approach

The data modification problem requires careful examinations in which the data are re-collected to verify their reliability by overlapping the two datasets. This approach is to use different anomaly detection techniques to determine which data are potential for frauds and to be re-collected. The paper also proposes a data selection model using their weights of importance in addition to anomaly detection. The target is to significantly reduce the amount of data in need of verification, but still guarantee that they achieve their high authenticity. Empirical experiments are conducted with real-world datasets to evaluate the efficiency of the proposed scheme.

Findings

The authors examine several techniques for detecting anomalies in the data of users and products, which give the accuracy of 80 per cent approximately. The integration with the weight selection model is also proved to be able to detect more than 80 per cent of the existing fraudulent ones while being careful not to accidentally include ones which are not, especially when the proportion of frauds is high.

Originality/value

With the rapid development of e-commerce fields, fraud detection on their data, as well as in Web crawling systems is new and necessary for research. This paper contributes a novel approach in crawling systems data authentication problem which has not been studied much.

Details

International Journal of Web Information Systems, vol. 15 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 30 October 2019

Tingting Jiang, Qian Guo, Shunchang Chen and Jiaqi Yang

The headlines of online news are created carefully to influence audience news selection today. The purpose of this paper is to investigate the relationships between news headline…

1159

Abstract

Purpose

The headlines of online news are created carefully to influence audience news selection today. The purpose of this paper is to investigate the relationships between news headline presentation and users’ clicking behavior.

Design/methodology/approach

Two types of unobtrusive data were collected and analyzed jointly for this purpose. A two-month server log file containing 39,990,200 clickstream records was obtained from an institutional news site. A clickstream data analysis was conducted at the footprint and movement levels, which extracted 98,016 clicks received by 7,120 headlines ever displayed on the homepage. Meanwhile, the presentation of these headlines was characterized from seven dimensions, i.e. position, format, text length, use of numbers, use of punctuation marks, recency and popularity, based on the layout and content crawled from the homepage.

Findings

This study identified a series of presentation characteristics that prompted users to click on the headlines, including placing them in the central T-shaped zones, using images, increasing text length properly for greater clarity, using visually distinctive punctuation marks, and providing recency and popularity indicators.

Originality/value

The findings have valuable implications for news providers in attracting clicks to their headlines. Also, the successful application of nonreactive methods has significant implications for future user studies in both information science and journalism.

Details

Aslib Journal of Information Management, vol. 72 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 28 July 2021

Yong-Hai Li, Jin Zheng, Shan-Tao Yue and Zhi-Ping Fan

In recent years, electronic word-of-mouth (e-WOM) concerning travel products reflected in online review information has become an important reference for tourists to make their…

Abstract

Purpose

In recent years, electronic word-of-mouth (e-WOM) concerning travel products reflected in online review information has become an important reference for tourists to make their product purchase decisions, while for travel service providers (TSPs), monitoring and improving the e-WOM of their travel products is always an important task. Therefore, based on the online review information, how to capture e-WOM of travel products and find out specific ways to improve the e-WOM is a noteworthy research problem. The purpose of this paper is to develop a method for capturing and analyzing e-WOM toward travel products based on sentiment analysis and stochastic dominance.

Design/methodology/approach

Specifically, online review information of travel products is first crawled and preprocessed. Second, sentiment strengths of online review information toward travel products concerning each feature are judged. Then, the matrix of structured online review information toward travel products is formed. Further, the matrix of e-WOM comparisons between any two travel products is constructed, and e-WOM ranking concerning each travel product is determined. Finally, trade-off chart models are constructed to conduct the e-WOM improvement analyses concerning the travel products.

Findings

An empirical study based on the online review information toward six travel products crawled from the Tuniu.com website is given to illustrate the use of the proposed method.

Originality/value

The proposed method can not only realize the real-time e-WOM monitoring to travel products but also be useful for TSPs to improve the e-WOM of their travel products.

Details

Kybernetes, vol. 51 no. 10
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 19 June 2020

Liyaning Tang, Logan Griffith, Matt Stevens and Mary Hardie

The purpose of this paper is to discover similarities and differences in the construction industry in China and the United States by using data analytic tools on data crawled from…

2011

Abstract

Purpose

The purpose of this paper is to discover similarities and differences in the construction industry in China and the United States by using data analytic tools on data crawled from social media platforms.

Design/methodology/approach

The method comprised comprehensive data analytics using network link analysis and natural language processing tools to discover similarities and differences of social networks, topics of interests and sentiments and emotions on different social media platforms.

Findings

From the research, it showed that all clusters (construction company, construction worker, construction media and construction union) shared similar trends on follower-following ratios and sentiment analysis in both social media platforms. The biggest difference between the two countries is that public accounts (e.g. company, media and union) on Twitter posted more on public interests, including safety and energy.

Research limitations/implications

The research contributes to knowledge about an alternative method of data collection for both academia and industry practitioners. Statistical bias can be introduced by only using social media platform data. The analyzed four clusters can be further divided to reflect more fine-grained groups of construction industries. The results can be integrated into other analyses based on traditional methodologies of data collection such as questionnaire surveys or interviews.

Originality/value

The research provides a comparative study of the construction industries in China and the USA among four clusters using social media platform data.

Details

Engineering, Construction and Architectural Management, vol. 27 no. 8
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 23 August 2013

Changhyun Byun, Hyeoncheol Lee, Yanggon Kim and Kwangmi Ko Kim

It is difficult to build our own social data set because data in social media is generally too vast and noisy. The aim of this study is to specify design and implementation…

Abstract

Purpose

It is difficult to build our own social data set because data in social media is generally too vast and noisy. The aim of this study is to specify design and implementation details of the Twitter data collecting tool with a rule‐based filtering module. Additionally, the paper aims to see how people communicate with each other through social networks in a case study with rule‐based analysis.

Design/methodology/approach

The authors developed a java‐based data gathering tool with a rule‐based filtering module for collecting data from Twitter. This paper introduces the design specifications and explain the implementation details of the Twitter Data Collecting Tool with detailed Unified Modeling Language (UML) diagrams. The Model View Controller (MVC) framework is applied in this system to support various types of user interfaces.

Findings

The Twitter Data Collecting Tool is able to gather a huge amount of data from Twitter and filter the data with modest rules for complex logic. This case study shows that a historical event creates buzz on Twitter and people's interests on the event are reflected in their Twitter activity.

Research limitations/implications

Applying data‐mining techniques to the social network data has so much potential. A possible improvement to the Twitter Data Collecting Tool would be an adaptation of a built‐in data‐mining module.

Originality/value

This paper focuses on designing a system handling massive amounts of Twitter Data. This is the first approach to embed a rule engine for filtering and analyzing social data. This paper will be valuable to those who may want to build their own Twitter dataset, apply customized filtering options to get rid of unnecessary, noisy data, and analyze social data to discover new knowledge.

Details

International Journal of Web Information Systems, vol. 9 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 18 April 2023

Fei Fan, Kara Chan, Yan Wang, Yupeng Li and Michael Prieler

Online influencers are increasingly used by brands around the globe to establish brand communication. This study aims to investigate the characteristics of social media content in…

2777

Abstract

Purpose

Online influencers are increasingly used by brands around the globe to establish brand communication. This study aims to investigate the characteristics of social media content in terms of presentation style and brand communication among online influencers in China. The authors identified how characteristics of social media posts influence young consumers’ engagement with the posts.

Design/methodology/approach

The authors analyzed 1,779 posts from the Sina Weibo accounts of ten top-ranked online influencers by combining traditional content analysis with Web data crawling of audience engagement with social media posts.

Findings

Online influencers in China more frequently used photos than videos to communicate with their social media audience. Altogether 8% and 6% of posts carried information about promotion and event, respectively. Posts with promotional incentives as well as event information were more likely to engage audiences. Altogether 22% of the sampled social media posts mentioned brands. Posts with brand information, however, were less likely to engage audiences. Furthermore, having long text is more effective than photos/images in generating likes from social media audiences.

Originality/value

Combining content analysis of social media posts and engagement analytics obtained via Web data crawling, this study is, to the best of the authors’ knowledge, one of the first empirical studies to analyze influencer marketing and young consumers’ reactions to social media in China.

Details

Young Consumers, vol. 24 no. 4
Type: Research Article
ISSN: 1747-3616

Keywords

Article
Publication date: 26 April 2018

Eugene Wong and Yan Wei

The purpose of this paper is to develop a customer online behaviour analysis tool, segment high-value customers, analyse their online purchasing behaviour and predict their next…

6829

Abstract

Purpose

The purpose of this paper is to develop a customer online behaviour analysis tool, segment high-value customers, analyse their online purchasing behaviour and predict their next purchases from an online air travel corporation.

Design/methodology/approach

An operations review of the customer online shopping process of an online travel agency (OTA) is conducted. A customer online shopping behaviour analysis tool is developed. The tool integrates competitors’ pricing data mining, customer segmentation and predictive analysis. The impacts of competitors’ price changes on customer purchasing decisions regarding the OTA’s products are evaluated. The integrated model for mining pricing data, identifying potential customers and predicting their next purchases helps the OTA recommend tailored product packages to its individual customers with reference to their travel patterns.

Findings

In the customer segmentation analysis, 110,840 customers are identified and segmented based on their purchasing behaviour. The relationship between the purchasing behaviour in an OTA and the price changes of different OTAs are analysed. There is a significant relationship between the flight duration time and the purchase lead time. The next travel destinations of segmented high-value customers are predicted with reference to their travel patterns and the significance of the relationships between destination pairs.

Practical implications

The developed model contributes to pricing evaluation, customer segmentation and package customization for online customers.

Originality/value

This study provides novel method and insights into customer behaviour towards OTAs through an integrated model of customer segmentation, customer behaviour and prediction analysis.

Details

International Journal of Retail & Distribution Management, vol. 46 no. 4
Type: Research Article
ISSN: 0959-0552

Keywords

Article
Publication date: 3 April 2017

Hei-Chia Wang, Che-Tsung Yang and Yi-Hao Yen

Community question answering (CQA) websites provide an open and free way to share knowledge about general topics on the internet. However, inquirers may not obtain useful answers…

1349

Abstract

Purpose

Community question answering (CQA) websites provide an open and free way to share knowledge about general topics on the internet. However, inquirers may not obtain useful answers and those who are qualified to provide answers may also miss opportunities to share their expertise without any notice. To address this problem, the purpose of this paper is to provide the means for inquirers to access archived answers and to identify effective subject matter experts for target questions.

Design/methodology/approach

This paper presents a question answering promoter, called QAP, for the CQA services. The proposed QAP facilitates the use of filtered archived answers regarded as explicit knowledge and recommended experts regarded as sources of implicit knowledge for the given target questions.

Findings

The experimental results indicate that QAP can leverage knowledge sharing by refining archived answers upon creditability and distributing raised questions to qualified potential experts.

Research limitations/implications

This proposed method is designed for the traditional Chinese corpus.

Originality/value

This paper proposed an integrated framework of answer selection and expert finding uses the bottom-up multipath evaluation algorithm, an underlying voting model, the agglomerative hierarchical clustering technique and feature approaches of answer trustworthiness measuring, identification of satisfied learners and credibility of repliers. The experiments using the corpus crawled from Yahoo! Knowledge Plus under designed scenarios are conducted and results are shown in fine details.

Article
Publication date: 21 March 2023

Tong Yang, Yanzhong Dang and Jiangning Wu

This paper aims to propose a method for dynamic product perceived quality analysis using social media data and to achieve a macro–micro combination analysis. The method enables…

Abstract

Purpose

This paper aims to propose a method for dynamic product perceived quality analysis using social media data and to achieve a macro–micro combination analysis. The method enables the prioritization of perceived quality attributes and provides perception causes.

Design/methodology/approach

To rationalize the macro–micro combination, ANOVA and multiple linear regression were used to identify the main factors affecting perceived quality which served as the combination basis; by using the combination basis for consumer segmentation, macro-knowledge (i.e. attribute importance and quality category of the attribute) is achieved by term frequency-inverse document frequency (TF-IDF)-based attribute importance calculation and KANO-based attribute classification, which is combined with micro-quality diagnostic information (i.e. perceived quality, perception causes and quality parameters). Further, dynamic perception Importance-Performance Analysis (IPA) is built to present the attribute priority and perception causes.

Findings

The framework was validated by the new energy vehicle (NEV) data of Autohome. The results show that price and purchase purpose are the most influential factors of perceived quality and that dynamic perception IPA can effectively prioritize attributes and mine perception causes.

Originality/value

This is one of the first studies to analyze dynamic perceived quality using social media data, which contributes to the research on perceived quality. The paper also contributes by achieving a combined macro–micro analysis of perceived quality. The method rationalizes the macro–micro combination by identifying the factors influencing perceived quality, which provides ideas for other studies using social media data.

Details

Industrial Management & Data Systems, vol. 123 no. 5
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 23 December 2019

Malte Bonart, Anastasiia Samokhina, Gernot Heisenberg and Philipp Schaer

Survey-based studies suggest that search engines are trusted more than social media or even traditional news, although cases of false information or defamation are known. The…

Abstract

Purpose

Survey-based studies suggest that search engines are trusted more than social media or even traditional news, although cases of false information or defamation are known. The purpose of this paper is to analyze query suggestion features of three search engines to see if these features introduce some bias into the query and search process that might compromise this trust. The authors test the approach on person-related search suggestions by querying the names of politicians from the German Bundestag before the German federal election of 2017.

Design/methodology/approach

This study introduces a framework to systematically examine and automatically analyze the varieties in different query suggestions for person names offered by major search engines. To test the framework, the authors collected data from the Google, Bing and DuckDuckGo query suggestion APIs over a period of four months for 629 different names of German politicians. The suggestions were clustered and statistically analyzed with regards to different biases, like gender, party or age and with regards to the stability of the suggestions over time.

Findings

By using the framework, the authors located three semantic clusters within the data set: suggestions related to politics and economics, location information and personal and other miscellaneous topics. Among other effects, the results of the analysis show a small bias in the form that male politicians receive slightly fewer suggestions on “personal and misc” topics. The stability analysis of the suggested terms over time shows that some suggestions are prevalent most of the time, while other suggestions fluctuate more often.

Originality/value

This study proposes a novel framework to automatically identify biases in web search engine query suggestions for person-related searches. Applying this framework on a set of person-related query suggestions shows first insights into the influence search engines can have on the query process of users that seek out information on politicians.

Details

Online Information Review, vol. 44 no. 2
Type: Research Article
ISSN: 1468-4527

Keywords

1 – 10 of over 1000