Search results

1 – 10 of over 1000
To view the access options for this content please click here
Article
Publication date: 3 June 2019

Tran Khanh Dang, Duc Minh Chau Pham and Duc Dan Ho

Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel…

Abstract

Purpose

Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data authentication model for such systems.

Design/methodology/approach

The data modification problem requires careful examinations in which the data are re-collected to verify their reliability by overlapping the two datasets. This approach is to use different anomaly detection techniques to determine which data are potential for frauds and to be re-collected. The paper also proposes a data selection model using their weights of importance in addition to anomaly detection. The target is to significantly reduce the amount of data in need of verification, but still guarantee that they achieve their high authenticity. Empirical experiments are conducted with real-world datasets to evaluate the efficiency of the proposed scheme.

Findings

The authors examine several techniques for detecting anomalies in the data of users and products, which give the accuracy of 80 per cent approximately. The integration with the weight selection model is also proved to be able to detect more than 80 per cent of the existing fraudulent ones while being careful not to accidentally include ones which are not, especially when the proportion of frauds is high.

Originality/value

With the rapid development of e-commerce fields, fraud detection on their data, as well as in Web crawling systems is new and necessary for research. This paper contributes a novel approach in crawling systems data authentication problem which has not been studied much.

Details

International Journal of Web Information Systems, vol. 15 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

To view the access options for this content please click here
Article
Publication date: 30 October 2019

Tingting Jiang, Qian Guo, Shunchang Chen and Jiaqi Yang

The headlines of online news are created carefully to influence audience news selection today. The purpose of this paper is to investigate the relationships between news…

Abstract

Purpose

The headlines of online news are created carefully to influence audience news selection today. The purpose of this paper is to investigate the relationships between news headline presentation and users’ clicking behavior.

Design/methodology/approach

Two types of unobtrusive data were collected and analyzed jointly for this purpose. A two-month server log file containing 39,990,200 clickstream records was obtained from an institutional news site. A clickstream data analysis was conducted at the footprint and movement levels, which extracted 98,016 clicks received by 7,120 headlines ever displayed on the homepage. Meanwhile, the presentation of these headlines was characterized from seven dimensions, i.e. position, format, text length, use of numbers, use of punctuation marks, recency and popularity, based on the layout and content crawled from the homepage.

Findings

This study identified a series of presentation characteristics that prompted users to click on the headlines, including placing them in the central T-shaped zones, using images, increasing text length properly for greater clarity, using visually distinctive punctuation marks, and providing recency and popularity indicators.

Originality/value

The findings have valuable implications for news providers in attracting clicks to their headlines. Also, the successful application of nonreactive methods has significant implications for future user studies in both information science and journalism.

Details

Aslib Journal of Information Management, vol. 72 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

To view the access options for this content please click here
Article
Publication date: 19 June 2020

Liyaning Tang, Logan Griffith, Matt Stevens and Mary Hardie

The purpose of this paper is to discover similarities and differences in the construction industry in China and the United States by using data analytic tools on data

Abstract

Purpose

The purpose of this paper is to discover similarities and differences in the construction industry in China and the United States by using data analytic tools on data crawled from social media platforms.

Design/methodology/approach

The method comprised comprehensive data analytics using network link analysis and natural language processing tools to discover similarities and differences of social networks, topics of interests and sentiments and emotions on different social media platforms.

Findings

From the research, it showed that all clusters (construction company, construction worker, construction media and construction union) shared similar trends on follower-following ratios and sentiment analysis in both social media platforms. The biggest difference between the two countries is that public accounts (e.g. company, media and union) on Twitter posted more on public interests, including safety and energy.

Research limitations/implications

The research contributes to knowledge about an alternative method of data collection for both academia and industry practitioners. Statistical bias can be introduced by only using social media platform data. The analyzed four clusters can be further divided to reflect more fine-grained groups of construction industries. The results can be integrated into other analyses based on traditional methodologies of data collection such as questionnaire surveys or interviews.

Originality/value

The research provides a comparative study of the construction industries in China and the USA among four clusters using social media platform data.

Details

Engineering, Construction and Architectural Management, vol. 27 no. 8
Type: Research Article
ISSN: 0969-9988

Keywords

To view the access options for this content please click here
Article
Publication date: 23 August 2013

Changhyun Byun, Hyeoncheol Lee, Yanggon Kim and Kwangmi Ko Kim

It is difficult to build our own social data set because data in social media is generally too vast and noisy. The aim of this study is to specify design and…

Abstract

Purpose

It is difficult to build our own social data set because data in social media is generally too vast and noisy. The aim of this study is to specify design and implementation details of the Twitter data collecting tool with a rule‐based filtering module. Additionally, the paper aims to see how people communicate with each other through social networks in a case study with rule‐based analysis.

Design/methodology/approach

The authors developed a java‐based data gathering tool with a rule‐based filtering module for collecting data from Twitter. This paper introduces the design specifications and explain the implementation details of the Twitter Data Collecting Tool with detailed Unified Modeling Language (UML) diagrams. The Model View Controller (MVC) framework is applied in this system to support various types of user interfaces.

Findings

The Twitter Data Collecting Tool is able to gather a huge amount of data from Twitter and filter the data with modest rules for complex logic. This case study shows that a historical event creates buzz on Twitter and people's interests on the event are reflected in their Twitter activity.

Research limitations/implications

Applying data‐mining techniques to the social network data has so much potential. A possible improvement to the Twitter Data Collecting Tool would be an adaptation of a built‐in data‐mining module.

Originality/value

This paper focuses on designing a system handling massive amounts of Twitter Data. This is the first approach to embed a rule engine for filtering and analyzing social data. This paper will be valuable to those who may want to build their own Twitter dataset, apply customized filtering options to get rid of unnecessary, noisy data, and analyze social data to discover new knowledge.

Details

International Journal of Web Information Systems, vol. 9 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

To view the access options for this content please click here
Article
Publication date: 26 April 2018

Eugene Wong and Yan Wei

The purpose of this paper is to develop a customer online behaviour analysis tool, segment high-value customers, analyse their online purchasing behaviour and predict…

Abstract

Purpose

The purpose of this paper is to develop a customer online behaviour analysis tool, segment high-value customers, analyse their online purchasing behaviour and predict their next purchases from an online air travel corporation.

Design/methodology/approach

An operations review of the customer online shopping process of an online travel agency (OTA) is conducted. A customer online shopping behaviour analysis tool is developed. The tool integrates competitors’ pricing data mining, customer segmentation and predictive analysis. The impacts of competitors’ price changes on customer purchasing decisions regarding the OTA’s products are evaluated. The integrated model for mining pricing data, identifying potential customers and predicting their next purchases helps the OTA recommend tailored product packages to its individual customers with reference to their travel patterns.

Findings

In the customer segmentation analysis, 110,840 customers are identified and segmented based on their purchasing behaviour. The relationship between the purchasing behaviour in an OTA and the price changes of different OTAs are analysed. There is a significant relationship between the flight duration time and the purchase lead time. The next travel destinations of segmented high-value customers are predicted with reference to their travel patterns and the significance of the relationships between destination pairs.

Practical implications

The developed model contributes to pricing evaluation, customer segmentation and package customization for online customers.

Originality/value

This study provides novel method and insights into customer behaviour towards OTAs through an integrated model of customer segmentation, customer behaviour and prediction analysis.

Details

International Journal of Retail & Distribution Management, vol. 46 no. 4
Type: Research Article
ISSN: 0959-0552

Keywords

To view the access options for this content please click here
Book part
Publication date: 10 July 2019

Jingjing Wang, Zhiqiang Li, Huanhuan Feng, Yuanjing Guo, Zhengbo Liang, Luyao Wang, Xing Wan and Yalin Wang

Recently, sharing economy is gradually accepted by people, and it has expanded from life to knowledge. It is important to encourage people to produce high quality content…

Abstract

Recently, sharing economy is gradually accepted by people, and it has expanded from life to knowledge. It is important to encourage people to produce high quality content in knowledge sharing area, and knowledge payment is one of the most effective ways to achieve it. Therefore, the knowledge payment has been regarded as a huge business opportunity, and it is of great meaning to study the development trend and feasibility of knowledge payment. This chapter, through big data methods, analyzes the business model of Zhihu (a Chinese platform of knowledge sharing) after it introduced knowledge payment projects, such as Zhihu Live and Pay Consultation. According to data of Zhihu users’ Q&A, concerned fields and others, this chapter tries to outline its user profile to find out the target groups of different topics, the proper form of knowledge payment and the hot topics of Zhihu Live. Through the analysis of knowledge graph, this chapter finds that Zhihu Live is expected to be the mainstream knowledge payment form in the future, and the most potential topics are mainly focused on science, law, and business. Meanwhile, it establishes a pricing model for Zhihu Live, and provides suggestions for the development of knowledge payment.

Details

The New Silk Road Leads through the Arab Peninsula: Mastering Global Business and Innovation
Type: Book
ISBN: 978-1-78756-680-4

Keywords

To view the access options for this content please click here
Article
Publication date: 3 April 2017

Hei-Chia Wang, Che-Tsung Yang and Yi-Hao Yen

Community question answering (CQA) websites provide an open and free way to share knowledge about general topics on the internet. However, inquirers may not obtain useful…

Abstract

Purpose

Community question answering (CQA) websites provide an open and free way to share knowledge about general topics on the internet. However, inquirers may not obtain useful answers and those who are qualified to provide answers may also miss opportunities to share their expertise without any notice. To address this problem, the purpose of this paper is to provide the means for inquirers to access archived answers and to identify effective subject matter experts for target questions.

Design/methodology/approach

This paper presents a question answering promoter, called QAP, for the CQA services. The proposed QAP facilitates the use of filtered archived answers regarded as explicit knowledge and recommended experts regarded as sources of implicit knowledge for the given target questions.

Findings

The experimental results indicate that QAP can leverage knowledge sharing by refining archived answers upon creditability and distributing raised questions to qualified potential experts.

Research limitations/implications

This proposed method is designed for the traditional Chinese corpus.

Originality/value

This paper proposed an integrated framework of answer selection and expert finding uses the bottom-up multipath evaluation algorithm, an underlying voting model, the agglomerative hierarchical clustering technique and feature approaches of answer trustworthiness measuring, identification of satisfied learners and credibility of repliers. The experiments using the corpus crawled from Yahoo! Knowledge Plus under designed scenarios are conducted and results are shown in fine details.

To view the access options for this content please click here
Article
Publication date: 23 December 2019

Malte Bonart, Anastasiia Samokhina, Gernot Heisenberg and Philipp Schaer

Survey-based studies suggest that search engines are trusted more than social media or even traditional news, although cases of false information or defamation are known…

Abstract

Purpose

Survey-based studies suggest that search engines are trusted more than social media or even traditional news, although cases of false information or defamation are known. The purpose of this paper is to analyze query suggestion features of three search engines to see if these features introduce some bias into the query and search process that might compromise this trust. The authors test the approach on person-related search suggestions by querying the names of politicians from the German Bundestag before the German federal election of 2017.

Design/methodology/approach

This study introduces a framework to systematically examine and automatically analyze the varieties in different query suggestions for person names offered by major search engines. To test the framework, the authors collected data from the Google, Bing and DuckDuckGo query suggestion APIs over a period of four months for 629 different names of German politicians. The suggestions were clustered and statistically analyzed with regards to different biases, like gender, party or age and with regards to the stability of the suggestions over time.

Findings

By using the framework, the authors located three semantic clusters within the data set: suggestions related to politics and economics, location information and personal and other miscellaneous topics. Among other effects, the results of the analysis show a small bias in the form that male politicians receive slightly fewer suggestions on “personal and misc” topics. The stability analysis of the suggested terms over time shows that some suggestions are prevalent most of the time, while other suggestions fluctuate more often.

Originality/value

This study proposes a novel framework to automatically identify biases in web search engine query suggestions for person-related searches. Applying this framework on a set of person-related query suggestions shows first insights into the influence search engines can have on the query process of users that seek out information on politicians.

Details

Online Information Review, vol. 44 no. 2
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article
Publication date: 2 December 2019

Fuli Zhou, Ming K. Lim, Yandong He and Saurabh Pratap

The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe…

Abstract

Purpose

The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe into the vehicle consumer consumption behavior and make recommendations for potential consumers from textual comments viewpoint.

Design/methodology/approach

A big data analytic-based approach is designed to discover vehicle consumer consumption behavior from online perspective. To reduce subjectivity of expert-based approaches, a parallel Naïve Bayes approach is designed to analyze the sentiment analysis, and the Saaty scale-based (SSC) scoring rule is employed to obtain specific sentimental value of attribute class, contributing to the multi-grade sentiment classification. To achieve the intelligent recommendation for potential vehicle customers, a novel SSC-VIKOR approach is developed to prioritize vehicle brand candidates from a big data analytical viewpoint.

Findings

The big data analytics argue that “cost-effectiveness” characteristic is the most important factor that vehicle consumers care, and the data mining results enable automakers to better understand consumer consumption behavior.

Research limitations/implications

The case study illustrates the effectiveness of the integrated method, contributing to much more precise operations management on marketing strategy, quality improvement and intelligent recommendation.

Originality/value

Researches of consumer consumption behavior are usually based on survey-based methods, and mostly previous studies about comments analysis focus on binary analysis. The hybrid SSC-VIKOR approach is developed to fill the gap from the big data perspective.

Details

Industrial Management & Data Systems, vol. 120 no. 1
Type: Research Article
ISSN: 0263-5577

Keywords

To view the access options for this content please click here
Article
Publication date: 3 October 2016

Philipp Max Hartmann, Mohamed Zaki, Niels Feldmann and Andy Neely

The purpose of this paper is to derive a taxonomy of business models used by start-up firms that rely on data as a key resource for business, namely data-driven business…

Abstract

Purpose

The purpose of this paper is to derive a taxonomy of business models used by start-up firms that rely on data as a key resource for business, namely data-driven business models (DDBMs). By providing a framework to systematically analyse DDBMs, the study provides an introduction to DDBM as a field of study.

Design/methodology/approach

To develop the taxonomy of DDBMs, business model descriptions of 100 randomly chosen start-up firms were coded using a DDBM framework derived from literature, comprising six dimensions with 35 features. Subsequent application of clustering algorithms produced six different types of DDBM, validated by case studies from the study’s sample.

Findings

The taxonomy derived from the research consists of six different types of DDBM among start-ups. These types are characterised by a subset of six of nine clustering variables from the DDBM framework.

Practical implications

A major contribution of the paper is the designed framework, which stimulates thinking about the nature and future of DDBMs. The proposed taxonomy will help organisations to position their activities in the current DDBM landscape. Moreover, framework and taxonomy may lead to a DDBM design toolbox.

Originality/value

This paper develops a basis for understanding how start-ups build business models capture value from data as a key resource, adding a business perspective to the discussion of big data. By offering the scientific community a specific framework of business model features and a subsequent taxonomy, the paper provides reference points and serves as a foundation for future studies of DDBMs.

Details

International Journal of Operations & Production Management, vol. 36 no. 10
Type: Research Article
ISSN: 0144-3577

Keywords

1 – 10 of over 1000