Search results
1 – 10 of over 33000Bilal Abu-Salih, Pornpit Wongthongtham and Chan Yan Kit
This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users’ domain (s) of interest is a…
Abstract
Purpose
This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users’ domain (s) of interest is a significant step towards addressing their domain-based trustworthiness through an accurate understanding of their content in their OSNs.
Design/methodology/approach
This study uses a Twitter mining approach for domain-based classification of users and their textual content. The proposed approach incorporates machine learning modules. The approach comprises two analysis phases: the time-aware semantic analysis of users’ historical content incorporating five commonly used machine learning classifiers. This framework classifies users into two main categories: politics-related and non-politics-related categories. In the second stage, the likelihood predictions obtained in the first phase will be used to predict the domain of future users’ tweets.
Findings
Experiments have been conducted to validate the mechanism proposed in the study framework, further supported by the excellent performance of the harnessed evaluation metrics. The experiments conducted verify the applicability of the framework to an effective domain-based classification for Twitter users and their content, as evident in the outstanding results of several performance evaluation metrics.
Research limitations/implications
This study is limited to an on/off domain classification for content of OSNs. Hence, we have selected a politics domain because of Twitter’s popularity as an opulent source of political deliberations. Such data abundance facilitates data aggregation and improves the results of the data analysis. Furthermore, the currently implemented machine learning approaches assume that uncertainty and incompleteness do not affect the accuracy of the Twitter classification. In fact, data uncertainty and incompleteness may exist. In the future, the authors will formulate the data uncertainty and incompleteness into fuzzy numbers which can be used to address imprecise, uncertain and vague data.
Practical implications
This study proposes a practical framework comprising significant implications for a variety of business-related applications, such as the voice of customer/voice of market, recommendation systems, the discovery of domain-based influencers and opinion mining through tracking and simulation. In particular, the factual grasp of the domains of interest extracted at the user level or post level enhances the customer-to-business engagement. This contributes to an accurate analysis of customer reviews and opinions to improve brand loyalty, customer service, etc.
Originality/value
This paper fills a gap in the existing literature by presenting a consolidated framework for Twitter mining that aims to uncover the deficiency of the current state-of-the-art approaches to topic distillation and domain discovery. The overall approach is promising in the fortification of Twitter mining towards a better understanding of users’ domains of interest.
Details
Keywords
Alfredo Milani, Niyogi Rajdeep, Nimita Mangal, Rajat Kumar Mudgal and Valentina Franzoni
This paper aims to propose an approach for the analysis of user interest based on tweets, which can be used in the design of user recommendation systems. The extract topics are…
Abstract
Purpose
This paper aims to propose an approach for the analysis of user interest based on tweets, which can be used in the design of user recommendation systems. The extract topics are seen positively by the user.
Design/methodology/approach
The proposed approach is based on the combination of sentiment extraction and classification analysis of tweet to extract the topic of interest. The proposed hybrid method is original. The topic extraction phase uses a method based on semantic distance in the WordNet taxonomy. Sentiment extraction uses NLPcore.
Findings
The algorithm has been extensively tested using real tweets generated by 1,000 users. The results are quite encouraging and outperform state-of-the-art results and confirm the suitability of the approach combining sentiment and categorization for the topic of interest extraction.
Research limitations/implications
The hybrid method combining sentiment extraction and classification for user positive topics represents a novel contribution with many potential applications.
Practical implications
The functionality of positive topic extraction is very useful as a component in the design of a recommender system based on user profiling from Twitter user behaviors.
Social implications
The application of the proposed method in short-text social network can be massive and beyond the applications in tweets.
Originality/value
There are few works that have considered both sentiment analysis and classification to find out users’ interest. The algorithm has been extensively tested using real tweets generated by 1,000 users. The results are quite encouraging and outperform state-of-the-art results.
Details
Keywords
Mieke Jans, Banu Aysolmaz, Maarten Corten, Anant Joshi and Mathijs van Peteghem
The Accounting Information Systems (AIS) research field emerged around 30 years ago as a subfield of accounting but is at risk to develop further as an isolated discipline…
Abstract
Purpose
The Accounting Information Systems (AIS) research field emerged around 30 years ago as a subfield of accounting but is at risk to develop further as an isolated discipline. However, given the importance of digitalization and its relevance for accounting, an amalgamation of the parent research field of accounting and the subfield of accounting information systems is pivotal for continuing relevant research that is of high quality. This study empirically investigates the distance between AIS research that is included in accounting literature and AIS research that prevails in dedicated AIS research outlets.
Design/methodology/approach
To understand which topics define AIS research, all articles published in the two leading AIS journals since 2000 were analyzed. Based on this topical inventory, all AIS studies that were published in the top 16 accounting journals, also since 2000, are identified and categorized in terms of topic, subtopic and research methodology. Next, AIS studies published in the general accounting field and AIS studies published in the AIS field were compared in terms of topics and research methodology to gain insights into the distance between the two fields.
Findings
The coverage of AIS topics in accounting journals is, to no small extent, concentrated around the topics “information disclosure”, “network technologies” and “audit and control”. Other AIS topics remain underrepresented. A possible explanation might be the focus on archival studies in accounting outlets, but other elements might play a role. The findings suggest that there is only a partial overlap between the parent accounting research field and the AIS subfield, in terms of both topic and research methodology diversity. These findings suggest a considerable distance between both fields, which might hold detrimental consequences in the long run, if no corrective actions are taken.
Originality/value
This is the first in-depth investigation of the distance between the AIS research field and its parent field of accounting. This study helped develop an AIS classification scheme, which can be used in other research endeavors. This study creates awareness of the divergence between the general accounting research field and the AIS subfield. Given the latter's relevance to the accounting profession, isolation or deterioration of the AIS research must be avoided. Some actionable suggestions are provided in the paper.
Details
Keywords
Haichao Dong, Siu Cheung Hui and Yulan He
The purpose of this research is to study the characteristics of chat messages from analysing a collection of 33,121 sample messages gathered from 1,700 sessions of conversations…
Abstract
Purpose
The purpose of this research is to study the characteristics of chat messages from analysing a collection of 33,121 sample messages gathered from 1,700 sessions of conversations of 72 pairs of MSN Messenger users over a four month duration from June to September of 2005. The primary objective of chat message characterization is to understand the properties of chat messages for effective message analysis, such as message topic detection.
Design/methodology/approach
From the study on chat message characteristics, an indicative term‐based categorization approach for chat topic detection is proposed. In the proposed approach, different techniques such as sessionalisation of chat messages and extraction of features from icon texts and URLs are incorporated for message pre‐processing. Naïve Bayes, Associative Classification, and Support Vector Machine are employed as classifiers for categorizing topics from chat sessions.
Findings
Indicative term‐based approach is superior to the traditional document frequency based approach, for feature selection in chat topic categorization.
Originality/value
This paper studies the characteristics of chat messages and proposes an indicative term‐based categorization approach for chat topic detection.
Details
Keywords
Kalervo Järvelin and Pertti Vakkari
This paper analyses the research in Library and Information Science (LIS) and reports on (1) the status of LIS research in 2015 and (2) on the evolution of LIS research…
Abstract
Purpose
This paper analyses the research in Library and Information Science (LIS) and reports on (1) the status of LIS research in 2015 and (2) on the evolution of LIS research longitudinally from 1965 to 2015.
Design/methodology/approach
The study employs a quantitative intellectual content analysis of articles published in 30+ scholarly LIS journals, following the design by Tuomaala et al. (2014). In the content analysis, we classify articles along eight dimensions covering topical content and methodology.
Findings
The topical findings indicate that the earlier strong LIS emphasis on L&I services has declined notably, while scientific and professional communication has become the most popular topic. Information storage and retrieval has given up its earlier strong position towards the end of the years analyzed. Individuals are increasingly the units of observation. End-user's and developer's viewpoints have strengthened at the cost of intermediaries' viewpoint. LIS research is methodologically increasingly scattered since survey, scientometric methods, experiment, case studies and qualitative studies have all gained in popularity. Consequently, LIS may have become more versatile in the analysis of its research objects during the years analyzed.
Originality/value
Among quantitative intellectual content analyses of LIS research, the study is unique in its scope: length of analysis period (50 years), width (8 dimensions covering topical content and methodology) and depth (the annual batch of 30+ scholarly journals).
Details
Keywords
The purpose of this study is to identify the health information needs of senior online communities (SOCs) users, which could provide a basis for improving senior health…
Abstract
Purpose
The purpose of this study is to identify the health information needs of senior online communities (SOCs) users, which could provide a basis for improving senior health information services.
Design/methodology/approach
A total of 14,933 health-related posts in the two most popular senior online communities (Yinling and Keai) in China are crawled as a corpus. Based on the results of word frequency analysis, text classification is performed based on two aspects: medical systems (Western medicine and traditional Chinese medicine) and topics. The health information needs of SOCs users are revealed from the composition, growth trends and popularity of health information. Finally, some key points of senior health information services are discussed.
Findings
The health information needs of senior users can be divided into four types: coping with aging, dietary nutrition, physical exercise and mental health. These needs are comprehensive and involve a variety of health issues. Users are mainly concerned with physical health issues. In terms of medical systems, the number of Western medicine posts is relatively larger, whereas traditional Chinese medicine appears more in posts on coping with aging and physical exercise. The health information needs of SOCs users are in a stable status. Both the medical systems and topics could have an impact on the popularity of health information, but the number of posts is inconsistent with the level of popularity.
Originality/value
This study combines multiple perspectives to identify the health information needs of seniors in China with a comprehensive overview.
Details
Keywords
Shana Wagger, Randi Park and Denise Ann Dowding Bedford
This paper aims to review key content, architecture, and metadata model decisions and strategies in creation of a publication portal (on DVD to start), based on a 30+ year series…
Abstract
Purpose
This paper aims to review key content, architecture, and metadata model decisions and strategies in creation of a publication portal (on DVD to start), based on a 30+ year series of flagship reports from the World Bank.
Design/methodology/approach
The paper describes and analyzes key considerations and aspects of the project, including content architecture, content analysis, DTD selection, retrospective conversion, vendor management, design of metadata architectures, use of automated profiling methods, user‐information behavior, and search architectures supporting complex content architectures. It includes the challenges of applying an institutionally based taxonomy required to express subject‐matter responsibilities and relationships within the World Bank.
Findings
The team learned that the metadata behavior and architecture (inheritance, relationships, variations) are more complex than simple links between parent and child objects. The project also reinforced the importance of comprehensive and dynamic topic taxonomy for classifying content that is both historical and current. The approach to defining classes for each full report (parent) will be likely to change, given what has been learned. The team would recommend that parts be classified and the sum of the part classes be assigned to the whole report. As a result of this exploratory work, the Bank's approach to classification and indexing of report series is changing from a top‐down to a bottom‐up inheritance.
Originality/value
The study provides insights into both general and World Bank‐specific challenges in creating a publication portal and derives some best practices for content architecture, metadata architecture, and use of automated profiling methods.
Details
Keywords
Ziqing Peng and Yan Wan
In this age of extremely well-developed social media, it is necessary to detect any change in the corporate image of an enterprise immediately so as to take quick action to avoid…
Abstract
Purpose
In this age of extremely well-developed social media, it is necessary to detect any change in the corporate image of an enterprise immediately so as to take quick action to avoid the wide spread of a negative image. However, existing survey-based corporate image evaluation methods are costly, slow and static, and the results may quickly become outdated. User comments, news reports and we-media articles on the internet offer varied channels for enterprises to obtain public evaluations and feedback. The purpose of this study is to effectively use online information to timely and accurately measure enterprises’ corporate images.
Design/methodology/approach
A new corporate image evaluation method was built by first using a literature review to establish a corporate image evaluation index system. Next, an automatic text analysis of online public information was performed through a topic classification and sentiment analysis algorithm based on the dictionary. The accuracy of the topic classification and sentiment analysis algorithm is then calculated. Finally, three internet enterprises were chosen as cases, and their corporate image was evaluated.
Findings
The results show that the author’s corporate image evaluation method is effective.
Originality/value
First, in this study, a new corporate image evaluation index system is constructed. Second, a new corporate image evaluation method based on text mining is proposed that can support data-driven decision-making for managers with real-time corporate image evaluation results. Finally, this study improves the understanding of corporate image by generating business intelligence through online information. The findings provide researchers with specific and detailed suggestions that focus on the corporate image management of emerging internet enterprises.
Details
Keywords
Smart manufacturing can lead to disruptive changes in production technologies and business models in the manufacturing industry. This paper aims to identify technological topics…
Abstract
Purpose
Smart manufacturing can lead to disruptive changes in production technologies and business models in the manufacturing industry. This paper aims to identify technological topics in smart manufacturing by using patent data, investigating technological trends and exploring potential opportunities.
Design/methodology/approach
The latent Dirichlet allocation (LDA) topic modeling technique was used to extract latent technological topics, and the generalized linear mixed model (GLMM) was used to analyze the relative emergence levels of the topics. Topic value and topic competitive analyses were developed to evaluate each topic's potential value and identify technological positions of competing firms, respectively.
Findings
A total of 14 topics were extracted from the collected patent data and several fast growth and high-value topics were identified, such as smart connection, cyber-physical systems (CPSs), manufacturing data analytics and powder bed fusion additive manufacturing. Several leading firms apply broad R&D emphasis across a variety of technological topics, while others focus on a few technological topics.
Practical implications
The developed methodology can help firms identify important technological topics in smart manufacturing for making their R&D investment decisions. Firms can select appropriate technology strategies depending on the topic's emergence position in the topic strategy matrix.
Originality/value
Previous research studies have not analyzed the maturity levels of technological topics. The topic-based patent analytics approach can complement previous studies. In addition, this study provides a multi-valuation framework for exploring technological opportunities, thus providing valuable information that supports a more robust understanding of the technology landscape of smart manufacturing.
Details
Keywords
This work aims to investigate the sensitivity of ranking performance with respect to the topic distribution of queries selected for ranking evaluation.
Abstract
Purpose
This work aims to investigate the sensitivity of ranking performance with respect to the topic distribution of queries selected for ranking evaluation.
Design/methodology/approach
The authors reweight queries used in two TREC tasks to make them match three real background topic distributions, and show that the performance rankings of retrieval systems are quite different.
Findings
It is found that search engines tend to perform similarly on queries about the same topic; and search engine performance is sensitive to the topic distribution of queries used in evaluation.
Originality/value
Using experiments with multiple real‐world query logs, the paper demonstrates weaknesses in the current evaluation model of retrieval systems.
Details