Search results
1 – 10 of 393Jian Zhan, Xin Janet Ge, Shoudong Huang, Liang Zhao, Johnny Kwok Wai Wong and Sean XiangJian He
Automated technologies have been applied to facility management (FM) practices to address labour demands of, and time consumed by, inputting and processing manual data. Less…
Abstract
Purpose
Automated technologies have been applied to facility management (FM) practices to address labour demands of, and time consumed by, inputting and processing manual data. Less attention has been focussed on automation of visual information, such as images, when improving timely maintenance decisions. This study aims to develop image classification algorithms to improve information flow in the inspection-repair process through building information modelling (BIM).
Design/methodology/approach
To improve and automate the inspection-repair process, image classification algorithms were used to connect images with a corresponding image database in a BIM knowledge repository. Quick response (QR) code decoding and Bag of Words were chosen to classify images in the system. Graphical user interfaces (GUIs) were developed to facilitate activity collaboration and communication. A pilot case study in an inspection-repair process was applied to demonstrate the applications of this system.
Findings
The system developed in this study associates the inspection-repair process with a digital three-dimensional (3D) model, GUIs, a BIM knowledge repository and image classification algorithms. By implementing the proposed application in a case study, the authors found that improvement of the inspection-repair process and automated image classification with a BIM knowledge repository (such as the one developed in this study) can enhance FM practices by increasing productivity and reducing time and costs associated with ecision-making.
Originality/value
This study introduces an innovative approach that applies image classification and leverages a BIM knowledge repository to enhance the inspection-repair process in FM practice. The system designed provides automated image-classifying data from a smart phone, eliminates time required to input image data manually and improves communication and collaboration between FM personnel for maintenance in the decision-making process.
Details
Keywords
The purpose of this paper is to focus on the problem of named entity disambiguation. The paper disambiguates named entities on a very detailed level. To each entity is assigned a…
Abstract
Purpose
The purpose of this paper is to focus on the problem of named entity disambiguation. The paper disambiguates named entities on a very detailed level. To each entity is assigned a concrete identifier of a corresponding Wikipedia article describing the entity.
Design/methodology/approach
For such a fine‐grained disambiguation a correct representation of the context is crucial. The authors compare various context representations: bag of words representation, linguistic representation and structured co‐occurrence representation. Models for each representation are described and evaluated. They also investigate the possibilities of multilingual named entity disambiguation.
Findings
Based on this evaluation, the structured co‐occurrence representation provides the best disambiguation results. It showed up that this method could be successfully applied also on other languages, not only on English.
Research limitations/implications
Despite its good results the structured co‐occurrence context representation has several limitations. It trades precision for recall, which might not be desirable in some use cases. Also it is not able to disambiguate two different types of entities, which are mentioned under the same name in the same text. These limitations can be overcome by combination with other described methods.
Practical implications
The authors provide a ready‐made web service, which can be directly plugged in existing applications using a REST interface.
Originality/value
The paper proposes a new approach to named entity disambiguation exploiting various context representation models (bag of words, linguistic and structural representation). The authors constructed a comprehensive dataset based on all English Wikipedia articles for named entity disambiguation. They evaluated and compared the individual context representation models on this dataset. They evaluate the support of multiple languages.
Details
Keywords
Sayeh Bagherzadeh, Sajjad Shokouhyar, Hamed Jahani and Marianna Sigala
Research analyzing online travelers’ reviews has boomed over the past years, but it lacks efficient methodologies that can provide useful end-user value within time and budget…
Abstract
Purpose
Research analyzing online travelers’ reviews has boomed over the past years, but it lacks efficient methodologies that can provide useful end-user value within time and budget. This study aims to contribute to the field by developing and testing a new methodology for sentiment analysis that surpasses the standard dictionary-based method by creating two hotel-specific word lexicons.
Design/methodology/approach
Big data of hotel customer reviews posted on the TripAdvisor platform were collected and appropriately prepared for conducting a binary sentiment analysis by developing a novel bag-of-words weighted approach. The latter provides a transparent and replicable procedure to prepare, create and assess lexicons for sentiment analysis. This approach resulted in two lexicons (a weighted lexicon, L1 and a manually selected lexicon, L2), which were tested and validated by applying classification accuracy metrics to the TripAdvisor big data. Two popular methodologies (a public dictionary-based method and a complex machine-learning algorithm) were used for comparing the accuracy metrics of the study’s approach for creating the two lexicons.
Findings
The results of the accuracy metrics confirmed that the study’s methodology significantly outperforms the dictionary-based method in comparison to the machine-learning algorithm method. The findings also provide evidence that the study’s methodology is generalizable for predicting users’ sentiment.
Practical implications
The study developed and validated a methodology for generating reliable lexicons that can be used for big data analysis aiming to understand and predict customers’ sentiment. The L2 hotel dictionary generated by the study provides a reliable method and a useful tool for analyzing guests’ feedback and enabling managers to understand, anticipate and re-actively respond to customers’ attitudes and changes. The study also proposed a simplified methodology for understanding the sentiment of each user, which, in turn, can be used for conducting comparisons aiming to detect and understand guests’ sentiment changes across time, as well as across users based on their profiles and experiences.
Originality/value
This study contributes to the field by proposing and testing a new methodology for conducting sentiment analysis that addresses previous methodological limitations, as well as the contextual specificities of the tourism industry. Based on the paper’s literature review, this is the first research study using a bag-of-words approach for conducting a sentiment analysis and creating a field-specific lexicon.
论可推广性的情感分析法以创建酒店字典:以TripAdvisor酒店评论为样本的大数据分析
摘要
研究目的
对于在线游客评论的研究在过去的几年中与日俱增, 但是仍缺乏有效方法能在有限的时间喝预算内提供终端用户价值。本论文开发并测试了一套情感分析的新方法, 创建两套酒店相关的词库, 此方法超越了标准词典式分析法。
研究设计/方法/途径
研究样本为TripAdvisor酒店客户评论的大数据, 通过开发崭新的有配重的词库法, 来开展两极式情感分析。这个崭新的具有配重的词库法能够呈现透明化和可复制的程序, 准备、创建、并检验情感分析的词条。这个方法用到了两种词典(有配重的词典L1和手动选择的词典L2), 本论文通过对TripAdvisor大数据进行使用词类划分精准度, 来检测和验证这两种词典。本论文采用两种热门方法(公共词典法和复杂机器学习算法)来对比词典的准确度。
研究结果
精确度对比结果证实了本论文的方法, 相较于机器学习算法, 显著地超越了以字典为基础的方法。研究结果还表明, 本论文的方法可以就预测用户情感趋势进行推广。
研究实际启示
本论文开发并验证了一项方法, 这种方法通过创建可信的词典进行大数据分析, 以判定用户情感。本论文创建的L2酒店词库对分析客人反馈是可靠有用的工具, 这个词库还能帮助酒店经理了解、预测、以及积极相应客人的态度和改变。本论文还提出了一项可以了解每个用户情感的简易方法, 这项方法可以通过对比的方式来检测和了解客人不同时间的情感变化, 以及根据其不同背景和经历的不同用户之间的变化。
研究原创性/价值
本论文提出并检测了一项新方法, 这项情感分析方法可以解决之前方法的局限并立脚于旅游行业。基于文献综述, 本论文是首篇研究, 使用词库法来进行情感分析和创建特别领域词典的方式。
Details
Keywords
Elena Fedorova, Pavel Drogovoz, Anna Popova and Vladimir Shiboldenkov
The paper examines whether, along with the financial performance, the disclosure of research and development (R&D) expenses, patent portfolios, patent citations and innovation…
Abstract
Purpose
The paper examines whether, along with the financial performance, the disclosure of research and development (R&D) expenses, patent portfolios, patent citations and innovation activities affect the market capitalization of Russian companies.
Design/methodology/approach
The paper opted for a set of techniques including bag-of-words (BoW) to retrieve additional innovation-related data from companies' annual reports, self-organizing maps (SOM) to perform visual exploratory analysis and panel data regression (PDR) to conduct confirmatory analysis using data on 74 Russian publicly traded companies for the period 2013–2019.
Findings
The paper observes that the disclosure of nonfinancial data on R&D, patents and primarily product and marketing innovations positively affects the market capitalization of the largest Russian companies, which are mainly focused on energy, raw materials and utilities and are operating on international markets. The study suggests that these companies are financially well-resourced to innovate at risk and thus to provide positive signals to stakeholders and external agents.
Research limitations/implications
Our findings are important to management, investors, financial analysts, regulators and various agencies providing guidance on corporate governance and sustainability reporting. However, the authors acknowledge that the research results may lack generalizability due to the sample covering a single national context. Researchers are encouraged to test the proposed approach further on other countries' data by using the compiled lexicons.
Originality/value
The study aims to expand the domains of signaling theory and market valuation by providing new insights into the impact that companies' reporting on R&D, patents and innovation activities has on market capitalization. New nonfinancial factors that previous research does not investigate – innovation disclosure indicators (IDI) – are tested.
Details
Keywords
Elena Fedorova, Igor Demin and Elena Silina
The paper aims to estimate how corporate philanthropy expenditures and corporate philanthropy disclosure (in general and in different spheres) affect investment attractiveness of…
Abstract
Purpose
The paper aims to estimate how corporate philanthropy expenditures and corporate philanthropy disclosure (in general and in different spheres) affect investment attractiveness of Russian companies.
Design/methodology/approach
To assess the degree of corporate philanthropy disclosure the authors compiled lexicons based on a set of techniques: text and frequency analysis, correlations, principal component analysis. To adjust the existing classifications of corporate philanthropic activities to the Russian market the authors employed expert analysis. The empirical research base includes 83 Russian publicly traded companies for the period 2013–2019. To estimate the impact of indicators of corporate philanthropy disclosure on company's investment attractiveness the authors utilized panel data regression and random forest algorithm.
Findings
We compiled 2 Russian lexicons: one on general issues of corporate philanthropy and another one on philanthropic activities in various spheres (sports and healthcare; support for certain groups of people; social infrastructure; children protection and youth policy; culture, education and science). 2. The paper observes that the disclosure of non-financial data including that related to general issues of corporate philanthropy as well as to different spheres affects the market capitalization of the largest Russian companies. The results of regression analysis suggest that disclosure of altruism-driven philanthropic activities (such as corporate philanthropy in the sphere of culture, education and science) has a lesser impact on company's investment attractiveness than that of activities driven by business-related motives (sports and healthcare, children protection and youth policy).
Research limitations/implications
Our findings are important to management, investors, financial analysts, regulators and various agencies providing guidance on corporate governance and sustainability reporting. However, the authors acknowledge that the research results may lack generalizability due to the sample covering a single national context. Researchers are encouraged to test the proposed approach further on other countries' data by using the authors’ compiled lexicons.
Originality/value
The study aims to expand the domains of signaling and agency theories. First, this subject has not been widely examined in terms of emerging markets, the authors’ study is the first to focus on the Russian market. Secondly, the majority of scholars use text analysis to examine not only the impact of charitable donations but also the effect of corporate philanthropy disclosure. Thirdly, the authors provided the authors’ own lexicon of corporate philanthropy disclosure based on machine learning technique and expert analysis. Fourthly, to estimate the impact of corporate philanthropy on company's investment attractiveness the authors used the original approach based on combination of linear (regression), and non-linear methods (permutation importance. The authors’ findings extend the theoretical concept of Peterson et al. (2021): corporate philanthropy is viewed as the company strategy to reinforce its reputation, it helps to establish more efficient relationships with stakeholders which, in its turn, results in the increased business value.
Details
Keywords
Eugene Yujun Fu, Hong Va Leong, Grace Ngai and Stephen C.F. Chan
Social signal processing under affective computing aims at recognizing and extracting useful human social interaction patterns. Fight is a common social interaction in real life…
Abstract
Purpose
Social signal processing under affective computing aims at recognizing and extracting useful human social interaction patterns. Fight is a common social interaction in real life. A fight detection system finds wide applications. This paper aims to detect fights in a natural and low-cost manner.
Design/methodology/approach
Research works on fight detection are often based on visual features, demanding substantive computation and good video quality. In this paper, the authors propose an approach to detect fight events through motion analysis. Most existing works evaluated their algorithms on public data sets manifesting simulated fights, where the fights are acted out by actors. To evaluate real fights, the authors collected videos involving real fights to form a data set. Based on the two types of data sets, the authors evaluated the performance of their motion signal analysis algorithm, which was then compared with the state-of-the-art approach based on MoSIFT descriptors with Bag-of-Words mechanism, and basic motion signal analysis with Bag-of-Words.
Findings
The experimental results indicate that the proposed approach accurately detects fights in real scenarios and performs better than the MoSIFT approach.
Originality/value
By collecting and annotating real surveillance videos containing real fight events and augmenting with well-known data sets, the authors proposed, implemented and evaluated a low computation approach, comparing it with the state-of-the-art approach. The authors uncovered some fundamental differences between real and simulated fights and initiated a new study in discriminating real against simulated fight events, with very good performance.
Details
Keywords
Daejin Kim, Hyoung-Goo Kang, Kyounghun Bae and Seongmin Jeon
To overcome the shortcomings of traditional industry classification systems such as the Standard Industrial Classification Standard Industrial Classification, North American…
Abstract
Purpose
To overcome the shortcomings of traditional industry classification systems such as the Standard Industrial Classification Standard Industrial Classification, North American Industry Classification System North American Industry Classification System, and Global Industry Classification Standard Global Industry Classification Standard, the authors explore industry classifications using machine learning methods as an application of interpretable artificial intelligence (AI).
Design/methodology/approach
The authors propose a text-based industry classification combined with a machine learning technique by extracting distinguishable features from business descriptions in financial reports. The proposed method can reduce the dimensions of word vectors to avoid the curse of dimensionality when measuring the similarities of firms.
Findings
Using the proposed method, the sample firms form clusters of distinctive industries, thus overcoming the limitations of existing classifications. The method also clarifies industry boundaries based on lower-dimensional information. The graphical closeness between industries can reflect the industry-level relationship as well as the closeness between individual firms.
Originality/value
The authors’ work contributes to the industry classification literature by empirically investigating the effectiveness of machine learning methods. The text mining method resolves issues concerning the timeliness of traditional industry classifications by capturing new information in annual reports. In addition, the authors’ approach can solve the computing concerns of high dimensionality.
Details
Keywords
Sixing Liu, Yan Chai, Rui Yuan and Hong Miao
Simultaneous localization and map building (SLAM), as a state estimation problem, is a prerequisite for solving the problem of autonomous vehicle motion in unknown environments…
Abstract
Purpose
Simultaneous localization and map building (SLAM), as a state estimation problem, is a prerequisite for solving the problem of autonomous vehicle motion in unknown environments. Existing algorithms are based on laser or visual odometry; however, the lidar sensing range is small, the amount of data features is small, the camera is vulnerable to external conditions and the localization and map building cannot be performed stably and accurately using a single sensor. This paper aims to propose a laser three dimensions tightly coupled map building method that incorporates visual information, and uses laser point cloud information and image information to complement each other to improve the overall performance of the algorithm.
Design/methodology/approach
The visual feature points are first matched at the front end of the method, and the mismatched point pairs are removed using the bidirectional random sample consensus (RANSAC) algorithm. The laser point cloud is then used to obtain its depth information, while the two types of feature points are fed into the pose estimation module for a tightly coupled local bundle adjustment solution using a heuristic simulated annealing algorithm. Finally, the visual bag-of-words model is fused in the laser point cloud information to establish a threshold to construct a loopback framework to further reduce the cumulative drift error of the system over time.
Findings
Experiments on publicly available data sets show that the proposed method in this paper can match its real trajectory well. For various scenes, the map can be constructed by using the complementary laser and vision sensors, with high accuracy and robustness. At the same time, the method is verified in a real environment using an autonomous walking acquisition platform, and the system loaded with the method can run well for a long time and take into account the environmental adaptability of multiple scenes.
Originality/value
A multi-sensor data tight coupling method is proposed to fuse laser and vision information for optimal solution of the positional attitude. A bidirectional RANSAC algorithm is used for the removal of visual mismatched point pairs. Further, oriented fast and rotated brief feature points are used to build a bag-of-words model and construct a real-time loopback framework to reduce error accumulation. According to the experimental validation results, the accuracy and robustness of the single-sensor SLAM algorithm can be improved.
Details
Keywords
Panagiotis Stamolampros and Nikolaos Korfiatis
Although the literature has established the effect of online reviews on customer purchase intentions, the influence of psychological factors on online ratings is overlooked. This…
Abstract
Purpose
Although the literature has established the effect of online reviews on customer purchase intentions, the influence of psychological factors on online ratings is overlooked. This paper aims to examine these factors under the perspective of construal level theory (CLT).
Design/methodology/approach
Using review data from TripAdvisor and Booking.com, the authors study three dimensions of psychological distances (temporal, spatial and social) and their direct and interaction effects on review valence, using regression analysis. The authors examine the effect of these distances on the information content of online reviews using a novel bag-of-words model to assess its concreteness.
Findings
Temporal distance and spatial distance have positive direct effects on review valence. Social distance, on the other hand, has a negative direct effect. However, its interaction with the other two distances has a positive effect, suggesting that consumers tend to “zoom-out” to less concrete things in their ratings.
Practical implications
The findings provide implications for the interpretation of review ratings by the service providers and their information content.
Originality/value
This study extends the CLT and electronic word-of-mouth literature by jointly exploring the effect of all three psychological distances that are applicable in post-purchase evaluations. Methodologically, it provides a novel application of the bag-of-words model in evaluating the concreteness of online reviews.
Details
Keywords
Jinwook Choi, Yongmoo Suh and Namchul Jung
The purpose of this study is to investigate the effectiveness of qualitative information extracted from firm’s annual report in predicting corporate credit rating. Qualitative…
Abstract
Purpose
The purpose of this study is to investigate the effectiveness of qualitative information extracted from firm’s annual report in predicting corporate credit rating. Qualitative information represented by published reports or management interview has been known as an important source in addition to quantitative information represented by financial values in assigning corporate credit rating in practice. Nevertheless, prior studies have room for further research in that they rarely employed qualitative information in developing prediction model of corporate credit rating.
Design/methodology/approach
This study adopted three document vectorization methods, Bag-Of-Words (BOW), Word to Vector (Word2Vec) and Document to Vector (Doc2Vec), to transform an unstructured textual data into a numeric vector, so that Machine Learning (ML) algorithms accept it as an input. For the experiments, we used the corpus of Management’s Discussion and Analysis (MD&A) section in 10-K financial reports as well as financial variables and corporate credit rating data.
Findings
Experimental results from a series of multi-class classification experiments show the predictive models trained by both financial variables and vectors extracted from MD&A data outperform the benchmark models trained only by traditional financial variables.
Originality/value
This study proposed a new approach for corporate credit rating prediction by using qualitative information extracted from MD&A documents as an input to ML-based prediction models. Also, this research adopted and compared three textual vectorization methods in the domain of corporate credit rating prediction and showed that BOW mostly outperformed Word2Vec and Doc2Vec.
Details