Search results
1 – 10 of 44
This paper purposed a multi-facet sentiment analysis system.
Abstract
Purpose
This paper purposed a multi-facet sentiment analysis system.
Design/methodology/approach
Hence, This paper uses multidomain resources to build a sentiment analysis system. The manual lexicon based features that are extracted from the resources are fed into a machine learning classifier to compare their performance afterward. The manual lexicon is replaced with a custom BOW to deal with its time consuming construction. To help the system run faster and make the model interpretable, this will be performed by employing different existing and custom approaches such as term occurrence, information gain, principal component analysis, semantic clustering, and POS tagging filters.
Findings
The proposed system featured by lexicon extraction automation and characteristics size optimization proved its efficiency when applied to multidomain and benchmark datasets by reaching 93.59% accuracy which makes it competitive to the state-of-the-art systems.
Originality/value
The construction of a custom BOW. Optimizing features based on existing and custom feature selection and clustering approaches.
Details
Keywords
Omar Alqaryouti, Nur Siyam, Azza Abdel Monem and Khaled Shaalan
Digital resources such as smart applications reviews and online feedback information are important sources to seek customers’ feedback and input. This paper aims to help…
Abstract
Digital resources such as smart applications reviews and online feedback information are important sources to seek customers’ feedback and input. This paper aims to help government entities gain insights on the needs and expectations of their customers. Towards this end, we propose an aspect-based sentiment analysis hybrid approach that integrates domain lexicons and rules to analyse the entities smart apps reviews. The proposed model aims to extract the important aspects from the reviews and classify the corresponding sentiments. This approach adopts language processing techniques, rules, and lexicons to address several sentiment analysis challenges, and produce summarized results. According to the reported results, the aspect extraction accuracy improves significantly when the implicit aspects are considered. Also, the integrated classification model outperforms the lexicon-based baseline and the other rules combinations by 5% in terms of Accuracy on average. Also, when using the same dataset, the proposed approach outperforms machine learning approaches that uses support vector machine (SVM). However, using these lexicons and rules as input features to the SVM model has achieved higher accuracy than other SVM models.
Details
Keywords
Abhishek Das and Mihir Narayan Mohanty
In time and accurate detection of cancer can save the life of the person affected. According to the World Health Organization (WHO), breast cancer occupies the most frequent…
Abstract
Purpose
In time and accurate detection of cancer can save the life of the person affected. According to the World Health Organization (WHO), breast cancer occupies the most frequent incidence among all the cancers whereas breast cancer takes fifth place in the case of mortality numbers. Out of many image processing techniques, certain works have focused on convolutional neural networks (CNNs) for processing these images. However, deep learning models are to be explored well.
Design/methodology/approach
In this work, multivariate statistics-based kernel principal component analysis (KPCA) is used for essential features. KPCA is simultaneously helpful for denoising the data. These features are processed through a heterogeneous ensemble model that consists of three base models. The base models comprise recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent unit (GRU). The outcomes of these base learners are fed to fuzzy adaptive resonance theory mapping (ARTMAP) model for decision making as the nodes are added to the F_2ˆa layer if the winning criteria are fulfilled that makes the ARTMAP model more robust.
Findings
The proposed model is verified using breast histopathology image dataset publicly available at Kaggle. The model provides 99.36% training accuracy and 98.72% validation accuracy. The proposed model utilizes data processing in all aspects, i.e. image denoising to reduce the data redundancy, training by ensemble learning to provide higher results than that of single models. The final classification by a fuzzy ARTMAP model that controls the number of nodes depending upon the performance makes robust accurate classification.
Research limitations/implications
Research in the field of medical applications is an ongoing method. More advanced algorithms are being developed for better classification. Still, the scope is there to design the models in terms of better performance, practicability and cost efficiency in the future. Also, the ensemble models may be chosen with different combinations and characteristics. Only signal instead of images may be verified for this proposed model. Experimental analysis shows the improved performance of the proposed model. This method needs to be verified using practical models. Also, the practical implementation will be carried out for its real-time performance and cost efficiency.
Originality/value
The proposed model is utilized for denoising and to reduce the data redundancy so that the feature selection is done using KPCA. Training and classification are performed using heterogeneous ensemble model designed using RNN, LSTM and GRU as base classifiers to provide higher results than that of single models. Use of adaptive fuzzy mapping model makes the final classification accurate. The effectiveness of combining these methods to a single model is analyzed in this work.
Details
Keywords
Marco D’Orazio, Gabriele Bernardini and Elisa Di Giuseppe
This paper aims to develop predictive methods, based on recurrent neural networks, useful to support facility managers in building maintenance tasks, by collecting information…
Abstract
Purpose
This paper aims to develop predictive methods, based on recurrent neural networks, useful to support facility managers in building maintenance tasks, by collecting information coming from a computerized maintenance management system (CMMS).
Design/methodology/approach
This study applies data-driven and text-mining approaches to a CMMS data set comprising more than 14,500 end-users’ requests for corrective maintenance actions, collected over 14 months. Unidirectional long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM) recurrent neural networks are trained to predict the priority of each maintenance request and the related technical staff assignment. The data set is also used to depict an overview of corrective maintenance needs and related performances and to verify the most relevant elements in the building and how the current facility management (FM) relates to the requests.
Findings
The study shows that LSTM and Bi-LSTM recurrent neural networks can properly recognize the words contained in the requests, thus correctly and automatically assigning the priority and predicting the technical staff to assign for each end-user’s maintenance request. The obtained global accuracy is very high, reaching 93.3% for priority identification and 96.7% for technical staff assignment. Results also show the main critical building elements for maintenance requests and the related intervention timings.
Research limitations/implications
This work shows that LSTM and Bi-LSTM recurrent neural networks can automate the assignment process of end-users’ maintenance requests if trained with historical CMMS data. Results are promising; however, the trained LSTM and Bi-LSTM RNN can be applied only to different hospitals adopting similar categorization.
Practical implications
The data-driven and text-mining approaches can be integrated into the CMMS to support corrective maintenance management by facilities management contractors, i.e. to properly and timely identify the actions to be carried out and the technical staff to assign.
Social implications
The improvement of the maintenance of the health-care system is a key component of improving health service delivery. This work shows how to reduce health-care service interruptions due to maintenance needs through machine learning methods.
Originality/value
This study develops original methods and tools easily integrable into IT workflow systems (i.e. CMMS) in the FM field.
Details
Keywords
Xuan Ji, Jiachen Wang and Zhijun Yan
Stock price prediction is a hot topic and traditional prediction methods are usually based on statistical and econometric models. However, these models are difficult to deal with…
Abstract
Purpose
Stock price prediction is a hot topic and traditional prediction methods are usually based on statistical and econometric models. However, these models are difficult to deal with nonstationary time series data. With the rapid development of the internet and the increasing popularity of social media, online news and comments often reflect investors’ emotions and attitudes toward stocks, which contains a lot of important information for predicting stock price. This paper aims to develop a stock price prediction method by taking full advantage of social media data.
Design/methodology/approach
This study proposes a new prediction method based on deep learning technology, which integrates traditional stock financial index variables and social media text features as inputs of the prediction model. This study uses Doc2Vec to build long text feature vectors from social media and then reduce the dimensions of the text feature vectors by stacked auto-encoder to balance the dimensions between text feature variables and stock financial index variables. Meanwhile, based on wavelet transform, the time series data of stock price is decomposed to eliminate the random noise caused by stock market fluctuation. Finally, this study uses long short-term memory model to predict the stock price.
Findings
The experiment results show that the method performs better than all three benchmark models in all kinds of evaluation indicators and can effectively predict stock price.
Originality/value
In this paper, this study proposes a new stock price prediction model that incorporates traditional financial features and social media text features which are derived from social media based on deep learning technology.
Details
Keywords
Karlo Puh and Marina Bagić Babac
As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism…
Abstract
Purpose
As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism importance and popularity, the amount of significant data grows, too. On daily basis, millions of people write their opinions, suggestions and views about accommodation, services, and much more on various websites. Well-processed and filtered data can provide a lot of useful information that can be used for making tourists' experiences much better and help us decide when selecting a hotel or a restaurant. Thus, the purpose of this study is to explore machine and deep learning models for predicting sentiment and rating from tourist reviews.
Design/methodology/approach
This paper used machine learning models such as Naïve Bayes, support vector machines (SVM), convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (BiLSTM) for extracting sentiment and ratings from tourist reviews. These models were trained to classify reviews into positive, negative, or neutral sentiment, and into one to five grades or stars. Data used for training the models were gathered from TripAdvisor, the world's largest travel platform. The models based on multinomial Naïve Bayes (MNB) and SVM were trained using the term frequency-inverse document frequency (TF-IDF) for word representations while deep learning models were trained using global vectors (GloVe) for word representation. The results from testing these models are presented, compared and discussed.
Findings
The performance of machine and learning models achieved high accuracy in predicting positive, negative, or neutral sentiments and ratings from tourist reviews. The optimal model architecture for both classification tasks was a deep learning model based on BiLSTM. The study’s results confirmed that deep learning models are more efficient and accurate than machine learning algorithms.
Practical implications
The proposed models allow for forecasting the number of tourist arrivals and expenditure, gaining insights into the tourists' profiles, improving overall customer experience, and upgrading marketing strategies. Different service sectors can use the implemented models to get insights into customer satisfaction with the products and services as well as to predict the opinions given a particular context.
Originality/value
This study developed and compared different machine learning models for classifying customer reviews as positive, negative, or neutral, as well as predicting ratings with one to five stars based on a TripAdvisor hotel reviews dataset that contains 20,491 unique hotel reviews.
Details
Keywords
Vicente Ramos, Woraphon Yamaka, Bartomeu Alorda and Songsak Sriboonchitta
This paper aims to illustrate the potential of high-frequency data for tourism and hospitality analysis, through two research objectives: First, this study describes and test a…
Abstract
Purpose
This paper aims to illustrate the potential of high-frequency data for tourism and hospitality analysis, through two research objectives: First, this study describes and test a novel high-frequency forecasting methodology applied on big data characterized by fine-grained time and spatial resolution; Second, this paper elaborates on those estimates’ usefulness for visitors and tourism public and private stakeholders, whose decisions are increasingly focusing on short-time horizons.
Design/methodology/approach
This study uses the technical communications between mobile devices and WiFi networks to build a high frequency and precise geolocation of big data. The empirical section compares the forecasting accuracy of several artificial intelligence and time series models.
Findings
The results robustly indicate the long short-term memory networks model superiority, both for in-sample and out-of-sample forecasting. Hence, the proposed methodology provides estimates which are remarkably better than making short-time decision considering the current number of residents and visitors (Naïve I model).
Practical implications
A discussion section exemplifies how high-frequency forecasts can be incorporated into tourism information and management tools to improve visitors’ experience and tourism stakeholders’ decision-making. Particularly, the paper details its applicability to managing overtourism and Covid-19 mitigating measures.
Originality/value
High-frequency forecast is new in tourism studies and the discussion sheds light on the relevance of this time horizon for dealing with some current tourism challenges. For many tourism-related issues, what to do next is not anymore what to do tomorrow or the next week.
Plain Language Summary
This research initiates high-frequency forecasting in tourism and hospitality studies. Additionally, we detail several examples of how anticipating urban crowdedness requires high-frequency data and can improve visitors’ experience and public and private decision-making.
Details
Keywords
This study explores whether a new machine learning method can more accurately predict the movement of stock prices.
Abstract
Purpose
This study explores whether a new machine learning method can more accurately predict the movement of stock prices.
Design/methodology/approach
This study presents a novel hybrid deep learning model, Residual-CNN-Seq2Seq (RCSNet), to predict the trend of stock price movement. RCSNet integrates the autoregressive integrated moving average (ARIMA) model, convolutional neural network (CNN) and the sequence-to-sequence (Seq2Seq) long–short-term memory (LSTM) model.
Findings
The hybrid model is able to forecast both linear and non-linear time-series component of stock dataset. CNN and Seq2Seq LSTMs can be effectively combined for dynamic modeling of short- and long-term-dependent patterns in non-linear time series forecast. Experimental results show that the proposed model outperforms baseline models on S&P 500 index stock dataset from January 2000 to August 2016.
Originality/value
This study develops the RCSNet hybrid model to tackle the challenge by combining both linear and non-linear models. New evidence has been obtained in predicting the movement of stock market prices.
Details
Keywords
Weiwei Zhu, Jinglin Wu, Ting Fu, Junhua Wang, Jie Zhang and Qiangqiang Shangguan
Efficient traffic incident management is needed to alleviate the negative impact of traffic incidents. Accurate and reliable estimation of traffic incident duration is of great…
Abstract
Purpose
Efficient traffic incident management is needed to alleviate the negative impact of traffic incidents. Accurate and reliable estimation of traffic incident duration is of great importance for traffic incident management. Previous studies have proposed models for traffic incident duration prediction; however, most of these studies focus on the total duration and could not update prediction results in real-time. From a traveler’s perspective, the relevant factor is the residual duration of the impact of the traffic incident. Besides, few (if any) studies have used dynamic traffic flow parameters in the prediction models. This paper aims to propose a framework to fill these gaps.
Design/methodology/approach
This paper proposes a framework based on the multi-layer perception (MLP) and long short-term memory (LSTM) model. The proposed methodology integrates traffic incident-related factors and real-time traffic flow parameters to predict the residual traffic incident duration. To validate the effectiveness of the framework, traffic incident data and traffic flow data from Shanghai Zhonghuan Expressway are used for modeling training and testing.
Findings
Results show that the model with 30-min time window and taking both traffic volume and speed as inputs performed best. The area under the curve values exceed 0.85 and the prediction accuracies exceed 0.75. These indicators demonstrated that the model is appropriate for this study context. The model provides new insights into traffic incident duration prediction.
Research limitations/implications
The incident samples applied by this study might not be enough and the variables are not abundant. The number of injuries and casualties, more detailed description of the incident location and other variables are expected to be used to characterize the traffic incident comprehensively. The framework needs to be further validated through a sufficiently large number of variables and locations.
Practical implications
The framework can help reduce the impacts of incidents on the safety of efficiency of road traffic once implemented in intelligent transport system and traffic management systems in future practical applications.
Originality/value
This study uses two artificial neural network methods, MLP and LSTM, to establish a framework aiming at providing accurate and time-efficient information on traffic incident duration in the future for transportation operators and travelers. This study will contribute to the deployment of emergency management and urban traffic navigation planning.
Details