Search results

1 – 10 of 98
Article
Publication date: 29 October 2018

Shrawan Kumar Trivedi and Shubhamoy Dey

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be…

Abstract

Purpose

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be achieved via natural language processing and machine learning classifiers. This paper aims to propose a novel probabilistic committee selection classifier (PCC) to analyse and classify the sentiment polarities of movie reviews.

Design/methodology/approach

An Indian movie review corpus is assembled for this study. Another publicly available movie review polarity corpus is also involved with regard to validating the results. The greedy stepwise search method is used to extract the features/words of the reviews. The performance of the proposed classifier is measured using different metrics, such as F-measure, false positive rate, receiver operating characteristic (ROC) curve and training time. Further, the proposed classifier is compared with other popular machine-learning classifiers, such as Bayesian, Naïve Bayes, Decision Tree (J48), Support Vector Machine and Random Forest.

Findings

The results of this study show that the proposed classifier is good at predicting the positive or negative polarity of movie reviews. Its performance accuracy and the value of the ROC curve of the PCC is found to be the most suitable of all other classifiers tested in this study. This classifier is also found to be efficient at identifying positive sentiments of reviews, where it gives low false positive rates for both the Indian Movie Review and Review Polarity corpora used in this study. The training time of the proposed classifier is found to be slightly higher than that of Bayesian, Naïve Bayes and J48.

Research limitations/implications

Only movie review sentiments written in English are considered. In addition, the proposed committee selection classifier is prepared only using the committee of probabilistic classifiers; however, other classifier committees can also be built, tested and compared with the present experiment scenario.

Practical implications

In this paper, a novel probabilistic approach is proposed and used for classifying movie reviews, and is found to be highly effective in comparison with other state-of-the-art classifiers. This classifier may be tested for different applications and may provide new insights for developers and researchers.

Social implications

The proposed PCC may be used to classify different product reviews, and hence may be beneficial to organizations to justify users’ reviews about specific products or services. By using authentic positive and negative sentiments of users, the credibility of the specific product, service or event may be enhanced. PCC may also be applied to other applications, such as spam detection, blog mining, news mining and various other data-mining applications.

Originality/value

The constructed PCC is novel and was tested on Indian movie review data.

Article
Publication date: 16 August 2021

Nur Azreen Zulkefly, Norjihan Abdul Ghani, Christie Pei-Yee Chin, Suraya Hamid and Nor Aniza Abdullah

Predicting the impact of social entrepreneurship is crucial as it can help social entrepreneurs to determine the achievement of their social mission and performance…

Abstract

Purpose

Predicting the impact of social entrepreneurship is crucial as it can help social entrepreneurs to determine the achievement of their social mission and performance. However, there is a lack of existing social entrepreneurship models to predict social enterprises' social impacts. This paper aims to propose the social impact prediction model for social entrepreneurs using a data analytic approach.

Design/methodology/approach

This study implemented an experimental method using three different algorithms: naive Bayes, k-nearest neighbor and J48 decision tree algorithms to develop and test the social impact prediction model.

Findings

The accurate result of the developed social impact prediction model is based on the list of identified social impact prediction variables that have been evaluated by social entrepreneurship experts. Based on the three algorithms' implementation of the model, the results showed that naive Bayes is the best performance classifier for social impact prediction accuracy.

Research limitations/implications

Although there are three categories of social entrepreneurship impact, this research only focuses on social impact. There will be a bright future of social entrepreneurship if the research can focus on all three social entrepreneurship categories. Future research in this area could look beyond these three categories of social entrepreneurship, so the prediction of social impact will be broader. The prospective researcher also can look beyond the difference and similarities of economic, social impacts and environmental impacts and study the overall perspective on those impacts.

Originality/value

This paper fulfills the need for the Malaysian social entrepreneurship blueprint to design the social impact in social entrepreneurship. There are none of the prediction models that can be used in predicting social impact in Malaysia. This study also contributes to social entrepreneur researchers, as the new social impact prediction variables found can be used in predicting social impact in social entrepreneurship in the future, which may lead to the significance of the prediction performance.

Details

Internet Research, vol. 32 no. 2
Type: Research Article
ISSN: 1066-2243

Keywords

Article
Publication date: 20 November 2017

Xiangbin Yan, Yumei Li and Weiguo Fan

Getting high-quality data by removing the noisy data from the user-generated content (UGC) is the first step toward data mining and effective decision-making based on…

Abstract

Purpose

Getting high-quality data by removing the noisy data from the user-generated content (UGC) is the first step toward data mining and effective decision-making based on ubiquitous and unstructured social media data. This paper aims to design a framework for revoking noisy data from UGC.

Design/methodology/approach

In this paper, the authors consider a classification-based framework to remove the noise from the unstructured UGC in social media community. They treat the noise as the concerned topic non-relevant messages and apply a text classification-based approach to remove the noise. They introduce a domain lexicon to help identify the concerned topic from noise and compare the performance of several classification algorithms combined with different feature selection methods.

Findings

Experimental results based on a Chinese stock forum show that 84.9 per cent of all the noise data from the UGC could be removed with little valuable information loss. The support vector machines classifier combined with information gain feature extraction model is the best choice for this system. With longer messages getting better classification performance, it has been found that the length of messages affects the system performance.

Originality/value

The proposed method could be used for preprocessing in text mining and new knowledge discovery from the big data.

Details

Information Discovery and Delivery, vol. 45 no. 4
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 23 June 2020

Ravikumar KN, Hemantha Kumar, Kumar GN and Gangadharan KV

The purpose of this paper is to study the fault diagnosis of internal combustion (IC) engine gearbox using vibration signals with signal processing and machine learning…

Abstract

Purpose

The purpose of this paper is to study the fault diagnosis of internal combustion (IC) engine gearbox using vibration signals with signal processing and machine learning (ML) techniques.

Design/methodology/approach

Vibration signals from the gearbox are acquired for healthy and induced faulty conditions of the gear. In this study, 50% tooth fault and 100% tooth fault are chosen as gear faults in the driver gear. The acquired signals are processed and analyzed using signal processing and ML techniques.

Findings

The obtained results show that variation in the amplitude of the crankshaft rotational frequency (CRF) and gear mesh frequency (GMF) for different conditions of the gearbox with various load conditions. ML techniques were also employed in developing the fault diagnosis system using statistical features. J48 decision tree provides better classification accuracy about 85.1852% in identifying gearbox conditions.

Practical implications

The proposed approach can be used effectively for fault diagnosis of IC engine gearbox. Spectrum and continuous wavelet transform (CWT) provide better information about gear fault conditions using time–frequency characteristics.

Originality/value

In this paper, experiments are conducted on real-time running condition of IC engine gearbox while considering combustion. Eddy current dynamometer is attached to output shaft of the engine for applying load. Spectrum, cepstrum, short-time Fourier transform (STFT) and wavelet analysis are performed. Spectrum, cepstrum and CWT provide better information about gear fault conditions using time–frequency characteristics. ML techniques were used in analyzing classification accuracy of the experimental data to detect the gearbox conditions using various classifiers. Hence, these techniques can be used for detection of faults in the IC engine gearbox and other reciprocating/rotating machineries.

Details

Journal of Quality in Maintenance Engineering, vol. 27 no. 2
Type: Research Article
ISSN: 1355-2511

Keywords

Open Access
Article
Publication date: 29 July 2020

T. Mahalingam and M. Subramoniam

Surveillance is the emerging concept in the current technology, as it plays a vital role in monitoring keen activities at the nooks and corner of the world. Among which…

1033

Abstract

Surveillance is the emerging concept in the current technology, as it plays a vital role in monitoring keen activities at the nooks and corner of the world. Among which moving object identifying and tracking by means of computer vision techniques is the major part in surveillance. If we consider moving object detection in video analysis is the initial step among the various computer applications. The main drawbacks of the existing object tracking method is a time-consuming approach if the video contains a high volume of information. There arise certain issues in choosing the optimum tracking technique for this huge volume of data. Further, the situation becomes worse when the tracked object varies orientation over time and also it is difficult to predict multiple objects at the same time. In order to overcome these issues here, we have intended to propose an effective method for object detection and movement tracking. In this paper, we proposed robust video object detection and tracking technique. The proposed technique is divided into three phases namely detection phase, tracking phase and evaluation phase in which detection phase contains Foreground segmentation and Noise reduction. Mixture of Adaptive Gaussian (MoAG) model is proposed to achieve the efficient foreground segmentation. In addition to it the fuzzy morphological filter model is implemented for removing the noise present in the foreground segmented frames. Moving object tracking is achieved by the blob detection which comes under tracking phase. Finally, the evaluation phase has feature extraction and classification. Texture based and quality based features are extracted from the processed frames which is given for classification. For classification we are using J48 ie, decision tree based classifier. The performance of the proposed technique is analyzed with existing techniques k-NN and MLP in terms of precision, recall, f-measure and ROC.

Details

Applied Computing and Informatics, vol. 17 no. 1
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 27 September 2021

Samrakshya Karki and Bonaventura Hadikusumo

Project manager’s competency is crucial in the construction sector for the successful completion of projects, particularly in the case of developing countries like Nepal…

Abstract

Purpose

Project manager’s competency is crucial in the construction sector for the successful completion of projects, particularly in the case of developing countries like Nepal. Therefore, it is very essential to select competent project managers by finding the competency factors required by them. Hence, this study aims to identify the characteristics of competent project managers by expert opinion method and to evaluate their competency level by a questionnaire survey to develop a prediction model using a supervised machine learning approach via Waikato Environment for Knowledge Analysis (WEKA), a machine learning tool which predicts Project manager’s performance as “Higher than expected,” “Expected” or “Lower than expected” for the medium complexity construction projects of Nepal (from US$200,000 up to US$10M).

Design/methodology/approach

The data collection procedure for this research is based on an expert opinion method and survey. Expert opinion method is conducted to find the characteristics of a competent project manager by validating the top 15 competency factors based on literature review. The survey is conducted with the top management to assess their project manager’s competency level. Both qualitative and quantitative approaches are used to collect data for classification and prediction in WEKA, a machine learning tool.

Findings

The results illustrate that the project managers in Nepal have a high score in leadership skills, personal characteristics, team development and delegation, communication skills, technical skills, problem-solving/coping with situation skills and stakeholder/relationship management skills. Furthermore, among the seven classifiers (naïve Bayes, sequential minimal optimization [SMO], multilayer perceptron, logistic, KStar, J48 and random forest), the accuracy given by the SMO algorithm is highest of all in both the percentage split and k-folds cross validation method. The model developed using SMO classifier by k-folds cross-validation (k = 10) is acknowledged as a final model.

Research limitations/implications

This research focuses to develop a prediction model to predict and analyze the competency of project managers by applying a supervised machine learning approach. Seven extensively used algorithms (Naïve Bayes, SMO, multilayer perceptron, logistic, KStar, J48, random forest) are used to check the accuracy of models and an algorithm that gives the highest accuracy is adopted. Data collection for this research is carried out by expert opinion method to validate the characteristics (factors) essential for competent project managers in the first round and the description of each factor as high, medium and low is inquired with the same experts in the second round. After an expert opinion, a structured questionnaire is prepared for the survey to assess the competency level of project managers (PMs). The competency level of PMs working under government funded, foreign aided or private projects from the contractor’s side is measured. This research is limited to the medium scale construction projects of Nepal.

Practical implications

This model can be a huge asset in the human resource department of construction companies as it helps to know the performance level of project managers in terms of “Higher than expected,” “Expected” or “Lower than expected” for the medium complexity construction projects of Nepal. Also, the model will assist human intelligence to make the decision while recruiting a new project manager/s for different types of projects at a time. Moreover, the model can be used for self-assessment of project manager/s to know their performance level. The model can be used to develop a user friendly interface system or an application such that it can be conveniently used anywhere any time.

Social implications

This research shows that most of the project managers working in a medium complexity construction project of Nepal are male, maximum of them hold bachelor’s degree and study for road projects. Furthermore, most of the project managers scored high in leadership skills, personal characteristics, communication skills, technical skills, problem-solving/coping with situation skills, team development and delegation and stakeholder/relationship management skills. The model has given the “Personal characteristics” attribute the highest weightage. Likewise, other attributes having high weightage are communication skills, analytical abilities, project budget, stakeholder/relationship management, team development and delegation and time management skills.

Originality/value

This research was conducted to find the competency factors and to study the competency level of project managers in Nepal to develop a prediction model to predict the PM’s performance using a machine learning approach in medium scale construction projects. There is a lack of research to develop a model that predicts project manager’s competency using the machine learning approach. Therefore, the predictive model developed here helps in the identification of a competent project manager as it will be advantageous for project completion with a high success rate.

Article
Publication date: 20 November 2017

Moloud Abdar and Neil Y. Yen

This research intends to look at the regional characteristics through an analysis of crowd preference and confidence, and investigates how regional characteristics are…

Abstract

Purpose

This research intends to look at the regional characteristics through an analysis of crowd preference and confidence, and investigates how regional characteristics are going to affect human beings at all aspects in a scenario of sharing economy. The purpose of this paper is to introduce an approach to provide an understandable rating score. Furthermore, the paper aims to find the relationships between different features classified in this study by using machine learning methods. Furthermore, due to the importance of performance of methods, the performance of the features is also improved.

Design/methodology/approach

The Rating Matching Rate (RMRate) approach is proposed to provide score in terms of simplicity and understandability for all features. The relationships between features can be extracted from accommodation data set using decision tree (DT) algorithms (J48, HoeffdingTree, and REPTree). Usability of these methods was evaluated using different metrics. Two techniques, “ClassBalancer” and “SpreadSubsample,” are applied to improve the performance of algorithms.

Findings

Experimental outcomes using the RMRate approach show that the scores are very easy to understand. Three property types are very popular almost in all of selected countries in this study (“apartment”, “house”, and “bed and breakfast”). The findings also indicate that “Entire home/apt” is the most common room-type and 4.5 and 5 star-rating are the most given star-rating by users. The proposed DT algorithms can find the relationships between features significantly. In addition, applied CB and SS techniques could improve the performance of algorithms efficiently.

Originality/value

This study gives precise details about the guests’ preferences and hosts’ preferences. The proposed techniques can effectively improve the performance in predicting the behavior of users in sharing economy. The findings can also help group decision making in P2P platforms efficiently.

Details

Library Hi Tech, vol. 35 no. 4
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 22 September 2021

Samar Ali Shilbayeh and Sunil Vadera

This paper aims to describe the use of a meta-learning framework for recommending cost-sensitive classification methods with the aim of answering an important question…

Abstract

Purpose

This paper aims to describe the use of a meta-learning framework for recommending cost-sensitive classification methods with the aim of answering an important question that arises in machine learning, namely, “Among all the available classification algorithms, and in considering a specific type of data and cost, which is the best algorithm for my problem?”

Design/methodology/approach

This paper describes the use of a meta-learning framework for recommending cost-sensitive classification methods for the aim of answering an important question that arises in machine learning, namely, “Among all the available classification algorithms, and in considering a specific type of data and cost, which is the best algorithm for my problem?” The framework is based on the idea of applying machine learning techniques to discover knowledge about the performance of different machine learning algorithms. It includes components that repeatedly apply different classification methods on data sets and measures their performance. The characteristics of the data sets, combined with the algorithms and the performance provide the training examples. A decision tree algorithm is applied to the training examples to induce the knowledge, which can then be used to recommend algorithms for new data sets. The paper makes a contribution to both meta-learning and cost-sensitive machine learning approaches. Those both fields are not new, however, building a recommender that recommends the optimal case-sensitive approach for a given data problem is the contribution. The proposed solution is implemented in WEKA and evaluated by applying it on different data sets and comparing the results with existing studies available in the literature. The results show that a developed meta-learning solution produces better results than METAL, a well-known meta-learning system. The developed solution takes the misclassification cost into consideration during the learning process, which is not available in the compared project.

Findings

The proposed solution is implemented in WEKA and evaluated by applying it to different data sets and comparing the results with existing studies available in the literature. The results show that a developed meta-learning solution produces better results than METAL, a well-known meta-learning system.

Originality/value

The paper presents a major piece of new information in writing for the first time. Meta-learning work has been done before but this paper presents a new meta-learning framework that is costs sensitive.

Details

Journal of Modelling in Management, vol. 17 no. 3
Type: Research Article
ISSN: 1746-5664

Keywords

Article
Publication date: 16 December 2019

Chihli Hung and You-Xin Cao

This paper aims to propose a novel approach which integrates collocations and domain concepts for Chinese cosmetic word of mouth (WOM) sentiment classification. Most…

Abstract

Purpose

This paper aims to propose a novel approach which integrates collocations and domain concepts for Chinese cosmetic word of mouth (WOM) sentiment classification. Most sentiment analysis works by collecting sentiment scores from each unigram or bigram. However, not every unigram or bigram in a WOM document contains sentiments. Chinese collocations consist of the main sentiments of WOM. This paper reduces the complexity of the document dimensionality and makes an improvement for sentiment classification.

Design/methodology/approach

This paper builds two contextual lexicons for feature words and sentiment words, respectively. Based on these contextual lexicons, this paper uses the techniques of associated rules and mutual information to build possible Chinese collocation sets. This paper applies preference vector modelling as the vector representation approach to catch the relationship between Chinese collocations and their associated concepts.

Findings

This paper compares the proposed preference vector models with benchmarks, using three classification techniques (i.e. support vector machine, J48 decision tree and multilayer perceptron). According to the experimental results, the proposed models outperform all benchmarks evaluated by the criterion of accuracy.

Originality/value

This paper focuses on Chinese collocations and proposes a novel research approach for sentiment classification. The Chinese collocations used in this paper are adaptable to the content and domains. Finally, this paper integrates collocations with the preference vector modelling approach, which not only achieves a better sentiment classification performance for Chinese WOM documents but also avoids the curse of dimensionality.

Details

The Electronic Library , vol. 38 no. 1
Type: Research Article
ISSN: 0264-0473

Keywords

Open Access
Article
Publication date: 29 July 2020

Jenri MP Panjaitan, Rudi Prasetya Timur and Sumiyana Sumiyana

This study aims to acknowledge that most Indonesian small and medium enterprises (SMEs) experience slow growth. It highlighted that this sluggishness is because of some…

3056

Abstract

Purpose

This study aims to acknowledge that most Indonesian small and medium enterprises (SMEs) experience slow growth. It highlighted that this sluggishness is because of some falsification of Indonesia’s ecological psychology. It focuses on investigating the situated cognition that probably supports this falsification, such as affordance, a community of practice, embodiment and the legitimacy of peripheral participation situated cognition and social intelligence theories.

Design/methodology/approach

This study obtained data from published newspapers between October 2016 and February 2019. The authors used the Waikato Environment for Knowledge Analysis and the J48 C.45 algorithm. The authors analyzed the data using the emergence of news probability for both the Government of Indonesia (GoI) and Indonesian society and the situated cognition concerning the improvement of the SMEs. The authors inferred ecological psychology from these published newspapers in Indonesia that the engaged actions were still suppressed, in comparison with being and doing.

Findings

This study contributes to the innovation and leadership policies of the SMEs’ managerial systems and the GoI. After this study identified the backward-looking practices, which the GoI and the people of Indonesia held, this study recommended some policies to help create a forward-looking orientation. The second one is also a policy for the GoI, which needs to reduce the discrepancy between the signified and the signifier, as recommended by the structuralist theory. The last one is suggested by the social learning theory; policies are needed that relate to developing the SMEs’ beliefs, attitudes and behavior. It means that the GoI should prepare the required social contexts, which are in motoric production and reinforcement. Explicitly, the authors argue that the GoI facilitates SMEs by emphasizing the internal learning process.

Research limitations/implications

The authors present some possibilities for the limitations of this research. The authors took into account that this study assumes the SMEs are all the same, without industrial clustering. It considers that the need for social learning and social cognition by the unclustered industries is equal. Second, the authors acknowledge that Indonesia is an emerging country, and its economic structure has three levels of contributors; the companies listed on the Indonesian Stock Exchange, then the SMEs and the lowest level is the underground economy. Third, the authors did not distinguish the levels of success for the empowerment programs that are conducted by either the GoI or the local governments. This study recognizes that the authors did not measure success levels. It means that the authors only focused on the knowledge content.

Practical implications

From these pieces of evidence, this study constructed its strategies. The authors offer three kinds of policies. The first is the submission of special allocation funds from which the GoI and local governments develop their budgets for the SMEs’ social learning and social cognition. The second is the development of social learning and social cognition’s curricula for both the SMEs’ owners and executive officers. The third is the need for a national knowledge repository for all the Indonesian SMEs. This repository is used for the dissemination of knowledge.

Originality/value

This study raises argumental novelties with some of the critical reasoning. First, the authors argue that the sluggishness of the Indonesian SMEs is because of some fallacies in their social cognition. This social cognition is derived from the cultural knowledge that the GoI and people of Indonesia disclosed in the newspapers. This study shows the falsifications from the three main perspectives of the structuration, structuralist and social learning theories. Second, this study can elaborate on the causal factor for the sluggishness of Indonesia’s SMEs, which can be explained by philosophical science, especially its fallacies (Hundleby, 2010; Magnus and Callender, 2004). The authors expand the causal factors for each gap in every theory, which determined the SMEs’ sluggishness through the identification of inconsistencies in each dimension of their structuration, structuralism and social learning. This study focused on the fallacy of philosophical science that explains the misconceptions about the SMEs’ improvement because of faulty reasoning, which causes the wrong moves to be made in the future (Dorr, 2017; Pielke, 1999).

Details

Journal of Entrepreneurship in Emerging Economies, vol. 13 no. 5
Type: Research Article
ISSN: 2053-4604

Keywords

1 – 10 of 98