Search results
1 – 10 of over 12000Xiaoguang Tian, Robert Pavur, Henry Han and Lili Zhang
Studies on mining text and generating intelligence on human resource documents are rare. This research aims to use artificial intelligence and machine learning techniques to…
Abstract
Purpose
Studies on mining text and generating intelligence on human resource documents are rare. This research aims to use artificial intelligence and machine learning techniques to facilitate the employee selection process through latent semantic analysis (LSA), bidirectional encoder representations from transformers (BERT) and support vector machines (SVM). The research also compares the performance of different machine learning, text vectorization and sampling approaches on the human resource (HR) resume data.
Design/methodology/approach
LSA and BERT are used to discover and understand the hidden patterns from a textual resume dataset, and SVM is applied to build the screening model and improve performance.
Findings
Based on the results of this study, LSA and BERT are proved useful in retrieving critical topics, and SVM can optimize the prediction model performance with the help of cross-validation and variable selection strategies.
Research limitations/implications
The technique and its empirical conclusions provide a practical, theoretical basis and reference for HR research.
Practical implications
The novel methods proposed in the study can assist HR practitioners in designing and improving their existing recruitment process. The topic detection techniques used in the study provide HR practitioners insights to identify the skill set of a particular recruiting position.
Originality/value
To the best of the authors’ knowledge, this research is the first study that uses LSA, BERT, SVM and other machine learning models in human resource management and resume classification. Compared with the existing machine learning-based resume screening system, the proposed system can provide more interpretable insights for HR professionals to understand the recommendation results through the topics extracted from the resumes. The findings of this study can also help organizations to find a better and effective approach for resume screening and evaluation.
Details
Keywords
Rajshree Varma, Yugandhara Verma, Priya Vijayvargiya and Prathamesh P. Churi
The rapid advancement of technology in online communication and fingertip access to the Internet has resulted in the expedited dissemination of fake news to engage a global…
Abstract
Purpose
The rapid advancement of technology in online communication and fingertip access to the Internet has resulted in the expedited dissemination of fake news to engage a global audience at a low cost by news channels, freelance reporters and websites. Amid the coronavirus disease 2019 (COVID-19) pandemic, individuals are inflicted with these false and potentially harmful claims and stories, which may harm the vaccination process. Psychological studies reveal that the human ability to detect deception is only slightly better than chance; therefore, there is a growing need for serious consideration for developing automated strategies to combat fake news that traverses these platforms at an alarming rate. This paper systematically reviews the existing fake news detection technologies by exploring various machine learning and deep learning techniques pre- and post-pandemic, which has never been done before to the best of the authors’ knowledge.
Design/methodology/approach
The detailed literature review on fake news detection is divided into three major parts. The authors searched papers no later than 2017 on fake news detection approaches on deep learning and machine learning. The papers were initially searched through the Google scholar platform, and they have been scrutinized for quality. The authors kept “Scopus” and “Web of Science” as quality indexing parameters. All research gaps and available databases, data pre-processing, feature extraction techniques and evaluation methods for current fake news detection technologies have been explored, illustrating them using tables, charts and trees.
Findings
The paper is dissected into two approaches, namely machine learning and deep learning, to present a better understanding and a clear objective. Next, the authors present a viewpoint on which approach is better and future research trends, issues and challenges for researchers, given the relevance and urgency of a detailed and thorough analysis of existing models. This paper also delves into fake new detection during COVID-19, and it can be inferred that research and modeling are shifting toward the use of ensemble approaches.
Originality/value
The study also identifies several novel automated web-based approaches used by researchers to assess the validity of pandemic news that have proven to be successful, although currently reported accuracy has not yet reached consistent levels in the real world.
Details
Keywords
Aishwarya Narang, Ravi Kumar and Amit Dhiman
This study seeks to understand the connection of methodology by finding relevant papers and their full review using the “Preferred Reporting Items for Systematic Reviews and…
Abstract
Purpose
This study seeks to understand the connection of methodology by finding relevant papers and their full review using the “Preferred Reporting Items for Systematic Reviews and Meta-Analyses” (PRISMA).
Design/methodology/approach
Concrete-filled steel tubular (CFST) columns have gained popularity in construction in recent decades as they offer the benefit of constituent materials and cost-effectiveness. Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Gene Expression Programming (GEP) and Decision Trees (DTs) are some of the approaches that have been widely used in recent decades in structural engineering to construct predictive models, resulting in effective and accurate decision making. Despite the fact that there are numerous research studies on the various parameters that influence the axial compression capacity (ACC) of CFST columns, there is no systematic review of these Machine Learning methods.
Findings
The implications of a variety of structural characteristics on machine learning performance parameters are addressed and reviewed. The comparison analysis of current design codes and machine learning tools to predict the performance of CFST columns is summarized. The discussion results indicate that machine learning tools better understand complex datasets and intricate testing designs.
Originality/value
This study examines machine learning techniques for forecasting the axial bearing capacity of concrete-filled steel tubular (CFST) columns. This paper also highlights the drawbacks of utilizing existing techniques to build CFST columns, and the benefits of Machine Learning approaches over them. This article attempts to introduce beginners and experienced professionals to various research trajectories.
Details
Keywords
This study aims to compare machine learning models, datasets and splitting training-testing using data mining methods to detect financial statement fraud.
Abstract
Purpose
This study aims to compare machine learning models, datasets and splitting training-testing using data mining methods to detect financial statement fraud.
Design/methodology/approach
This study uses a quantitative approach from secondary data on the financial reports of companies listed on the Indonesia Stock Exchange in the last ten years, from 2010 to 2019. Research variables use financial and non-financial variables. Indicators of financial statement fraud are determined based on notes or sanctions from regulators and financial statement restatements with special supervision.
Findings
The findings show that the Extremely Randomized Trees (ERT) model performs better than other machine learning models. The best original-sampling dataset compared to other dataset treatments. Training testing splitting 80:10 is the best compared to other training-testing splitting treatments. So the ERT model with an original-sampling dataset and 80:10 training-testing splitting are the most appropriate for detecting future financial statement fraud.
Practical implications
This study can be used by regulators, investors, stakeholders and financial crime experts to add insight into better methods of detecting financial statement fraud.
Originality/value
This study proposes a machine learning model that has not been discussed in previous studies and performs comparisons to obtain the best financial statement fraud detection results. Practitioners and academics can use findings for further research development.
Details
Keywords
Md Shamim Hossain, Mst Farjana Rahman, Md Kutub Uddin and Md Kamal Hossain
There is a strong prerequisite for organizations to analyze customer review behavior to evaluate the competitive business environment. The purpose of this study is to analyze and…
Abstract
Purpose
There is a strong prerequisite for organizations to analyze customer review behavior to evaluate the competitive business environment. The purpose of this study is to analyze and predict customer reviews of halal restaurants using machine learning (ML) approaches.
Design/methodology/approach
The authors collected customer review data from the Yelp website. The authors filtered the reviews of only halal restaurants from the original data set. Following cleaning, the filtered review texts were classified as positive, neutral or negative sentiments, and those sentiments were scored using the AFINN and VADER sentiment algorithms. Also, the current study applies four machine learning methods to classify each review toward halal restaurants into its sentiment class.
Findings
The experiment showed that most of the customer reviews toward halal restaurants were positive. The authors also discovered that all of the methods (decision tree, linear support vector machine, logistic regression and random forest classifier) can correctly classify the review text into sentiment class, but logistic regression outperforms the others in terms of accuracy.
Practical implications
The results facilitate halal restaurateurs in identifying customer review behavior.
Social implications
Sentiment and emotions, according to appraisal theory, form the basis for all interactions, facilitating cognitive functions and supporting prospective customers in making sense of experiences. Emotion theory also describes human affective states that determine motives and actions. The study looks at how potential customers might react to a halal restaurant’s consensus on social media based on reviewers’ opinions of halal restaurants because emotions can be conveyed through reviews.
Originality/value
This study applies machine learning approaches to analyze and predict customer sentiment based on the review texts toward halal restaurants.
Details
Keywords
Chedia Dhaoui, Cynthia M. Webster and Lay Peng Tan
With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about…
Abstract
Purpose
With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about brands, products or services embedded within consumer-generated content (CGC). These “Big Data” opportunities render manual approaches to sentiment analysis impractical and raise the need to develop automated tools to analyse consumer sentiment expressed in text format. This paper aims to evaluate and compare the performance of two prominent approaches to automated sentiment analysis applied to CGC on social media and explores the benefits of combining them.
Design/methodology/approach
A sample of 850 consumer comments from 83 Facebook brand pages are used to test and compare lexicon-based and machine learning approaches to sentiment analysis, as well as their combination, using the LIWC2015 lexicon and RTextTools machine learning package.
Findings
Results show the two approaches are similar in accuracy, both achieving higher accuracy when classifying positive sentiment than negative sentiment. However, they differ substantially in their classification ensembles. The combined approach demonstrates significantly improved performance in classifying positive sentiment.
Research limitations/implications
Further research is required to improve the accuracy of negative sentiment classification. The combined approach needs to be applied to other kinds of CGCs on social media such as tweets.
Practical implications
The findings inform decision-making around which sentiment analysis approaches (or a combination thereof) is best to analyse CGC on social media.
Originality/value
This study combines two sentiment analysis approaches and demonstrates significantly improved performance.
Details
Keywords
The purpose of this study is to examine the state of research into adoption of machine learning systems within the health sector, to identify themes that have been studied and…
Abstract
Purpose
The purpose of this study is to examine the state of research into adoption of machine learning systems within the health sector, to identify themes that have been studied and observe the important gaps in the literature that can inform a research agenda going forward.
Design/methodology/approach
A systematic literature strategy was utilized to identify and analyze scientific papers between 2012 and 2022. A total of 28 articles were identified and reviewed.
Findings
The outcomes reveal that while advances in machine learning have the potential to improve service access and delivery, there have been sporadic growth of literature in this area which is perhaps surprising given the immense potential of machine learning within the health sector. The findings further reveal that themes such as recordkeeping, drugs development and streamlining of treatment have primarily been focused on by the majority of authors in this area.
Research limitations/implications
The search was limited to journal articles published in English, resulting in the exclusion of studies disseminated through alternative channels, such as conferences, and those published in languages other than English. Considering that scholars in developing nations may encounter less difficulty in disseminating their work through alternative channels and that numerous emerging nations employ languages other than English, it is plausible that certain research has been overlooked in the present investigation.
Originality/value
This review provides insights into future research avenues for theory, content and context on adoption of machine learning within the health sector.
Details
Keywords
Bilal Abu-Salih, Pornpit Wongthongtham and Chan Yan Kit
This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users’ domain (s) of interest is a…
Abstract
Purpose
This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users’ domain (s) of interest is a significant step towards addressing their domain-based trustworthiness through an accurate understanding of their content in their OSNs.
Design/methodology/approach
This study uses a Twitter mining approach for domain-based classification of users and their textual content. The proposed approach incorporates machine learning modules. The approach comprises two analysis phases: the time-aware semantic analysis of users’ historical content incorporating five commonly used machine learning classifiers. This framework classifies users into two main categories: politics-related and non-politics-related categories. In the second stage, the likelihood predictions obtained in the first phase will be used to predict the domain of future users’ tweets.
Findings
Experiments have been conducted to validate the mechanism proposed in the study framework, further supported by the excellent performance of the harnessed evaluation metrics. The experiments conducted verify the applicability of the framework to an effective domain-based classification for Twitter users and their content, as evident in the outstanding results of several performance evaluation metrics.
Research limitations/implications
This study is limited to an on/off domain classification for content of OSNs. Hence, we have selected a politics domain because of Twitter’s popularity as an opulent source of political deliberations. Such data abundance facilitates data aggregation and improves the results of the data analysis. Furthermore, the currently implemented machine learning approaches assume that uncertainty and incompleteness do not affect the accuracy of the Twitter classification. In fact, data uncertainty and incompleteness may exist. In the future, the authors will formulate the data uncertainty and incompleteness into fuzzy numbers which can be used to address imprecise, uncertain and vague data.
Practical implications
This study proposes a practical framework comprising significant implications for a variety of business-related applications, such as the voice of customer/voice of market, recommendation systems, the discovery of domain-based influencers and opinion mining through tracking and simulation. In particular, the factual grasp of the domains of interest extracted at the user level or post level enhances the customer-to-business engagement. This contributes to an accurate analysis of customer reviews and opinions to improve brand loyalty, customer service, etc.
Originality/value
This paper fills a gap in the existing literature by presenting a consolidated framework for Twitter mining that aims to uncover the deficiency of the current state-of-the-art approaches to topic distillation and domain discovery. The overall approach is promising in the fortification of Twitter mining towards a better understanding of users’ domains of interest.
Details
Keywords
Prudence Kadebu, Robert T.R. Shoniwa, Kudakwashe Zvarevashe, Addlight Mukwazvure, Innocent Mapanga, Nyasha Fadzai Thusabantu and Tatenda Trust Gotora
Given how smart today’s malware authors have become through employing highly sophisticated techniques, it is only logical that methods be developed to combat the most potent…
Abstract
Purpose
Given how smart today’s malware authors have become through employing highly sophisticated techniques, it is only logical that methods be developed to combat the most potent threats, particularly where the malware is stealthy and makes indicators of compromise (IOC) difficult to detect. After the analysis is completed, the output can be employed to detect and then counteract the attack. The goal of this work is to propose a machine learning approach to improve malware detection by combining the strengths of both supervised and unsupervised machine learning techniques. This study is essential as malware has certainly become ubiquitous as cyber-criminals use it to attack systems in cyberspace. Malware analysis is required to reveal hidden IOC, to comprehend the attacker’s goal and the severity of the damage and to find vulnerabilities within the system.
Design/methodology/approach
This research proposes a hybrid approach for dynamic and static malware analysis that combines unsupervised and supervised machine learning algorithms and goes on to show how Malware exploiting steganography can be exposed.
Findings
The tactics used by malware developers to circumvent detection are becoming more advanced with steganography becoming a popular technique applied in obfuscation to evade mechanisms for detection. Malware analysis continues to call for continuous improvement of existing techniques. State-of-the-art approaches applying machine learning have become increasingly popular with highly promising results.
Originality/value
Cyber security researchers globally are grappling with devising innovative strategies to identify and defend against the threat of extremely sophisticated malware attacks on key infrastructure containing sensitive data. The process of detecting the presence of malware requires expertise in malware analysis. Applying intelligent methods to this process can aid practitioners in identifying malware’s behaviour and features. This is especially expedient where the malware is stealthy, hiding IOC.
Details
Keywords
The objective of this research work is to design a data-based solution for administering traffic organization in a smart city by using the machine learning algorithm.
Abstract
Purpose
The objective of this research work is to design a data-based solution for administering traffic organization in a smart city by using the machine learning algorithm.
Design/methodology/approach
A machine learning framework for managing traffic infrastructure and air pollution in urban centers relies on a predictive analytics model. The model makes use of transportation data to predict traffic patterns based on the information gathered from numerous sources within the city. It can be promoted for strategic planning determination. The data features volume and calendar variables, including hours of the day, week and month. These variables are leveraged to identify time series-based seasonal patterns in the data. To achieve accurate traffic volume forecasting, the long short-term memory (LSTM) method is recommended.
Findings
The study has produced a model that is appropriate for the transportation sector in the city and other innovative urban applications. The findings indicate that the implementation of smart transportation systems enhances transportation and has a positive impact on air quality. The study's results are explored and connected to practical applications in the areas of air pollution control and smart transportation.
Originality/value
The present paper has created the machine learning framework for the transportation sector of smart cities that achieves a reasonable level of accuracy. Additionally, the paper examines the effects of smart transportation on both the environment and supply chain.
Details