Search results
1 – 10 of 26Abstract
Purpose
Identifying the frontiers of a specific research field is one of the most basic tasks in bibliometrics and research published in leading conferences is crucial to the data mining research community, whereas few research studies have focused on it. The purpose of this study is to detect the intellectual structure of data mining based on conference papers.
Design/methodology/approach
This study takes the authoritative conference papers of the ranking 9 in the data mining field provided by Google Scholar Metrics as a sample. According to paper amount, this paper first detects the annual situation of the published documents and the distribution of the published conferences. Furthermore, from the research perspective of keywords, CiteSpace was used to dig into the conference papers to identify the frontiers of data mining, which focus on keywords term frequency, keywords betweenness centrality, keywords clustering and burst keywords.
Findings
Research showed that the research heat of data mining had experienced a linear upward trend during 2007 and 2016. The frontier identification based on the conference papers showed that there were five research hotspots in data mining, including clustering, classification, recommendation, social network analysis and community detection. The research contents embodied in the conference papers were also very rich.
Originality/value
This study detected the research frontier from leading data mining conference papers. Based on the keyword co-occurrence network, from four dimensions of keyword term frequency, betweeness centrality, clustering analysis and burst analysis, this paper identified and analyzed the research frontiers of data mining discipline from 2007 to 2016.
Details
Keywords
The purpose of this paper is to establish and implement a direct topological reanalysis algorithm for general successive structural modifications, based on the updating matrix…
Abstract
Purpose
The purpose of this paper is to establish and implement a direct topological reanalysis algorithm for general successive structural modifications, based on the updating matrix triangular factorization (UMTF) method for non-topological modification proposed by Song et al. [Computers and Structures, 143(2014):60-72].
Design/methodology/approach
In this method, topological modifications are viewed as a union of symbolic and numerical change of structural matrices. The numerical part is dealt with UMTF by directly updating the matrix triangular factors. For symbolic change, an integral structure which consists of all potential nodes/elements is introduced to avoid side effects on the efficiency during successive modifications. Necessary pre- and post processing are also developed for memory-economic matrix manipulation.
Findings
The new reanalysis algorithm is applicable to successive general structural modifications for arbitrary modification amplitudes and locations. It explicitly updates the factor matrices of the modified structure and thus guarantees the accuracy as full direct analysis while greatly enhancing the efficiency.
Practical implications
Examples including evolutionary structural optimization and sequential construction analysis show the capability and efficiency of the algorithm.
Originality/value
This innovative paper makes direct topological reanalysis be applicable for successive structural modifications in many different areas.
Details
Keywords
Linzi Wang, Qiudan Li, Jingjun David Xu and Minjie Yuan
Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models…
Abstract
Purpose
Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models primarily integrate word embedding and matrix decomposition, which only generates keyword-based hot topics with weak interpretability, making it difficult to meet the specific needs of users. Mining phrase-based hot topics with syntactic dependency structure have been proven to model structure information effectively. A key challenge lies in the effective integration of the above information into the hot topic mining process.
Design/methodology/approach
This paper proposes the nonnegative matrix factorization (NMF)-based hot topic mining method, semantics syntax-assisted hot topic model (SSAHM), which combines semantic association and syntactic dependency structure. First, a semantic–syntactic component association matrix is constructed. Then, the matrix is used as a constraint condition to be incorporated into the block coordinate descent (BCD)-based matrix decomposition process. Finally, a hot topic information-driven phrase extraction algorithm is applied to describe hot topics.
Findings
The efficacy of the developed model is demonstrated on two real-world datasets, and the effects of dependency structure information on different topics are compared. The qualitative examples further explain the application of the method in real scenarios.
Originality/value
Most prior research focuses on keyword-based hot topics. Thus, the literature is advanced by mining phrase-based hot topics with syntactic dependency structure, which can effectively analyze the semantics. The development of syntactic dependency structure considering the combination of word order and part-of-speech (POS) is a step forward as word order, and POS are only separately utilized in the prior literature. Ignoring this synergy may miss important information, such as grammatical structure coherence and logical relations between syntactic components.
Details
Keywords
Yixin Zhang, Lizhen Cui, Wei He, Xudong Lu and Shipeng Wang
The behavioral decision-making of digital-self is one of the important research contents of the network of crowd intelligence. The factors and mechanisms that affect…
Abstract
Purpose
The behavioral decision-making of digital-self is one of the important research contents of the network of crowd intelligence. The factors and mechanisms that affect decision-making have attracted the attention of many researchers. Among the factors that influence decision-making, the mind of digital-self plays an important role. Exploring the influence mechanism of digital-selfs’ mind on decision-making is helpful to understand the behaviors of the crowd intelligence network and improve the transaction efficiency in the network of CrowdIntell.
Design/methodology/approach
In this paper, the authors use behavioral pattern perception layer, multi-aspect perception layer and memory network enhancement layer to adaptively explore the mind of a digital-self and generate the mental representation of a digital-self from three aspects including external behavior, multi-aspect factors of the mind and memory units. The authors use the mental representations to assist behavioral decision-making.
Findings
The evaluation in real-world open data sets shows that the proposed method can model the mind and verify the influence of the mind on the behavioral decisions, and its performance is better than the universal baseline methods for modeling user interest.
Originality/value
In general, the authors use the behaviors of the digital-self to mine and explore its mind, which is used to assist the digital-self to make decisions and promote the transaction in the network of CrowdIntell. This work is one of the early attempts, which uses neural networks to model the mental representation of digital-self.
Details
Keywords
Hangjing Zhang, Yan Chen and H. Vicky Zhao
The purpose of this paper is to have a review on the analysis of information diffusion based on evolutionary game theory. People now get used to interact over social networks, and…
Abstract
Purpose
The purpose of this paper is to have a review on the analysis of information diffusion based on evolutionary game theory. People now get used to interact over social networks, and one of the most important functions of social networks is information sharing. Understanding the mechanisms of the information diffusion over social networks is critical to various applications including online advertisement and rumor control.
Design/methodology/approach
It has been shown that the graphical evolutionary game theory (EGT) is a very efficient method to study this problem.
Findings
By applying EGT to information diffusion, the authors could predict every small change in the process, get the detailed dynamics and finally foretell the stable states.
Originality/value
In this paper, the authors provide a general review on the evolutionary game-theoretic framework for information diffusion over social network by summarizing the results and conclusions of works using graphical EGT.
Details
Keywords
Independent component analysis (ICA) is a widely-used blind source separation technique. ICA has been applied to many applications. ICA is usually utilized as a black box, without…
Abstract
Independent component analysis (ICA) is a widely-used blind source separation technique. ICA has been applied to many applications. ICA is usually utilized as a black box, without understanding its internal details. Therefore, in this paper, the basics of ICA are provided to show how it works to serve as a comprehensive source for researchers who are interested in this field. This paper starts by introducing the definition and underlying principles of ICA. Additionally, different numerical examples in a step-by-step approach are demonstrated to explain the preprocessing steps of ICA and the mixing and unmixing processes in ICA. Moreover, different ICA algorithms, challenges, and applications are presented.
Details
Keywords
Qinxu Ding, Ding Ding, Yue Wang, Chong Guan and Bosheng Ding
The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive…
Abstract
Purpose
The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive examination of the research landscape in LLMs, providing an overview of the prevailing themes and topics within this dynamic domain.
Design/methodology/approach
Drawing from an extensive corpus of 198 records published between 1996 to 2023 from the relevant academic database encompassing journal articles, books, book chapters, conference papers and selected working papers, this study delves deep into the multifaceted world of LLM research. In this study, the authors employed the BERTopic algorithm, a recent advancement in topic modeling, to conduct a comprehensive analysis of the data after it had been meticulously cleaned and preprocessed. BERTopic leverages the power of transformer-based language models like bidirectional encoder representations from transformers (BERT) to generate more meaningful and coherent topics. This approach facilitates the identification of hidden patterns within the data, enabling authors to uncover valuable insights that might otherwise have remained obscure. The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.
Findings
The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.
Practical implications
This classification offers practical guidance for researchers, developers, educators, and policymakers to focus efforts and resources. The study underscores the importance of addressing challenges in LLMs, including potential biases, transparency, data privacy, and responsible deployment. Policymakers can utilize this information to shape regulations, while developers can tailor technology development based on the diverse applications identified. The findings also emphasize the need for interdisciplinary collaboration and highlight ethical considerations, providing a roadmap for navigating the complex landscape of LLM research and applications.
Originality/value
This study stands out as the first to examine the evolution of LLMs across such a long time frame and across such diversified disciplines. It provides a unique perspective on the key areas of LLM research, highlighting the breadth and depth of LLM’s evolution.
Details
Keywords
Annye Braca and Pierpaolo Dondio
Prediction is a critical task in targeted online advertising, where predictions better than random guessing can translate to real economic return. This study aims to use machine…
Abstract
Purpose
Prediction is a critical task in targeted online advertising, where predictions better than random guessing can translate to real economic return. This study aims to use machine learning (ML) methods to identify individuals who respond well to certain linguistic styles/persuasion techniques based on Aristotle’s means of persuasion, rhetorical devices, cognitive theories and Cialdini’s principles, given their psychometric profile.
Design/methodology/approach
A total of 1,022 individuals took part in the survey; participants were asked to fill out the ten item personality measure questionnaire to capture personality traits and the dysfunctional attitude scale (DAS) to measure dysfunctional beliefs and cognitive vulnerabilities. ML classification models using participant profiling information as input were developed to predict the extent to which an individual was influenced by statements that contained different linguistic styles/persuasion techniques. Several ML algorithms were used including support vector machine, LightGBM and Auto-Sklearn to predict the effect of each technique given each individual’s profile (personality, belief system and demographic data).
Findings
The findings highlight the importance of incorporating emotion-based variables as model input in predicting the influence of textual statements with embedded persuasion techniques. Across all investigated models, the influence effect could be predicted with an accuracy ranging 53%–70%, indicating the importance of testing multiple ML algorithms in the development of a persuasive communication (PC) system. The classification ability of models was highest when predicting the response to statements using rhetorical devices and flattery persuasion techniques. Contrastingly, techniques such as authority or social proof were less predictable. Adding DAS scale features improved model performance, suggesting they may be important in modelling persuasion.
Research limitations/implications
In this study, the survey was limited to English-speaking countries and largely Western society values. More work is needed to ascertain the efficacy of models for other populations, cultures and languages. Most PC efforts are targeted at groups such as users, clients, shoppers and voters with this study in the communication context of education – further research is required to explore the capability of predictive ML models in other contexts. Finally, long self-reported psychological questionnaires may not be suitable for real-world deployment and could be subject to bias, thus a simpler method needs to be devised to gather user profile data such as using a subset of the most predictive features.
Practical implications
The findings of this study indicate that leveraging richer profiling data in conjunction with ML approaches may assist in the development of enhanced persuasive systems. There are many applications such as online apps, digital advertising, recommendation systems, chatbots and e-commerce platforms which can benefit from integrating persuasion communication systems that tailor messaging to the individual – potentially translating into higher economic returns.
Originality/value
This study integrates sets of features that have heretofore not been used together in developing ML-based predictive models of PC. DAS scale data, which relate to dysfunctional beliefs and cognitive vulnerabilities, were assessed for their importance in identifying effective persuasion techniques. Additionally, the work compares a range of persuasion techniques that thus far have only been studied separately. This study also demonstrates the application of various ML methods in predicting the influence of linguistic styles/persuasion techniques within textual statements and show that a robust methodology comparing a range of ML algorithms is important in the discovery of a performant model.
Details
Keywords
Anette Rantanen, Joni Salminen, Filip Ginter and Bernard J. Jansen
User-generated social media comments can be a useful source of information for understanding online corporate reputation. However, the manual classification of these comments is…
Abstract
Purpose
User-generated social media comments can be a useful source of information for understanding online corporate reputation. However, the manual classification of these comments is challenging due to their high volume and unstructured nature. The purpose of this paper is to develop a classification framework and machine learning model to overcome these limitations.
Design/methodology/approach
The authors create a multi-dimensional classification framework for the online corporate reputation that includes six main dimensions synthesized from prior literature: quality, reliability, responsibility, successfulness, pleasantness and innovativeness. To evaluate the classification framework’s performance on real data, the authors retrieve 19,991 social media comments about two Finnish banks and use a convolutional neural network (CNN) to classify automatically the comments based on manually annotated training data.
Findings
After parameter optimization, the neural network achieves an accuracy between 52.7 and 65.2 percent on real-world data, which is reasonable given the high number of classes. The findings also indicate that prior work has not captured all the facets of online corporate reputation.
Practical implications
For practical purposes, the authors provide a comprehensive classification framework for online corporate reputation, which companies and organizations operating in various domains can use. Moreover, the authors demonstrate that using a limited amount of training data can yield a satisfactory multiclass classifier when using CNN.
Originality/value
This is the first attempt at automatically classifying online corporate reputation using an online-specific classification framework.
Details
Keywords
Michael Leumüller, Karl Hollaus and Joachim Schöberl
This paper aims to consider a multiscale electromagnetic wave problem for a housing with a ventilation grill. Using the standard finite element method to discretise the apertures…
Abstract
Purpose
This paper aims to consider a multiscale electromagnetic wave problem for a housing with a ventilation grill. Using the standard finite element method to discretise the apertures leads to an unduly large number of unknowns. An efficient approach to simulate the multiple scales is introduced. The aim is to significantly reduce the computational costs.
Design/methodology/approach
A domain decomposition technique with upscaling is applied to cope with the different scales. The idea is to split the domain of computation into an exterior domain and multiple non-overlapping sub-domains. Each sub-domain represents a single aperture and uses the same finite element mesh. The identical mesh of the sub-domains is efficiently exploited by the hybrid discontinuous Galerkin method and a Schur complement which facilitates the transition from fine meshes in the sub-domains to a coarse mesh in the exterior domain. A coarse skeleton grid is used on the interface between the exterior domain and the individual sub-domains to avoid large dense blocks in the finite element discretisation matrix.
Findings
Applying a Schur complement to the identical discretisation of the sub-domains leads to a method that scales very well with respect to the number of apertures.
Originality/value
The error compared to the standard finite element method is negligible and the computational costs are significantly reduced.
Details