Search results

1 – 10 of 18
Open Access
Article
Publication date: 10 August 2022

Jie Ma, Zhiyuan Hao and Mo Hu

The density peak clustering algorithm (DP) is proposed to identify cluster centers by two parameters, i.e. ρ value (local density) and δ value (the distance between a point and…

Abstract

Purpose

The density peak clustering algorithm (DP) is proposed to identify cluster centers by two parameters, i.e. ρ value (local density) and δ value (the distance between a point and another point with a higher ρ value). According to the center-identifying principle of the DP, the potential cluster centers should have a higher ρ value and a higher δ value than other points. However, this principle may limit the DP from identifying some categories with multi-centers or the centers in lower-density regions. In addition, the improper assignment strategy of the DP could cause a wrong assignment result for the non-center points. This paper aims to address the aforementioned issues and improve the clustering performance of the DP.

Design/methodology/approach

First, to identify as many potential cluster centers as possible, the authors construct a point-domain by introducing the pinhole imaging strategy to extend the searching range of the potential cluster centers. Second, they design different novel calculation methods for calculating the domain distance, point-domain density and domain similarity. Third, they adopt domain similarity to achieve the domain merging process and optimize the final clustering results.

Findings

The experimental results on analyzing 12 synthetic data sets and 12 real-world data sets show that two-stage density peak clustering based on multi-strategy optimization (TMsDP) outperforms the DP and other state-of-the-art algorithms.

Originality/value

The authors propose a novel DP-based clustering method, i.e. TMsDP, and transform the relationship between points into that between domains to ultimately further optimize the clustering performance of the DP.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 5 September 2016

Qingyuan Wu, Changchen Zhan, Fu Lee Wang, Siyang Wang and Zeping Tang

The quick growth of web-based and mobile e-learning applications such as massive open online courses have created a large volume of online learning resources. Confronting such a…

3518

Abstract

Purpose

The quick growth of web-based and mobile e-learning applications such as massive open online courses have created a large volume of online learning resources. Confronting such a large amount of learning data, it is important to develop effective clustering approaches for user group modeling and intelligent tutoring. The paper aims to discuss these issues.

Design/methodology/approach

In this paper, a minimum spanning tree based approach is proposed for clustering of online learning resources. The novel clustering approach has two main stages, namely, elimination stage and construction stage. During the elimination stage, the Euclidean distance is adopted as a metrics formula to measure density of learning resources. Resources with quite low densities are identified as outliers and therefore removed. During the construction stage, a minimum spanning tree is built by initializing the centroids according to the degree of freedom of the resources. Online learning resources are subsequently partitioned into clusters by exploiting the structure of minimum spanning tree.

Findings

Conventional clustering algorithms have a number of shortcomings such that they cannot handle online learning resources effectively. On the one hand, extant partitional clustering methods use a randomly assigned centroid for each cluster, which usually cause the problem of ineffective clustering results. On the other hand, classical density-based clustering methods are very computationally expensive and time-consuming. Experimental results indicate that the algorithm proposed outperforms the traditional clustering algorithms for online learning resources.

Originality/value

The effectiveness of the proposed algorithms has been validated by using several data sets. Moreover, the proposed clustering algorithm has great potential in e-learning applications. It has been demonstrated how the novel technique can be integrated in various e-learning systems. For example, the clustering technique can classify learners into groups so that homogeneous grouping can improve the effectiveness of learning. Moreover, clustering of online learning resources is valuable to decision making in terms of tutorial strategies and instructional design for intelligent tutoring. Lastly, a number of directions for future research have been identified in the study.

Details

Asian Association of Open Universities Journal, vol. 11 no. 2
Type: Research Article
ISSN: 1858-3431

Keywords

Open Access
Article
Publication date: 13 November 2018

Zhiwen Pan, Wen Ji, Yiqiang Chen, Lianjun Dai and Jun Zhang

The disability datasets are the datasets that contain the information of disabled populations. By analyzing these datasets, professionals who work with disabled populations can…

1236

Abstract

Purpose

The disability datasets are the datasets that contain the information of disabled populations. By analyzing these datasets, professionals who work with disabled populations can have a better understanding of the inherent characteristics of the disabled populations, so that working plans and policies, which can effectively help the disabled populations, can be made accordingly.

Design/methodology/approach

In this paper, the authors proposed a big data management and analytic approach for disability datasets.

Findings

By using a set of data mining algorithms, the proposed approach can provide the following services. The data management scheme in the approach can improve the quality of disability data by estimating miss attribute values and detecting anomaly and low-quality data instances. The data mining scheme in the approach can explore useful patterns which reflect the correlation, association and interactional between the disability data attributes. Experiments based on real-world dataset are conducted at the end to prove the effectiveness of the approach.

Originality/value

The proposed approach can enable data-driven decision-making for professionals who work with disabled populations.

Details

International Journal of Crowd Science, vol. 2 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 26 October 2020

Mohammed S. Al-kahtani, Lutful Karim and Nargis Khan

Designing an efficient routing protocol that opportunistically forwards data to the destination node through nearby sensor nodes or devices is significantly important for an…

Abstract

Designing an efficient routing protocol that opportunistically forwards data to the destination node through nearby sensor nodes or devices is significantly important for an effective incidence response and disaster recovery framework. Existing sensor routing protocols are mostly not effective in such disaster recovery applications as the networks are affected (destroyed or overused) in disasters such as earthquake, flood, Tsunami and wildfire. These protocols require a large number of message transmissions to reestablish the clusters and communications that is not energy efficient and result in packet loss. This paper introduces ODCR - an energy efficient and reliable opportunistic density clustered-based routing protocol for such emergency sensor applications. We perform simulation to measure the performance of ODCR protocol in terms of network energy consumptions, throughput and packet loss ratio. Simulation results demonstrate that the ODCR protocol is much better than the existing TEEN, LEACH and LORA protocols in term of these performance metrics.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 19 December 2023

Qinxu Ding, Ding Ding, Yue Wang, Chong Guan and Bosheng Ding

The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive…

1483

Abstract

Purpose

The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive examination of the research landscape in LLMs, providing an overview of the prevailing themes and topics within this dynamic domain.

Design/methodology/approach

Drawing from an extensive corpus of 198 records published between 1996 to 2023 from the relevant academic database encompassing journal articles, books, book chapters, conference papers and selected working papers, this study delves deep into the multifaceted world of LLM research. In this study, the authors employed the BERTopic algorithm, a recent advancement in topic modeling, to conduct a comprehensive analysis of the data after it had been meticulously cleaned and preprocessed. BERTopic leverages the power of transformer-based language models like bidirectional encoder representations from transformers (BERT) to generate more meaningful and coherent topics. This approach facilitates the identification of hidden patterns within the data, enabling authors to uncover valuable insights that might otherwise have remained obscure. The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.

Findings

The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.

Practical implications

This classification offers practical guidance for researchers, developers, educators, and policymakers to focus efforts and resources. The study underscores the importance of addressing challenges in LLMs, including potential biases, transparency, data privacy, and responsible deployment. Policymakers can utilize this information to shape regulations, while developers can tailor technology development based on the diverse applications identified. The findings also emphasize the need for interdisciplinary collaboration and highlight ethical considerations, providing a roadmap for navigating the complex landscape of LLM research and applications.

Originality/value

This study stands out as the first to examine the evolution of LLMs across such a long time frame and across such diversified disciplines. It provides a unique perspective on the key areas of LLM research, highlighting the breadth and depth of LLM’s evolution.

Details

Journal of Electronic Business & Digital Economics, vol. 3 no. 1
Type: Research Article
ISSN: 2754-4214

Keywords

Open Access
Article
Publication date: 10 February 2020

Veronika Fenyves, Kinga Emese Zsido, Ioan Bircea and Tibor Tarnoczi

Changes in food retailing (globalization, concentration) have negative impacts on smaller, “traditional” food retail businesses. Their market share decreasing year by year. The…

4010

Abstract

Purpose

Changes in food retailing (globalization, concentration) have negative impacts on smaller, “traditional” food retail businesses. Their market share decreasing year by year. The purpose of this study is to examine and compare the financial performances of these businesses under the given circumstances and current economic environment in a Hungarian and a Romanian county.

Design/methodology/approach

The study is based on two complete databases, including all companies that behoove retail food activity (considering the NACE cod) in the counties of Hajdu-Bihar (Hungary) and Cluj (Romania). The database analyzed contains the financial statements for five consecutive years for 212 and 690 businesses. Databases were examined by the most typical financial indicators using the multivariate and univariate analysis of variance and the k-medoid cluster analysis methods.

Findings

The results of the analysis have shown that there are differences in the number of retail food companies in the case of two counties, both in number and in financial performance. Companies in Hajdú-Bihar county perform better in terms of financial ratios than those in Cluj county. The groups created by k-medoids cluster analysis are relatively well distinguished in the case of Hajdú-Bihar county, while the picture is much more mixed in the case of Kolozs county. However, it is also important to note that the companies analyzed should generally perform better to survive.

Research limitations/implications

Among the limitations of the study, it is important to note that the findings are relevant only to the two counties examined. Another limiting factor is that quite several companies had to be excluded from the analysis due to missing data or outliers.

Practical implications

The study presents for the corporate decision-makers the current performance of the companies of the sector examined in the two counties. The results of the study highlight the business areas of concern in management. The findings show that they need to change this performance to strengthen their market position. We believe that it is not enough to complain about the expansion of the supermarket chains, but they should take appropriate actions to improve their situation. Based on the results of the study, it can be concluded that there is a need to improve the financial efficiency of retail food companies in both counties to survive in the long run. This improvement is essential because retailers can play an important role in smaller settlements and narrower residential environments.

Originality/value

Comparative analysis of retail food companies in similar counties in these two neighboring countries has not been conducted using complex financial analysis. The study revealed the common and/or individual characteristics of these companies.

Details

British Food Journal, vol. 122 no. 11
Type: Research Article
ISSN: 0007-070X

Keywords

Open Access
Article
Publication date: 26 April 2024

Xue Xin, Yuepeng Jiao, Yunfeng Zhang, Ming Liang and Zhanyong Yao

This study aims to ensure reliable analysis of dynamic responses in asphalt pavement structures. It investigates noise reduction and data mining techniques for pavement dynamic…

Abstract

Purpose

This study aims to ensure reliable analysis of dynamic responses in asphalt pavement structures. It investigates noise reduction and data mining techniques for pavement dynamic response signals.

Design/methodology/approach

The paper conducts time-frequency analysis on signals of pavement dynamic response initially. It also uses two common noise reduction methods, namely, low-pass filtering and wavelet decomposition reconstruction, to evaluate their effectiveness in reducing noise in these signals. Furthermore, as these signals are generated in response to vehicle loading, they contain a substantial amount of data and are prone to environmental interference, potentially resulting in outliers. Hence, it becomes crucial to extract dynamic strain response features (e.g. peaks and peak intervals) in real-time and efficiently.

Findings

The study introduces an improved density-based spatial clustering of applications with Noise (DBSCAN) algorithm for identifying outliers in denoised data. The results demonstrate that low-pass filtering is highly effective in reducing noise in pavement dynamic response signals within specified frequency ranges. The improved DBSCAN algorithm effectively identifies outliers in these signals through testing. Furthermore, the peak detection process, using the enhanced findpeaks function, consistently achieves excellent performance in identifying peak values, even when complex multi-axle heavy-duty truck strain signals are present.

Originality/value

The authors identified a suitable frequency domain range for low-pass filtering in asphalt road dynamic response signals, revealing minimal amplitude loss and effective strain information reflection between road layers. Furthermore, the authors introduced the DBSCAN-based anomaly data detection method and enhancements to the Matlab findpeaks function, enabling the detection of anomalies in road sensor data and automated peak identification.

Details

Smart and Resilient Transportation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2632-0487

Keywords

Open Access
Article
Publication date: 28 July 2020

Prabhat Pokharel, Roshan Pokhrel and Basanta Joshi

Analysis of log message is very important for the identification of a suspicious system and network activity. This analysis requires the correct extraction of variable entities…

1076

Abstract

Analysis of log message is very important for the identification of a suspicious system and network activity. This analysis requires the correct extraction of variable entities. The variable entities are extracted by comparing the logs messages against the log patterns. Each of these log patterns can be represented in the form of a log signature. In this paper, we present a hybrid approach for log signature extraction. The approach consists of two modules. The first module identifies log patterns by generating log clusters. The second module uses Named Entity Recognition (NER) to extract signatures by using the extracted log clusters. Experiments were performed on event logs from Windows Operating System, Exchange and Unix and validation of the result was done by comparing the signatures and the variable entities against the standard log documentation. The outcome of the experiments was that extracted signatures were ready to be used with a high degree of accuracy.

Details

Applied Computing and Informatics, vol. 19 no. 1/2
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 26 April 2022

Katarzyna Piwowar-Sulej, Sławomir Wawak, Małgorzata Tyrańska, Małgorzata Zakrzewska, Szymon Jarosz and Mariusz Sołtysik

The purpose of the study was to detect trends in human resource management (HRM) research presented in journals during the 2000–2020 timeframe. The research question is: How are…

9932

Abstract

Purpose

The purpose of the study was to detect trends in human resource management (HRM) research presented in journals during the 2000–2020 timeframe. The research question is: How are the interests of researchers changing in the field of HRM and which topics have gained popularity in recent years?

Design/methodology/approach

The approach adopted in this study was designed to overcome all the limitations specific to the systematic literature reviews and bibliometric studies presented in the Introduction. The full texts of papers were analyzed. The text-mining tools detected first clusters and then trends, moreover, which limited the impact of a researcher's bias. The approach applied is consistent with the general rules of systematic literature reviews.

Findings

The article makes a threefold contribution to academic knowledge. First, it uses modern methodology to gather and synthesize HRM research topics. The proposed approach was designed to allow early detection of nascent, non-obvious trends in research, which will help researchers address topics of high value for both theory and practice. Second, the results of our study highlight shifts in focus in HRM over the past 19 years. Third, the article suggests further directions of research.

Research limitations/implications

In this study, the approach designed to overcome the limitations of using systematic literature review was presented. The analysis was done on the basis of the full text of the articles and the categories were discovered directly from the articles rather than predetermined. The study's findings may, however, potentially be limited by the following issues. First, the eligibility criteria included only papers indexed in the Scopus and WoS database and excluded conference proceedings, book chapters, and non-English papers. Second, only full-text articles were included in the study, which could narrow down the research area. As a consequence, important information regarding the research presented in the excluded documents is potentially lost. Third, most of the papers in our database were published in the International Journal of Human Resource Management, and therefore such trends as “challenges for international HRM” can be considered significant (long-lasting). Another – the fourth – limitation of the study is the lack of estimation of the proportion between searches in HRM journals and articles published in other journals. Future research may overcome the above-presented limitations. Although the authors used valuable techniques such as TF-IDF and HDBSCAN, the fifth limitation is that, after trends were discovered, it was necessary to evaluate and interpret them. That could have induced researchers' bias even if – as in this study – researchers from different areas of experience were involved. Finally, this study covers the 2000–2020 timeframe. Since HRM is a rapidly developing field, in a few years from now academics will probably begin to move into exciting new research areas. As a consequence, it might be worthwhile conducting similar analyses to those presented in this study and compare their results.

Originality/value

The present study provides an analysis of HRM journals with the aim of establishing trends in HRM research. It makes contributions to the field by providing a more comprehensive and objective review than analyses resulting from systematic literature reviews. It fills the gap in literature studies on HRM with a novel research approach – a methodology based on full-text mining and a big data toolset. As a consequence, this study can be considered as providing an adequate reflection of all the articles published in journals strictly devoted to HRM issues and which may serve as an important source of reference for both researchers and practitioners. This study can help them identify the core journals focused on HRM research as well as topics which are of particular interest and importance.

Details

International Journal of Manpower, vol. 44 no. 1
Type: Research Article
ISSN: 0143-7720

Keywords

Open Access
Article
Publication date: 9 August 2023

Jie Zhang, Yuwei Wu, Jianyong Gao, Guangjun Gao and Zhigang Yang

This study aims to explore the formation mechanism of aerodynamic noise of a high-speed maglev train and understand the characteristics of dipole and quadrupole sound sources of…

360

Abstract

Purpose

This study aims to explore the formation mechanism of aerodynamic noise of a high-speed maglev train and understand the characteristics of dipole and quadrupole sound sources of the maglev train at different speed levels.

Design/methodology/approach

Based on large eddy simulation (LES) method and Kirchhoff–Ffowcs Williams and Hawkings (K-FWH) equations, the characteristics of dipole and quadrupole sound sources of maglev trains at different speed levels were simulated and analyzed by constructing reasonable penetrable integral surface.

Findings

The spatial disturbance resulting from the separation of the boundary layer in the streamlined area of the tail car is the source of aerodynamic sound of the maglev train. The dipole sources of the train are mainly distributed around the radio terminals of the head and tail cars of the maglev train, the bottom of the arms of the streamlined parts of the head and tail cars and the nose tip area of the streamlined part of the tail car, and the quadrupole sources are mainly distributed in the wake area. When the train runs at three speed levels of 400, 500 and 600 km·h−1, respectively, the radiated energy of quadrupole source is 62.4%, 63.3% and 71.7%, respectively, which exceeds that of dipole sources.

Originality/value

This study can help understand the aerodynamic noise characteristics generated by the high-speed maglev train and provide a reference for the optimization design of its aerodynamic shape.

Details

Railway Sciences, vol. 2 no. 3
Type: Research Article
ISSN: 2755-0907

Keywords

1 – 10 of 18