Search results

1 – 10 of over 1000
Open Access
Article
Publication date: 21 May 2021

Yue Huang, Hu Liu and Jing Pan

Identifying the frontiers of a specific research field is one of the most basic tasks in bibliometrics and research published in leading conferences is crucial to the data mining…

1076

Abstract

Purpose

Identifying the frontiers of a specific research field is one of the most basic tasks in bibliometrics and research published in leading conferences is crucial to the data mining research community, whereas few research studies have focused on it. The purpose of this study is to detect the intellectual structure of data mining based on conference papers.

Design/methodology/approach

This study takes the authoritative conference papers of the ranking 9 in the data mining field provided by Google Scholar Metrics as a sample. According to paper amount, this paper first detects the annual situation of the published documents and the distribution of the published conferences. Furthermore, from the research perspective of keywords, CiteSpace was used to dig into the conference papers to identify the frontiers of data mining, which focus on keywords term frequency, keywords betweenness centrality, keywords clustering and burst keywords.

Findings

Research showed that the research heat of data mining had experienced a linear upward trend during 2007 and 2016. The frontier identification based on the conference papers showed that there were five research hotspots in data mining, including clustering, classification, recommendation, social network analysis and community detection. The research contents embodied in the conference papers were also very rich.

Originality/value

This study detected the research frontier from leading data mining conference papers. Based on the keyword co-occurrence network, from four dimensions of keyword term frequency, betweeness centrality, clustering analysis and burst analysis, this paper identified and analyzed the research frontiers of data mining discipline from 2007 to 2016.

Details

International Journal of Crowd Science, vol. 5 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 6 March 2017

Zhuoxuan Jiang, Chunyan Miao and Xiaoming Li

Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by…

2108

Abstract

Purpose

Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by learners all over the world, unprecedented massive educational resources are aggregated. The educational resources include videos, subtitles, lecture notes, quizzes, etc., on the teaching side, and forum contents, Wiki, log of learning behavior, log of homework, etc., on the learning side. However, the data are both unstructured and diverse. To facilitate knowledge management and mining on MOOCs, extracting keywords from the resources is important. This paper aims to adapt the state-of-the-art techniques to MOOC settings and evaluate the effectiveness on real data. In terms of practice, this paper also tries to answer the questions for the first time that to what extend can the MOOC resources support keyword extraction models, and how many human efforts are required to make the models work well.

Design/methodology/approach

Based on which side generates the data, i.e instructors or learners, the data are classified to teaching resources and learning resources, respectively. The approach used on teaching resources is based on machine learning models with labels, while the approach used on learning resources is based on graph model without labels.

Findings

From the teaching resources, the methods used by the authors can accurately extract keywords with only 10 per cent labeled data. The authors find a characteristic of the data that the resources of various forms, e.g. subtitles and PPTs, should be separately considered because they have the different model ability. From the learning resources, the keywords extracted from MOOC forums are not as domain-specific as those extracted from teaching resources, but they can reflect the topics which are lively discussed in forums. Then instructors can get feedback from the indication. The authors implement two applications with the extracted keywords: generating concept map and generating learning path. The visual demos show they have the potential to improve learning efficiency when they are integrated into a real MOOC platform.

Research limitations/implications

Conducting keyword extraction on MOOC resources is quite difficult because teaching resources are hard to be obtained due to copyrights. Also, getting labeled data is tough because usually expertise of the corresponding domain is required.

Practical implications

The experiment results support that MOOC resources are good enough for building models of keyword extraction, and an acceptable balance between human efforts and model accuracy can be achieved.

Originality/value

This paper presents a pioneer study on keyword extraction on MOOC resources and obtains some new findings.

Details

International Journal of Crowd Science, vol. 1 no. 1
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 13 July 2020

Dalia Hamed

The purpose of this study is to apply a corpus-assisted analysis of keywords and their collocations in the US presidential discourse from Clinton to Trump to discover the meanings…

3976

Abstract

Purpose

The purpose of this study is to apply a corpus-assisted analysis of keywords and their collocations in the US presidential discourse from Clinton to Trump to discover the meanings of these words and the collocates they have. Keywords are salient words in a corpus whose frequency is unusually high (positive keywords) or low (negative keywords) in comparison with a reference corpus. Collocation is the co-occurrence of words.

Design/methodology/approach

To achieve this purpose, the investigation of keywords and collocations is generated by AntConc, a corpus processing software.

Findings

This analysis leads to shed light on the similarities and/or differences amongst the past four American presidents concerning their key topics. Keyword analysis through keyness makes it evident that Clinton and Obama, being Democrats, demonstrate a clear tendency to improve Americans’ life inside their social sphere. Obama surpasses Clinton as regard foreign affairs. Clinton and Obama’s infrequent subjects have to do with terrorism and immigration. This complies with their condensed focus on social and economic improvements. Bush, a republican, concentrates only on external issues. This is proven by his keywords signifying war against terrorism. Bush’s negative use of words marking cooperative actions conforms to his positive use of words indicating external war. Trump’s positive keywords are about exaggerated descriptions without a defined target. He also shows an unusual frequency in referring to his name and position. His words used with negative keyness refer to reforming programs and external issues. Collocations around each top content keyword clarify the word and harmonize with the presidential orientation negotiated by the keywords.

Research limitations/implications

Limitations have to do with the issue of the accurate representation of the samples.

Originality/value

This research is original in its methodology of applying corpus linguistics tools in the analysis of presidential discourses.

Details

Journal of Humanities and Applied Social Sciences, vol. 3 no. 2
Type: Research Article
ISSN: 2632-279X

Keywords

Open Access
Article
Publication date: 18 June 2019

Youngjoo Na and Jisu Kim

The purpose of this paper is to analyze the post type of the official account of the Korean fashion brands on Instagram and to analyze the images and keywords according to the use…

4571

Abstract

Purpose

The purpose of this paper is to analyze the post type of the official account of the Korean fashion brands on Instagram and to analyze the images and keywords according to the use of the hashtag in it. This study also will provide data of how fashion brands use the new media of Instagram and how they promote it.

Design/methodology/approach

This study investigated the types of postings and keywords of hashtag(#) of fashion brand’s official Instagram account in order to analyze the post type and keyword. In total, six apparel brand companies were selected, with two in each of three categories (classic casual brand, outdoor sports brand and designer character brand), and seven types of postings were classified (lookbook and product, collection, broadcasting ads, brand issue, sensibility pictures, sponsorship and event). The frequencies were collected according to their types that were confirmed by four fashion major specialists.

Findings

First, the proportion of the types of postings varied according to the characteristics of the brand. Second, the six brands used keywords of a symbol because it is important to convey brand identity. Third, the sensibility keywords of each brand were investigated, and one of the designer character brands used only practical keywords without sensibility keywords. Fourth, this study examined the number of Instagram hashtags and hearts to determine if the reaction was in alignment with the marketing trends of the company’s official Instagram account and consumers. One of the classic casual brands, one of the outdoor sports brands and both designer character brands showed a high proportion of types of posts on Instagram that well matched with consumer response. As a hypothesis of this study, it was supported that the posting types of images and hashtags will be different according to the characteristics of brand.

Originality/value

Instagram is the fastest growing social network service (SNS) globally, especially among young adults. Instagram is noted for its strong SNS marketing but it has not been well researched in the apparel industry. The study results will help improve the brand image and promotion by using official Instagram account in the apparel industry.

Details

International Journal of Clothing Science and Technology, vol. 32 no. 1
Type: Research Article
ISSN: 0955-6222

Keywords

Open Access
Article
Publication date: 11 October 2023

Bachriah Fatwa Dhini, Abba Suganda Girsang, Unggul Utan Sufandi and Heny Kurniawati

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes…

Abstract

Purpose

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the highest vector embedding. Combining these models is used to optimize the model with increasing accuracy.

Design/methodology/approach

The development of the model in the study is divided into seven stages: (1) data collection, (2) pre-processing data, (3) selected pre-trained SentenceTransformers model, (4) semantic similarity (sentence pair), (5) keyword similarity, (6) calculate final score and (7) evaluating model.

Findings

The multilingual paraphrase-multilingual-MiniLM-L12-v2 and distilbert-base-multilingual-cased-v1 models got the highest scores from comparisons of 11 pre-trained multilingual models of SentenceTransformers with Indonesian data (Dhini and Girsang, 2023). Both multilingual models were adopted in this study. A combination of two parameters is obtained by comparing the response of the keyword extraction responses with the rubric keywords. Based on the experimental results, proposing a combination can increase the evaluation results by 0.2.

Originality/value

This study uses discussion forum data from the general biology course in online learning at the open university for the 2020.2 and 2021.2 semesters. Forum discussion ratings are still manual. In this survey, the authors created a model that automatically calculates the value of discussion forums, which are essays based on the lecturer's answers moreover rubrics.

Details

Asian Association of Open Universities Journal, vol. 18 no. 3
Type: Research Article
ISSN: 1858-3431

Keywords

Open Access
Article
Publication date: 19 April 2023

Milad Soltani, Alexios Kythreotis and Arash Roshanpoor

The emergence of machine learning has opened a new way for researchers. It allows them to supplement the traditional manual methods for conducting a literature review and turning…

3752

Abstract

Purpose

The emergence of machine learning has opened a new way for researchers. It allows them to supplement the traditional manual methods for conducting a literature review and turning it into smart literature. This study aims to present a framework for incorporating machine learning into financial statement fraud (FSF) literature analysis. This framework facilitates the analysis of a large amount of literature to show the trend of the field and identify the most productive authors, journals and potential areas for future research.

Design/methodology/approach

In this study, a framework was introduced that merges bibliometric analysis techniques such as word frequency, co-word analysis and coauthorship analysis with the Latent Dirichlet Allocation topic modeling approach. This framework was used to uncover subtopics from 20 years of financial fraud research articles. Furthermore, the hierarchical clustering method was used on selected subtopics to demonstrate the primary contexts in the literature on FSF.

Findings

This study has contributed to the literature in two ways. First, this study has determined the top journals, articles, countries and keywords based on various bibliometric metrics. Second, using topic modeling and then hierarchy clustering, this study demonstrates the four primary contexts in FSF detection.

Research limitations/implications

In this study, the authors tried to comprehensively view the studies related to financial fraud conducted over two decades. However, this research has limitations that can be an opportunity for future researchers. The first limitation is due to language bias. This study has focused on English language articles, so it is suggested that other researchers consider other languages as well. The second limitation is caused by citation bias. In this study, the authors tried to show the top articles based on the citation criteria. However, judging based on citation alone can be misleading. Therefore, this study suggests that the researchers consider other measures to check the citation quality and assess the studies’ precision by applying meta-analysis.

Originality/value

Despite the popularity of bibliometric analysis and topic modeling, there have been limited efforts to use machine learning for literature review. This novel approach of using hierarchical clustering on topic modeling results enable us to uncover four primary contexts. Furthermore, this method allowed us to show the keywords of each context and highlight significant articles within each context.

Details

Journal of Financial Crime, vol. 30 no. 5
Type: Research Article
ISSN: 1359-0790

Keywords

Open Access
Article
Publication date: 29 September 2022

K.G. Priyashantha, W.E. Dahanayake and M.N. Maduwanthi

Research has been conducted to investigate the factors that influence career indecision. This study attempted to synthesize empirical research on career indecision to (1) find the…

10082

Abstract

Purpose

Research has been conducted to investigate the factors that influence career indecision. This study attempted to synthesize empirical research on career indecision to (1) find the common determinants over the last two decades and (2) find the factors/areas that need to be addressed for future research on career indecision.

Design/methodology/approach

This study used the systematic literature review (SLR) methodology and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Following the predetermined inclusion criteria, 118 articles from the Scopus database were included for review.

Findings

From this research, the authors found four main determinants for career indecision, namely (1) career-related decision-making difficulties, (2) adolescent differences, (3) individual and situational career decision-making profiles (CDMPs) and (4) level of individual readiness for career choice, which have been researched in the last two decades. Additionally, eight factors/areas were found to be addressed in future research on career indecision which include those four common determinants, the other three determinants, namely (1) individual differences, (2) contextual/environmental factors, (3) social factors, and one outcome, subjective well-being.

Research limitations/implications

The study had limitations in conducting this research, and the findings of the study provide some theoretical and future research implications.

Practical implications

The seven determinants and the only outcome provide some implications for practitioners and policymakers.

Originality/value

The study found seven determinants and one outcome of career indecision derived from empirical studies conducted during 2000–2021.

Details

Journal of Humanities and Applied Social Sciences, vol. 5 no. 2
Type: Research Article
ISSN: 2632-279X

Keywords

Open Access
Article
Publication date: 2 September 2019

Toan Luu Duc Huynh

This paper aims to shed light on an impact of Google keywords on the number of new businesses (and an amount of capital registered) in Vietnam, the Southeast Asian country, after…

1444

Abstract

Purpose

This paper aims to shed light on an impact of Google keywords on the number of new businesses (and an amount of capital registered) in Vietnam, the Southeast Asian country, after the year of an entrepreneur, 2016.

Design/methodology/approach

This study uses a rich set of quantitative techniques from VAR Granger and threshold regression. The whole sample period covers the data (keywords, number of new businesses, an amount of capital invested to register) from the first week of 2016 to October 2018, which includes 144 observations in total.

Findings

The findings suggest that the relationship between Google does not persist in the long run. There is a short-run shock, might cause a change to the frequency of the other keywords rather than the number of firms (or an amount of capital). However, under the number of firms’ threshold, keywords have the both positive and negative impacts on entrepreneurs whereas a higher threshold of capital, keywords show their roles to predict an amount of money for registering firms.

Practical implications

The Vietnamese Government and executives are advised to consider the Google keywords “entrepreneur” (in Vietnamese) and “start-up”, which cause a decline in entrepreneurial movements. In addition, the current period is going to inverse from the previous one in terms of the number of firms and an amount of capital. Finally, there are two critical thresholds: 1,602 companies and 35,010m VND for the keywords' influence.

Originality/value

This study contributes empirical evidence of technological change and entrepreneurship and contributes to the existing literature by discussing how this relationship under the threshold.

Details

Asia Pacific Journal of Innovation and Entrepreneurship, vol. 13 no. 2
Type: Research Article
ISSN: 2398-7812

Keywords

Open Access
Article
Publication date: 17 July 2020

Mukesh Kumar and Palak Rehan

Social media networks like Twitter, Facebook, WhatsApp etc. are most commonly used medium for sharing news, opinions and to stay in touch with peers. Messages on twitter are…

1148

Abstract

Social media networks like Twitter, Facebook, WhatsApp etc. are most commonly used medium for sharing news, opinions and to stay in touch with peers. Messages on twitter are limited to 140 characters. This led users to create their own novel syntax in tweets to express more in lesser words. Free writing style, use of URLs, markup syntax, inappropriate punctuations, ungrammatical structures, abbreviations etc. makes it harder to mine useful information from them. For each tweet, we can get an explicit time stamp, the name of the user, the social network the user belongs to, or even the GPS coordinates if the tweet is created with a GPS-enabled mobile device. With these features, Twitter is, in nature, a good resource for detecting and analyzing the real time events happening around the world. By using the speed and coverage of Twitter, we can detect events, a sequence of important keywords being talked, in a timely manner which can be used in different applications like natural calamity relief support, earthquake relief support, product launches, suspicious activity detection etc. The keyword detection process from Twitter can be seen as a two step process: detection of keyword in the raw text form (words as posted by the users) and keyword normalization process (reforming the users’ unstructured words in the complete meaningful English language words). In this paper a keyword detection technique based upon the graph, spanning tree and Page Rank algorithm is proposed. A text normalization technique based upon hybrid approach using Levenshtein distance, demetaphone algorithm and dictionary mapping is proposed to work upon the unstructured keywords as produced by the proposed keyword detector. The proposed normalization technique is validated using the standard lexnorm 1.2 dataset. The proposed system is used to detect the keywords from Twiter text being posted at real time. The detected and normalized keywords are further validated from the search engine results at later time for detection of events.

Details

Applied Computing and Informatics, vol. 17 no. 2
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 18 December 2020

Maureen Alice Flynn and Niamh M. Brennan

While clinical governance is assumed to be part of organisational structures and policies, implementation of clinical governance in practice (the praxis) can be markedly…

8048

Abstract

Purpose

While clinical governance is assumed to be part of organisational structures and policies, implementation of clinical governance in practice (the praxis) can be markedly different. This paper draws on insights from hospital clinicians, managers and governors on how they interpret the term “clinical governance”. The influence of best-practice and roles and responsibilities on their interpretations is considered.

Design/methodology/approach

The research is based on 40 in-depth, semi-structured interviews with hospital clinicians, managers and governors from two large academic hospitals in Ireland. The analytical lens for the research is practice theory. Interview transcripts are analysed for practitioners' spoken keywords/terms to explore how practitioners interpret the term “clinical governance”. The practice of clinical governance is mapped to front line, management and governance roles and responsibilities.

Findings

The research finds that interpretation of clinical governance in praxis is quite different from best-practice definitions. Practitioner roles and responsibilities held influence practitioners' interpretation.

Originality/value

The research examines interpretations of clinical governance in praxis by clinicians, managers and governors and highlights the adverse consequence of the absence of clear mapping of roles and responsibilities to clinical, management and governance practice.

Details

Journal of Health Organization and Management, vol. 35 no. 9
Type: Research Article
ISSN: 1477-7266

Keywords

1 – 10 of over 1000