Search results

1 – 10 of over 46000
Article
Publication date: 21 January 2019

Issa Alsmadi and Keng Hoon Gan

Rapid developments in social networks and their usage in everyday life have caused an explosion in the amount of short electronic documents. Thus, the need to classify…

Abstract

Purpose

Rapid developments in social networks and their usage in everyday life have caused an explosion in the amount of short electronic documents. Thus, the need to classify this type of document based on their content has a significant implication in many applications. The need to classify these documents in relevant classes according to their text contents should be interested in many practical reasons. Short-text classification is an essential step in many applications, such as spam filtering, sentiment analysis, Twitter personalization, customer review and many other applications related to social networks. Reviews on short text and its application are limited. Thus, this paper aims to discuss the characteristics of short text, its challenges and difficulties in classification. The paper attempt to introduce all stages in principle classification, the technique used in each stage and the possible development trend in each stage.

Design/methodology/approach

The paper as a review of the main aspect of short-text classification. The paper is structured based on the classification task stage.

Findings

This paper discusses related issues and approaches to these problems. Further research could be conducted to address the challenges in short texts and avoid poor accuracy in classification. Problems in low performance can be solved by using optimized solutions, such as genetic algorithms that are powerful in enhancing the quality of selected features. Soft computing solution has a fuzzy logic that makes short-text problems a promising area of research.

Originality/value

Using a powerful short-text classification method significantly affects many applications in terms of efficiency enhancement. Current solutions still have low performance, implying the need for improvement. This paper discusses related issues and approaches to these problems.

Details

International Journal of Web Information Systems, vol. 15 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 26 June 2020

Jamal Al Qundus, Adrian Paschke, Shivam Gupta, Ahmad M. Alzouby and Malik Yousef

The purpose of this paper is to explore to which extent the quality of social media short text without extensions can be investigated and what are the predictors, if any…

Abstract

Purpose

The purpose of this paper is to explore to which extent the quality of social media short text without extensions can be investigated and what are the predictors, if any, of such short text that lead to trust its content.

Design/methodology/approach

The paper applies a trust model to classify data collections based on metadata into four classes: Very Trusted, Trusted, Untrusted and Very Untrusted. These data are collected from the online communities, Genius and Stack Overflow. In order to evaluate short texts in terms of its trust levels, the authors have conducted two investigations: (1) A natural language processing (NLP) approach to extract relevant features (i.e. Part-of-Speech and various readability indexes). The authors report relatively good performance of the NLP study. (2) A machine learning technique in more precise, a random forest (RF) classifierusing bag-of-words model (BoW).

Findings

The investigation of the RF classifier using BoW shows promising intermediate results (on average 62% accuracy of both online communities) in short-text quality identification that leads to trust.

Practical implications

As social media becomes an increasingly new and attractive source of information, which is mostly provided in the form of short texts, businesses (e.g. in search engines for smart data) can filter content without having to apply complex approaches and continue to deal with information that is considered more trustworthy.

Originality/value

Short-text classifications with regard to a criterion (e.g. quality, readability) are usually extended by an external source or its metadata. This enhancement either changes the original text if it is an additional text from an external source, or it requires text metadata that is not always available. To this end, the originality of this study faces the challenge of investigating the quality of short text (i.e. social media text) without having to extend or modify it using external sources. This modification alters the text and distorts the results of the investigation.

Details

Journal of Enterprise Information Management, vol. 33 no. 6
Type: Research Article
ISSN: 1741-0398

Keywords

Article
Publication date: 7 August 2017

Hao Wang and Sanhong Deng

In the era of Big Data, network digital resources are growing rapidly, especially the short-text resources, such as tweets, comments, messages and so on, are showing a…

Abstract

Purpose

In the era of Big Data, network digital resources are growing rapidly, especially the short-text resources, such as tweets, comments, messages and so on, are showing a vigorous vitality. This study aims to compare the categories discriminative capacity (CDC) of Chinese language fragments with different granularities and to explore and verify feasibility, rationality and effectiveness of the low-granularity feature, such as Chinese characters in Chinese short-text classification (CSTC).

Design/methodology/approach

This study takes discipline classification of journal articles from CSSCI as a simulation environment. On the basis of sorting out the distribution rules of classification features with various granularities, including keywords, terms and characters, the classification effects accessed by the SVM algorithm are comprehensively compared and evaluated from three angles of using the same experiment samples, testing before and after feature optimization, and introducing external data.

Findings

The granularity of a classification feature has an important impact on CSTC. In general, the larger the granularity is, the better the classification result is, and vice versa. However, a low-granularity feature is also feasible, and its CDC could be improved by reasonable weight setting, even exceeding a high-granularity feature if synthetically considering classification precision, computational complexity and text coverage.

Originality/value

This is the first study to propose that Chinese characters are more suitable as descriptive features in CSTC than terms and keywords and to demonstrate that CDC of Chinese character features could be strengthened by mixing frequency and position as weight.

Article
Publication date: 29 April 2021

Heng-Yang Lu, Yi Zhang and Yuntao Du

Topic model has been widely applied to discover important information from a vast amount of unstructured data. Traditional long-text topic models such as Latent Dirichlet…

Abstract

Purpose

Topic model has been widely applied to discover important information from a vast amount of unstructured data. Traditional long-text topic models such as Latent Dirichlet Allocation may suffer from the sparsity problem when dealing with short texts, which mostly come from the Web. These models also exist the readability problem when displaying the discovered topics. The purpose of this paper is to propose a novel model called the Sense Unit based Phrase Topic Model (SenU-PTM) for both the sparsity and readability problems.

Design/methodology/approach

SenU-PTM is a novel phrase-based short-text topic model under a two-phase framework. The first phase introduces a phrase-generation algorithm by exploiting word embeddings, which aims to generate phrases with the original corpus. The second phase introduces a new concept of sense unit, which consists of a set of semantically similar tokens for modeling topics with token vectors generated in the first phase. Finally, SenU-PTM infers topics based on the above two phases.

Findings

Experimental results on two real-world and publicly available datasets show the effectiveness of SenU-PTM from the perspectives of topical quality and document characterization. It reveals that modeling topics on sense units can solve the sparsity of short texts and improve the readability of topics at the same time.

Originality/value

The originality of SenU-PTM lies in the new procedure of modeling topics on the proposed sense units with word embeddings for short-text topic discovery.

Details

Data Technologies and Applications, vol. 55 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 10 April 2017

Angeline Close Scheinbaum, Stefan Hampel and Mihyun Kang

Marketers use e-mail in new, potentially more informative, entertaining and lucrative ways – such as embedding video. The purpose of this paper is to examine consumer…

2255

Abstract

Purpose

Marketers use e-mail in new, potentially more informative, entertaining and lucrative ways – such as embedding video. The purpose of this paper is to examine consumer responses to audiovisual (i.e. text along with a short video) versus text-only messages in brand communication. Specifically, authors seek to uncover the efficacy of marketer-embedded video (vs text-only) in e-mail on the consumer's product interest, informativeness, perceived prestige, electronic word-of-mouth (e-WOM) intentions and willingness to pass the electronic message along digitally or on social media. With the dual coding theory and selective visual attention as theoretical guideposts, the intended contribution is a framework that can explain and predict advantages for multi-modal e-mail marketing communications.

Design/methodology/approach

Five hypotheses are tested experimentally with a one-factor experiment with two conditions (text-only vs audiovisual). The sample was 240 adult participants. Real brands (Audi and Apple) were used. For both brands, participants were randomly assigned to one of two conditions of the e-mail (i.e. audiovisual vs text-only). The stimuli are identical, with the exception of embedded video in the e-mail body. The videos are authentic brand videos, are approximately 50 s and use a product feature appeal. Participants’ pre-existing brand attitude was measured. Then, five dependent variables (product interest, informativeness, perceived prestige, e-WOM intentions and willingness to pass the electronic message along digitally or on social media) were considered with respect to consumer exposure to e-mail with video and text in the e-mail from the brand versus text-only e-mail from the brand.

Findings

The results supported the hypotheses that audiovisual messages (i.e. those with text and video) heighten informativeness, product interest, perceived prestige, intentions to spread e-WOM for a brand and willingness to pass along the e-mail along to friends and family when compared to text-only messages. These experimental findings from a one-factor experiment with two conditions (text-only vs audiovisual) are generally consistent for an American consumer technology brand Apple (iPhone) and a German luxury automobile brand Audi (S4). Hypotheses are supported for both brands (Apple and Audi), with the exception of product interest for Audi, which may be explained by the high price of a luxury automobile.

Research limitations/implications

An implication here for the dual coding theory is that the theory may be extended to consider what happens after the consumer codes the information with both the verbal and the non-verbal subsystem. The finding of interest to information processing scholars is that a video accompanying text communication from a brand to a consumer has an advantage over text-only communication. Brands that communicate with multi-modal marketing communication have better outcomes in informativeness, brand prestige perceptions and intentions of online consumer behaviors, including positive e-WOM for the brand in general and willingness to pass the specific content along in digital and social media platforms. Consumers can become brand advocates by being more inclined to forward the e-mails with the product short video as well as the e-mail text.

Practical implications

Brand marketers should consider e-mail in an integrated brand promotion (IBP) campaign as a cost advantage; one of the reasons e-mail should have a solid place in the IBP toolkit is due to e-mail's relatively low cost. The main cost comes with administration and production of the video. As a managerial implication for advertisers, embedding ads of a short video format in e-mails is a way to be more effective than plain-text e-mails. Short videos in e-mails are a reasonable idea to include in an integrated marketing communications effort (plausibly due to information processing with both a verbal and a non-verbal system). Brands can use videos in e-mails to enhance informativeness regarding products to enhance product differentiation from competitors. Yet, it is important to raise caution with some concerning disadvantages potentially associated with e-mail marketing and video. The three areas of caution include potential issues of privacy, clutter and technical inhibitors.

Originality/value

Despite the fact that e-mail is one of the most heavily used communication tools in marketing, there is scarce literature on e-mail and branding. By brands evoking a degree of prestige with embedded videos, consumer willingness to become part of the marketing communications is enhanced, as their e-WOM and willingness to share the branded content increase.

Details

European Journal of Marketing, vol. 51 no. 3
Type: Research Article
ISSN: 0309-0566

Keywords

Book part
Publication date: 23 August 2022

Carol Abiri and Katina Zammit

The teaching of reading in English is fraught with challenges that influence teachers' practices in Papua New Guinea (PNG). There are a plethora of linguistic issues…

Abstract

The teaching of reading in English is fraught with challenges that influence teachers' practices in Papua New Guinea (PNG). There are a plethora of linguistic issues regarding teaching in both the vernacular languages and English. Postcolonial education in PNG has continued to promote English as the medium of instruction while also promoting the use of vernacular and mother tongue. The outcomes-based education reform in the Language and Literacy Policy (1993–2014) supported the use of vernacular languages in the elementary years with the gradual bridging to English in Grade 3. In 2015, the Language and Literacy policy changed to standards-based education. One major shift was from the use of vernacular languages to English as a medium of instruction at all levels of formal education.

In this chapter, we use Tierney's concept of decolonizing spaces to investigate teachers' perspectives on implementing the English standards-based curriculum and the role the vernacular, mother tongue, and translanguaging plays in the classroom as Year 4 teachers grapple with the teaching of reading. It will problematize the colonization of English, the place of translanguaging, and the benefits and challenges for teachers when the classroom teacher most likely is not a native speaker of the children's dialect or English.

Article
Publication date: 19 February 2018

Qiujun Lan, Haojie Ma and Gang Li

Sentiment identification of Chinese text faces many challenges, such as requiring complex preprocessing steps, preparing various word dictionaries carefully and dealing…

Abstract

Purpose

Sentiment identification of Chinese text faces many challenges, such as requiring complex preprocessing steps, preparing various word dictionaries carefully and dealing with a lot of informal expressions, which lead to high computational complexity.

Design/methodology/approach

A method based on Chinese characters instead of words is proposed. This method represents the text into a fixed length vector and introduces the chi-square statistic to measure the categorical sentiment score of a Chinese character. Based on these, the sentiment identification could be accomplished through four main steps.

Findings

Experiments on corpus with various themes indicate that the performance of proposed method is a little bit worse than existing Chinese words-based methods on most texts, but with improved performance on short and informal texts. Especially, the computation complexity of the proposed method is far better than words-based methods.

Originality/value

The proposed method exploits the property of Chinese characters being a linguistic unit with semantic information. Contrasting to word-based methods, the computational efficiency of this method is significantly improved at slight loss of accuracy. It is more sententious and cuts off the problems resulted from preparing predefined dictionaries and various data preprocessing.

Details

Information Discovery and Delivery, vol. 46 no. 1
Type: Research Article
ISSN: 2398-6247

Keywords

Abstract

Details

Children and Mobile Phones: Adoption, Use, Impact, and Control
Type: Book
ISBN: 978-1-78973-036-4

Article
Publication date: 20 December 2007

Isak Taksa, Sarah Zelikovitz and Amanda Spink

The work presented in this paper aims to provide an approach to classifying web logs by personal properties of users.

471

Abstract

Purpose

The work presented in this paper aims to provide an approach to classifying web logs by personal properties of users.

Design/methodology/approach

The authors describe an iterative system that begins with a small set of manually labeled terms, which are used to label queries from the log. A set of background knowledge related to these labeled queries is acquired by combining web search results on these queries. This background set is used to obtain many terms that are related to the classification task. The system then ranks each of the related terms, choosing those that most fit the personal properties of the users. These terms are then used to begin the next iteration.

Findings

The authors identify the difficulties of classifying web logs, by approaching this problem from a machine learning perspective. By applying the approach developed, the authors are able to show that many queries in a large query log can be classified.

Research limitations/implications

Testing results in this type of classification work is difficult, as the true personal properties of web users are unknown. Evaluation of the classification results in terms of the comparison of classified queries to well known age‐related sites is a direction that is currently being exploring.

Practical implications

This research is background work that can be incorporated in search engines or other web‐based applications, to help marketing companies and advertisers.

Originality/value

This research enhances the current state of knowledge in shorttext classification and query log learning.

Details

International Journal of Web Information Systems, vol. 3 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 19 November 2018

Kristine Pytash, Todd Hawley and Kate Morgan

The purpose of this paper is to explore the potential of using digital shorts (Pytash et al., 2017) focusing on social issues in social studies classrooms.

Abstract

Purpose

The purpose of this paper is to explore the potential of using digital shorts (Pytash et al., 2017) focusing on social issues in social studies classrooms.

Design/methodology/approach

Qualitative case study is used in this study.

Findings

Digital shorts focused on important social issues, and included their beliefs and perspectives about their social issue, as well as insights into their developing identities as citizens. The authors’ findings demonstrate how this assignment can be the gateway for discussions regarding social issues, how students perceive their identities tied to contemporary social issues, and how they make sense of these issues within multimodal compositions.

Research limitations/implications

The findings from this research have implications for researching the effectiveness of digital media production analysis for students’ learning of social issues.

Practical implications

The findings from this research have implications for exploring how digital media production analysis can be incorporated into social studies courses.

Originality/value

Although the push for social studies teachers to provide spaces for students to demonstrate these capacities, few examples exist in the literature.

Details

Social Studies Research and Practice, vol. 13 no. 3
Type: Research Article
ISSN: 1933-5415

Keywords

1 – 10 of over 46000