Search results

1 – 10 of over 3000
Article
Publication date: 8 November 2018

Radhia Toujani and Jalel Akaichi

Nowadays, the event detection is so important in gathering news from social media. Indeed, it is widely employed by journalists to generate early alerts of reported stories. In…

Abstract

Purpose

Nowadays, the event detection is so important in gathering news from social media. Indeed, it is widely employed by journalists to generate early alerts of reported stories. In order to incorporate available data on social media into a news story, journalists must manually process, compile and verify the news content within a very short time span. Despite its utility and importance, this process is time-consuming and labor-intensive for media organizations. Because of the afore-mentioned reason and as social media provides an essential source of data used as a support for professional journalists, the purpose of this paper is to propose the citizen clustering technique which allows the community of journalists and media professionals to document news during crises.

Design/methodology/approach

The authors develop, in this study, an approach for natural hazard events news detection and danger citizen’ groups clustering based on three major steps. In the first stage, the authors present a pipeline of several natural language processing tasks: event trigger detection, applied to recuperate potential event triggers; named entity recognition, used for the detection and recognition of event participants related to the extracted event triggers; and, ultimately, a dependency analysis between all the extracted data. Analyzing the ambiguity and the vagueness of similarity of news plays a key role in event detection. This issue was ignored in traditional event detection techniques. To this end, in the second step of our approach, the authors apply fuzzy sets techniques on these extracted events to enhance the clustering quality and remove the vagueness of the extracted information. Then, the defined degree of citizens’ danger is injected as input to the introduced citizens clustering method in order to detect citizens’ communities with close disaster degrees.

Findings

Empirical results indicate that homogeneous and compact citizen’ clusters can be detected using the suggested event detection method. It can also be observed that event news can be analyzed efficiently using the fuzzy theory. In addition, the proposed visualization process plays a crucial role in data journalism, as it is used to analyze event news, as well as in the final presentation of detected danger citizens’ clusters.

Originality/value

The introduced citizens clustering method is profitable for journalists and editors to better judge the veracity of social media content, navigate the overwhelming, identify eyewitnesses and contextualize the event. The empirical analysis results illustrate the efficiency of the developed method for both real and artificial networks.

Details

Online Information Review, vol. 43 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 16 November 2015

Hsien-Tsung Chang, Shu-Wei Liu and Nilamadhab Mishra

The purpose of this paper is to design and implement new tracking and summarization algorithms for Chinese news content. Based on the proposed methods and algorithms, the authors…

Abstract

Purpose

The purpose of this paper is to design and implement new tracking and summarization algorithms for Chinese news content. Based on the proposed methods and algorithms, the authors extract the important sentences that are contained in topic stories and list those sentences according to timestamp order to ensure ease of understanding and to visualize multiple news stories on a single screen.

Design/methodology/approach

This paper encompasses an investigational approach that implements a new Dynamic Centroid Summarization algorithm in addition to a Term Frequency (TF)-Density algorithm to empirically compute three target parameters, i.e., recall, precision, and F-measure.

Findings

The proposed TF-Density algorithm is implemented and compared with the well-known algorithms Term Frequency-Inverse Word Frequency (TF-IWF) and Term Frequency-Inverse Document Frequency (TF-IDF). Three test data sets are configured from Chinese news web sites for use during the investigation, and two important findings are obtained that help the authors provide more precision and efficiency when recognizing the important words in the text. First, the authors evaluate three topic tracking algorithms, i.e., TF-Density, TF-IDF, and TF-IWF, with the said target parameters and find that the recall, precision, and F-measure of the proposed TF-Density algorithm is better than those of the TF-IWF and TF-IDF algorithms. In the context of the second finding, the authors implement a blind test approach to obtain the results of topic summarizations and find that the proposed Dynamic Centroid Summarization process can more accurately select topic sentences than the LexRank process.

Research limitations/implications

The results show that the tracking and summarization algorithms for news topics can provide more precise and convenient results for users tracking the news. The analysis and implications are limited to Chinese news content from Chinese news web sites such as Apple Library, UDN, and well-known portals like Yahoo and Google.

Originality/value

The research provides an empirical analysis of Chinese news content through the proposed TF-Density and Dynamic Centroid Summarization algorithms. It focusses on improving the means of summarizing a set of news stories to appear for browsing on a single screen and carries implications for innovative word measurements in practice.

Details

Aslib Journal of Information Management, vol. 67 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 14 January 2022

Krishnadas Nanath, Supriya Kaitheri, Sonia Malik and Shahid Mustafa

The purpose of this paper is to examine the factors that significantly affect the prediction of fake news from the virality theory perspective. The paper looks at a mix of…

Abstract

Purpose

The purpose of this paper is to examine the factors that significantly affect the prediction of fake news from the virality theory perspective. The paper looks at a mix of emotion-driven content, sentimental resonance, topic modeling and linguistic features of news articles to predict the probability of fake news.

Design/methodology/approach

A data set of over 12,000 articles was chosen to develop a model for fake news detection. Machine learning algorithms and natural language processing techniques were used to handle big data with efficiency. Lexicon-based emotion analysis provided eight kinds of emotions used in the article text. The cluster of topics was extracted using topic modeling (five topics), while sentiment analysis provided the resonance between the title and the text. Linguistic features were added to the coding outcomes to develop a logistic regression predictive model for testing the significant variables. Other machine learning algorithms were also executed and compared.

Findings

The results revealed that positive emotions in a text lower the probability of news being fake. It was also found that sensational content like illegal activities and crime-related content were associated with fake news. The news title and the text exhibiting similar sentiments were found to be having lower chances of being fake. News titles with more words and content with fewer words were found to impact fake news detection significantly.

Practical implications

Several systems and social media platforms today are trying to implement fake news detection methods to filter the content. This research provides exciting parameters from a viral theory perspective that could help develop automated fake news detectors.

Originality/value

While several studies have explored fake news detection, this study uses a new perspective on viral theory. It also introduces new parameters like sentimental resonance that could help predict fake news. This study deals with an extensive data set and uses advanced natural language processing to automate the coding techniques in developing the prediction model.

Details

Journal of Systems and Information Technology, vol. 24 no. 2
Type: Research Article
ISSN: 1328-7265

Keywords

Article
Publication date: 20 August 2018

Dharini Ramachandran and Parvathi Ramasubramanian

“What’s happening?” around you can be spread through the very pronounced social media to everybody. It provides a powerful platform that brings to light the latest news, trends…

Abstract

Purpose

“What’s happening?” around you can be spread through the very pronounced social media to everybody. It provides a powerful platform that brings to light the latest news, trends and happenings around the world in “near instant” time. Microblog is a popular Web service that enables users to post small pieces of digital content, such as text, picture, video and link to external resource. The raw data from microblog prove indispensable in extracting information from it, offering a way to single out the physical events and popular topics prevalent in social media. This study aims to present and review the varied methods carried out for event detection from microblogs. An event is an activity or action with a clear finite duration in which the target entity plays a key role. Event detection helps in the timely understanding of people’s opinion and actual condition of the detected events.

Design/methodology/approach

This paper presents a study of various approaches adopted for event detection from microblogs. The approaches are reviewed according to the techniques used, applications and the element detected (event or topic).

Findings

Various ideas explored, important observations inferred, corresponding outcomes and assessment of results from those approaches are discussed.

Originality/value

The approaches and techniques for event detection are studied in two categories: first, based on the kind of event being detected (physical occurrence or emerging/popular topic) and second, within each category, the approaches further categorized into supervised- and unsupervised-based techniques.

Book part
Publication date: 30 August 2019

Fulya Ozcan

This chapter investigates the behavior of Reddit’s news subreddit users and the relationship between their sentiment on exchange rates. Using graphical models and natural language…

Abstract

This chapter investigates the behavior of Reddit’s news subreddit users and the relationship between their sentiment on exchange rates. Using graphical models and natural language processing, hidden online communities among Reddit users are discovered. The data set used in this project is a mixture of text and categorical data from Reddit’s news subreddit. These data include the titles of the news pages, as well as a few user characteristics, in addition to users’ comments. This data set is an excellent resource to study user reaction to news since their comments are directly linked to the webpage contents. The model considered in this chapter is a hierarchical mixture model which is a generative model that detects overlapping networks using the sentiment from the user generated content. The advantage of this model is that the communities (or groups) are assumed to follow a Chinese restaurant process, and therefore it can automatically detect and cluster the communities. The hidden variables and the hyperparameters for this model are obtained using Gibbs sampling.

Details

Topics in Identification, Limited Dependent Variables, Partial Observability, Experimentation, and Flexible Modeling: Part A
Type: Book
ISBN: 978-1-78973-241-2

Keywords

Article
Publication date: 16 August 2021

Rajshree Varma, Yugandhara Verma, Priya Vijayvargiya and Prathamesh P. Churi

The rapid advancement of technology in online communication and fingertip access to the Internet has resulted in the expedited dissemination of fake news to engage a global…

1406

Abstract

Purpose

The rapid advancement of technology in online communication and fingertip access to the Internet has resulted in the expedited dissemination of fake news to engage a global audience at a low cost by news channels, freelance reporters and websites. Amid the coronavirus disease 2019 (COVID-19) pandemic, individuals are inflicted with these false and potentially harmful claims and stories, which may harm the vaccination process. Psychological studies reveal that the human ability to detect deception is only slightly better than chance; therefore, there is a growing need for serious consideration for developing automated strategies to combat fake news that traverses these platforms at an alarming rate. This paper systematically reviews the existing fake news detection technologies by exploring various machine learning and deep learning techniques pre- and post-pandemic, which has never been done before to the best of the authors’ knowledge.

Design/methodology/approach

The detailed literature review on fake news detection is divided into three major parts. The authors searched papers no later than 2017 on fake news detection approaches on deep learning and machine learning. The papers were initially searched through the Google scholar platform, and they have been scrutinized for quality. The authors kept “Scopus” and “Web of Science” as quality indexing parameters. All research gaps and available databases, data pre-processing, feature extraction techniques and evaluation methods for current fake news detection technologies have been explored, illustrating them using tables, charts and trees.

Findings

The paper is dissected into two approaches, namely machine learning and deep learning, to present a better understanding and a clear objective. Next, the authors present a viewpoint on which approach is better and future research trends, issues and challenges for researchers, given the relevance and urgency of a detailed and thorough analysis of existing models. This paper also delves into fake new detection during COVID-19, and it can be inferred that research and modeling are shifting toward the use of ensemble approaches.

Originality/value

The study also identifies several novel automated web-based approaches used by researchers to assess the validity of pandemic news that have proven to be successful, although currently reported accuracy has not yet reached consistent levels in the real world.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 15 February 2024

Xinyu Liu, Kun Ma, Ke Ji, Zhenxiang Chen and Bo Yang

Propaganda is a prevalent technique used in social media to intentionally express opinions or actions with the aim of manipulating or deceiving users. Existing methods for…

Abstract

Purpose

Propaganda is a prevalent technique used in social media to intentionally express opinions or actions with the aim of manipulating or deceiving users. Existing methods for propaganda detection primarily focus on capturing language features within its content. However, these methods tend to overlook the information presented within the external news environment from which propaganda news originated and spread. This news environment reflects recent mainstream media opinions and public attention and contains language characteristics of non-propaganda news. Therefore, the authors have proposed a graph-based multi-information integration network with an external news environment (abbreviated as G-MINE) for propaganda detection.

Design/methodology/approach

G-MINE is proposed to comprise four parts: textual information extraction module, external news environment perception module, multi-information integration module and classifier. Specifically, the external news environment perception module and multi-information integration module extract and integrate the popularity and novelty into the textual information and capture the high-order complementary information between them.

Findings

G-MINE achieves state-of-the-art performance on both the TSHP-17, Qprop and the PTC data sets, with an accuracy of 98.24%, 90.59% and 97.44%, respectively.

Originality/value

An external news environment perception module is proposed to capture the popularity and novelty information, and a multi-information integration module is proposed to effectively fuse them with the textual information.

Details

International Journal of Web Information Systems, vol. 20 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 18 February 2021

Tahereh Dehdarirad and Jonathan Freer

During recent years, web technologies and mass media have become prevalent in the context of medicine and health. Two examples of important web technologies used in health are news

Abstract

Purpose

During recent years, web technologies and mass media have become prevalent in the context of medicine and health. Two examples of important web technologies used in health are news media and patient forums. Both have a significant role in shaping patients' perspective and behaviour in relation to health and illness, as well as the way that they might choose or change their treatment. In this paper, the authors investigated the application of web technologies using the data analysis approach. The authors did this analysis from the point of view of topics being discussed and disseminated via patients and journalists in breast and lung cancer. The study also investigated the (dis)alignment amongst these two groups and scientists in terms of topics.

Design/methodology/approach

Three data sets comprised documents published between 2014 and 2018 obtained from ProQuest and Web of Science Medline databases, alongside data from three major patient forums on breast and lung cancer. The analysis and visualisation in this paper have been done using the udpipe, igraph R packages and VOSviewer.

Findings

The study’s findings showed that in general scientists focussed more on prognosis and treatment of cancer, whereas patients and journalists focussed more on detection, prevention and role of social and emotional support. The only exception was for news coverage of lung cancer where the largest cluster was related to treatment, research in cancer treatment and therapies. However, when comparing coverage by scientists and journalists in terms of treatment, the focus of news articles in both cancer types was mainly on chemotherapy and complimentary therapies. Finally, topics such as lifestyle or pain management were only discussed by breast cancer patients.

Originality/value

The results obtained from this study may provide valuable insights into topics of interest for each group of scientists, journalist and patients as well as (dis)alignment among them in terms of topics. These findings are important as scientific research is heavily dependent on communication, and research does not exist in a bubble. Scientists and journalists can gain insights from patients' experiences and needs, which in turn may help them to have a more holistic and realistic view.

Peer review

The peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-06-2020-0228

Details

Online Information Review, vol. 45 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 25 July 2008

Jahna Otterbacher and Dragomir Radev

Automated sentence‐level relevance and novelty detection would be of direct benefit to many information retrieval systems. However, the low level of agreement between human judges…

Abstract

Purpose

Automated sentence‐level relevance and novelty detection would be of direct benefit to many information retrieval systems. However, the low level of agreement between human judges performing the task is an issue of concern. In previous approaches, annotators were asked to identify sentences in a document set that are relevant to a given topic, and then to eliminate sentences that do not provide novel information. This paper aims to explore a new approach in which relevance and novelty judgments are made within the context of specific, factual information needs, rather than with respect to a broad topic.

Design/methodology/approach

An experiment is conducted in which annotators perform the novelty detection task in both the topic‐focused and fact‐focused settings.

Findings

Higher levels of agreement between judges are found on the task of identifying relevant sentences in the fact‐focused approach. However, the new approach does not improve agreement on novelty judgments.

Originality/value

The analysis confirms the intuition that making sentence‐level relevance judgments is likely to be the more difficult of the two tasks in the novelty detection framework.

Details

Journal of Documentation, vol. 64 no. 4
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 27 November 2020

Hoda Daou

Social media is characterized by its volume, its speed of generation and its easy and open access; all this making it an important source of information that provides valuable…

Abstract

Purpose

Social media is characterized by its volume, its speed of generation and its easy and open access; all this making it an important source of information that provides valuable insights. Content characteristics such as valence and emotions play an important role in the diffusion of information; in fact, emotions can shape virality of topics in social media. The purpose of this research is to fill the gap in event detection applied on online content by incorporating sentiment, more specifically strong sentiment, as main attribute in identifying relevant content.

Design/methodology/approach

The study proposes a methodology based on strong sentiment classification using machine learning and an advanced scoring technique.

Findings

The results show the following key findings: the proposed methodology is able to automatically capture trending topics and achieve better classification compared to state-of-the-art topic detection algorithms. In addition, the methodology is not context specific; it is able to successfully identify important events from various datasets within the context of politics, rallies, various news and real tragedies.

Originality/value

This study fills the gap of topic detection applied on online content by building on the assumption that important events trigger strong sentiment among the society. In addition, classic topic detection algorithms require tuning in terms of number of topics to search for. This methodology involves scoring the posts and, thus, does not require limiting the number topics; it also allows ordering the topics by relevance based on the value of the score.

Peer review

The peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-12-2019-0373

Details

Online Information Review, vol. 45 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

1 – 10 of over 3000