Search results

1 – 10 of 13
Article
Publication date: 1 May 1990

William J. Black

Although the structure of texts and the way they are produced and understood are active areas for research in a range of disciplines — linguistics, psychology, artificial…

Abstract

Although the structure of texts and the way they are produced and understood are active areas for research in a range of disciplines — linguistics, psychology, artificial intelligence and information science — this research has not yet produced convincing computer‐based solutions to the problem of producing abstracts of technical papers. Genuinely knowledge‐based approaches are probably furthest from producing practical results, because of the amount of background knowledge that they presuppose and because of the difficulty in principle of finding appropriate representations and processes. After reviewing the kinds of structure in abstracts that seem pertinent to the abstracting problem, we discuss a rule‐based approach that requires a relatively simple knowledge base, partitioned into two sets of rules: one for recognising textual fragments that explicitly signal the document topic, and another for detecting when referring expressions (especially pronouns) are linked to objects mentioned or evoked in a different sentence. These two rule sets are used in a method of exctracting from text to produce a paragraph which can function as an abstract, pioneered by Paice at the University of Lancaster. The second stage of the method is designed to ensure that sentences extracted in the first do not contain any references to parts of the text that have not been extracted, thus ensuring a minimal standard of coherence. In refining these methods, definite noun phases (those beginning with ‘the’) pose a particular problem, which is being addressed in current joint work by Paice and the author.

Details

Online Review, vol. 14 no. 5
Type: Research Article
ISSN: 0309-314X

Article
Publication date: 1 June 2015

Trung Tran and Dang Tuan Nguyen

The purpose of this paper is to enhance the quality of new reducing sentence in sentence-generation-based summarizing method by establishing consequence relationship between two…

Abstract

Purpose

The purpose of this paper is to enhance the quality of new reducing sentence in sentence-generation-based summarizing method by establishing consequence relationship between two action, state or process Vietnamese sentences.

Design/methodology/approach

First, types of pairs of Vietnamese sentences based on presupposition about the consequence relationship is classified: the verb indicating action or state at the first sentence is considered as the consequence of the verb indicating action, state or process at the second sentence. Then main predicates in Discourse Representation Structure – a logical form which represents the semantic of a given pair of sentences – is analyzed and inner- and inter-sentential relationships are determined. The next step is to generate the syntactic structure of the new reducing sentence. Finally, a combination with the built set of lexicons is done to complete the new meaning-summarizing Vietnamese sentence.

Findings

This method makes the new meaning-summarizing Vietnamese sentence satisfy two requirements: summarize the semantic of the given pair of Vietnamese sentences and have naturalism in common Vietnamese communication. In addition, it is possible to extend the method and apply for the purpose of summarizing the more complex Vietnamese paragraphs as well as paragraphs in other languages.

Research limitations/implications

At the first step, only inter-sentential consequence relationship is considered and this is applied to the limit types of pairs of Vietnamese sentences which have a simple structure.

Originality/value

This study presents improvements in sentence-generation-based summarization method to enhance the quality of new meaning-summarizing Vietnamese sentences. This method proves effective in summarizing the considered pairs of sentences.

Details

International Journal of Pervasive Computing and Communications, vol. 11 no. 2
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 27 September 2021

Sudarshan S. Sonawane and Satish R. Kolhe

The purpose of this paper is to handle the anaphors through anaphora resolution in aspect-oriented sentiment analysis. Sentiment analysis is one of the predictive analytics of…

45

Abstract

Purpose

The purpose of this paper is to handle the anaphors through anaphora resolution in aspect-oriented sentiment analysis. Sentiment analysis is one of the predictive analytics of social media. In particular, the social media platform Twitter is an open platform to post the opinion by subscribers on contextual issues, events, products, individuals and organizations.

Design/methodology/approach

The sentiment polarity assessment is not deterministic to conclude the opinion of the target audience unless the polarity is assessed under diversified aspects. Hence, the aspect-oriented sentiment polarity assessment is a crucial objective of the opinion assessment over social media. However, the aspect-oriented sentiment polarity assessment often influences by the curse of anaphora resolution.

Findings

Focusing on these limitations, a scale to estimate the aspects oriented sentiment polarity under anaphors influence has been portrayed in this article. To assess the aspect-based sentiment polarity of the tweets, the anaphors of the tweets have been considered to assess the weightage of the tweets toward the sentiment polarity.

Originality/value

The experimental study presents the performance of the proposed model by comparing it with the contemporary models, which are estimating the sentiment polarity tweets under anaphors impact.

Details

International Journal of Intelligent Unmanned Systems, vol. 10 no. 1
Type: Research Article
ISSN: 2049-6427

Keywords

Article
Publication date: 1 May 1978

W.J. Hutchins

The common view of the ‘aboutness’ of documents is that the index entries (or classifications) assigned to documents represent or indicate in some way the total contents of…

3155

Abstract

The common view of the ‘aboutness’ of documents is that the index entries (or classifications) assigned to documents represent or indicate in some way the total contents of documents; indexing and classifying are seen as processes involving the ‘summarization’ of the texts of documents. In this paper an alternative concept of ‘aboutness’ is proposed based on an analysis of the linguistic organization of texts, which is felt to be more appropriate in many indexing environments (particularly in non‐specialized libraries and information services) and which has implications for the evaluation of the effectiveness of indexing systems.

Details

Aslib Proceedings, vol. 30 no. 5
Type: Research Article
ISSN: 0001-253X

Article
Publication date: 5 June 2017

Atika Qazi, Ram Gopal Raj, Glenn Hardaker and Craig Standing

The purpose of this paper is to map the evidence provided on the review types, and explain the challenges faced by classification techniques in sentiment analysis (SA). The aim is…

3269

Abstract

Purpose

The purpose of this paper is to map the evidence provided on the review types, and explain the challenges faced by classification techniques in sentiment analysis (SA). The aim is to understand how traditional classification technique issues can be addressed through the adoption of improved methods.

Design/methodology/approach

A systematic review of literature was used to search published articles between 2002 and 2014 and identified 24 papers that discuss regular, comparative, and suggestive reviews and the related SA techniques. The authors formulated and applied specific inclusion and exclusion criteria in two distinct rounds to determine the most relevant studies for the research goal.

Findings

The review identified nine practices of review types, eight standard machine learning classification techniques and seven practices of concept learning Sentic computing techniques. This paper offers insights on promising concept-based approaches to SA, which leverage commonsense knowledge and linguistics for tasks such as polarity detection. The practical implications are also explained in this review.

Research limitations/implications

The findings provide information for researchers and traders to consider in relation to a variety of techniques for SA such as Sentic computing and multiple opinion types such as suggestive opinions.

Originality/value

Previous literature review studies in the field of SA have used simple literature review to find the tasks and challenges in the field. In this study, a systematic literature review is conducted to find the more specific answers to the proposed research questions. This type of study has not been conducted in the field previously and so provides a novel contribution. Systematic reviews help to reduce implicit researcher bias. Through adoption of broad search strategies, predefined search strings and uniform inclusion and exclusion criteria, systematic reviews effectively force researchers to search for studies beyond their own subject areas and networks.

Details

Internet Research, vol. 27 no. 3
Type: Research Article
ISSN: 1066-2243

Keywords

Open Access
Article
Publication date: 30 May 2022

Amani Mejri

This corpus-based study provides a descriptive account of the distribution of the polysemous noun nafs in two Arabic varieties, Modern Standard Arabic (MSA) and Classical Arabic…

Abstract

Purpose

This corpus-based study provides a descriptive account of the distribution of the polysemous noun nafs in two Arabic varieties, Modern Standard Arabic (MSA) and Classical Arabic (CA). The research objective is to survey the use of nafs as a reflexive marker in local binding domains and as a self-intensifier in NP-adjoined positions.

Design/methodology/approach

The consulted corpora are Timespamped JSI Web corpus for MSA and Quran corpus for CA. While attending to corpora size differences, MSA and CA exhibit a pattern of difference and similarity in nafs diffusion.

Findings

In the modern variety, nafs is pervasively used as reflexive marker in canonical binding domains, along with a less frequent, yet notable, intensifier user, and these uses are partially and cautiously attributed to the specific genre in which they occur. In CA, nafs is mainly recurrent as a polysemous noun, along with extensive use as a reflexive marker in local binding settings. As an intensifier, nafs is totally non-existent in the CA corpus, in the same way as it is in absentia in VP-constituent extraction in MSA.

Originality/value

Examining whether nafs, as a reflexive marker, deviates from canonical binding in Arabic the way English reflexive pronouns do. Building a general account of this distribution is relevant in understanding the explicit (syntactic) and implicit (discourse-based) dimensions of reflexive marker and self-intensifier processing and interpretation in Arabic as a first and second language.

Details

Saudi Journal of Language Studies, vol. 2 no. 2
Type: Research Article
ISSN: 2634-243X

Keywords

Article
Publication date: 1 January 1989

EMMANUEL J. YANNAKOUDAKIS and HUSSAIN A. ATTAR‐BASHI

The Subject‐Object Relationship Interface model (SORI) described in this paper is a novel approach that displays many of the structures necessary to map between the conceptual…

107

Abstract

The Subject‐Object Relationship Interface model (SORI) described in this paper is a novel approach that displays many of the structures necessary to map between the conceptual level and the external level in a database management system, which is an information‐oriented view of data. The model embodies a semantic synthesiser, which is based on an algorithm that maps the syntactic representation of a tuple or a record onto a semantic representation. This is based on table‐driven semantics which are embedded in the database model. The paper introduces a technique for translating tuples into natural language sentences, and discusses a system that has been fully implemented in PROLOG.

Details

Journal of Documentation, vol. 45 no. 1
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 December 1995

Frances Johnson

The prospect of automatically generating abstracts has attractedresearchers for some time, but the promise of superseding the humaneffort has yet to be realized. Surveys the…

1002

Abstract

The prospect of automatically generating abstracts has attracted researchers for some time, but the promise of superseding the human effort has yet to be realized. Surveys the approaches and techniques developed with the view to showing why this is so. Particular emphasis is placed on the requirements for the production of abstracts, which effectively serve their intended function, to show the ways in which this has hampered research in the past. Suggests that progress of automatic abstracting research may come about via the integration of some of the techniques into computerized information retrieval systems. This will allow researchers to shift the aim from reproducing the conventional benefits of abstracts to accentuating the advantages to users of computerized representation of information in large textual databases.

Details

Library Review, vol. 44 no. 8
Type: Research Article
ISSN: 0024-2535

Keywords

Article
Publication date: 27 April 2010

María‐Dolores Olvera‐Lobo and Lola García‐Santiago

This study aims to focus on the evaluation of systems for the automatic translation of questions destined to translingual question‐answer (QA) systems. The efficacy of online…

Abstract

Purpose

This study aims to focus on the evaluation of systems for the automatic translation of questions destined to translingual question‐answer (QA) systems. The efficacy of online translators when performing as tools in QA systems is analysed using a collection of documents in the Spanish language.

Design/methodology/approach

Automatic translation is evaluated in terms of the functionality of actual translations produced by three online translators (Google Translator, Promt Translator, and Worldlingo) by means of objective and subjective evaluation measures, and the typology of errors produced was identified. For this purpose, a comparative study of the quality of the translation of factual questions of the CLEF collection of queries was carried out, from German and French to Spanish.

Findings

It was observed that the rates of error for the three systems evaluated here are greater in the translations pertaining to the language pair German‐Spanish. Promt was identified as the most reliable translator of the three (on average) for the two linguistic combinations evaluated. However, for the Spanish‐German pair, a good assessment of the Google online translator was obtained as well. Most errors (46.38 percent) tended to be of a lexical nature, followed by those due to a poor translation of the interrogative particle of the query (31.16 percent).

Originality/value

The evaluation methodology applied focuses above all on the finality of the translation. That is, does the resulting question serve as effective input into a translingual QA system? Thus, instead of searching for “perfection”, the functionality of the question and its capacity to lead one to an adequate response are appraised. The results obtained contribute to the development of improved translingual QA systems.

Details

Journal of Documentation, vol. 66 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 18 April 2017

Mahmoud Al-Ayyoub, Ahmed Alwajeeh and Ismail Hmeidi

The authorship authentication (AA) problem is concerned with correctly attributing a text document to its corresponding author. Historically, this problem has been the focus of…

Abstract

Purpose

The authorship authentication (AA) problem is concerned with correctly attributing a text document to its corresponding author. Historically, this problem has been the focus of various studies focusing on the intuitive idea that each author has a unique style that can be captured using stylometric features (SF). Another approach to this problem, known as the bag-of-words (BOW) approach, uses keywords occurrences/frequencies in each document to identify its author. Unlike the first one, this approach is more language-independent. This paper aims to study and compare both approaches focusing on the Arabic language which is still largely understudied despite its importance.

Design/methodology/approach

Being a supervised learning problem, the authors start by collecting a very large data set of Arabic documents to be used for training and testing purposes. For the SF approach, they compute hundreds of SF, whereas, for the BOW approach, the popular term frequency-inverse document frequency technique is used. Both approaches are compared under various settings.

Findings

The results show that the SF approach, which is much cheaper to train, can generate more accurate results under most settings.

Practical implications

Numerous advantages of efficiently solving the AA problem are obtained in different fields of academia as well as the industry including literature, security, forensics, electronic markets and trading, etc. Another practical implication of this work is the public release of its sources. Specifically, some of the SF can be very useful for other problems such as sentiment analysis.

Originality/value

This is the first study of its kind to compare the SF and BOW approaches for authorship analysis of Arabic articles. Moreover, many of the computed SF are novel, while other features are inspired by the literature. As SF are language-dependent and most existing papers focus on English, extra effort must be invested to adapt such features to Arabic text.

Details

International Journal of Web Information Systems, vol. 13 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Access

Year

Content type

Article (13)
1 – 10 of 13