Search results
11 – 14 of 14Mahmoud Al-Ayyoub, Ahmed Alwajeeh and Ismail Hmeidi
The authorship authentication (AA) problem is concerned with correctly attributing a text document to its corresponding author. Historically, this problem has been the focus of…
Abstract
Purpose
The authorship authentication (AA) problem is concerned with correctly attributing a text document to its corresponding author. Historically, this problem has been the focus of various studies focusing on the intuitive idea that each author has a unique style that can be captured using stylometric features (SF). Another approach to this problem, known as the bag-of-words (BOW) approach, uses keywords occurrences/frequencies in each document to identify its author. Unlike the first one, this approach is more language-independent. This paper aims to study and compare both approaches focusing on the Arabic language which is still largely understudied despite its importance.
Design/methodology/approach
Being a supervised learning problem, the authors start by collecting a very large data set of Arabic documents to be used for training and testing purposes. For the SF approach, they compute hundreds of SF, whereas, for the BOW approach, the popular term frequency-inverse document frequency technique is used. Both approaches are compared under various settings.
Findings
The results show that the SF approach, which is much cheaper to train, can generate more accurate results under most settings.
Practical implications
Numerous advantages of efficiently solving the AA problem are obtained in different fields of academia as well as the industry including literature, security, forensics, electronic markets and trading, etc. Another practical implication of this work is the public release of its sources. Specifically, some of the SF can be very useful for other problems such as sentiment analysis.
Originality/value
This is the first study of its kind to compare the SF and BOW approaches for authorship analysis of Arabic articles. Moreover, many of the computed SF are novel, while other features are inspired by the literature. As SF are language-dependent and most existing papers focus on English, extra effort must be invested to adapt such features to Arabic text.
Details
Keywords
Kokil Jaidka, Christopher S.G. Khoo and Jin‐Cheon Na
This paper aims to report a study of researchers' preferences in selecting information from cited papers to include in a literature review, and the kinds of transformations and…
Abstract
Purpose
This paper aims to report a study of researchers' preferences in selecting information from cited papers to include in a literature review, and the kinds of transformations and editing applied to the selected information.
Design/methodology/approach
This is a part of a larger project to develop an automatic summarization method that emulates human literature review writing behaviour. Research questions were: how are literature reviews written – where do authors select information from, what types of information do they select and how do they transform it? What is the relationship between styles of literature review (integrative and descriptive) and each of these variables (source sections, types of information and types of transformation)? The authors analysed the literature review sections of 20 articles from the Journal of the American Society for Information Science and Technology, 2001‐2008, to answer these questions. Referencing sentences were mapped to 279 source papers to determine the source sentences. The type of information selected, the sections of source papers where the information was taken from, and the types of editing changes made to include in the literature review were analyzed.
Findings
Integrative literature reviews contain more research result information and critique, and reference more information from the results and conclusion sections of the source papers. Descriptive literature reviews contain more research method information, and reference more information from the abstract and introduction sections. The most common kind of transformation is the high‐level summary, though descriptive literature reviews have more cut‐pasting, especially for information taken from the abstract. The types of editing – substitutions, insertions and deletions – applied to the source sentences are identified.
Practical implications
The results are useful in the teaching of literature review writing, and indicate ways for automatic summarization systems to emulate human literature review writing.
Originality/value
Though there have been several studies of abstracts and abstracting, there are few studies of literature reviews and literature review writing. Little is known about how writers select information from source papers, integrate it and present it in a literature review. This paper fills some of the gaps.
Details
Keywords
Zehui Zhan, Jun Wu, Hu Mei, Qianyi Wu and Patrick S.W. Fong
This paper aims to investigate the individual difference on digital reading, by examining the eye-tracking records of male and female readers with different reading ability…
Abstract
Purpose
This paper aims to investigate the individual difference on digital reading, by examining the eye-tracking records of male and female readers with different reading ability (including their pupil size, blink rate, fixation rate, fixation duration, saccade rate, saccade duration, saccade amplitude and regression rate).
Design/methodology/approach
A total of 74 participants were selected according to 6,520 undergraduate students’ university entrance exam scores and the follow-up reading assessments. Half of them are men and half are women, with the top 3% good readers and the bottom 3% poor readers, from different disciplines.
Findings
Results indicated that the major gender differences on reading abilities were indicated by saccade duration, regression rate and blink rate. The major effects on reading ability have a larger effect size than the major effect on gender. Among all the indicators that have been examined, blink rate and regression rates are the most sensitive to the gender attribute, while the fixation rate and saccade amplitude showed the least sensitiveness.
Originality/value
This finding could be helpful for user modeling with eye-tracking data in intelligent tutoring systems, where necessary adjustments might be needed according to users’ individual differences. In this way, instructors could be able to provide purposeful guidance according to what the learners had seen and personalized the experience of digital reading.
Details
Keywords
Hamish Cunningham, Kalina Bontcheva and Yaoyong Li
Seeks to explore the gap that exists between knowledge management (KM) systems and the natural language materials that form almost all corporate data stores.
Abstract
Purpose
Seeks to explore the gap that exists between knowledge management (KM) systems and the natural language materials that form almost all corporate data stores.
Design/methodology/approach
A conceptual discussion and approach are taken using recent scientific results in the fields of the semantic web and ontology‐based information extraction.
Findings
Provides a high‐level introduction to information extraction (IE) and descriptions of application scenarios for KM tools that exploit IE, a form of natural language analysis to link semantic web models with documents. The paper presents some examples of ontology‐based IE systems, one of which, KIM, is under development in the SEKT Project. KIM offers IE‐based facilities for metadata creation, storage and conceptual search. The system can be used by diverse applications for annotating and querying documents.
Originality/value
Focuses on technologies and facilities that will become an important part of next‐generation KM applications.
Details