Search results
1 – 10 of 177Zhongyi Wang, Jin Zhang and Jing Huang
Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed…
Abstract
Purpose
Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library’s structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks.
Design/methodology/approach
MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion.
Findings
This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance.
Practical implications
With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload.
Originality/value
This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree.
Details
Keywords
Sung‐on Hwang, Carolyn L. Piazza, Michael J. Pierce and Sara M. Bryce
The purpose of this paper is to report on one high school English‐language‐learner's (ELL) breadth and depth of vocabulary as he communicated with his teacher through e‐mail…
Abstract
Purpose
The purpose of this paper is to report on one high school English‐language‐learner's (ELL) breadth and depth of vocabulary as he communicated with his teacher through e‐mail across geographic boundaries for over 18 months.
Design/methodology/approach
The authors began by separating 358 e‐mails into three time periods (first beginning, second middle, and third end) to calculate breadth using lexical density (type‐token ratios). Then, we sampled e‐mails based on personal and impersonal topics within these time periods and linguistically analyzed them for lexical cohesion, semantic usage, and derivational morphology. Interviews with participants before and after the analysis served as member checks.
Findings
The quantitative results showed a steady improvement in the breadth of the student's vocabulary over time. Qualitative analyses revealed four major uses of vocabulary within the context of e‐mail and the teacher‐student relationship.
Practical implications
Given our findings, we offer educators insights into ELL strategies and vocabulary assessment, not only with e‐mail but in all written communication.
Social implications
A social writing tool like e‐mail can be useful for learning English in a safe, non‐threatening environment. Moreover, a trusting social relationship between communicators that develops over time can expedite the language learning process.
Originality/value
Very few studies have looked at the strategic ways ELL students use vocabulary to learn English through e‐mailing.
Details
Keywords
This article examines the discourse of appointment, promotion, and tenure (APT) documents for academic librarians. Discourse analysis can illuminate the social role of language…
Abstract
This article examines the discourse of appointment, promotion, and tenure (APT) documents for academic librarians. Discourse analysis can illuminate the social role of language, social systems, and social practices.
This qualitative research analyzes the APT documents for librarians from a group of US universities (n = 50) whose librarians are tenured faculty (n = 35). Linguistic features were examined to identify genre (text type) and register (language variety) characteristics.
The documents showed strong relationships with other texts; vocabulary from the language of human resources (HR); grammatical characteristics such as nominalization; passive constructions; few pronouns; the “quasi-synonymy” of series of adjectives, nouns, or verbs; and expression of certainty and obligation. The documents have a sociolinguistic and social semiotic component. In using a faculty genre, librarians assert solidarity with other faculty, while the prominent discourse of librarians as practitioners detracts from faculty solidarity.
This research is limited to librarians at US land grant institutions. It has implications for other research institutions and other models of librarian status.
This research can help academic librarians fulfill their obligations by understanding how values encoded in these documents reflect positive and negative approaches.
Higher education and academic librarianship are in a state of flux. Understanding the discourse of these documents can help librarians encode appropriate goals and values. Little has been written on the discourse of librarianship. This is a contribution to the understanding of librarians as a discourse community and of significant communicative events.
Details
Keywords
The purpose of this study is to investigate the ability of EFL learners’ cohesion after the implementation of small group flipped instruction model through WhatsApp with small…
Abstract
Purpose
The purpose of this study is to investigate the ability of EFL learners’ cohesion after the implementation of small group flipped instruction model through WhatsApp with small group writing activities compared with individual flipped instruction model through WhatsApp with individual writing activities.
Design/methodology/approach
A quasi-experimental study with a nonequivalent control group and a pre-test/post-test design was implemented to find any significant difference between the two combinations. The small group was treated using small group flipped instruction model through WhatsApp with small group writing activities, and an individual class was exposed to individual group flipped instruction model through WhatsApp with individual writing activities as well. The instrument of this study was a writing test.
Findings
The findings revealed that the mean score from the small group flipped instruction model through WhatsApp with small group writing activities at 66.17 was higher than the mean score individual flipped model via WhatsApp with individual writing activities at50.19 with a level of significance < 0.05. He meant that the small group flipped classroom instruction model through WhatsApp with small group writing activities performed better than teaching cohesion with individual flipped instruction through WhatsApp with individual writing activities. The results suggested small group flipped teaching–learning cohesion with WhatsApp in writing served as one of the alternatives flipped group discussion to improve learners’ cohesion in writing.
Originality/value
Flipped classroom innovation has attracted English language teaching researchers’ attention to scrutinize its effectiveness. This inquiry, therefore, elaborated the effect off-lipping individual and small group classroom instruction with WhatsApp on EFL learners’ cohesion as part of EFL writing skills.
Details
Keywords
Ginger G. Collins and Stephanie F. Reid
This chapter details how engaging students in digital comics creation might support adolescents in strengthening their narrative writing capabilities. This chapter first provides…
Abstract
This chapter details how engaging students in digital comics creation might support adolescents in strengthening their narrative writing capabilities. This chapter first provides a more detailed explanation of the micro and macrostructural elements involved in narrative production. Second, the chapter provides an introduction to comics and important design features. The authors also illuminate the complexity of multimodal texts (texts that combine images and words) and link visual narrative pedagogy and curriculum to classroom equity and accessibility. Across these opening sections, academic standards are referenced to show how the comics medium aligns with national visions of what robust English Language Arts education entails. The chapter concludes with descriptions of specific pedagogical strategies and digital comic-making tools that teachers and interventionists might explore with students within various classroom contexts. Examples of digital comics designed using various web tools are also shared.
Details
Keywords
Robin Sydserff and Pauline Weetman
Readability formulas have been criticised as a method for scoring accounting narratives because of their focus on word‐ and sentence‐level features and not on whole‐text aspects…
Abstract
Readability formulas have been criticised as a method for scoring accounting narratives because of their focus on word‐ and sentence‐level features and not on whole‐text aspects, their lack of regard for the interests and motivation of the reader, and their inappropriateness for evaluating adult‐based and technical accounting narratives. The literature of linguistics offers theoretical and practical validation for application of a texture index which addresses these criticisms. The paper shows how the general model drawn from applied linguistics can be tailored to the specific situation of an accounting narrative – the Operating and Financial Review. Rules which provide for objectivity in replication are specified and illustrated for a sample narrative. Illustrative empirical analysis shows that there is no evidence of association with the Flesch readability score. This suggests that the texture index is potentially a powerful tool for analysis of accounting narratives and association testing.
Details
Keywords
Jamal Al Qundus, Adrian Paschke, Shivam Gupta, Ahmad M. Alzouby and Malik Yousef
The purpose of this paper is to explore to which extent the quality of social media short text without extensions can be investigated and what are the predictors, if any, of such…
Abstract
Purpose
The purpose of this paper is to explore to which extent the quality of social media short text without extensions can be investigated and what are the predictors, if any, of such short text that lead to trust its content.
Design/methodology/approach
The paper applies a trust model to classify data collections based on metadata into four classes: Very Trusted, Trusted, Untrusted and Very Untrusted. These data are collected from the online communities, Genius and Stack Overflow. In order to evaluate short texts in terms of its trust levels, the authors have conducted two investigations: (1) A natural language processing (NLP) approach to extract relevant features (i.e. Part-of-Speech and various readability indexes). The authors report relatively good performance of the NLP study. (2) A machine learning technique in more precise, a random forest (RF) classifierusing bag-of-words model (BoW).
Findings
The investigation of the RF classifier using BoW shows promising intermediate results (on average 62% accuracy of both online communities) in short-text quality identification that leads to trust.
Practical implications
As social media becomes an increasingly new and attractive source of information, which is mostly provided in the form of short texts, businesses (e.g. in search engines for smart data) can filter content without having to apply complex approaches and continue to deal with information that is considered more trustworthy.
Originality/value
Short-text classifications with regard to a criterion (e.g. quality, readability) are usually extended by an external source or its metadata. This enhancement either changes the original text if it is an additional text from an external source, or it requires text metadata that is not always available. To this end, the originality of this study faces the challenge of investigating the quality of short text (i.e. social media text) without having to extend or modify it using external sources. This modification alters the text and distorts the results of the investigation.
Details
Keywords
Jiunn-Liang Guo, Hei-Chia Wang and Ming-Way Lai
The purpose of this paper is to develop a novel feature selection approach for automatic text classification of large digital documents – e-books of online library system. The…
Abstract
Purpose
The purpose of this paper is to develop a novel feature selection approach for automatic text classification of large digital documents – e-books of online library system. The main idea mainly aims on automatically identifying the discourse features in order to improving the feature selection process rather than focussing on the size of the corpus.
Design/methodology/approach
The proposed framework intends to automatically identify the discourse segments within e-books and capture proper discourse subtopics that are cohesively expressed in discourse segments and treating these subtopics as informative and prominent features. The selected set of features is then used to train and perform the e-book classification task based on the support vector machine technique.
Findings
The evaluation of the proposed framework shows that identifying discourse segments and capturing subtopic features leads to better performance, in comparison with two conventional feature selection techniques: TFIDF and mutual information. It also demonstrates that discourse features play important roles among textual features, especially for large documents such as e-books.
Research limitations/implications
Automatically extracted subtopic features cannot be directly entered into FS process but requires control of the threshold.
Practical implications
The proposed technique has demonstrated the promised application of using discourse analysis to enhance the classification of large digital documents – e-books as against to conventional techniques.
Originality/value
A new FS technique is proposed which can inspect the narrative structure of large documents and it is new to the text classification domain. The other contribution is that it inspires the consideration of discourse information in future text analysis, by providing more evidences through evaluation of the results. The proposed system can be integrated into other library management systems.
Details
Keywords
Pedro Hípola, José A. Senso, Amed Leiva-Mederos and Sandor Domínguez-Velasco
The purpose of this paper is to look into the latest advances in ontology-based text summarization systems, with emphasis on the methodologies of a socio-cognitive approach, the…
Abstract
Purpose
The purpose of this paper is to look into the latest advances in ontology-based text summarization systems, with emphasis on the methodologies of a socio-cognitive approach, the structural discourse models and the ontology-based text summarization systems.
Design/methodology/approach
The paper analyzes the main literature in this field and presents the structure and features of Texminer, a software that facilitates summarization of texts on Port and Coastal Engineering. Texminer entails a combination of several techniques, including: socio-cognitive user models, Natural Language Processing, disambiguation and ontologies. After processing a corpus, the system was evaluated using as a reference various clustering evaluation experiments conducted by Arco (2008) and Hennig et al. (2008). The results were checked with a support vector machine, Rouge metrics, the F-measure and calculation of precision and recall.
Findings
The experiment illustrates the superiority of abstracts obtained through the assistance of ontology-based techniques.
Originality/value
The authors were able to corroborate that the summaries obtained using Texminer are more efficient than those derived through other systems whose summarization models do not use ontologies to summarize texts. Thanks to ontologies, main sentences can be selected with a broad rhetorical structure, especially for a specific knowledge domain.
Details
Keywords
Michael Grassmann, Stephan Fuhrmann and Thomas W. Guenther
Integrated reporting (IR) aims to provide disclosures of the connectivity of non-financial and financial value creation aspects. These disclosures are defined as the disclosed…
Abstract
Purpose
Integrated reporting (IR) aims to provide disclosures of the connectivity of non-financial and financial value creation aspects. These disclosures are defined as the disclosed connectivity of the capitals resulting from integrated thinking. This paper aims to investigate the extent of disclosed connectivity of the capitals in integrated reports and its underlying managerial discretion by drawing on economic-based theories.
Design/methodology/approach
Regression analyses are applied to examine the associations between economic firm-level characteristics and the extent of disclosed connectivity of the capitals. The analyses are based on a content analysis of 169 integrated reports disclosed in 2013 and 2014 by Forbes Global 2000 companies.
Findings
This paper finds high heterogeneity in the extent of disclosed connectivity of the capitals in current IR practice. This heterogeneity is related to drivers arising from economic-based theories. Firms’ non-financial and financial performance and the importance of strategic shareholders and debt providers are positively associated with the extent of disclosed connectivity of the capitals. The complexity of the business model and a highly competitive environment are negatively associated with the extent of disclosed connectivity of the capitals.
Research limitations/implications
This paper extends qualitative IR studies on the disclosed connectivity of the capitals by quantitative results from a content analysis for a cross-sectional and global sample. Additionally, this study adds to prior IR literature on the drivers of the binary decision to disclose an integrated report by focusing on the extent of disclosed connectivity of the capitals.
Practical implications
For report preparers, users and standard setters, the results reveal that perceived cost-benefit considerations (signaling vs. direct and proprietary costs) may explain managerial discretion regarding the connectivity of the capitals within integrated reports.
Social implications
This paper examines integrated reports, which are intended to inform providers of financial capital and other stakeholders about the connectivity of the six capitals of the IR framework.
Originality/value
This paper develops a metric disclosure measure of the extent of disclosed connectivity of the capitals. It provides initial evidence of how the IR framework’s focus on this key characteristic is realized in disclosure practice. Concerns about competitive disadvantages and preparation costs limit this key characteristic of integrated reports.
Details