Search results

1 – 10 of over 7000
Article
Publication date: 8 November 2022

Yohanes Sigit Purnomo W.P., Yogan Jaya Kumar and Nur Zareen Zulkarnain

By far, the corpus for the quotation extraction and quotation attribution tasks in Indonesian is still limited in quantity and depth. This study aims to develop an Indonesian…

Abstract

Purpose

By far, the corpus for the quotation extraction and quotation attribution tasks in Indonesian is still limited in quantity and depth. This study aims to develop an Indonesian corpus of public figure statements attributions and a baseline model for attribution extraction, so it will contribute to fostering research in information extraction for the Indonesian language.

Design/methodology/approach

The methodology is divided into corpus development and extraction model development. During corpus development, data were collected and annotated. The development of the extraction model entails feature extraction, the definition of the model architecture, parameter selection and configuration, model training and evaluation, as well as model selection.

Findings

The Indonesian corpus of public figure statements attribution achieved 90.06% agreement level between the annotator and experts and could serve as a gold standard corpus. Furthermore, the baseline model predicted most labels and achieved 82.026% F-score.

Originality/value

To the best of the authors’ knowledge, the resulting corpus is the first corpus for attribution of public figures’ statements in the Indonesian language, which makes it a significant step for research on attribution extraction in the language. The resulting corpus and the baseline model can be used as a benchmark for further research. Other researchers could follow the methods presented in this paper to develop a new corpus and baseline model for other languages.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 2 December 2020

Yohanes Sigit Purnomo W.P., Yogan Jaya Kumar and Nur Zareen Zulkarnain

Extracting information from unstructured data becomes a challenging task for computational linguistics. Public figure’s statement attributed by journalists in a story is one type…

Abstract

Purpose

Extracting information from unstructured data becomes a challenging task for computational linguistics. Public figure’s statement attributed by journalists in a story is one type of information that can be processed into structured data. Therefore, having the knowledge base about this data will be very beneficial for further use, such as for opinion mining, claim detection and fact-checking. This study aims to understand statement extraction tasks and the models that have already been applied to formulate a framework for further study.

Design/methodology/approach

This paper presents a literature review from selected previous research that specifically addresses the topics of quotation extraction and quotation attribution. Research works that discuss corpus development related to quotation extraction and quotation attribution are also considered. The findings of the review will be used as a basis for proposing a framework to direct further research.

Findings

There are three findings in this study. Firstly, the extraction process still consists of two main tasks, namely, the extraction of quotations and the attribution of quotations. Secondly, most extraction algorithms rely on a rule-based algorithm or traditional machine learning. And last, the availability of corpus, which is limited in quantity and depth. Based on these findings, a statement extraction framework for Indonesian language corpus and model development is proposed.

Originality/value

The paper serves as a guideline to formulate a framework for statement extraction based on the findings from the literature study. The proposed framework includes a corpus development in the Indonesian language and a model for public figure statement extraction. Furthermore, this study could be used as a reference to produce a similar framework for other languages.

Details

Global Knowledge, Memory and Communication, vol. 70 no. 6/7
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 2 September 2019

Jelena Andonovski, Branislava Šandrih and Olivera Kitanović

This paper aims to describe the structure of an aligned Serbian-German literary corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to create a…

Abstract

Purpose

This paper aims to describe the structure of an aligned Serbian-German literary corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to create a benchmark Serbian-German annotated corpus searchable with various query expansions.

Design/methodology/approach

The presented research is particularly focused on the enhancement of bilingual search queries in a full-text search of aligned SrpNemKor collection. The enhancement is based on using existing lexical resources such as Serbian morphological electronic dictionaries and the bilingual lexical database Termi.

Findings

For the purpose of this research, the lexical database Termi is enriched with a bilingual list of German-Serbian translated pairs of lexical units. The list of correct translation pairs was extracted from SrpNemKor, evaluated and integrated into Termi. Also, Serbian morphological e-dictionaries are updated with new entries extracted from the Serbian part of the corpus.

Originality/value

A bilingual search of SrpNemKor in Bibliša is available within the user-friendly platform. The enriched database Termi enables semantic enhancement and refinement of user’s search query based on synonyms both in Serbian and German at a very high level. Serbian morphological e-dictionaries facilitate the morphological expansion of search queries in Serbian, thereby enabling the analysis of concepts and concept structures by identifying terms assigned to the concept, and by establishing relations between terms in Serbian and German which makes Bibliša a valuable Web tool that can support research and analysis of SrpNemKor.

Details

The Electronic Library , vol. 37 no. 4
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 12 September 2023

Wenjing Wu, Caifeng Wen, Qi Yuan, Qiulan Chen and Yunzhong Cao

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the…

Abstract

Purpose

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the difficulty of reusing unstructured data in the construction industry, the knowledge in it is difficult to be used directly for safety analysis. The purpose of this paper is to explore the construction of construction safety knowledge representation model and safety accident graph through deep learning methods, extract construction safety knowledge entities through BERT-BiLSTM-CRF model and propose a data management model of data–knowledge–services.

Design/methodology/approach

The ontology model of knowledge representation of construction safety accidents is constructed by integrating entity relation and logic evolution. Then, the database of safety incidents in the architecture, engineering and construction (AEC) industry is established based on the collected construction safety incident reports and related dispute cases. The construction method of construction safety accident knowledge graph is studied, and the precision of BERT-BiLSTM-CRF algorithm in information extraction is verified through comparative experiments. Finally, a safety accident report is used as an example to construct the AEC domain construction safety accident knowledge graph (AEC-KG), which provides visual query knowledge service and verifies the operability of knowledge management.

Findings

The experimental results show that the combined BERT-BiLSTM-CRF algorithm has a precision of 84.52%, a recall of 92.35%, and an F1 value of 88.26% in named entity recognition from the AEC domain database. The construction safety knowledge representation model and safety incident knowledge graph realize knowledge visualization.

Originality/value

The proposed framework provides a new knowledge management approach to improve the safety management of practitioners and also enriches the application scenarios of knowledge graph. On the one hand, it innovatively proposes a data application method and knowledge management method of safety accident report that integrates entity relationship and matter evolution logic. On the other hand, the legal adjudication dimension is innovatively added to the knowledge graph in the construction safety field as the basis for the postincident disposal measures of safety accidents, which provides reference for safety managers' decision-making in all aspects.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 1 July 2020

Mike Thelwall, Eleanor-Rose Papas, Zena Nyakoojo, Liz Allen and Verena Weigert

Peer reviewer evaluations of academic papers are known to be variable in content and overall judgements but are important academic publishing safeguards. This article introduces a…

Abstract

Purpose

Peer reviewer evaluations of academic papers are known to be variable in content and overall judgements but are important academic publishing safeguards. This article introduces a sentiment analysis program, PeerJudge, to detect praise and criticism in peer evaluations. It is designed to support editorial management decisions and reviewers in the scholarly publishing process and for grant funding decision workflows. The initial version of PeerJudge is tailored for reviews from F1000Research's open peer review publishing platform.

Design/methodology/approach

PeerJudge uses a lexical sentiment analysis approach with a human-coded initial sentiment lexicon and machine learning adjustments and additions. It was built with an F1000Research development corpus and evaluated on a different F1000Research test corpus using reviewer ratings.

Findings

PeerJudge can predict F1000Research judgements from negative evaluations in reviewers' comments more accurately than baseline approaches, although not from positive reviewer comments, which seem to be largely unrelated to reviewer decisions. Within the F1000Research mode of post-publication peer review, the absence of any detected negative comments is a reliable indicator that an article will be ‘approved’, but the presence of moderately negative comments could lead to either an approved or approved with reservations decision.

Originality/value

PeerJudge is the first transparent AI approach to peer review sentiment detection. It may be used to identify anomalous reviews with text potentially not matching judgements for individual checks or systematic bias assessments.

Details

Online Information Review, vol. 44 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 1 June 2005

Colin C. Williams

Recently, the recurring narrative that capitalism is stretching its tentacles ever moe widely and deeply into every crevice of daily life across the globe has been challenged in

389

Abstract

Purpose

Recently, the recurring narrative that capitalism is stretching its tentacles ever moe widely and deeply into every crevice of daily life across the globe has been challenged in the context of Western economies and the Third World by an emerging post‐development corpus of thought. The aim here is to extend this critique of market hegemony by investigating the so‐called “transition” economies of East‐Central Europe.

Design/methodology/approach

The paper analyses the extent to which market practices penetrated the “transition” economies of East‐Central Europe in the years following the collapse of the socialist bloc, first through a review of the post‐development literature and then by examining the nature of work and trajectories of the “transition” economies.

Findings

Analysis highlights not only the shallow permeation of market practices but also the multiplicity of development trajectories being pursued at both the household and societal levels.

Originality/value

The outcome is to provide additional evidence from the post‐socialist East‐Central European bloc to support the critique of market hegemony and open up the future to alternative possibilities beyond marketisation.

Details

Foresight, vol. 7 no. 3
Type: Research Article
ISSN: 1463-6689

Keywords

Article
Publication date: 2 October 2007

Colin C. Williams and John Round

The purpose of this paper is to evaluate critically the meta‐narrative that capitalism is becoming totalising and hegemonic. Grounded in an emerging corpus of post‐development

651

Abstract

Purpose

The purpose of this paper is to evaluate critically the meta‐narrative that capitalism is becoming totalising and hegemonic. Grounded in an emerging corpus of post‐development thought that has deconstructed this discourse in relation to western economies and the majority (third) world, the purpose of this paper is to further contribute to this burgeoning critique by analysing the degree to which capitalism has penetrated a post‐socialist society, namely Ukraine.

Design/methodology/approach

To analyse the penetration of capitalism, a survey is reported of the work practices of 600 households in a array of localities in Ukraine, conducted during 2005/2006 using face‐to‐face interviews.

Findings

Analysing the practices used by households to secure their livelihoods, the finding is that capitalism is far from hegemonic. Even when the formal economy is relied on either as their most important or second most important source of livelihood, it is nearly always combined with some other economic activity. A diverse portfolio of work practices is thus the norm rather than the exception with over 90 per cent of households relying on sources other than the formal market sphere as either their most important or second most important source of livelihood.

Practical implications

Displaying the shallow penetration of capitalism in this array of localities in Ukraine, this paper reveals the need for a re‐representation of the realities of work in such post‐socialist societies so as to open up the feasibility of, and possibilities for, alternative futures for work.

Originality/value

This paper reports the first evaluation of the extent to which capitalism has penetrated work practices in post‐socialist Ukraine.

Details

Journal of Economic Studies, vol. 34 no. 5
Type: Research Article
ISSN: 0144-3585

Keywords

Article
Publication date: 23 February 2010

Colin C. Williams

A persistent and recurring narrative is that capitalism has penetrated ever wider and deeper into all aspects of daily life across the globe. Recently, however, this has started

446

Abstract

Purpose

A persistent and recurring narrative is that capitalism has penetrated ever wider and deeper into all aspects of daily life across the globe. Recently, however, this has started to be challenged by an emergent post‐development body of thought that has displayed the shallowness of commodification in a number of global regions. The aim of this paper is to further contribute to this emergent critique of capitalist hegemony by evaluating the degree to which capitalism has managed to permeate everyday life in the Commonwealth of Independent States.

Design/methodology/approach

The findings of a 2001 survey of household economic practices in eight CIS are analysed, namely Armenia, Belarus, Georgia, Kazakhstan, Kirgizstan, Moldova, Russia and Ukraine.

Findings

This study reveals a shallow permeation of capitalist practices in the CIS and how an array of non‐capitalist economic practices remain a core integral component of these economies and heavily relied on by households to secure a livelihood.

Research limitations/implications

This snapshot survey only displays that capitalism is far from hegemonic. It does not show whether there is movement towards greater reliance on the capitalist sphere.

Originality/value

This paper provides further evidence from the CIS to support the emergent post‐development critique of capitalist hegemony and opens up the future of work in this region to alternative possibilities beyond commodification.

Details

Foresight, vol. 12 no. 1
Type: Research Article
ISSN: 1463-6689

Keywords

Article
Publication date: 3 April 2017

Xin Huang and Wenzhong Zhu

After over 30 years’ reform and opening-up, China as the second largest economy is now facing the most essential transformation of management philosophy and the biggest…

699

Abstract

Purpose

After over 30 years’ reform and opening-up, China as the second largest economy is now facing the most essential transformation of management philosophy and the biggest challenging issue of business sustainable development, with people’s increasing worry of the deterioration of environmental pollution, food security and human health. It can be said that what China needs urgently today is business ethical value and long-term sustainable development concept, rather than rapidly growing GDP. The purpose of this paper is to assess how the term “sustainable development” is constructed and valued in the sustainability reports or corporate social responsibility (CSR) reports of Chinese corporations, so as to interpret these Chinese firms’ conception of sustainable development in their real business practices.

Design/methodology/approach

A corpus of sustainability reports collected from 30 Chinese corporations totaling 247,311 tokens is first of all compiled to realize the objective of study. Then the authors use the AntConc, a corpus analysis toolkit, to generate word lists, key-word-in-context concordances and collocation lists, as well as calculating statistical significance measures for collocates, of which the mutual information (MI) score 3 is most relevant to the paper’s purposes. Based on the key-word-in-context concordance and collocation list, the authors can find what context “sustainable development” usually appears in sustainability reports, thus inferring Chinese corporations’ conception of sustainable development.

Findings

The result indicates that Chinese corporations use the rhetoric of weak sustainability, indicating that sustainable development is compatible with further economic growth, which means that Chinese corporations in current China, strongly promoting the concept of new normal economy, still put economic growth as a dominant goal, on which other dimensions of sustainability like environmental protection depend.

Research limitations/implications

The data gleaned in current corpus are limited to the sustainability reports in 2014 thus the study provides no hints as to diachronic trends. However, this study increases our understanding of how Chinese corporations attach value to sustainable development from the view of corpus analysis.

Originality/value

Different from traditional discourse analysis, which usually carries out qualitative analysis to analyze how a word or phrase is constructed in a small number of texts, the authors’ study innovatively introduces the method of corpus analysis to explore how Chinese corporations construct “sustainable development” in their sustainability reports. Thus, the number of texts analyzed is larger in the authors’ study and their findings are more representative and convincing. The authors create a more qualitative understanding of what the reports are actually saying on their reports and prove that corpus methods can bring new application to the discourse analysis of the biggest challenging issue of China’s future economic growth, suggesting a potential novel way to work out the meaning and implication of sustainable development in Chinese real business world.

Details

Chinese Management Studies, vol. 11 no. 1
Type: Research Article
ISSN: 1750-614X

Keywords

Book part
Publication date: 26 July 2014

Lars Engwall, Enno Aljets, Tina Hedmo and Raphaël Ramuz

Computer corpus linguistics (CCL) is a scientific innovation that has facilitated the creation and analysis of large corpora in a systematic way by means of computer technology…

Abstract

Computer corpus linguistics (CCL) is a scientific innovation that has facilitated the creation and analysis of large corpora in a systematic way by means of computer technology since the 1950s. This article provides an account of the CCL pioneers in general but particularly of those in Germany, the Netherlands, Sweden, and Switzerland. It is found that Germany and Sweden, due to more advantageous financing and weaker communities of generativists, had a faster adoption of CCL than the other two countries. A particular late adopter among the four was Switzerland, which did not take up CCL until foreign professors had been recruited.

Details

Organizational Transformation and Scientific Change: The Impact of Institutional Restructuring on Universities and Intellectual Innovation
Type: Book
ISBN: 978-1-78350-684-2

Keywords

1 – 10 of over 7000