Search results

1 – 10 of over 13000
To view the access options for this content please click here
Article
Publication date: 26 July 2021

Pengcheng Li, Qikai Liu, Qikai Cheng and Wei Lu

This paper aims to identify data set entities in scientific literature. To address poor recognition caused by a lack of training corpora in existing studies, a distant…

Abstract

Purpose

This paper aims to identify data set entities in scientific literature. To address poor recognition caused by a lack of training corpora in existing studies, a distant supervised learning-based approach is proposed to identify data set entities automatically from large-scale scientific literature in an open domain.

Design/methodology/approach

Firstly, the authors use a dictionary combined with a bootstrapping strategy to create a labelled corpus to apply supervised learning. Secondly, a bidirectional encoder representation from transformers (BERT)-based neural model was applied to identify data set entities in the scientific literature automatically. Finally, two data augmentation techniques, entity replacement and entity masking, were introduced to enhance the model generalisability and improve the recognition of data set entities.

Findings

In the absence of training data, the proposed method can effectively identify data set entities in large-scale scientific papers. The BERT-based vectorised representation and data augmentation techniques enable significant improvements in the generality and robustness of named entity recognition models, especially in long-tailed data set entity recognition.

Originality/value

This paper provides a practical research method for automatically recognising data set entities in scientific literature. To the best of the authors’ knowledge, this is the first attempt to apply distant learning to the study of data set entity recognition. The authors introduce a robust vectorised representation and two data augmentation strategies (entity replacement and entity masking) to address the problem inherent in distant supervised learning methods, which the existing research has mostly ignored. The experimental results demonstrate that our approach effectively improves the recognition of data set entities, especially long-tailed data set entities.

To view the access options for this content please click here
Article
Publication date: 1 May 2020

Qihang Wu, Daifeng Li, Lu Huang and Biyun Ye

Entity relation extraction is an important research direction to obtain structured information. However, most of the current methods are to determine the relations between…

Abstract

Purpose

Entity relation extraction is an important research direction to obtain structured information. However, most of the current methods are to determine the relations between entities in a given sentence based on a stepwise method, seldom considering entities and relations into a unified framework. The joint learning method is an optimal solution that combines relations and entities. This paper aims to optimize hierarchical reinforcement learning framework and provide an efficient model to extract entity relation.

Design/methodology/approach

This paper is based on the hierarchical reinforcement learning framework of joint learning and combines the model with BERT, the best language representation model, to optimize the word embedding and encoding process. Besides, this paper adjusts some punctuation marks to make the data set more standardized, and introduces positional information to improve the performance of the model.

Findings

Experiments show that the model proposed in this paper outperforms the baseline model with a 13% improvement, and achieve 0.742 in F1 score in NYT10 data set. This model can effectively extract entities and relations in large-scale unstructured text and can be applied to the fields of multi-domain information retrieval, intelligent understanding and intelligent interaction.

Originality/value

The research provides an efficient solution for researchers in a different domain to make use of artificial intelligence (AI) technologies to process their unstructured text more accurately.

Details

Information Discovery and Delivery, vol. 48 no. 3
Type: Research Article
ISSN: 2398-6247

Keywords

To view the access options for this content please click here
Article
Publication date: 7 June 2021

Marco Humbel, Julianne Nyhan, Andreas Vlachidis, Kim Sloan and Alexandra Ortolja-Baird

By mapping-out the capabilities, challenges and limitations of named-entity recognition (NER), this article aims to synthesise the state of the art of NER in the context…

Abstract

Purpose

By mapping-out the capabilities, challenges and limitations of named-entity recognition (NER), this article aims to synthesise the state of the art of NER in the context of the early modern research field and to inform discussions about the kind of resources, methods and directions that may be pursued to enrich the application of the technique going forward.

Design/methodology/approach

Through an extensive literature review, this article maps out the current capabilities, challenges and limitations of NER and establishes the state of the art of the technique in the context of the early modern, digitally augmented research field. It also presents a new case study of NER research undertaken by Enlightenment Architectures: Sir Hans Sloane's Catalogues of his Collections (2016–2021), a Leverhulme funded research project and collaboration between the British Museum and University College London, with contributing expertise from the British Library and the Natural History Museum.

Findings

Currently, it is not possible to benchmark the capabilities of NER as applied to documents of the early modern period. The authors also draw attention to the situated nature of authority files, and current conceptualisations of NER, leading them to the conclusion that more robust reporting and critical analysis of NER approaches and findings is required.

Research limitations/implications

This article examines NER as applied to early modern textual sources, which are mostly studied by Humanists. As addressed in this article, detailed reporting of NER processes and outcomes is not necessarily valued by the disciplines of the Humanities, with the result that it can be difficult to locate relevant data and metrics in project outputs. The authors have tried to mitigate this by contacting projects discussed in this paper directly, to further verify the details they report here.

Practical implications

The authors suggest that a forum is needed where tools are evaluated according to community standards. Within the wider NER community, the MUC and ConLL corpora are used for such experimental set-ups and are accompanied by a conference series, and may be seen as a useful model for this. The ultimate nature of such a forum must be discussed with the whole research community of the early modern domain.

Social implications

NER is an algorithmic intervention that transforms data according to certain rules-, patterns- or training data and ultimately affects how the authors interpret the results. The creation, use and promotion of algorithmic technologies like NER is not a neutral process, and neither is their output A more critical understanding of the role and impact of NER on early modern documents and research and focalization of some of the data- and human-centric aspects of NER routines that are currently overlooked are called for in this paper.

Originality/value

This article presents a state of the art snapshot of NER, its applications and potential, in the context of early modern research. It also seeks to inform discussions about the kinds of resources, methods and directions that may be pursued to enrich the application of NER going forward. It draws attention to the situated nature of authority files, and current conceptualisations of NER, and concludes that more robust reporting of NER approaches and findings are urgently required. The Appendix sets out a comprehensive summary of digital tools and resources surveyed in this article.

Details

Journal of Documentation, vol. 77 no. 6
Type: Research Article
ISSN: 0022-0418

Keywords

To view the access options for this content please click here
Article
Publication date: 1 June 2015

Quang-Minh Nguyen and Tuan-Dung Cao

The purpose of this paper is to propose an automatic method to generate semantic annotations of football transfer in the news. The current automatic news integration…

Abstract

Purpose

The purpose of this paper is to propose an automatic method to generate semantic annotations of football transfer in the news. The current automatic news integration systems on the Web are constantly faced with the challenge of diversity, heterogeneity of sources. The approaches for information representation and storage based on syntax have some certain limitations in news searching, sorting, organizing and linking it appropriately. The models of semantic representation are promising to be the key to solving these problems.

Design/methodology/approach

The approach of the author leverages Semantic Web technologies to improve the performance of detection of hidden annotations in the news. The paper proposes an automatic method to generate semantic annotations based on named entity recognition and rule-based information extraction. The authors have built a domain ontology and knowledge base integrated with the knowledge and information management (KIM) platform to implement the former task (named entity recognition). The semantic extraction rules are constructed based on defined language models and the developed ontology.

Findings

The proposed method is implemented as a part of the sport news semantic annotations-generating prototype BKAnnotation. This component is a part of the sport integration system based on Web Semantics BKSport. The semantic annotations generated are used for improving features of news searching – sorting – association. The experiments on the news data from SkySport (2014) channel showed positive results. The precisions achieved in both cases, with and without integration of the pronoun recognition method, are both over 80 per cent. In particular, the latter helps increase the recall value to around 10 per cent.

Originality/value

This is one of the initial proposals in automatic creation of semantic data about news, football news in particular and sport news in general. The combination of ontology, knowledge base and patterns of language model allows detection of not only entities with corresponding types but also semantic triples. At the same time, the authors propose a pronoun recognition method using extraction rules to improve the relation recognition process.

Details

International Journal of Pervasive Computing and Communications, vol. 11 no. 2
Type: Research Article
ISSN: 1742-7371

Keywords

To view the access options for this content please click here
Article
Publication date: 10 June 2014

Ping Bao and Suoling Zhu

The purpose of this paper is to present a system for recognition of location names in ancient books written in languages, such as Chinese, in which proper names are not…

Abstract

Purpose

The purpose of this paper is to present a system for recognition of location names in ancient books written in languages, such as Chinese, in which proper names are not signaled by an initial capital letter.

Design/methodology/approach

Rule-based and statistical methods were combined to develop a set of rules for identification of product-related location names in the local chronicles of Guangdong. A name recognition system, with functions of document management, information extraction and storage, rule management, location name recognition, and inquiry and statistics, was developed using Microsoft's .NET framework, SQL Server 2005, ADO.NET and XML. The system was evaluated with precision ratio, recall ratio and the comprehensive index, F.

Findings

The system was quite successful at recognizing product-related location names (F was 71.8 percent), demonstrating the potential for application of automatic named entity recognition techniques in digital collation of ancient books such as local chronicles.

Research limitations/implications

Results suffered from limitations in initial digitization of the text. Statistical methods, such as the hidden Markov model, should be combined with an extended set of recognition rules to improve recognition scores and system efficiency.

Practical implications

Electronic access to local chronicles by location name saves time for chorographers and provides researchers with new opportunities.

Social implications

Named entity recognition brings previously isolated ancient documents together in a knowledge base of scholarly and cultural value.

Originality/value

Automatic name recognition can be implemented in information extraction from ancient books in languages other than English. The system described here can also be adapted to modern texts and other named entities.

To view the access options for this content please click here
Article
Publication date: 1 April 2003

Georgios I. Zekos

Aim of the present monograph is the economic analysis of the role of MNEs regarding globalisation and digital economy and in parallel there is a reference and examination…

Downloads
55487

Abstract

Aim of the present monograph is the economic analysis of the role of MNEs regarding globalisation and digital economy and in parallel there is a reference and examination of some legal aspects concerning MNEs, cyberspace and e‐commerce as the means of expression of the digital economy. The whole effort of the author is focused on the examination of various aspects of MNEs and their impact upon globalisation and vice versa and how and if we are moving towards a global digital economy.

Details

Managerial Law, vol. 45 no. 1/2
Type: Research Article
ISSN: 0309-0558

Keywords

To view the access options for this content please click here
Article
Publication date: 14 May 2019

Ahsan Mahmood, Hikmat Ullah Khan, Zahoor Ur Rehman, Khalid Iqbal and Ch. Muhmmad Shahzad Faisal

The purpose of this research study is to extract and identify named entities from Hadith literature. Named entity recognition (NER) refers to the identification of the…

Abstract

Purpose

The purpose of this research study is to extract and identify named entities from Hadith literature. Named entity recognition (NER) refers to the identification of the named entities in a computer readable text having an annotation of categorization tags for information extraction. NER is an active research area in information management and information retrieval systems. NER serves as a baseline for machines to understand the context of a given content and helps in knowledge extraction. Although NER is considered as a solved task in major languages such as English, in languages such as Urdu, NER is still a challenging task. Moreover, NER depends on the language and domain of study; thus, it is gaining the attention of researchers in different domains.

Design/methodology/approach

This paper proposes a knowledge extraction framework using finite-state transducers (FSTs) – KEFST – to extract the named entities. KEFST consists of five steps: content extraction, tokenization, part of speech tagging, multi-word detection and NER. An extensive empirical analysis using the data corpus of Urdu translation of Sahih Al-Bukhari, a widely known hadith book, reveals that the proposed method effectively recognizes the entities to obtain better results.

Findings

The significant performance in terms of f-measure, precision and recall validates that the proposed model outperforms the existing methods for NER in the relevant literature.

Originality/value

This research is novel in this regard that no previous work is proposed in the Urdu language to extract named entities using FSTs and no previous work is proposed for Urdu hadith data NER.

Details

The Electronic Library , vol. 37 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

To view the access options for this content please click here
Article
Publication date: 8 January 2018

Mahmood Reza Khabbazi, Jan Wikander, Mauro Onori and Antonio Maffei

This paper introduces a schema for the product assembly feature data in an object-oriented and module-based format using Unified Modeling Language (UML). To link…

Abstract

Purpose

This paper introduces a schema for the product assembly feature data in an object-oriented and module-based format using Unified Modeling Language (UML). To link production with product design, it is essential to determine at an early stage which entities of product design and development are involved and used at the automated assembly planning and operations. To this end, it is absolutely reasonable to assign meaningful attributes to the parts’ design entities (assembly features) in a systematic and structured way. As such, this approach empowers processes such as motion planning and sequence planning in assembly design.

Design/methodology/approach

The assembly feature data requirements are studied and definitions are analyzed and redefined. Using object-oriented techniques, the assembly feature data structure and relationships are modeled based on the identified requirements as five UML packages (Part, three-dimensional (3D) models, Mating, Joint and Handling). All geometric and non-geometric design data entities endorsed with assembly design perspective are extracted or assigned from 3D models and realized through the featured entity interface class. The featured entities are then associated (used) with the mating, handling and joints features. The AssemblyFeature interface is realized through mating, handling and joint packages related to the assembly and part classes. Each package contains all relevant classes which further classify the important attributes of the main class.

Findings

This paper sets out to provide an explanatory approach using object-oriented techniques to model the schema of assembly features association and artifacts at the product design level, all of which are essential in several subsequent and parallel steps of the assembly planning process, as well as assembly feature entity assignments in design improvement cycle.

Practical implications

The practical implication based on the identified advantages can be classified in three main features: module-based design, comprehensive classification, integration. These features help the automation and solution development processes based on the proposed models much easier and systematic.

Originality/value

The proposed schema’s comprehensiveness and reliability are verified through comparisons with other works and the advantages are discussed in detail.

Details

Assembly Automation, vol. 38 no. 1
Type: Research Article
ISSN: 0144-5154

Keywords

To view the access options for this content please click here
Article
Publication date: 20 April 2015

Abubakar Roko, Shyamala Doraisamy, Azrul Hazri Jantan and Azreen Azman

The purpose of this paper is to propose and evaluate XKQSS, a query structuring method that relegates the task of generating structured queries from a user to a search…

Abstract

Purpose

The purpose of this paper is to propose and evaluate XKQSS, a query structuring method that relegates the task of generating structured queries from a user to a search engine while retaining the simple keyword search query interface. A more effective way for searching XML database is to use structured queries. However, using query languages to express queries prove to be difficult for most users since this requires learning a query language and knowledge of the underlying data schema. On the other hand, the success of Web search engines has made many users to be familiar with keyword search and, therefore, they prefer to use a keyword search query interface to search XML data.

Design/methodology/approach

Existing query structuring approaches require users to provide structural hints in their input keyword queries even though their interface is keyword base. Other problems with existing systems include their inability to put keyword query ambiguities into consideration during query structuring and how to select the best generated structure query that best represents a given keyword query. To address these problems, this study allows users to submit a schema independent keyword query, use named entity recognition (NER) to categorize query keywords to resolve query ambiguities and compute semantic information for a node from its data content. Algorithms were proposed that find user search intentions and convert the intentions into a set of ranked structured queries.

Findings

Experiments with Sigmod and IMDB datasets were conducted to evaluate the effectiveness of the method. The experimental result shows that the XKQSS is about 20 per cent more effective than XReal in terms of return nodes identification, a state-of-art systems for XML retrieval.

Originality/value

Existing systems do not take keyword query ambiguities into account. XKSS consists of two guidelines based on NER that help to resolve these ambiguities before converting the submitted query. It also include a ranking function computes a score for each generated query by using both semantic information and data statistic, as opposed to data statistic only approach used by the existing approaches.

Details

International Journal of Web Information Systems, vol. 11 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

To view the access options for this content please click here
Article
Publication date: 1 January 1977

A distinction must be drawn between a dismissal on the one hand, and on the other a repudiation of a contract of employment as a result of a breach of a fundamental term…

Downloads
1636

Abstract

A distinction must be drawn between a dismissal on the one hand, and on the other a repudiation of a contract of employment as a result of a breach of a fundamental term of that contract. When such a repudiation has been accepted by the innocent party then a termination of employment takes place. Such termination does not constitute dismissal (see London v. James Laidlaw & Sons Ltd (1974) IRLR 136 and Gannon v. J. C. Firth (1976) IRLR 415 EAT).

Details

Managerial Law, vol. 20 no. 1
Type: Research Article
ISSN: 0309-0558

1 – 10 of over 13000