Search results

1 – 10 of 49
Article
Publication date: 6 April 2012

Chengzhi Zhang and Dan Wu

Terminology is the set of technical words or expressions used in specific contexts, which denotes the core concept in a formal discipline and is usually applied in the fields of…

697

Abstract

Purpose

Terminology is the set of technical words or expressions used in specific contexts, which denotes the core concept in a formal discipline and is usually applied in the fields of machine translation, information retrieval, information extraction and text categorization, etc. Bilingual terminology extraction plays an important role in the application of bilingual dictionary compilation, bilingual ontology construction, machine translation and cross‐language information retrieval etc. This paper aims to address the issues of monolingual terminology extraction and bilingual term alignment based on multi‐level termhood.

Design/methodology/approach

A method based on multi‐level termhood is proposed. The new method computes the termhood of the terminology candidate as well as the sentence that includes the terminology by the comparison of the corpus. Since terminologies and general words usually have different distribution in the corpus, termhood can also be used to constrain and enhance the performance of term alignment when aligning bilingual terms on the parallel corpus. In this paper, bilingual term alignment based on termhood constraints is presented.

Findings

Experimental results show multi‐level termhood can get better performance than the existing method for terminology extraction. If termhood is used as a constraining factor, the performance of bilingual term alignment can be improved.

Originality/value

The termhood of the candidate terminology and the sentence that includes the terminology is used for terminology extraction, which is called multi‐level termhood. Multi‐level termhood is computed by the comparison of the corpus. Bilingual term alignment method based on termhood constraint is put forward and termhood is used in the task of bilingual terminology extraction. Experimental results show that termhood constraints can improve the performance of terminology alignment to some extent.

Article
Publication date: 2 September 2019

Jelena Andonovski, Branislava Šandrih and Olivera Kitanović

This paper aims to describe the structure of an aligned Serbian-German literary corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to create a…

Abstract

Purpose

This paper aims to describe the structure of an aligned Serbian-German literary corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to create a benchmark Serbian-German annotated corpus searchable with various query expansions.

Design/methodology/approach

The presented research is particularly focused on the enhancement of bilingual search queries in a full-text search of aligned SrpNemKor collection. The enhancement is based on using existing lexical resources such as Serbian morphological electronic dictionaries and the bilingual lexical database Termi.

Findings

For the purpose of this research, the lexical database Termi is enriched with a bilingual list of German-Serbian translated pairs of lexical units. The list of correct translation pairs was extracted from SrpNemKor, evaluated and integrated into Termi. Also, Serbian morphological e-dictionaries are updated with new entries extracted from the Serbian part of the corpus.

Originality/value

A bilingual search of SrpNemKor in Bibliša is available within the user-friendly platform. The enriched database Termi enables semantic enhancement and refinement of user’s search query based on synonyms both in Serbian and German at a very high level. Serbian morphological e-dictionaries facilitate the morphological expansion of search queries in Serbian, thereby enabling the analysis of concepts and concept structures by identifying terms assigned to the concept, and by establishing relations between terms in Serbian and German which makes Bibliša a valuable Web tool that can support research and analysis of SrpNemKor.

Details

The Electronic Library , vol. 37 no. 4
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 24 May 2018

Vesna Pajić, Staša Vujičić Stanković, Ranka Stanković and Miloš Pajić

A hybrid approach is presented, which combines linguistic and statistical information to semi-automatically extract multiword term candidates from texts.

Abstract

Purpose

A hybrid approach is presented, which combines linguistic and statistical information to semi-automatically extract multiword term candidates from texts.

Design/methodology/approach

The method is designed to be domain and language independent, focusing on languages with rich morphology. Here, it is used for extracting multiword terms from texts in Serbian, belonging to the agricultural engineering domain, as a use case. Predefined syntactic structures were used for multiword terms. For each structure, a finite state transducer was developed, which recognizes text sequences having that structure and outputs the sequence in a normalized form, so that different inflectional forms of the same multiword term can be counted properly. Term candidates were further filtered by their frequencies and evaluated by two domain experts.

Findings

By using language resources, such as electronic dictionaries and grammars, 928 multiword terms were extracted out of 1,523 multiword terms that were recognized as candidates from a corpus having 42,260 different simple word forms; 870 of these were new, not already contained in the existing electronic dictionary of compounds for Serbian, and they were used to enrich the dictionary.

Originality/value

The paper presents methodology that can significantly contribute to the development of terminology lexicons in different areas. In this particular use case, some important agricultural engineering concepts were extracted from the text, but this approach could be used for other domains and languages as well.

Details

The Electronic Library, vol. 36 no. 3
Type: Research Article
ISSN: 0264-0473

Keywords

Content available
Article
Publication date: 6 April 2012

Daqing He

528

Abstract

Details

The Electronic Library, vol. 30 no. 2
Type: Research Article
ISSN: 0264-0473

Article
Publication date: 14 December 2021

Claudia Lanza, Antonietta Folino, Erika Pasceri and Anna Perri

The aim of this study is a semantic comparative analysis between the current pandemic and the Spanish flu. It is based on a bilingual terminological perspective oriented to…

Abstract

Purpose

The aim of this study is a semantic comparative analysis between the current pandemic and the Spanish flu. It is based on a bilingual terminological perspective oriented to evaluate and compare the terms used to describe and communicate the pandemic's issues both to biomedical experts and to a non-specialist public.

Design/methodology/approach

The analysis carried out is a terminological comparative investigation performed on two corpora, the first containing scientific English articles, the second Italian national newspapers' issues on two pandemics, the Spanish flu and the current Covid-19 disease, towards the detection of semantic similarities and differences among them through the implementation of computational tasks and corpus linguistics methodologies.

Findings

Given the cross-fielding representativeness of terms, and their relevance within specific historical eras, our study is conducted both on a synchronic and on a diachronic level to discover the common lexical usages in the dissemination of the pandemic issues.

Originality/value

The study presents the extraction of the main representative terms about two pandemics and their usages to share news about their trends among the population and the integration of a topic modeling detection procedure to discover some of the main categories representing the lexicon of the pandemics with reference to a list of classes created by external thesauri and ontologies on pandemics. As a result, a detailed overview of the discrepancies, as well as similarities, retrieved in two historical corpora dealing with a common subject, i.e. the pandemics' terminology, is provided.

Details

Journal of Documentation, vol. 78 no. 4
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 1 February 1995

Sophia Ananiadou and John McNaught

This paper assesses the degree to which established practices in terminology can provide the translation industry with the lexical means to support mediation of information…

Abstract

This paper assesses the degree to which established practices in terminology can provide the translation industry with the lexical means to support mediation of information between languages, especially where such mediation involves modification. The effects of term variation, collocation and sublanguage phraseology present problems of term choice to the translator. Current term resources cannot help much with these problems; however, tools and techniques are discussed which, in the near future, will offer translators the means to make appropriate choices of terminology.

Details

Aslib Proceedings, vol. 47 no. 2
Type: Research Article
ISSN: 0001-253X

Article
Publication date: 1 March 1994

Blaise Nkwenti‐Azeh

This paper examines how the changes currently taking place in terminology processing and documentation are related to the multilingual needs of translation, and also how progress…

Abstract

This paper examines how the changes currently taking place in terminology processing and documentation are related to the multilingual needs of translation, and also how progress in natural language processing in general, and terminology processing in particular, can contribute to the development of reliable, up‐to‐date terminology support tools for translators. The paper also describes some recent experiences in the automatic identification of terminological units from corpora. The paper concludes by identifying some specific areas in terminology software development which can benefit from the expertise of translators and other language professionals.

Details

Aslib Proceedings, vol. 46 no. 3
Type: Research Article
ISSN: 0001-253X

Article
Publication date: 1 March 1998

Robert Gaizauskas and Yorick Wilks

In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre‐specified…

1404

Abstract

In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre‐specified set of entities, relations or events from natural language texts and to record this information in structured representations called templates. Here we describe the nature of the IE task, review the history of the area from its origins in AI work in the 1960s and 70s till the present, discuss the techniques being used to carry out the task, describe application areas where IE systems are or are about to be at work, and conclude with a discussion of the challenges facing the area. What emerges is a picture of an exciting new text processing technology with a host of new applications, both on its own and in conjunction with other technologies, such as information retrieval, machine translation and data mining.

Details

Journal of Documentation, vol. 54 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

Book part
Publication date: 10 July 2019

Tianxing Wu, Guilin Qi and Cheng Li

With the continuous development of intelligent technologies, knowledge graph, the backbone of artificial intelligence, has attracted much attention from both academic and…

Abstract

With the continuous development of intelligent technologies, knowledge graph, the backbone of artificial intelligence, has attracted much attention from both academic and industrial communities due to its powerful capability of knowledge representation and reasoning. Besides, knowledge graph has been widely applied in different kinds of applications, such as semantic search, question answering, knowledge management, and so on. In recent years, knowledge graph techniques in China are also developing rapidly and different Chinese knowledge graphs have been built to support various applications. Under the background of “One Belt One Road (OBOR)” initiative, cooperating with the countries along OBOR on studying knowledge graph techniques and applications will greatly promote the development of artificial intelligence. At the same time, the accumulated experience of China on developing knowledge graph is also a good reference. Thus, in this chapter, the authors mainly introduce the development of Chinese knowledge graphs and their applications. The authors first describe the background of OBOR, and then introduce the concept of knowledge graph and three typical Chinese knowledge graphs, including Zhishi.me, CN-DBpedia, and XLORE. Finally, the authors demonstrate several applications of Chinese knowledge graphs.

Details

The New Silk Road Leads through the Arab Peninsula: Mastering Global Business and Innovation
Type: Book
ISBN: 978-1-78756-680-4

Keywords

Article
Publication date: 1 June 1979

Valerie Gilbert

Aslib Library holds a collection of thesauri, subject headings and classification schemes which are used to answer members' enquiries about the existence of schemes for particular…

Abstract

Aslib Library holds a collection of thesauri, subject headings and classification schemes which are used to answer members' enquiries about the existence of schemes for particular subject fields and many of which are available on loan for two weeks. Our policy is to acquire all significant English language publications and bilingual or multilingual items with English as one of the languages.

Details

Aslib Proceedings, vol. 31 no. 6
Type: Research Article
ISSN: 0001-253X

1 – 10 of 49