Search results
1 – 10 of 161Jelena Andonovski, Branislava Šandrih and Olivera Kitanović
This paper aims to describe the structure of an aligned Serbian-German literary corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to create a…
Abstract
Purpose
This paper aims to describe the structure of an aligned Serbian-German literary corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to create a benchmark Serbian-German annotated corpus searchable with various query expansions.
Design/methodology/approach
The presented research is particularly focused on the enhancement of bilingual search queries in a full-text search of aligned SrpNemKor collection. The enhancement is based on using existing lexical resources such as Serbian morphological electronic dictionaries and the bilingual lexical database Termi.
Findings
For the purpose of this research, the lexical database Termi is enriched with a bilingual list of German-Serbian translated pairs of lexical units. The list of correct translation pairs was extracted from SrpNemKor, evaluated and integrated into Termi. Also, Serbian morphological e-dictionaries are updated with new entries extracted from the Serbian part of the corpus.
Originality/value
A bilingual search of SrpNemKor in Bibliša is available within the user-friendly platform. The enriched database Termi enables semantic enhancement and refinement of user’s search query based on synonyms both in Serbian and German at a very high level. Serbian morphological e-dictionaries facilitate the morphological expansion of search queries in Serbian, thereby enabling the analysis of concepts and concept structures by identifying terms assigned to the concept, and by establishing relations between terms in Serbian and German which makes Bibliša a valuable Web tool that can support research and analysis of SrpNemKor.
Details
Keywords
This chapter examines factors impacting vocabulary development in preschool dual language learners, providing a cultural and linguistic perspective on vocabulary instruction in…
Abstract
This chapter examines factors impacting vocabulary development in preschool dual language learners, providing a cultural and linguistic perspective on vocabulary instruction in this population. Through a multidisciplinary review of the research literature, instructional strategies that can support vocabulary development in this population are identified. The chapter concludes with a detailed illustration of how these strategies can be incorporated into a culturally linguistically responsive vocabulary approach for Latino preschoolers.
Details
Keywords
The purpose of this paper is to address the knowledge acquisition bottleneck problem in natural language processing by introducing a new rule‐based approach for the automatic…
Abstract
Purpose
The purpose of this paper is to address the knowledge acquisition bottleneck problem in natural language processing by introducing a new rule‐based approach for the automatic acquisition of linguistic knowledge.
Design/methodology/approach
The author has developed a new machine translation methodology that only requires a bilingual lexicon and a parallel corpus of surface sentences aligned at the sentence level to learn new transfer rules.
Findings
A first prototype of a web‐based Japanese‐English translation system called Japanese‐English translation using corpus‐based acquisition of transfer (JETCAT) has been implemented in SWI‐Prolog, and a Greasemonkey user script to analyze Japanese web pages and translate sentences via Ajax. In addition, linguistic information is displayed at the character, word, and sentence level to provide a useful tool for web‐based language learning. An important feature is customization; the user can simply correct translation results leading to an incremental update of the knowledge base.
Research limitations/implications
This paper focuses on the technical aspects and user interface issues of JETCAT. The author is planning to use JETCAT in a classroom setting to gather first experiences and will then evaluate a real‐world deployment; also work has started on extending JETCAT to include collaborative features.
Practical implications
The research has a high practical impact on academic language education. It also could have implications for the translation industry by superseding certain translation tasks and, on the other hand, adding value and quality to others.
Originality/value
The paper presents an extended version of the paper receiving the Emerald Web Information Systems Best Paper Award at iiWAS2010.
Details
Keywords
Sophia Ananiadou and John McNaught
This paper assesses the degree to which established practices in terminology can provide the translation industry with the lexical means to support mediation of information…
Abstract
This paper assesses the degree to which established practices in terminology can provide the translation industry with the lexical means to support mediation of information between languages, especially where such mediation involves modification. The effects of term variation, collocation and sublanguage phraseology present problems of term choice to the translator. Current term resources cannot help much with these problems; however, tools and techniques are discussed which, in the near future, will offer translators the means to make appropriate choices of terminology.
Chuanming Yu, Haodong Xue, Manyi Wang and Lu An
Owing to the uneven distribution of annotated corpus among different languages, it is necessary to bridge the gap between low resource languages and high resource languages. From…
Abstract
Purpose
Owing to the uneven distribution of annotated corpus among different languages, it is necessary to bridge the gap between low resource languages and high resource languages. From the perspective of entity relation extraction, this paper aims to extend the knowledge acquisition task from a single language context to a cross-lingual context, and to improve the relation extraction performance for low resource languages.
Design/methodology/approach
This paper proposes a cross-lingual adversarial relation extraction (CLARE) framework, which decomposes cross-lingual relation extraction into parallel corpus acquisition and adversarial adaptation relation extraction. Based on the proposed framework, this paper conducts extensive experiments in two tasks, i.e. the English-to-Chinese and the English-to-Arabic cross-lingual entity relation extraction.
Findings
The Macro-F1 values of the optimal models in the two tasks are 0.880 1 and 0.789 9, respectively, indicating that the proposed CLARE framework for CLARE can significantly improve the effect of low resource language entity relation extraction. The experimental results suggest that the proposed framework can effectively transfer the corpus as well as the annotated tags from English to Chinese and Arabic. This study reveals that the proposed approach is less human labour intensive and more effective in the cross-lingual entity relation extraction than the manual method. It shows that this approach has high generalizability among different languages.
Originality/value
The research results are of great significance for improving the performance of the cross-lingual knowledge acquisition. The cross-lingual transfer may greatly reduce the time and cost of the manual construction of the multi-lingual corpus. It sheds light on the knowledge acquisition and organization from the unstructured text in the era of big data.
Details
Keywords
Behnam Forouhandeh, Rodney J. Clarke and Nina Louise Reynolds
The purpose of this paper is to demonstrate the utility of systemic functional linguistics (SFL) as an underlying model to examine the similarities/differences between spoken and…
Abstract
Purpose
The purpose of this paper is to demonstrate the utility of systemic functional linguistics (SFL) as an underlying model to examine the similarities/differences between spoken and written peer-to-peer (P2P) communication.
Design/methodology/approach
An embedded mixed methods experimental design with linguistically standardized experimental stimuli was used to expose the basic linguistic differences between P2P communications that can be attributed to communication medium (spoken/written) and product type (hedonic/utilitarian).
Findings
The findings show, empirically, that consumer’s spoken language is not linguistically equivalent to that of written language. This confirms that the capability of language to convey semantic meaning in spoken communication differs from written communication. This study extends the characteristics that differentiate hedonic from utilitarian products to include lexical density (i.e. hedonic) vs lexical sparsity (i.e. utilitarian).
Research limitations/implications
The findings of this study are not wholly relevant to other forms of consumer communication (e.g. viral marketing). This research used a few SFL resources.
Practical implications
This research shows that marketers should ideally apply a semantic approach to the analysis of communications, given that communication meaning can vary across channels. Marketers may also want to focus on specific feedback channels (e.g. review site vs telephone) depending on the depth of product’s details that need to be captured. This study also offers metrics that advertisers could use to classify media and to characterize consumer segments.
Originality/value
This research shows the relevance of SFL for understanding P2P communications and has potential applications to other marketing communications.
Details
Keywords
Carmen Galvez and Félix de Moya‐Anegón
To evaluate the accuracy of conflation methods based on finite‐state transducers (FSTs).
Abstract
Purpose
To evaluate the accuracy of conflation methods based on finite‐state transducers (FSTs).
Design/methodology/approach
Incorrectly lemmatized and stemmed forms may lead to the retrieval of inappropriate documents. Experimental studies to date have focused on retrieval performance, but very few on conflation performance. The process of normalization we used involved a linguistic toolbox that allowed us to construct, through graphic interfaces, electronic dictionaries represented internally by FSTs. The lexical resources developed were applied to a Spanish test corpus for merging term variants in canonical lemmatized forms. Conflation performance was evaluated in terms of an adaptation of recall and precision measures, based on accuracy and coverage, not actual retrieval. The results were compared with those obtained using a Spanish version of the Porter algorithm.
Findings
The conclusion is that the main strength of lemmatization is its accuracy, whereas its main limitation is the underanalysis of variant forms.
Originality/value
The report outlines the potential of transducers in their application to normalization processes.
Details
Keywords
Marilyn Domas White, Miriam Matteson and Eileen G. Abels
This paper characterizes translation as a task and aims to identify how it influences professional translators' information needs and use of resources to meet those needs.
Abstract
Purpose
This paper characterizes translation as a task and aims to identify how it influences professional translators' information needs and use of resources to meet those needs.
Design/methodology/approach
This research is exploratory and qualitative. Data are based on focus group sessions with 19 professional translators. Where appropriate, findings are related to several theories relating task characteristics and information behavior (IB).
Findings
The findings support some of Byström's findings about relationship between task and information use but also suggest new hypotheses or relationships among task, information need, and information use, including the notion of a zone of familiarity. Translators use a wide range of resources, both formal and informal, localized sources, including personal contacts with other translators, native speakers, and domain experts, to supplement their basic resources, which are different types of dictionaries. The study addresses translator problems created by the need to translate materials in less commonly taught languages.
Research limitations/implications
Focus group sessions allow only for identifying concepts, relationships, and hypotheses, not for indicating the relative importance of variables or distribution across individuals. Translation does not cover literary translation.
Practical implications
The paper suggests content and features of workstations offering access to wide range of resources for professional translators.
Originality/value
Unlike other information behavior studies of professional translators, this article focuses on a broad range of resources, not just on dictionary use. It also identifies information problems associated not only with normal task activities, but also with translators' moving out of their zone of familiarity, i.e. their range of domain, language, and style expertise. The model of translator IB is potentially generalizable to other groups and both supports and expands other task‐related research.
Details
Keywords
In consideration of the needs of the growing numbers of Spanish-speaking emergent bilingual students in U.S. classrooms who are learning English as a new language, this study…
Abstract
In consideration of the needs of the growing numbers of Spanish-speaking emergent bilingual students in U.S. classrooms who are learning English as a new language, this study explores the teachers’ understanding of instructional practice using a specific pedagogical framework designed for emergent bilingual classroom contexts called Preview/View/Review (P/V/R). A constructivist and a translanguaging lens informed the theoretical framework for this study. One set of qualitative data from interviews was collected from a random sample of teachers who participated in a Master’s program in bilingual education in a border university in South Texas. Interview questions focused on the teachers’ reflections on the planning for and the implementation of the pedagogical structure P/V/R in their dual language contexts. Three findings arose from the data: (a) participants demonstrated an understanding of planning for and implementation of the P/V/R structure as a scaffold to build background knowledge of new concepts in the different disciplines; (b) the P/V/R structure has the potential to facilitate cross-linguistic transfer and the potential to be implemented as a form of translanguaging pedagogy; and (c) the implementation of a well-planned P/V/R structure enhances students’ engagement with the learning in two languages. One identifiable limitation of the study is the small size of the sample. In addition, classroom observations of the implementation of the structure are needed to mitigate the possible over-reporting of P/V/R as a good practice on the part of the teachers. Insights from this study inform teacher educators in teacher preparation programs who are preparing teachers for working with emergent bilingual learners and the professional development of all teachers, including those who teach in bilingual school contexts.
Details