Search results

1 – 10 of 878
Article
Publication date: 23 November 2018

Chih-Ming Chen, Yung-Ting Chen and Chen-Yu Liu

An automatic text annotation system (ATAS) that can collect resources from different databases through Linked Data (LD) for automatically annotating ancient texts was…

2889

Abstract

Purpose

An automatic text annotation system (ATAS) that can collect resources from different databases through Linked Data (LD) for automatically annotating ancient texts was developed in this study to support digital humanities research. It allows the humanists referring to resources from diverse databases when interpreting ancient texts as well as provides a friendly text annotation reader for humanists interpreting ancient text through reading. The paper aims to discuss whether the ATAS is helpful to support digital humanities research or not.

Design/methodology/approach

Based on the quasi-experimental design, the ATAS developed in this study and MARKUS semi-ATAS were compared whether the significant differences in the reading effectiveness and technology acceptance for supporting humanists interpreting ancient text of the Ming dynasty’s collections existed or not. Additionally, lag sequential analysis was also used to analyze users’ operation behaviors on the ATAS. A semi-structured in-depth interview was also applied to understand users’ opinions and perception of using the ATAS to interpret ancient texts through reading.

Findings

The experimental results reveal that the ATAS has higher reading effectiveness than MARKUS semi-ATAS, but not reaching the statistically significant difference. The technology acceptance of the ATAS is significantly higher than that of MARKUS semi-ATAS. Particularly, the function comparison of the two systems shows that the ATAS presents more perceived ease of use on the functions of term search, connection to source websites and adding annotation than MARKUS semi-ATAS. Furthermore, the reading interface of ATAS is simple and understandable and is more suitable for reading than MARKUS semi-ATAS. Among all the considered LD sources, Moedict, which is an online Chinese dictionary, was confirmed as the most helpful one.

Research limitations/implications

This study adopted Jieba Chinese parser to perform the word segmentation process based on a parser lexicon for the Chinese ancient texts of the Ming dynasty’s collections. The accuracy of word segmentation to a lexicon-based Chinese parser is limited due to ignoring the grammar and semantics of ancient texts. Moreover, the original parser lexicon used in Jieba Chinese parser only contains the modern words. This will reduce the accuracy of word segmentation for Chinese ancient texts. The two limitations that affect Jieba Chinese parser to correctly perform the word segmentation process for Chinese ancient texts will significantly affect the effectiveness of using ATAS to support digital humanities research. This study thus proposed a practicable scheme by adding new terms into the parser lexicon based on humanists’ self-judgment to improve the accuracy of word segmentation of Jieba Chinese parser.

Practical implications

Although some digital humanities platforms have been successfully developed to support digital humanities research for humanists, most of them have still not provided a friendly digital reading environment to support humanists on interpreting texts. For this reason, this study developed an ATAS that can automatically retrieve LD sources from different databases on the Internet to supply rich annotation information on reading texts to help humanists interpret texts. This study brings digital humanities research to a new ground.

Originality/value

This study proposed a novel ATAS that can automatically annotate useful information on an ancient text to increase the readability of the ancient text based on LD sources from different databases, thus helping humanists obtain a deeper and broader understanding in the ancient text. Currently, there is no this kind of tool developed for humanists to support digital humanities research.

Article
Publication date: 3 June 2019

Chih-Ming Chen and Chung Chang

With the rapid development of digital humanities, some digital humanities platforms have been successfully developed to support digital humanities research for humanists…

Abstract

Purpose

With the rapid development of digital humanities, some digital humanities platforms have been successfully developed to support digital humanities research for humanists. However, most of them have still not provided a friendly digital reading environment and practicable social network analysis tool to support humanists on interpreting texts and exploring characters’ social network relationships. Moreover, the advancement of digitization technologies for the retrieval and use of Chinese ancient books is arising an unprecedented challenge and opportunity. For these reasons, this paper aims to present a Chinese ancient books digital humanities research platform (CABDHRP) to support historical China studies. In addition to providing digital archives, digital reading, basic search and advanced search functions for Chinese ancient books, this platform still provides two novel functions that can more effectively support digital humanities research, including an automatic text annotation system (ATAS) for interpreting texts and a character social network relationship map tool (CSNRMT) for exploring characters’ social network relationships.

Design/methodology/approach

This study adopted DSpace, an open-source institutional repository system, to serve as a digital archives system for archiving scanned images, metadata, and full texts to develop the CABDHRP for supporting digital humanities (DH) research. Moreover, the ATAS developed in the CABDHRP used the Node.js framework to implement the system’s front- and back-end services, as well as application programming interfaces (APIs) provided by different databases, such as China Biographical Database (CBDB) and TGAZ, used to retrieve the useful linked data (LD) sources for interpreting ancient texts. Also, Neo4j which is an open-source graph database management system was used to implement the CSNRMT of the CABDHRP. Finally, JavaScript and jQuery were applied to develop a monitoring program embedded in the CABDHRP to record the use processes from humanists based on xAPI (experience API). To understand the research participants’ perception when interpreting the historical texts and characters’ social network relationships with the support of ATAS and CSNRMT, semi-structured interviews with 21 research participants were conducted.

Findings

An ATAS embedded in the reading interface of CABDHRP can collect resources from different databases through LD for automatically annotating ancient texts to support digital humanities research. It allows the humanists to refer to resources from diverse databases when interpreting ancient texts, as well as provides a friendly text annotation reader for humanists to interpret ancient text through reading. Additionally, the CSNRMT provided by the CABDHRP can semi-automatically identify characters’ names based on Chinese word segmentation technology and humanists’ support to confirm and analyze characters’ social network relationships from Chinese ancient books based on visualizing characters’ social networks as a knowledge graph. The CABDHRP not only can stimulate humanists to explore new viewpoints in a humanistic research, but also can promote the public to emerge the learning interest and awareness of Chinese ancient books.

Originality/value

This study proposed a novel CABDHRP that provides the advanced features, including the automatic word segmentation of Chinese text, automatic Chinese text annotation, semi-automatic character social network analysis and user behavior analysis, that are different from other existed digital humanities platforms. Currently, there is no this kind of digital humanities platform developed for humanists to support digital humanities research.

Details

The Electronic Library , vol. 37 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 19 January 2021

Chih-Ming Chen, Chung Chang and Yung-Ting Chen

Digital humanities aim to use a digital-based revolutionary new way to carry out enhanced forms of humanities research more effectively and efficiently. This study…

Abstract

Purpose

Digital humanities aim to use a digital-based revolutionary new way to carry out enhanced forms of humanities research more effectively and efficiently. This study develops a character social network relationship map tool (CSNRMT) that can semi-automatically assist digital humanists through human-computer interaction to more efficiently and accurately explore the character social network relationships from Chinese ancient texts for useful research findings.

Design/methodology/approach

With a counterbalanced design, semi-structured in-depth interview, and lag sequential analysis, a total of 21 research subjects participated in an experiment to examine the system effectiveness and technology acceptance of adopting the ancient book digital humanities research platform with and without the CSNRMT to interpret the characters and character social network relationships.

Findings

The experimental results reveal that the experimental group with the CSNRMT support appears higher system effectiveness on the interpretation of characters and character social network relationships than the control group without the CSNRMT, but does not achieve a statistically significant difference. Encouragingly, the experimental group with the CSNRMT support presents remarkably higher technology acceptance than the control group without the CSNRMT. Furthermore, use behaviors analyzed by lag sequential analysis reveal that the CSNRMT could assist digital humanists in the interpretation of character social network relationships. The results of the interview present positive opinions on the integration of system interface, smoothness of operation, and external search function.

Research limitations/implications

Currently, the system effectiveness of exploring the character social network relationships from texts for useful research findings by using the CSNRMT developed in this study will be significantly affected by the accuracy of recognizing character names and character social network relationships from Chinese ancient texts. The developed CSNRMT will be more practical when the offered information about character names and character social network relationships is more accurate and broad.

Practical implications

This study develops an ancient book digital humanities research platform with an emerging CSNRMT that provides an easy-to-use real-time interaction interface to semi-automatically support digital humanists to perform digital humanities research with the need of exploring character social network relationships.

Originality/value

At present, a real-time social network analysis tool to provide a friendly interaction interface and effectively assist digital humanists in the digital humanities research with character social networks analysis is still lacked. This study thus presents the CSNRMT that can semi-automatically identify character names from Chinese ancient texts and provide an easy-to-use real-time interaction interface for supporting digital humanities research so that digital humanists could more efficiently and accurately establish character social network relationships from the analyzed texts to explore complicated character social networks relationship and find out useful research findings.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 11 May 2022

Chih-Ming Chen, Tek-Soon Ling, Chung Chang, Chih-Fan Hsu and Chia-Pei Lim

Digital humanities research platform for biographies of Malaysia personalities (DHRP-BMP) was collaboratively developed by the Research Center for Chinese Cultural…

Abstract

Purpose

Digital humanities research platform for biographies of Malaysia personalities (DHRP-BMP) was collaboratively developed by the Research Center for Chinese Cultural Subjectivity in Taiwan, the Federation of Heng Ann Association Malaysia, and the Malaysian Chinese Research Center of Universiti Malaya in this study. Using The Biographies of Malaysia Henghua Personalities as the main archival sources, DHRP-BMP adopted the Omeka S, which is a next-generation Web publishing platform for institutions interested in connecting digital cultural heritage collections with other resources online, as the basic development system of the platform, to develop the functions of close reading and distant reading both combined together as the foundation of its digital humanities tools.

Design/methodology/approach

The results of the first-stage development are introduced in this study, and a case study of qualitative analysis is provided to describe the research process by a humanist scholar who used DHRP-BMP to discover the character relationships and contexts hidden in The Biographies of Malaysia Henghua Personalities.

Findings

Close reading provided by DHRP-BMP was able to support humanities scholars on comprehending full text contents through a user-friendly reading interface while distant reading developed in DHRP-BMP could assist humanities scholars on interpreting texts from a rather macro perspective through text analysis, with the functions such as keyword search, geographic information and social networks analysis for humanities scholars to master on the character relationships and geographic distribution from personality biographies, thus accelerating their text interpretation efficiency and uncovering the hidden context.

Originality/value

At present, a digital humanities research platform with real-time characters’ relationships analysis tool that can automatically generate visualized character relationship graphs based on Chinese named entity recognition (CNER) and character relationship identification technologies to effectively assist humanities scholars in interpreting characters’ relationships for digital humanities research is still lacking so far. This study thus presents the DHRP-BMP that offers the key features that can automatically identify characters’ names and characters’ relationships from personality biographies and provide a user-friendly visualization interface of characters’ relationships for supporting digital humanities research, so that humanities scholars could more efficiently and accurately explore characters’ relationships from the analyzed texts to explore complicated characters’ relationships and find out useful research findings.

Article
Publication date: 29 April 2022

Chih-Ming Chen, Szu-Yu Ho and Chung Chang

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that…

Abstract

Purpose

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that is associated with the need of topic exploration on the Digital Humanities Platform for Mr. Lo Chia-Lun’s Writings (DHP-LCLW). HTAT can assist humanities scholars on distant reading with analysis of hierarchical text topics, through classifying time-stamped texts into multiple historical eras, conducting hierarchical topic modeling (HTM) according to the texts from different eras and presenting through visualization. The comparative network diagram is another function provided to assist humanities scholars in comparing the difference in the topics they wish to explore and to track how the concept of a topic changes over time from a particular perspective. In addition, HTAT can also provide humanities scholars with the feature to view source texts, thus having high potential to be applied in promoting the effectiveness of topic exploration due to simultaneously integrating both the topic exploration functions of distant reading and close reading.

Design/methodology/approach

This study adopts a counterbalanced experimental design to examine whether there is significant differences in the effectiveness of topic inquiry, the number of relevant topics inquired and the time spent on them when research participants were alternately conducting text exploration using DHP-LCLW with HTAT or DHP-LCLW with Single-layer Topic Analysis Tool (SLTAT). A technology acceptance questionnaire and semi-structured interviews were also conducted to understand the research participants' perception and feelings toward using the two different tools to assist topic inquiry.

Findings

The experimental results show that DHP-LCLW with HTAT could better assist the research participants, in comparison with DHP-LCLW with SLTAT, to grasp the topic context of the texts from two particular perspectives assigned by this study within a short period. In addition, the results of the interviews revealed that DHP-LCLW with HTAT, in comparison with SLTAT, was able to provide a topic terms that better met research participnats' expectations and needs, and effectively guided them to the corresponding texts for close reading. In the analysis of technology acceptance and interview data, it can be found that the research participants have a high and positive tendency toward using DHP-LCLW with HTAT to assist topic inquiry.

Research limitations/implications

The Jieba Chinese word segmentation system was used in the Mr. Lo Chia-Lun’s Writings Database in this study, to perform word segmentation on Mr. Lo Chia-Lun’s writing texts for topic modeling based on hLDA. Since Jieba word segmentation system is a lexicon based word segmentation system, it cannot identify new words that have still not been collected in the lexicon well. In this case, the correctness of word segmentation on the target texts will affect the results of hLDA topic modeling, and the effectiveness of HTAT in assisting humanities scholars for topic inquiry.

Practical implications

An HTAT was developed to support digital humanities research in this study. With HTAT, DHP-LCLW provides hmanities scholars with topic clues from different hierarchical perspectives for textual exploration, and with temporal and comparative network diagrams to assist humanities scholars in tracking the evolution of the topics of specific perspectives over time, to gain a more comprehensive understanding of the overall context of the texts.

Originality/value

In recent years, topic analysis technology that can automatically extract key topic information from a large amount of texts has been developed rapidly, but the topics generated from traditional topic analysis models like LDA (Latent Dirichelet allocation) make it difficult for users to understand the differences in the topics of texts with different hierarchical levels. Thus, this study proposes HTAT which uses hLDA to build a hierarchical topic tree with a tree-like structure without the need to define the number of topics in advance, enabling humanities scholars to quickly grasp the concept of textual topics and use different hierarchical perspectives for further textual exploration. At the same time, it also provides a combination function of temporal division and comparative network diagram to assist humanities scholars in exploring topics and their changes in different eras, which helps them discover more useful research clues or findings.

Details

Aslib Journal of Information Management, vol. 75 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 29 January 2020

Abdoulaye Kaba and Chennupati K. Ramaiah

The purpose of this research paper is to report about an investigation on the relationship between knowledge acquisition and knowledge creation to find out whether…

Abstract

Purpose

The purpose of this research paper is to report about an investigation on the relationship between knowledge acquisition and knowledge creation to find out whether knowledge acquisition can predict knowledge creation. The study measures the concept of knowledge acquisition through the faculty use of knowledge acquisition tools and reading knowledge sources while measuring the concept of knowledge creation through the faculty use of knowledge creation tools and publishing knowledge sources.

Design/methodology/approach

The population of the study is faculty members in the United Arab Emirates (UAE). The sample of the population consisted of 300 faculty members affiliated with 26 universities and colleges. Data was collected from the sample through questionnaire instrument. Stated hypotheses and Mathew’s theory of knowledge consumption–production correlation are tested and verified through correlation matrix and regression analysis.

Findings

Findings of the study revealed that the use of knowledge acquisition tools by faculty members has a positive effect on the use of knowledge creation tools and on publishing knowledge sources. Likewise, reading knowledge sources appeared to have a positive impact on the use of knowledge creation tools and publishing knowledge sources. Accordingly, the study confirmed the stated four hypotheses. Moreover, the results of the study supported the theory of knowledge consumption–production correlation and strongly confirmed the prediction of knowledge creation through the use of information and communication technology (ICT) tools for knowledge acquisition and reading knowledge sources.

Practical implications

Findings of the study appeal to the decision-makers and stakeholders of academic institutions to make effective investment in ICT facilities and knowledge sources to improve knowledge creation among faculty members.

Originality/value

Not many studies have investigated how knowledge acquisition can predict knowledge creation in the academic environment. This paper contributes to the understanding of the relationship between knowledge acquisition and knowledge creation in academic settings. Findings of the study can be an important reference for providing and improving knowledge sources, knowledge acquisition tools and knowledge creation tools in the academic environment.

Details

VINE Journal of Information and Knowledge Management Systems, vol. 50 no. 3
Type: Research Article
ISSN: 2059-5891

Keywords

Article
Publication date: 30 March 2012

José L. Navarro‐Galindo and José Samos

Nowadays, the use of WCMS (web content management systems) is widespread. The conversion of this infrastructure into its semantic equivalent (semantic WCMS) is a critical…

Abstract

Purpose

Nowadays, the use of WCMS (web content management systems) is widespread. The conversion of this infrastructure into its semantic equivalent (semantic WCMS) is a critical issue, as this enables the benefits of the semantic web to be extended. The purpose of this paper is to present a FLERSA (Flexible Range Semantic Annotation) for flexible range semantic annotation.

Design/methodology/approach

A FLERSA is presented as a user‐centred annotation tool for Web content expressed in natural language. The tool has been built in order to illustrate how a WCMS called Joomla! can be converted into its semantic equivalent.

Findings

The development of the tool shows that it is possible to build a semantic WCMS through a combination of semantic components and other resources such as ontologies and emergence technologies, including XML, RDF, RDFa and OWL.

Practical implications

The paper provides a starting‐point for further research in which the principles and techniques of the FLERSA tool can be applied to any WCMS.

Originality/value

The tool allows both manual and automatic semantic annotations, as well as providing enhanced search capabilities. For manual annotation, a new flexible range markup technique is used, based on the RDFa standard, to support the evolution of annotated Web documents more effectively than XPointer. For automatic annotation, a hybrid approach based on machine learning techniques (Vector‐Space Model + n‐grams) is used to determine the concepts that the content of a Web document deals with (from an ontology which provides a taxonomy), based on previous annotations that are used as a training corpus.

Article
Publication date: 13 December 2022

Chengxi Yan, Xuemei Tang, Hao Yang and Jun Wang

The majority of existing studies about named entity recognition (NER) concentrate on the prediction enhancement of deep neural network (DNN)-based models themselves, but…

Abstract

Purpose

The majority of existing studies about named entity recognition (NER) concentrate on the prediction enhancement of deep neural network (DNN)-based models themselves, but the issues about the scarcity of training corpus and the difficulty of annotation quality control are not fully solved, especially for Chinese ancient corpora. Therefore, designing a new integrated solution for Chinese historical NER, including automatic entity extraction and man-machine cooperative annotation, is quite valuable for improving the effectiveness of Chinese historical NER and fostering the development of low-resource information extraction.

Design/methodology/approach

The research provides a systematic approach for Chinese historical NER with a three-stage framework. In addition to the stage of basic preprocessing, the authors create, retrain and yield a high-performance NER model only using limited labeled resources during the stage of augmented deep active learning (ADAL), which entails three steps—DNN-based NER modeling, hybrid pool-based sampling (HPS) based on the active learning (AL), and NER-oriented data augmentation (DA). ADAL is thought to have the capacity to maintain the performance of DNN as high as possible under the few-shot constraint. Then, to realize machine-aided quality control in crowdsourcing settings, the authors design a stage of globally-optimized automatic label consolidation (GALC). The core of GALC is a newly-designed label consolidation model called simulated annealing-based automatic label aggregation (“SA-ALC”), which incorporates the factors of worker reliability and global label estimation. The model can assure the annotation quality of those data from a crowdsourcing annotation system.

Findings

Extensive experiments on two types of Chinese classical historical datasets show that the authors’ solution can effectively reduce the corpus dependency of a DNN-based NER model and alleviate the problem of label quality. Moreover, the results also show the superior performance of the authors’ pipeline approaches (i.e. HPS + DA and SA-ALC) compared to equivalent baselines in each stage.

Originality/value

The study sheds new light on the automatic extraction of Chinese historical entities in an all-technological-process integration. The solution is helpful to effectively reducing the annotation cost and controlling the labeling quality for the NER task. It can be further applied to similar tasks of information extraction and other low-resource fields in theoretical and practical ways.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 19 June 2017

Mohammed Ourabah Soualah, Yassine Ait Ali Yahia, Abdelkader Keita and Abderrezak Guessoum

The purpose of this paper is to obtain online access to the digitised Arabic manuscripts images, which need to use a catalogue. The bibliographic cataloguing is unsuitable…

Abstract

Purpose

The purpose of this paper is to obtain online access to the digitised Arabic manuscripts images, which need to use a catalogue. The bibliographic cataloguing is unsuitable for old Arabic manuscripts, and it is imperative to establish a new cataloguing model. In the research, the authors propose a new cataloguing model based on manuscript annotations and transcriptions. This model can be an effective solution to dynamic catalogue old Arabic manuscripts. In this field, the authors used the automatic extraction of the metadata that is based on the structural similarity of the documents.

Design/methodology/approach

This work is based on experimental methodology. The whole proposed concepts and formulas were tested for validation. This, allows the authors to make concise conclusions.

Findings

Cataloguing old Arabic manuscripts faces problem of unavailability of information. However, this information may be found in another place in a copy of the original manuscript. Thus, cataloguing Arabic manuscript cannot be done in one time, it is a continual process which require information updating. The idea is to make a pre-cataloguing of a manuscript, then try to complete and improve it through a specific platform. Consequently, in the research work, the authors propose a new cataloguing model, which the authors call “Dynamic cataloguing”.

Research limitations/implications

The success of the proposed model is confronted with the involvement of all actors of the model. It is based on the conviction and the motivation of actors of the collaborative platform.

Practical implications

The model can be used in several cataloguing fields, where the encoding model is based on XML. The model is innovative and implements a smart cataloguing model. The model is useful by using a web platform. It allows an automatic update of a catalogue.

Social implications

The model prompts the user to participate and enrich the catalogue. The user could improve his social status from a passive to an active.

Originality/value

The dynamic cataloguing model is a new concept. It has never been proposed in the literature until now. The proposed cataloguing model is based on automatic extraction of metadata from user annotations/transcription. It is a smart system which automatically updates or fills the catalogue with the extracted metadata.

Article
Publication date: 15 June 2015

Masaki Samejima, Daichi Hisakane and Norihisa Komoda

The purpose of this paper is to annotate an attribute of a problem, a solution or no annotation on learners’ opinions automatically for supporting the learners’ discussion…

Abstract

Purpose

The purpose of this paper is to annotate an attribute of a problem, a solution or no annotation on learners’ opinions automatically for supporting the learners’ discussion without a facilitator. The case method aims at discussing problems and solutions in a target case. However, the learners miss discussing some of problems and solutions.

Design/methodology/approach

Because opinions about problems and solutions on the same case are similar to each other, the proposed method uses opinions that are correctly annotated in past discussions for annotating an appropriate attribute on each opinion in discussions of the same case. The annotation on each opinion is identified by Support Vector Machine learned with opinions and annotations in the past discussion.

Findings

Compared to a simple method that uses decision tree classification, this proposed method improves the recall rate and the precision rate of annotating the attribute by over 10 per cent. The proposed method is effective for automatic annotation.

Originality/value

Because the recall rate and the precision rate of annotating an attribute of a problem are over 80 per cent, it is possible to make learners aware of problems that they should discuss. On the other hand, the recall rate and the precision rate of annotating an attribute of a solution are still low. The authors discuss the research issue to improve the rates for automatic annotation.

Details

Interactive Technology and Smart Education, vol. 12 no. 2
Type: Research Article
ISSN: 1741-5659

Keywords

1 – 10 of 878