Search results

1 – 10 of over 4000
Article
Publication date: 9 August 2011

Lin‐Chih Chen

Web‐snippet clustering has recently attracted a lot of attention as a means to provide users with a succinct overview of relevant results compared with traditional search results…

Abstract

Purpose

Web‐snippet clustering has recently attracted a lot of attention as a means to provide users with a succinct overview of relevant results compared with traditional search results. This paper seeks to research the building of a web‐snippet clustering system, based on a mixed clustering method.

Design/methodology/approach

This paper proposes a mixed clustering method to organise all returned snippets into a hierarchical tree. The method accomplishes two main tasks: one is to construct the cluster labels and the other is to build a hierarchical tree.

Findings

Five measures were used to measure the quality of clustering results. Based on the results of the experiments, it was concluded that the performance of the system is better than current commercial and academic systems.

Originality/value

A high performance system is presented, based on the clustering method. A divisive hierarchical clustering algorithm is also developed to organise all returned snippets into a hierarchical tree.

Article
Publication date: 29 April 2022

Chih-Ming Chen, Szu-Yu Ho and Chung Chang

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that is…

Abstract

Purpose

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that is associated with the need of topic exploration on the Digital Humanities Platform for Mr. Lo Chia-Lun’s Writings (DHP-LCLW). HTAT can assist humanities scholars on distant reading with analysis of hierarchical text topics, through classifying time-stamped texts into multiple historical eras, conducting hierarchical topic modeling (HTM) according to the texts from different eras and presenting through visualization. The comparative network diagram is another function provided to assist humanities scholars in comparing the difference in the topics they wish to explore and to track how the concept of a topic changes over time from a particular perspective. In addition, HTAT can also provide humanities scholars with the feature to view source texts, thus having high potential to be applied in promoting the effectiveness of topic exploration due to simultaneously integrating both the topic exploration functions of distant reading and close reading.

Design/methodology/approach

This study adopts a counterbalanced experimental design to examine whether there is significant differences in the effectiveness of topic inquiry, the number of relevant topics inquired and the time spent on them when research participants were alternately conducting text exploration using DHP-LCLW with HTAT or DHP-LCLW with Single-layer Topic Analysis Tool (SLTAT). A technology acceptance questionnaire and semi-structured interviews were also conducted to understand the research participants' perception and feelings toward using the two different tools to assist topic inquiry.

Findings

The experimental results show that DHP-LCLW with HTAT could better assist the research participants, in comparison with DHP-LCLW with SLTAT, to grasp the topic context of the texts from two particular perspectives assigned by this study within a short period. In addition, the results of the interviews revealed that DHP-LCLW with HTAT, in comparison with SLTAT, was able to provide a topic terms that better met research participnats' expectations and needs, and effectively guided them to the corresponding texts for close reading. In the analysis of technology acceptance and interview data, it can be found that the research participants have a high and positive tendency toward using DHP-LCLW with HTAT to assist topic inquiry.

Research limitations/implications

The Jieba Chinese word segmentation system was used in the Mr. Lo Chia-Lun’s Writings Database in this study, to perform word segmentation on Mr. Lo Chia-Lun’s writing texts for topic modeling based on hLDA. Since Jieba word segmentation system is a lexicon based word segmentation system, it cannot identify new words that have still not been collected in the lexicon well. In this case, the correctness of word segmentation on the target texts will affect the results of hLDA topic modeling, and the effectiveness of HTAT in assisting humanities scholars for topic inquiry.

Practical implications

An HTAT was developed to support digital humanities research in this study. With HTAT, DHP-LCLW provides hmanities scholars with topic clues from different hierarchical perspectives for textual exploration, and with temporal and comparative network diagrams to assist humanities scholars in tracking the evolution of the topics of specific perspectives over time, to gain a more comprehensive understanding of the overall context of the texts.

Originality/value

In recent years, topic analysis technology that can automatically extract key topic information from a large amount of texts has been developed rapidly, but the topics generated from traditional topic analysis models like LDA (Latent Dirichelet allocation) make it difficult for users to understand the differences in the topics of texts with different hierarchical levels. Thus, this study proposes HTAT which uses hLDA to build a hierarchical topic tree with a tree-like structure without the need to define the number of topics in advance, enabling humanities scholars to quickly grasp the concept of textual topics and use different hierarchical perspectives for further textual exploration. At the same time, it also provides a combination function of temporal division and comparative network diagram to assist humanities scholars in exploring topics and their changes in different eras, which helps them discover more useful research clues or findings.

Details

Aslib Journal of Information Management, vol. 75 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 6 February 2017

Zhongyi Wang, Jin Zhang and Jing Huang

Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed…

Abstract

Purpose

Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library’s structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks.

Design/methodology/approach

MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion.

Findings

This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance.

Practical implications

With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload.

Originality/value

This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree.

Article
Publication date: 29 August 2018

Norihiro Kamide

The purpose of this paper is to develop new simple logics and translations for hierarchical model checking. Hierarchical model checking is a model-checking paradigm that can…

Abstract

Purpose

The purpose of this paper is to develop new simple logics and translations for hierarchical model checking. Hierarchical model checking is a model-checking paradigm that can appropriately verify systems with hierarchical information and structures.

Design/methodology/approach

In this study, logics and translations for hierarchical model checking are developed based on linear-time temporal logic (LTL), computation-tree logic (CTL) and full computation-tree logic (CTL*). A sequential linear-time temporal logic (sLTL), a sequential computation-tree logic (sCTL), and a sequential full computation-tree logic (sCTL*), which can suitably represent hierarchical information and structures, are developed by extending LTL, CTL and CTL*, respectively. Translations from sLTL, sCTL and sCTL* into LTL, CTL and CTL*, respectively, are defined, and theorems for embedding sLTL, sCTL and sCTL* into LTL, CTL and CTL*, respectively, are proved using these translations.

Findings

These embedding theorems allow us to reuse the standard LTL-, CTL-, and CTL*-based model-checking algorithms to verify hierarchical systems that are modeled and specified by sLTL, sCTL and sCTL*.

Originality/value

The new logics sLTL, sCTL and sCTL* and their translations are developed, and some illustrative examples of hierarchical model checking are presented based on these logics and translations.

Details

Data Technologies and Applications, vol. 52 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 16 October 2009

Kwan Yi and Lois Mai Chan

The purpose of this paper is to investigate the linking of a folksonomy (user vocabulary) and LCSH (controlled vocabulary) on the basis of word matching, for the potential use of…

4661

Abstract

Purpose

The purpose of this paper is to investigate the linking of a folksonomy (user vocabulary) and LCSH (controlled vocabulary) on the basis of word matching, for the potential use of LCSH in bringing order to folksonomies.

Design/methodology/approach

A selected sample of a folksonomy from a popular collaborative tagging system, Delicious, was word‐matched with LCSH. LCSH was transformed into a tree structure called an LCSH tree for the matching. A close examination was conducted on the characteristics of folksonomies, the overlap of folksonomies with LCSH, and the distribution of folksonomies over the LCSH tree.

Findings

The experimental results showed that the total proportion of tags being matched with LC subject headings constituted approximately two‐thirds of all tags involved, with an additional 10 percent of the remaining tags having potential matches. A number of barriers for the linking as well as two areas in need of improving the matching are identified and described. Three important tag distribution patterns over the LCSH tree were identified and supported: skewedness, multifacet, and Zipfian‐pattern.

Research limitations/implications

The results of the study can be adopted for the development of innovative methods of mapping between folksonomy and LCSH, which directly contributes to effective access and retrieval of tagged web resources and to the integration of multiple information repositories based on the two vocabularies.

Practical implications

The linking of controlled vocabularies can be applicable to enhance information retrieval capability within collaborative tagging systems as well as across various tagging system information depositories and bibliographic databases.

Originality/value

This is among frontier works that examines the potential of linking a folksonomy, extracted from a collaborative tagging system, to an authority‐maintained subject heading system. It provides exploratory data to support further advanced mapping methods for linking the two vocabularies.

Details

Journal of Documentation, vol. 65 no. 6
Type: Research Article
ISSN: 0022-0418

Keywords

Open Access
Article
Publication date: 31 January 2024

Juan Gabriel Brida, Emiliano Alvarez, Gaston Cayssials and Matias Mednik

Our paper studies a central issue with a long history in economics: the relationship between population and economic growth. We analyze the joint dynamics of economic and…

Abstract

Purpose

Our paper studies a central issue with a long history in economics: the relationship between population and economic growth. We analyze the joint dynamics of economic and demographic growth in 111 countries during the period 1960–2019.

Design/methodology/approach

Using the concept of economic regime, the paper introduces the notion of distance between the dynamical paths of different countries. Then, a minimal spanning tree (MST) and a hierarchical tree (HT) are constructed to detect groups of countries sharing similar dynamic performance.

Findings

The methodology confirms the existence of three country clubs, each of which exhibits a different dynamic behavior pattern. The analysis also shows that the clusters clearly differ with respect to the evolution of other fundamental variables not previously considered [gross domestic product (GDP) per capita, human capital and life expectancy, among others].

Practical implications

Our results indirectly suggest the existence of dynamic interdependence in the trajectories of economic growth and population change between countries. It also provides evidence against single-model approaches to explain the interdependence between demographic change and economic growth.

Originality/value

We introduce a methodology that allows for a model-free topological and hierarchical description of the interplay between economic growth and population.

Details

Review of Economics and Political Science, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2356-9980

Keywords

Article
Publication date: 21 November 2016

Jing Chen, Dan Wang, Quan Lu and Zeyuan Xu

With a mass of electronic multi-topic documents available, there is an increasing need for evaluating emerging analysis tools to help users and digital libraries analyze these…

Abstract

Purpose

With a mass of electronic multi-topic documents available, there is an increasing need for evaluating emerging analysis tools to help users and digital libraries analyze these documents better. The purpose of this paper is to evaluate the effectiveness, efficiency and user satisfaction of THC-DAT, a within-document analysis tool, in reading a multi-topic document.

Design/methodology/approach

The authors reviewed related literature first, then performed a user-centered, comparative evaluation of two within-document analysis tools, THC-DAT and BOOKMARK. THC-DAT extracts a topic hierarchy tree using hierarchical latent Dirichlet allocation (hLDA) method and takes the context information into account. BOOKMARK provides similar functionality to the Table of Contents bookmarks in Adobe Reader. Three novel kinds of tasks were devised for participants to finish on two tools, with objective results to assess reading effectiveness and efficiency. And post-system questionnaires were employed to obtain participants’ subjective judgments about the tools.

Findings

The results confirm that THC-DAT is significantly more effective than BOOKMARK, while not inferior in efficiency. There is some evidence that suggests THC-DAT can slow down the process of approaching cognitive overload and improve users’ willingness to undertake difficult task. Based on qualitative data from questionnaires, the results indicate that users were more satisfied when using THC-DAT than BOOKMARK.

Practical implications

Adopting THC-DAT in digital libraries or electrical document reading systems contributes to promoting users’ reading performance, willingness to undertake difficult task and general satisfaction. Moreover, THC-DAT is of great value to addressing cognitive overload problem in the information retrieval field.

Originality/value

This paper evaluates a novel within-document analysis tool in analyzing a multi-topic document, and proved that this tool is superior to the benchmark in effectiveness and user satisfaction, and not inferior in efficiency.

Details

Library Hi Tech, vol. 34 no. 4
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 18 January 2011

Yael Keshet

Classification is an important process in making sense of the world, and has a pronounced social dimension. This paper aims to compare folksonomy, a new social classification…

3483

Abstract

Purpose

Classification is an important process in making sense of the world, and has a pronounced social dimension. This paper aims to compare folksonomy, a new social classification system currently being developed on the web, with conventional taxonomy in the light of theoretical sociological and anthropological approaches. The co‐existence of these two types of classification system raises the questions: Will and should taxonomies be hybridized with folksonomies? What can each of these systems contribute to information‐searching processes, and how can the sociology of knowledge provide an answer to these questions? This paper aims also to address these issues.

Design/methodology/approach

This paper is situated at the meeting point of the sociology of knowledge, epistemology and information science and aims at examining systems of classification in the light of both classical theory and current late‐modern sociological and anthropological approaches.

Findings

Using theoretical approaches current in the sociology of science and knowledge, the paper envisages two divergent possible outcomes.

Originality/value

While concentrating on classifications systems, this paper addresses the more general social issue of what we know and how it is known. The concept of hybrid knowledge is suggested in order to illuminate the epistemological basis of late‐modern knowledge being constructed by hybridizing contradictory modern knowledge categories, such as the subjective with the objective and the social with the natural. Integrating tree‐like taxonomies with folksonomies or, in other words, generating a naturalized structural order of objective relations with social, subjective classification systems, can create a vast range of hybrid knowledge.

Details

Journal of Documentation, vol. 67 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 13 March 2017

Fulvio Mazzocchi

The purpose of this paper, which increases and deepens what was expressed in a previous work (Mazzocchi et al., 2007), is to scrutinize the underlying assumptions of the types of…

Abstract

Purpose

The purpose of this paper, which increases and deepens what was expressed in a previous work (Mazzocchi et al., 2007), is to scrutinize the underlying assumptions of the types of relations included in thesauri, particularly the genus-species relation. Logicist approaches to information organization, which are still dominant, will be compared with hermeneutically oriented approaches. In the light of these approaches, the nature and features of the relations, and what the notion of a priori could possibly mean with regard to them, are examined, together with the implications for designing and implementing knowledge organizations systems (KOS).

Design/methodology/approach

The inquiry is based on how the relations are described in literature, engaging in particular a discussion with Hjørland (2015) and Svenonius (2004). The philosophical roots of today’s leading views are briefly illustrated, in order to put them under perspective and deconstruct the uncritical reception of their authority. To corroborate the discussion a semantic analysis of specific terms and relations is provided too.

Findings

All relations should be seen as “perspectival” (not as a priori). On the other hand, different types of relations, depending on the conceptual features of the terms involved, can hold a different degree of “stability.” On this basis, they could be used to address different information concerns (e.g. interoperability vs expressiveness).

Research limitations/implications

Some arguments that the paper puts forth at the conceptual level need to be tested in application contexts.

Originality/value

This paper considers that the standpoint of logic and of hermeneutic (usually seen as conflicting) are both significant for information organization, and could be pragmatically integrated. In accordance with this view, an extension of thesaurus relations’ set is advised, meaning that perspective hierarchical relations (i.e. relations that are not logically based but function contingently) should be also included in such a set.

Details

Journal of Documentation, vol. 73 no. 2
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 2 November 2015

Hengliang Shi, Xiaolei Bai and Jianhui Duan

In cloth animation field, the collision detection of fabric under external force is very complex, and difficult to satisfy the needs of reality feeling and real time. The purpose…

Abstract

Purpose

In cloth animation field, the collision detection of fabric under external force is very complex, and difficult to satisfy the needs of reality feeling and real time. The purpose of this paper is to improve reality feeling and real-time requirement.

Design/methodology/approach

This paper puts forward a mass-spring model with building bounding-box in the center of particle, and designs the collision detection algorithm based on Mapreduce. At the same time, a method is proposed to detect collision based on geometric unit.

Findings

The method can quickly detect the intersection of particle and triangle, and then deal with collision response according to the physical characteristics of fabric. Experiment shows that the algorithm improves real-time and authenticity.

Research limitations/implications

Experiments show that 3D fabric simulation can be more efficiency through parallel calculation model − Mapreduce.

Practical implications

This method can improve the reality feeling, and reduce calculation quantity.

Social implications

This collision-detection can be used into more fields such as 3D games, aero simulation training and garments automation.

Originality/value

This model and method have originality, and can be used to 3D animation, digital entertainment, and garment industry.

Details

International Journal of Clothing Science and Technology, vol. 27 no. 6
Type: Research Article
ISSN: 0955-6222

Keywords

1 – 10 of over 4000