Search results

1 – 10 of over 17000
Article
Publication date: 9 August 2021

Xintong Zhao, Jane Greenberg, Vanessa Meschke, Eric Toberer and Xiaohua Hu

The output of academic literature has increased significantly due to digital technology, presenting researchers with a challenge across every discipline, including materials…

Abstract

Purpose

The output of academic literature has increased significantly due to digital technology, presenting researchers with a challenge across every discipline, including materials science, as it is impossible to manually read and extract knowledge from millions of published literature. The purpose of this study is to address this challenge by exploring knowledge extraction in materials science, as applied to digital scholarship. An overriding goal is to help inform readers about the status knowledge extraction in materials science.

Design/methodology/approach

The authors conducted a two-part analysis, comparing knowledge extraction methods applied materials science scholarship, across a sample of 22 articles; followed by a comparison of HIVE-4-MAT, an ontology-based knowledge extraction and MatScholar, a named entity recognition (NER) application. This paper covers contextual background, and a review of three tiers of knowledge extraction (ontology-based, NER and relation extraction), followed by the research goals and approach.

Findings

The results indicate three key needs for researchers to consider for advancing knowledge extraction: the need for materials science focused corpora; the need for researchers to define the scope of the research being pursued, and the need to understand the tradeoffs among different knowledge extraction methods. This paper also points to future material science research potential with relation extraction and increased availability of ontologies.

Originality/value

To the best of the authors’ knowledge, there are very few studies examining knowledge extraction in materials science. This work makes an important contribution to this underexplored research area.

Details

The Electronic Library , vol. 39 no. 3
Type: Research Article
ISSN: 0264-0473

Keywords

Open Access
Article
Publication date: 14 August 2017

Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Anne H.H. Ngu and Yihong Zhang

This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.

2120

Abstract

Purpose

This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.

Design/methodology/approach

In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed.

Findings

Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed.

Originality/value

To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.

Details

PSU Research Review, vol. 1 no. 2
Type: Research Article
ISSN: 2399-1747

Keywords

Article
Publication date: 19 July 2024

Kuoyi Lin, Xiaoyang Kan and Meilian Liu

This study develops and validates an innovative approach for extracting knowledge from online user reviews by integrating textual content and emojis. Recognizing the pivotal role…

Abstract

Purpose

This study develops and validates an innovative approach for extracting knowledge from online user reviews by integrating textual content and emojis. Recognizing the pivotal role emojis play in enhancing the expressiveness and emotional depth of digital communication, this study aims to address the significant gap in existing sentiment analysis models, which have largely overlooked the contribution of emojis in interpreting user preferences and sentiments. By constructing a comprehensive model that synergizes emotional and semantic information conveyed through emojis and text, this study seeks to provide a more nuanced understanding of user preferences, thereby enhancing the accuracy and depth of knowledge extraction from online reviews. The goal is to offer a robust framework that enables more effective and empathetic engagement with user-generated content on digital platforms, paving the way for improved service delivery, product development and customer satisfaction through informed insights into consumer behavior and sentiments.

Design/methodology/approach

This study uses a structured methodology to integrate and analyze text and emojis from online reviews for effective knowledge extraction, focusing on user preferences and sentiments. This methodology consists of four key stages. First, this study leverages high-frequency noun analysis to identify and extract product attributes mentioned in online user reviews. By focusing on nouns that appear frequently, the authors can systematically discern the primary features or aspects of products that users discuss, thereby providing a foundation for a more detailed sentiment and preference analysis. Second, a foundational sentiment dictionary is established that incorporates sentiment-bearing words, intensifiers and negation terms to analyze the textual part of the reviews. This dictionary is used to assign sentiment scores to phrases and sentences within reviews, allowing the quantification of textual sentiments based on the presence and combination of these predefined lexical items. Third, an emoticon sentiment dictionary is developed to address the emotional content conveyed through emojis. This dictionary categorizes emojis based on their associated sentiments, thus enabling the quantification of emotional expressions in reviews. The sentiment scores derived from the emojis are then integrated with those from the textual analysis. This integration considers the weights of text- and emoji-based emotions to compute a comprehensive attribute sentiment score that reflects a nuanced understanding of user sentiments and preferences. Finally, the authors conduct an empirical study to validate the effectiveness of the proposed methodology in mining user preferences from online reviews by applying the approach to a data set of online reviews and evaluating its ability to accurately identify product attributes and user sentiments. The validation process assessed the reliability and accuracy of the methodology in extracting meaningful insights from the complex interplay between text and emojis. This study offers a holistic and nuanced framework for knowledge extraction from online reviews, capturing both explicit and implicit sentiments expressed by users through text and emojis. By integrating these elements, this study seeks to provide a comprehensive understanding of user preferences, contributing to improved consumer insight and strategic decision-making for businesses and researchers.

Findings

The application of the proposed methodology for integrating emojis with text in online reviews yields significant findings that underscore the feasibility and value of extracting realistic user knowledge to gain insights from user-generated content. The analysis successfully captured consumer preferences, which are instrumental in informing service decisions and driving innovation. This achievement is largely attributed to the development and utilization of a comprehensive emotion-sentiment dictionary tailored to interpret the complex interplay between textual and emoji-based expressions in online reviews. By implementing a sentiment calculation model that intricately combines textual sentiment analysis with emoji sentiment analysis, this study was able to accurately determine the final attribute emotion for various product features discussed in the reviews. This model effectively characterized the emotional knowledge of online users and provided a nuanced understanding of their sentiments and preferences. The emotional knowledge extracted is not only quantifiable but also rich in context, offering deeper insights into consumer behavior and attitudes. Furthermore, a case analysis is conducted to rigorously test the validity of the proposed model in a real-world scenario. This practical examination revealed that the model is not only capable of accurately extracting and analyzing user preferences but is also adaptable to different contexts and product categories. The case analysis highlights the robustness and flexibility of the model, demonstrating its potential to enhance the precision of knowledge extraction processes significantly. Overall, the results confirm the effectiveness of the proposed approach in integrating text and emojis for comprehensive knowledge extraction from online reviews. The findings validate the model’s capability to offer actionable insights into consumer preferences, thereby supporting more informed and strategic decision-making by businesses. This study contributes to the broader field of sentiment analysis by showcasing the untapped potential of emojis as valuable indicators of user sentiments, opening new avenues for research and applications in digital marketing and consumer behavior analysis.

Originality/value

This study introduces a pioneering approach to extract knowledge from Web user interactions, notably through the integration of online reviews that incorporate both textual content and emoticons. This innovative methodology stands out because it holistically considers the dual channels of communication, text and emojis, to comprehensively mine Web user preferences. The key contribution of this study lies in its novel insights into the extraction of consumer preferences, advancing beyond traditional text-based analysis to embrace nuanced expressions conveyed through emoticons. The originality of this study is underpinned by its acknowledgment of emoticons as a significant and untapped source of sentiment and preference indicators in online reviews. By effectively merging emoticon analysis and emoji emotion scoring with textual sentiment analysis, this study enriches the understanding of Web user preferences and enhances the accuracy and depth of consumer preference insights. This dual-analysis approach represents a significant leap forward in sentiment analysis, setting a new standard for how digital communication can be leveraged to derive meaningful insights into consumer behavior. Furthermore, the results have practical implications to businesses and marketers. The insights gained from this integrated analytical approach offer a more granular and emotionally nuanced view of customer feedback, which can inform more effective marketing strategies, product development and customer service practices. By pioneering this comprehensive method of knowledge extraction, this study paves the way for future research and practice to interpret and respond more accurately to the complex landscape of online consumer expressions. This study’s originality and value lie in its innovative method of capturing and analyzing the rich tapestry of Web user communication, offering a ground-breaking perspective on consumer preference extraction that promises to enhance both academic research and practical applications in the digital era.

Details

Journal of Knowledge Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1367-3270

Keywords

Article
Publication date: 5 June 2024

Azanzi Jiomekong and Sanju Tiwari

This paper aims to curate open research knowledge graph (ORKG) with papers related to ontology learning and define an approach using ORKG as a computer-assisted tool to organize…

Abstract

Purpose

This paper aims to curate open research knowledge graph (ORKG) with papers related to ontology learning and define an approach using ORKG as a computer-assisted tool to organize key-insights extracted from research papers.

Design/methodology/approach

Action research was used to explore, test and evaluate the use of the Open Research Knowledge Graph as a computer assistant tool for knowledge acquisition from scientific papers.

Findings

To extract, structure and describe research contributions, the granularity of information should be decided; to facilitate the comparison of scientific papers, one should design a common template that will be used to describe the state of the art of a domain.

Originality/value

This approach is currently used to document “food information engineering,” “tabular data to knowledge graph matching” and “question answering” research problems and the “neurosymbolic AI” domain. More than 200 papers are ingested in ORKG. From these papers, more than 800 contributions are documented and these contributions are used to build over 100 comparison tables. At the end of this work, we found that ORKG is a valuable tool that can reduce the working curve of state-of-the-art research.

Article
Publication date: 24 June 2020

Yilu Zhou and Yuan Xue

Strategic alliances among organizations are some of the central drivers of innovation and economic growth. However, the discovery of alliances has relied on pure manual search and…

250

Abstract

Purpose

Strategic alliances among organizations are some of the central drivers of innovation and economic growth. However, the discovery of alliances has relied on pure manual search and has limited scope. This paper proposes a text-mining framework, ACRank, that automatically extracts alliances from news articles. ACRank aims to provide human analysts with a higher coverage of strategic alliances compared to existing databases, yet maintain a reasonable extraction precision. It has the potential to discover alliances involving less well-known companies, a situation often neglected by commercial databases.

Design/methodology/approach

The proposed framework is a systematic process of alliance extraction and validation using natural language processing techniques and alliance domain knowledge. The process integrates news article search, entity extraction, and syntactic and semantic linguistic parsing techniques. In particular, Alliance Discovery Template (ADT) identifies a number of linguistic templates expanded from expert domain knowledge and extract potential alliances at sentence-level. Alliance Confidence Ranking (ACRank)further validates each unique alliance based on multiple features at document-level. The framework is designed to deal with extremely skewed, noisy data from news articles.

Findings

In evaluating the performance of ACRank on a gold standard data set of IBM alliances (2006–2008) showed that: Sentence-level ADT-based extraction achieved 78.1% recall and 44.7% precision and eliminated over 99% of the noise in news articles. ACRank further improved precision to 97% with the top20% of extracted alliance instances. Further comparison with Thomson Reuters SDC database showed that SDC covered less than 20% of total alliances, while ACRank covered 67%. When applying ACRank to Dow 30 company news articles, ACRank is estimated to achieve a recall between 0.48 and 0.95, and only 15% of the alliances appeared in SDC.

Originality/value

The research framework proposed in this paper indicates a promising direction of building a comprehensive alliance database using automatic approaches. It adds value to academic studies and business analyses that require in-depth knowledge of strategic alliances. It also encourages other innovative studies that use text mining and data analytics to study business relations.

Details

Information Technology & People, vol. 33 no. 5
Type: Research Article
ISSN: 0959-3845

Keywords

Article
Publication date: 14 November 2023

Shaodan Sun, Jun Deng and Xugong Qin

This paper aims to amplify the retrieval and utilization of historical newspapers through the application of semantic organization, all from the vantage point of a fine-grained…

Abstract

Purpose

This paper aims to amplify the retrieval and utilization of historical newspapers through the application of semantic organization, all from the vantage point of a fine-grained knowledge element perspective. This endeavor seeks to unlock the latent value embedded within newspaper contents while simultaneously furnishing invaluable guidance within methodological paradigms for research in the humanities domain.

Design/methodology/approach

According to the semantic organization process and knowledge element concept, this study proposes a holistic framework, including four pivotal stages: knowledge element description, extraction, association and application. Initially, a semantic description model dedicated to knowledge elements is devised. Subsequently, harnessing the advanced deep learning techniques, the study delves into the realm of entity recognition and relationship extraction. These techniques are instrumental in identifying entities within the historical newspaper contents and capturing the interdependencies that exist among them. Finally, an online platform based on Flask is developed to enable the recognition of entities and relationships within historical newspapers.

Findings

This article utilized the Shengjing Times·Changchun Compilation as the datasets for describing, extracting, associating and applying newspapers contents. Regarding knowledge element extraction, the BERT + BS consistently outperforms Bi-LSTM, CRF++ and even BERT in terms of Recall and F1 scores, making it a favorable choice for entity recognition in this context. Particularly noteworthy is the Bi-LSTM-Pro model, which stands out with the highest scores across all metrics, notably achieving an exceptional F1 score in knowledge element relationship recognition.

Originality/value

Historical newspapers transcend their status as mere artifacts, evolving into invaluable reservoirs safeguarding the societal and historical memory. Through semantic organization from a fine-grained knowledge element perspective, it can facilitate semantic retrieval, semantic association, information visualization and knowledge discovery services for historical newspapers. In practice, it can empower researchers to unearth profound insights within the historical and cultural context, broadening the landscape of digital humanities research and practical applications.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 29 May 2023

Jinxiang Zeng, Shujin Cao, Yijin Chen, Pei Pan and Yafang Cai

This study analyzed the interdisciplinary characteristics of Chinese research studies in library and information science (LIS) measured by knowledge elements extracted through the…

Abstract

Purpose

This study analyzed the interdisciplinary characteristics of Chinese research studies in library and information science (LIS) measured by knowledge elements extracted through the Lexicon-LSTM model.

Design/methodology/approach

Eight research themes were selected for experiment, with a large-scale (N = 11,625) dataset of research papers from the China National Knowledge Infrastructure (CNKI) database constructed. And it is complemented with multiple corpora. Knowledge elements were extracted through a Lexicon-LSTM model. A subject knowledge graph is constructed to support the searching and classification of knowledge elements. An interdisciplinary-weighted average citation index space was constructed for measuring the interdisciplinary characteristics and contributions based on knowledge elements.

Findings

The empirical research shows that the Lexicon-LSTM model has superiority in the accuracy of extracting knowledge elements. In the field of LIS, the interdisciplinary diversity indicator showed an upward trend from 2011 to 2021, while the disciplinary balance and difference indicators showed a downward trend. The knowledge elements of theory and methodology could be used to detect and measure the interdisciplinary characteristics and contributions.

Originality/value

The extraction of knowledge elements facilitates the discovery of semantic information embedded in academic papers. The knowledge elements were proved feasible for measuring the interdisciplinary characteristics and exploring the changes in the time sequence, which helps for overview the state of the arts and future development trend of the interdisciplinary of research theme in LIS.

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 20 April 2012

Mohamed Morsey, Jens Lehmann, Sören Auer, Claus Stadler and Sebastian Hellmann

DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the web using Linked Data and SPARQL. However…

2414

Abstract

Purpose

DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the web using Linked Data and SPARQL. However, the DBpedia release process is heavyweight and releases are sometimes based on several months old data. DBpedia‐Live solves this problem by providing a live synchronization method based on the update stream of Wikipedia. This paper seeks to address these issues.

Design/methodology/approach

Wikipedia provides DBpedia with a continuous stream of updates, i.e. a stream of articles, which were recently updated. DBpedia‐Live processes that stream on the fly to obtain RDF data and stores the extracted data back to DBpedia. DBpedia‐Live publishes the newly added/deleted triples in files, in order to enable synchronization between the DBpedia endpoint and other DBpedia mirrors.

Findings

During the realization of DBpedia‐Live the authors learned that it is crucial to process Wikipedia updates in a priority queue. Recently‐updated Wikipedia articles should have the highest priority, over mapping‐changes and unmodified pages. An overall finding is that there are plenty of opportunities arising from the emerging Web of Data for librarians.

Practical implications

DBpedia had and has a great effect on the Web of Data and became a crystallization point for it. Many companies and researchers use DBpedia and its public services to improve their applications and research approaches. The DBpedia‐Live framework improves DBpedia further by timely synchronizing it with Wikipedia, which is relevant for many use cases requiring up‐to‐date information.

Originality/value

The new DBpedia‐Live framework adds new features to the old DBpedia‐Live framework, e.g. abstract extraction, ontology changes, and changesets publication.

Details

Program, vol. 46 no. 2
Type: Research Article
ISSN: 0033-0337

Keywords

Article
Publication date: 28 December 2023

Na Xu, Yanxiang Liang, Chaoran Guo, Bo Meng, Xueqing Zhou, Yuting Hu and Bo Zhang

Safety management plays an important part in coal mine construction. Due to complex data, the implementation of the construction safety knowledge scattered in standards poses a…

Abstract

Purpose

Safety management plays an important part in coal mine construction. Due to complex data, the implementation of the construction safety knowledge scattered in standards poses a challenge. This paper aims to develop a knowledge extraction model to automatically and efficiently extract domain knowledge from unstructured texts.

Design/methodology/approach

Bidirectional encoder representations from transformers (BERT)-bidirectional long short-term memory (BiLSTM)-conditional random field (CRF) method based on a pre-training language model was applied to carry out knowledge entity recognition in the field of coal mine construction safety in this paper. Firstly, 80 safety standards for coal mine construction were collected, sorted out and marked as a descriptive corpus. Then, the BERT pre-training language model was used to obtain dynamic word vectors. Finally, the BiLSTM-CRF model concluded the entity’s optimal tag sequence.

Findings

Accordingly, 11,933 entities and 2,051 relationships in the standard specifications texts of this paper were identified and a language model suitable for coal mine construction safety management was proposed. The experiments showed that F1 values were all above 60% in nine types of entities such as security management. F1 value of this model was more than 60% for entity extraction. The model identified and extracted entities more accurately than conventional methods.

Originality/value

This work completed the domain knowledge query and built a Q&A platform via entities and relationships identified by the standard specifications suitable for coal mines. This paper proposed a systematic framework for texts in coal mine construction safety to improve efficiency and accuracy of domain-specific entity extraction. In addition, the pretraining language model was also introduced into the coal mine construction safety to realize dynamic entity recognition, which provides technical support and theoretical reference for the optimization of safety management platforms.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 12 April 2022

Jun Deng, Chuyi Zhong, Shaodan Sun and Ruan Wang

This paper aims to construct a spatio-temporal emotional framework (STEF) for digital humanities from a quantitative perspective, applying knowledge extraction and mining…

Abstract

Purpose

This paper aims to construct a spatio-temporal emotional framework (STEF) for digital humanities from a quantitative perspective, applying knowledge extraction and mining technology to promote innovation of humanities research paradigm and method.

Design/methodology/approach

The proposed STEF uses methods of information extraction, sentiment analysis and geographic information system to achieve knowledge extraction and mining. STEF integrates time, space and emotional elements to visualize the spatial and temporal evolution of emotions, which thus enriches the analytical paradigm in digital humanities.

Findings

The case study shows that STEF can effectively extract knowledge from unstructured texts in the field of Chinese Qing Dynasty novels. First, STEF introduces the knowledge extraction tools – MARKUS and DocuSky – to profile character entities and perform plots extraction. Second, STEF extracts the characters' emotional evolutionary trajectory from the temporal and spatial perspective. Finally, the study draws a spatio-temporal emotional path figure of the leading characters and integrates the corresponding plots to analyze the causes of emotion fluctuations.

Originality/value

The STEF is constructed based on the “spatio-temporal narrative theory” and “emotional narrative theory”. It is the first framework to integrate elements of time, space and emotion to analyze the emotional evolution trajectories of characters in novels. The execuability and operability of the framework is also verified with a case novel to suggest a new path for quantitative analysis of other novels.

Details

Aslib Journal of Information Management, vol. 74 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

1 – 10 of over 17000