Search results

1 – 10 of over 4000
To view the access options for this content please click here
Article
Publication date: 21 October 2019

Priyadarshini R., Latha Tamilselvan and Rajendran N.

The purpose of this paper is to propose a fourfold semantic similarity that results in more accuracy compared to the existing literature. The change detection in the URL…

Abstract

Purpose

The purpose of this paper is to propose a fourfold semantic similarity that results in more accuracy compared to the existing literature. The change detection in the URL and the recommendation of the source documents is facilitated by means of a framework in which the fourfold semantic similarity is implied. The latest trends in technology emerge with the continuous growth of resources on the collaborative web. This interactive and collaborative web pretense big challenges in recent technologies like cloud and big data.

Design/methodology/approach

The enormous growth of resources should be accessed in a more efficient manner, and this requires clustering and classification techniques. The resources on the web are described in a more meaningful manner.

Findings

It can be descripted in the form of metadata that is constituted by resource description framework (RDF). Fourfold similarity is proposed compared to three-fold similarity proposed in the existing literature. The fourfold similarity includes the semantic annotation based on the named entity recognition in the user interface, domain-based concept matching and improvised score-based classification of domain-based concept matching based on ontology, sequence-based word sensing algorithm and RDF-based updating of triples. The aggregation of all these similarity measures including the components such as semantic user interface, semantic clustering, and sequence-based classification and semantic recommendation system with RDF updating in change detection.

Research limitations/implications

The existing work suggests that linking resources semantically increases the retrieving and searching ability. Previous literature shows that keywords can be used to retrieve linked information from the article to determine the similarity between the documents using semantic analysis.

Practical implications

These traditional systems also lack in scalability and efficiency issues. The proposed study is to design a model that pulls and prioritizes knowledge-based content from the Hadoop distributed framework. This study also proposes the Hadoop-based pruning system and recommendation system.

Social implications

The pruning system gives an alert about the dynamic changes in the article (virtual document). The changes in the document are automatically updated in the RDF document. This helps in semantic matching and retrieval of the most relevant source with the virtual document.

Originality/value

The recommendation and detection of changes in the blogs are performed semantically using n-triples and automated data structures. User-focussed and choice-based crawling that is proposed in this system also assists the collaborative filtering. Consecutively collaborative filtering recommends the user focussed source documents. The entire clustering and retrieval system is deployed in multi-node Hadoop in the Amazon AWS environment and graphs are plotted and analyzed.

Details

International Journal of Intelligent Unmanned Systems, vol. 7 no. 4
Type: Research Article
ISSN: 2049-6427

Keywords

To view the access options for this content please click here
Article
Publication date: 3 June 2019

Bilal Hawashin, Shadi Alzubi, Tarek Kanan and Ayman Mansour

This paper aims to propose a new efficient semantic recommender method for Arabic content.

Abstract

Purpose

This paper aims to propose a new efficient semantic recommender method for Arabic content.

Design/methodology/approach

Three semantic similarities were proposed to be integrated with the recommender system to improve its ability to recommend based on the semantic aspect. The proposed similarities are CHI-based semantic similarity, singular value decomposition (SVD)-based semantic similarity and Arabic WordNet-based semantic similarity. These similarities were compared with the existing similarities used by recommender systems from the literature.

Findings

Experiments show that the proposed semantic method using CHI-based similarity and using SVD-based similarity are more efficient than the existing methods on Arabic text in term of accuracy and execution time.

Originality/value

Although many previous works proposed recommender system methods for English text, very few works concentrated on Arabic Text. The field of Arabic Recommender Systems is largely understudied in the literature. Aside from this, there is a vital need to consider the semantic relationships behind user preferences to improve the accuracy of the recommendations. The contributions of this work are the following. First, as many recommender methods were proposed for English text and have never been tested on Arabic text, this work compares the performance of these widely used methods on Arabic text. Second, it proposes a novel semantic recommender method for Arabic text. As this method uses semantic similarity, three novel base semantic similarities were proposed and evaluated. Third, this work would direct the attention to more studies in this understudied topic in the literature.

To view the access options for this content please click here
Article
Publication date: 11 November 2013

Nina Preschitschek, Helen Niemann, Jens Leker and Martin G. Moehrle

The convergence of industries exposes the involved firms to various challenges. In such a setting, a firm's response time becomes key to its future success. Hence

Downloads
2985

Abstract

Purpose

The convergence of industries exposes the involved firms to various challenges. In such a setting, a firm's response time becomes key to its future success. Hence, different approaches to anticipating convergence have been developed in the recent past. So far, especially IPC co-classification patent analyses have been successfully applied in different industry settings to anticipate convergence on a broader industry/technology level. Here, the aim is to develop a concept to anticipate convergence even in small samples, simultaneously providing more detailed information on its origin and direction.

Design/methodology/approach

The authors assigned 326 US-patents on phytosterols to four different technological fields and measured the semantic similarity of the patents from the different technological fields. Finally, they compared these results to those of an IPC co-classification analysis of the same patent sample.

Findings

An increasing semantic similarity of food and pharmaceutical patents and personal care and pharmaceutical patents over time could be regarded as an indicator of convergence. The IPC co-classification analyses proved to be unsuitable for finding evidence for convergence here.

Originality/value

Semantic analyses provide the opportunity to analyze convergence processes in greater detail, even if only limited data are available. However, IPC co-classification analyses are still relevant in analyzing large amounts of data. The appropriateness of the semantic similarity approach requires verification, e.g. by applying it to other convergence settings.

To view the access options for this content please click here
Article
Publication date: 27 November 2018

Rajat Kumar Mudgal, Rajdeep Niyogi, Alfredo Milani and Valentina Franzoni

The purpose of this paper is to propose and experiment a framework for analysing the tweets to find the basis of popularity of a person and extract the reasons supporting…

Abstract

Purpose

The purpose of this paper is to propose and experiment a framework for analysing the tweets to find the basis of popularity of a person and extract the reasons supporting the popularity. Although the problem of analysing tweets to detect popular events and trends has recently attracted extensive research efforts, not much emphasis has been given to find out the reasons behind the popularity of a person based on tweets.

Design/methodology/approach

In this paper, the authors introduce a framework to find out the reasons behind the popularity of a person based on the analysis of events and the evaluation of a Web-based semantic set similarity measure applied to tweets. The methodology uses the semantic similarity measure to group similar tweets in events. Although the tweets cannot contain identical hashtags, they can refer to a unique topic with equivalent or related terminology. A special data structure maintains event information, related keywords and statistics to extract the reasons supporting popularity.

Findings

An implementation of the algorithms has been experimented on a data set of 218,490 tweets from five different countries for popularity detection and reasons extraction. The experimental results are quite encouraging and consistent in determining the reasons behind popularity. The use of Web-based semantic similarity measure is based on statistics extracted from search engines, it allows to dynamically adapt the similarity values to the variation on the correlation of words depending on current social trends.

Originality/value

To the best of the authors’ knowledge, the proposed method for finding the reason of popularity in short messages is original. The semantic set similarity presented in the paper is an original asymmetric variant of a similarity scheme developed in the context of semantic image recognition.

Details

International Journal of Web Information Systems, vol. 14 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

To view the access options for this content please click here
Article
Publication date: 11 July 2019

M. Priya and Aswani Kumar Ch.

The purpose of this paper is to merge the ontologies that remove the redundancy and improve the storage efficiency. The count of ontologies developed in the past few eras…

Abstract

Purpose

The purpose of this paper is to merge the ontologies that remove the redundancy and improve the storage efficiency. The count of ontologies developed in the past few eras is noticeably very high. With the availability of these ontologies, the needed information can be smoothly attained, but the presence of comparably varied ontologies nurtures the dispute of rework and merging of data. The assessment of the existing ontologies exposes the existence of the superfluous information; hence, ontology merging is the only solution. The existing ontology merging methods focus only on highly relevant classes and instances, whereas somewhat relevant classes and instances have been simply dropped. Those somewhat relevant classes and instances may also be useful or relevant to the given domain. In this paper, we propose a new method called hybrid semantic similarity measure (HSSM)-based ontology merging using formal concept analysis (FCA) and semantic similarity measure.

Design/methodology/approach

The HSSM categorizes the relevancy into three classes, namely highly relevant, moderate relevant and least relevant classes and instances. To achieve high efficiency in merging, HSSM performs both FCA part and the semantic similarity part.

Findings

The experimental results proved that the HSSM produced better results compared with existing algorithms in terms of similarity distance and time. An inconsistency check can also be done for the dissimilar classes and instances within an ontology. The output ontology will have set of highly relevant and moderate classes and instances as well as few least relevant classes and instances that will eventually lead to exhaustive ontology for the particular domain.

Practical implications

In this paper, a HSSM method is proposed and used to merge the academic social network ontologies; this is observed to be an extremely powerful methodology compared with other former studies. This HSSM approach can be applied for various domain ontologies and it may deliver a novel vision to the researchers.

Originality/value

The HSSM is not applied for merging the ontologies in any former studies up to the knowledge of authors.

Details

Library Hi Tech, vol. 38 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

To view the access options for this content please click here
Article
Publication date: 21 September 2012

Jorge Martinez‐Gil and José F. Aldana‐Montes

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring…

Abstract

Purpose

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Design/methodology/approach

In this article, the authors propose an optimization environment to improve existing techniques that use the notion of co‐occurrence and the information available on the web to measure similarity between terms.

Findings

The experimental results using the Miller and Charles and Gracia and Mena benchmark datasets show that the proposed approach is able to outperform classic probabilistic web‐based algorithms by a wide margin.

Originality/value

This paper presents two main contributions. The authors propose a novel technique that beats classic probabilistic techniques for measuring semantic similarity between terms. This new technique consists of using not only a search engine for computing web page counts, but a smart combination of several popular web search engines. The approach is evaluated on the Miller and Charles and Gracia and Mena benchmark datasets and compared with existing probabilistic web extraction techniques.

Details

Online Information Review, vol. 36 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article
Publication date: 1 July 2014

Janina Fengel

The purpose of this paper is to propose a solution for automating the task of matching business process models and search for correspondences with regard to the model…

Abstract

Purpose

The purpose of this paper is to propose a solution for automating the task of matching business process models and search for correspondences with regard to the model semantics, thus improving the efficiency of such works.

Design/methodology/approach

A method is proposed based on combining several semantic technologies. The research follows a design-science-oriented approach in that a method together with its supporting artifacts has been engineered. It application allows for reusing legacy models and automatedly determining semantic similarity.

Findings

The method has been applied and the first findings suggest the effectiveness of the approach. The results of applying the method show its feasibility and significance. The suggested heuristic computing of semantic correspondences between semantically heterogeneous business process models is flexible and can support domain users.

Research limitations/implications

Even though a solution can be offered that is directly usable, so far the full complexity of the natural language as given in model element labels is not yet completely resolvable. Here further research could contribute to the potential optimizations and refinement of automatic matching and linguistic procedures. However, an open research question could be solved.

Practical implications

The method presented is aimed at adding to the methods in the field of business process management and could extend the possibilities of automating support for business analysis.

Originality/value

The suggested combination of semantic technologies is innovative and addresses the aspect of semantic heterogeneity in a holistic, which is novel to the field.

To view the access options for this content please click here
Article
Publication date: 3 December 2018

Cong-Phuoc Phan, Hong-Quang Nguyen and Tan-Tai Nguyen

Large collections of patent documents disclosing novel, non-obvious technologies are publicly available and beneficial to academia and industries. To maximally exploit its…

Abstract

Purpose

Large collections of patent documents disclosing novel, non-obvious technologies are publicly available and beneficial to academia and industries. To maximally exploit its potential, searching these patent documents has increasingly become an important topic. Although much research has processed a large size of collections, a few studies have attempted to integrate both patent classifications and specifications for analyzing user queries. Consequently, the queries are often insufficiently analyzed for improving the accuracy of search results. This paper aims to address such limitation by exploiting semantic relationships between patent contents and their classification.

Design/methodology/approach

The contributions are fourfold. First, the authors enhance similarity measurement between two short sentences and make it 20 per cent more accurate. Second, the Graph-embedded Tree ontology is enriched by integrating both patent documents and classification scheme. Third, the ontology does not rely on rule-based method or text matching; instead, an heuristic meaning comparison to extract semantic relationships between concepts is applied. Finally, the patent search approach uses the ontology effectively with the results sorted based on their most common order.

Findings

The experiment on searching for 600 patent documents in the field of Logistics brings better 15 per cent in terms of F-Measure when compared with traditional approaches.

Research limitations/implications

The research, however, still requires improvement in which the terms and phrases extracted by Noun and Noun phrases making less sense in some aspect and thus might not result in high accuracy. The large collection of extracted relationships could be further optimized for its conciseness. In addition, parallel processing such as Map-Reduce could be further used to improve the search processing performance.

Practical implications

The experimental results could be used for scientists and technologists to search for novel, non-obvious technologies in the patents.

Social implications

High quality of patent search results will reduce the patent infringement.

Originality/value

The proposed ontology is semantically enriched by integrating both patent documents and their classification. This ontology facilitates the analysis of the user queries for enhancing the accuracy of the patent search results.

Details

International Journal of Web Information Systems, vol. 15 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

To view the access options for this content please click here
Article
Publication date: 1 March 2013

Wenyu Chen, Zhongquan Zhang, Tao Xiang and Ru Zeng

The purpose of this paper is to obtain more accurate matching between the request and the release of web service.

Downloads
216

Abstract

Purpose

The purpose of this paper is to obtain more accurate matching between the request and the release of web service.

Design/methodology/approach

This paper adopts Levenshtein distance algorithm to calculate the name similarity between publishing service and request service, employs cosine theorem to compute the text similarity, and uses semantic distance to count the input‐output similarity, then filters out the low similarity and bad reputation services to structure the candidate service set.

Findings

The qualitative and quantitative analysis of the scheme is given in this paper. The experimental results show that the multi‐level matching filtering algorithm can obviously improve the recall ratio and precision ratio of web service discovery.

Originality/value

This paper proposes a similarity‐based filtering algorithm for multi‐level matching.

Details

COMPEL - The international journal for computation and mathematics in electrical and electronic engineering, vol. 32 no. 2
Type: Research Article
ISSN: 0332-1649

Keywords

To view the access options for this content please click here
Article
Publication date: 21 May 2018

Dongmei Han, Wen Wang, Suyuan Luo, Weiguo Fan and Songxin Wang

This paper aims to apply vector space model (VSM)-PCR model to compute the similarity of Fault zone ontology semantics, which verified the feasibility and effectiveness of…

Abstract

Purpose

This paper aims to apply vector space model (VSM)-PCR model to compute the similarity of Fault zone ontology semantics, which verified the feasibility and effectiveness of the application of VSM-PCR method in uncertainty mapping of ontologies.

Design/methodology/approach

The authors first define the concept of uncertainty ontology and then propose the method of ontology mapping. The proposed method fully considers the properties of ontology in measuring the similarity of concept. It expands the single VSM of concept meaning or instance set to the “meaning, properties, instance” three-dimensional VSM and uses membership degree or correlation to express the level of uncertainty.

Findings

It provides a relatively better accuracy which verified the feasibility and effectiveness of VSM-PCR method in treating the uncertainty mapping of ontology.

Research limitations/implications

The future work will focus on exploring the similarity measure and combinational methods in every dimension.

Originality/value

This paper presents an uncertain mapping method of ontology concept based on three-dimensional combination weighted VSM, namely, VSM-PCR. It expands the single VSM of concept meaning or instance set to the “meaning, properties, instance” three-dimensional VSM. The model uses membership degree or correlation which is used to express the degree of uncertainty; as a result, a three-dimensional VSM is obtained. The authors finally provide an example to verify the feasibility and effectiveness of VSM-PCR method in treating the uncertainty mapping of ontology.

Details

Information Discovery and Delivery, vol. 46 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

1 – 10 of over 4000