Search results
11 – 20 of 971Samira Khodabandehlou, S. Alireza Hashemi Golpayegani and Mahmoud Zivari Rahman
Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity…
Abstract
Purpose
Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity, scalability and interest drift that affect their performance. Despite the efforts made to solve these problems, there is still no RS that can solve or reduce all the problems simultaneously. Therefore, the purpose of this study is to provide an effective and comprehensive RS to solve or reduce all of the above issues, which uses a combination of basic customer information as well as big data techniques.
Design/methodology/approach
The most important steps in the proposed RS are: (1) collecting demographic and behavioral data of customers from an e-clothing store; (2) assessing customer personality traits; (3) creating a new user-item matrix based on customer/user interest; (4) calculating the similarity between customers with efficient k-nearest neighbor (EKNN) algorithm based on locality-sensitive hashing (LSH) approach and (5) defining a new similarity function based on a combination of personality traits, demographic characteristics and time-based purchasing behavior that are the key incentives for customers' purchases.
Findings
The proposed method was compared with different baselines (matrix factorization and ensemble). The results showed that the proposed method in terms of all evaluation measures led to a significant improvement in traditional collaborative filtering (CF) performance, and with a significant difference (more than 40%), performed better than all baselines. According to the results, we find that our proposed method, which uses a combination of personality information and demographics, as well as tracking the recent interests and needs of the customer with the LSH approach, helps to improve the effectiveness of the recommendations more than the baselines. This is due to the fact that this method, which uses the above information in conjunction with the LSH technique, is more effective and more accurate in solving problems of cold start, scalability, sparsity and interest drift.
Research limitations/implications
The research data were limited to only one e-clothing store.
Practical implications
In order to achieve an accurate and real-time RS in e-commerce, it is essential to use a combination of customer information with efficient techniques. In this regard, according to the results of the research, the use of personality traits and demographic characteristics lead to a more accurate knowledge of customers' interests and thus better identification of similar customers. Therefore, this information should be considered as a solution to reduce the problems of cold start and sparsity. Also, a better judgment can be made about customers' interests by considering their recent purchases; therefore, in order to solve the problems of interest drifts, different weights should be assigned to purchases and launch time of products/items at different times (the more recent, the more weight). Finally, the LSH technique is used to increase the RS scalability in e-commerce. In total, a combination of personality traits, demographics and customer purchasing behavior over time with the LSH technique should be used to achieve an ideal RS. Using the RS proposed in this research, it is possible to create a comfortable and enjoyable shopping experience for customers by providing real-time recommendations that match customers' preferences and can result in an increase in the profitability of e-shops.
Originality/value
In this study, by considering a combination of personality traits, demographic characteristics and time-based purchasing behavior of customers along with the LSH technique, we were able for the first time to simultaneously solve the basic problems of CF, namely cold start, scalability, sparsity and interest drift, which led to a decrease in significant errors of recommendations and an increase in the accuracy of CF. The average errors of the recommendations provided to users based on the proposed model is only about 13%, and the accuracy and compliance of these recommendations with the interests of customers is about 92%. In addition, a 40% difference between the accuracy of the proposed method and the traditional CF method has been observed. This level of accuracy in RSs is very significant and special, which is certainly welcomed by e-business owners. This is also a new scientific finding that is very useful for programmers, users and researchers. In general, the main contributions of this research are: 1) proposing an accurate RS using personality traits, demographic characteristics and time-based purchasing behavior; 2) proposing an effective and comprehensive RS for a “clothing” online store; 3) improving the RS performance by solving the cold start issue using personality traits and demographic characteristics; 4) improving the scalability issue in RS through efficient k-nearest neighbors; 5) Mitigating the sparsity issue by using personality traits and demographic characteristics and also by densifying the user-item matrix and 6) improving the RS accuracy by solving the interest drift issue through developing a time-based user-item matrix.
Details
Keywords
Narasimhulu K, Meena Abarna KT and Sivakumar B
The purpose of the paper is to study multiple viewpoints which are required to access the more informative similarity features among the tweets documents, which is useful for…
Abstract
Purpose
The purpose of the paper is to study multiple viewpoints which are required to access the more informative similarity features among the tweets documents, which is useful for achieving the robust tweets data clustering results.
Design/methodology/approach
Let “N” be the number of tweets documents for the topics extraction. Unwanted texts, punctuations and other symbols are removed, tokenization and stemming operations are performed in the initial tweets pre-processing step. Bag-of-features are determined for the tweets; later tweets are modelled with the obtained bag-of-features during the process of topics extraction. Approximation of topics features are extracted for every tweet document. These set of topics features of N documents are treated as multi-viewpoints. The key idea of the proposed work is to use multi-viewpoints in the similarity features computation. The following figure illustrates multi-viewpoints based cosine similarity computation of the five tweets documents (here N = 5) and corresponding documents are defined in projected space with five viewpoints, say, v1,v2, v3, v4, and v5. For example, similarity features between two documents (viewpoints v1, and v2) are computed concerning the other three multi-viewpoints (v3, v4, and v5), unlike a single viewpoint in traditional cosine metric.
Findings
Healthcare problems with tweets data. Topic models play a crucial role in the classification of health-related tweets with finding topics (or health clusters) instead of finding term frequency and inverse document frequency (TF–IDF) for unlabelled tweets.
Originality/value
Topic models play a crucial role in the classification of health-related tweets with finding topics (or health clusters) instead of finding TF-IDF for unlabelled tweets.
Details
Keywords
Adamu Garba, Shah Khalid, Irfan Ullah, Shah Khusro and Diyawu Mumin
There have been many challenges in crawling deep web by search engines due to their proprietary nature or dynamic content. Distributed Information Retrieval (DIR) tries to solve…
Abstract
Purpose
There have been many challenges in crawling deep web by search engines due to their proprietary nature or dynamic content. Distributed Information Retrieval (DIR) tries to solve these problems by providing a unified searchable interface to these databases. Since a DIR must search across many databases, selecting a specific database to search against the user query is challenging. The challenge can be solved if the past queries of the users are considered in selecting collections to search in combination with word embedding techniques. Combining these would aid the best performing collection selection method to speed up retrieval performance of DIR solutions.
Design/methodology/approach
The authors propose a collection selection model based on word embedding using Word2Vec approach that learns the similarity between the current and past queries. They used the cosine and transformed cosine similarity models in computing the similarities among queries. The experiment is conducted using three standard TREC testbeds created for federated search.
Findings
The results show significant improvements over the baseline models.
Originality/value
Although the lexical matching models for collection selection using similarity based on past queries exist, to the best our knowledge, the proposed work is the first of its kind that uses word embedding for collection selection by learning from past queries.
Details
Keywords
Mohamed Haddache, Allel Hadjali and Hamid Azzoune
The study of the skyline queries has received considerable attention from several database researchers since the end of 2000's. Skyline queries are an appropriate tool that can…
Abstract
Purpose
The study of the skyline queries has received considerable attention from several database researchers since the end of 2000's. Skyline queries are an appropriate tool that can help users to make intelligent decisions in the presence of multidimensional data when different, and often contradictory criteria are to be taken into account. Based on the concept of Pareto dominance, the skyline process extracts the most interesting (not dominated in the sense of Pareto) objects from a set of data. Skyline computation methods often lead to a set with a large size which is less informative for the end users and not easy to be exploited. The purpose of this paper is to tackle this problem, known as the large size skyline problem, and propose a solution to deal with it by applying an appropriate refining process.
Design/methodology/approach
The problem of the skyline refinement is formalized in the fuzzy formal concept analysis setting. Then, an ideal fuzzy formal concept is computed in the sense of some particular defined criteria. By leveraging the elements of this ideal concept, one can reduce the size of the computed Skyline.
Findings
An appropriate and rational solution is discussed for the problem of interest. Then, a tool, named SkyRef, is developed. Rich experiments are done using this tool on both synthetic and real datasets.
Research limitations/implications
The authors have conducted experiments on synthetic and some real datasets to show the effectiveness of the proposed approaches. However, thorough experiments on large-scale real datasets are highly desirable to show the behavior of the tool with respect to the performance and time execution criteria.
Practical implications
The tool developed SkyRef can have many domains applications that require decision-making, personalized recommendation and where the size of skyline has to be reduced. In particular, SkyRef can be used in several real-world applications such as economic, security, medicine and services.
Social implications
This work can be expected in all domains that require decision-making like hotel finder, restaurant recommender, recruitment of candidates, etc.
Originality/value
This study mixes two research fields artificial intelligence (i.e. formal concept analysis) and databases (i.e. skyline queries). The key elements of the solution proposed for the skyline refinement problem are borrowed from the fuzzy formal concept analysis which makes it clearer and rational, semantically speaking. On the other hand, this study opens the door for using the formal concept analysis and its extensions in solving other issues related to skyline queries, such as relaxation.
Details
Keywords
V. Senthil Kumaran and R. Latha
The purpose of this paper is to provide adaptive access to learning resources in the digital library.
Abstract
Purpose
The purpose of this paper is to provide adaptive access to learning resources in the digital library.
Design/methodology/approach
A novel method using ontology-based multi-attribute collaborative filtering is proposed. Digital libraries are those which are fully automated and all resources are in digital form and access to the information available is provided to a remote user as well as a conventional user electronically. To satisfy users' information needs, a humongous amount of newly created information is published electronically in digital libraries. While search applications are improving, it is still difficult for the majority of users to find relevant information. For better service, the framework should also be able to adapt queries to search domains and target learners.
Findings
This paper improves the accuracy and efficiency of predicting and recommending personalized learning resources in digital libraries. To facilitate a personalized digital learning environment, the authors propose a novel method using ontology-supported collaborative filtering (CF) recommendation system. The objective is to provide adaptive access to learning resources in the digital library. The proposed model is based on user-based CF which suggests learning resources for students based on their course registration, preferences for topics and digital libraries. Using ontological framework knowledge for semantic similarity and considering multiple attributes apart from learners' preferences for the learning resources improve the accuracy of the proposed model.
Research limitations/implications
The results of this work majorly rely on the developed ontology. More experiments are to be conducted with other domain ontologies.
Practical implications
The proposed approach is integrated into Nucleus, a Learning Management System (https://nucleus.amcspsgtech.in). The results are of interest to learners, academicians, researchers and developers of digital libraries. This work also provides insights into the ontology for e-learning to improve personalized learning environments.
Originality/value
This paper computes learner similarity and learning resources similarity based on ontological knowledge, feedback and ratings on the learning resources. The predictions for the target learner are calculated and top N learning resources are generated by the recommendation engine using CF.
Details
Keywords
Priyadarshini R., Latha Tamilselvan and Rajendran N.
The purpose of this paper is to propose a fourfold semantic similarity that results in more accuracy compared to the existing literature. The change detection in the URL and the…
Abstract
Purpose
The purpose of this paper is to propose a fourfold semantic similarity that results in more accuracy compared to the existing literature. The change detection in the URL and the recommendation of the source documents is facilitated by means of a framework in which the fourfold semantic similarity is implied. The latest trends in technology emerge with the continuous growth of resources on the collaborative web. This interactive and collaborative web pretense big challenges in recent technologies like cloud and big data.
Design/methodology/approach
The enormous growth of resources should be accessed in a more efficient manner, and this requires clustering and classification techniques. The resources on the web are described in a more meaningful manner.
Findings
It can be descripted in the form of metadata that is constituted by resource description framework (RDF). Fourfold similarity is proposed compared to three-fold similarity proposed in the existing literature. The fourfold similarity includes the semantic annotation based on the named entity recognition in the user interface, domain-based concept matching and improvised score-based classification of domain-based concept matching based on ontology, sequence-based word sensing algorithm and RDF-based updating of triples. The aggregation of all these similarity measures including the components such as semantic user interface, semantic clustering, and sequence-based classification and semantic recommendation system with RDF updating in change detection.
Research limitations/implications
The existing work suggests that linking resources semantically increases the retrieving and searching ability. Previous literature shows that keywords can be used to retrieve linked information from the article to determine the similarity between the documents using semantic analysis.
Practical implications
These traditional systems also lack in scalability and efficiency issues. The proposed study is to design a model that pulls and prioritizes knowledge-based content from the Hadoop distributed framework. This study also proposes the Hadoop-based pruning system and recommendation system.
Social implications
The pruning system gives an alert about the dynamic changes in the article (virtual document). The changes in the document are automatically updated in the RDF document. This helps in semantic matching and retrieval of the most relevant source with the virtual document.
Originality/value
The recommendation and detection of changes in the blogs are performed semantically using n-triples and automated data structures. User-focussed and choice-based crawling that is proposed in this system also assists the collaborative filtering. Consecutively collaborative filtering recommends the user focussed source documents. The entire clustering and retrieval system is deployed in multi-node Hadoop in the Amazon AWS environment and graphs are plotted and analyzed.
Details
Keywords
Qiongwei Ye and Baojun Ma
Internet + and Electronic Business in China is a comprehensive resource that provides insight and analysis into E-commerce in China and how it has revolutionized and continues to…
Abstract
Internet + and Electronic Business in China is a comprehensive resource that provides insight and analysis into E-commerce in China and how it has revolutionized and continues to revolutionize business and society. Split into four distinct sections, the book first lays out the theoretical foundations and fundamental concepts of E-Business before moving on to look at internet+ innovation models and their applications in different industries such as agriculture, finance and commerce. The book then provides a comprehensive analysis of E-business platforms and their applications in China before finishing with four comprehensive case studies of major E-business projects, providing readers with successful examples of implementing E-Business entrepreneurship projects.
Internet + and Electronic Business in China is a comprehensive resource that provides insights and analysis into how E-commerce has revolutionized and continues to revolutionize business and society in China.
Hsin-Chang Yang, Chung-Hong Lee and Wen-Sheng Liao
Measuring the similarity between two resources is considered difficult due to a lack of reliable information and a wide variety of available information regarding the resources…
Abstract
Purpose
Measuring the similarity between two resources is considered difficult due to a lack of reliable information and a wide variety of available information regarding the resources. Many approaches have been devised to tackle such difficulty. Although content-based approaches, which adopted resource-related data in comparing resources, played a major role in similarity measurement methodology, the lack of semantic insight on the data may leave these approaches imperfect. The purpose of this paper is to incorporate data semantics into the measuring process.
Design/methodology/approach
The emerged linked open data (LOD) provide a practical solution to tackle such difficulty. Common methodologies consuming LOD mainly focused on using link attributes that provide some sort of semantic relations between data. In this work, methods for measuring semantic distances between resources using information gathered from LOD were proposed. Such distances were then applied to music recommendation, focusing on the effect of various weight and level settings.
Findings
This work conducted experiments using the MusicBrainz dataset and evaluated the proposed schemes for the plausibility of LOD on music recommendation. The experimental result shows that the proposed methods electively improved classic approaches for both linked data semantic distance (LDSD) and PathSim methods by 47 and 9.7%, respectively.
Originality/value
The main contribution of this work is to develop novel schemes for incorporating knowledge from LOD. Two types of knowledge, namely attribute and path, were derived and incorporated into similarity measurements. Such knowledge may reflect the relationships between resources in a semantic manner since the links in LOD carry much semantic information regarding connecting resources.
Details
Keywords
Seyed Mahmood Zanjirchi and Najmeh Faregh
ISM technique is one of the tools of interest in soft operations research. The soft nature of this technique has made inevitable use of indeterminacy theories. The present…
Abstract
Purpose
ISM technique is one of the tools of interest in soft operations research. The soft nature of this technique has made inevitable use of indeterminacy theories. The present research attempts to develop ISM technique and MICMAC analysis in a neutrosophic space due to the complexity and uncertainty of the decision-making environment.
Design/methodology/approach
In this study, single-valued triangular neutrosophic numbers is used to develop Neutrosophic ISM (NISM) and Neutrosophic MICMAC (NMICMAC). First, the general algorithm of NISM and NMICMAC is provided. In the following, the complete description of NISM steps including level value determination, Factor Leveling Algorithm and NISM digraph algorithm are presented. Finally, a description of the NMICMAC steps is described.
Findings
An illustrative example – supplier selection problem – is given to verify the effectiveness of the proposed method and in the discussion section; the comparison and analysis of different aspects of the NISM with the previous methods are discussed.
Originality/value
In this study, NISM and NMICMAC are presented for the first time, so that each pairwise comparison judgment is provided as single valued triangular neutrosophic numbers. The development of the model is continued until the final stages of calculations with neutrosophic numbers, and only in the final stage the results are crispy presented. In addition, not only the factors of process are leveled, but at each level the factors are lined up and their importance is determined.
Details
Keywords
Yajun Leng, Qing Lu and Changyong Liang
Collaborative recommender systems play a crucial role in providing personalized services to online consumers. Most online shopping sites and many other applications now use the…
Abstract
Purpose
Collaborative recommender systems play a crucial role in providing personalized services to online consumers. Most online shopping sites and many other applications now use the collaborative recommender systems. The measurement of the similarity plays a fundamental role in collaborative recommender systems. Some of the most well-known similarity measures are: Pearson’s correlation coefficient, cosine similarity and mean squared differences. However, due to data sparsity, accuracy of the above similarity measures decreases, which makes the formation of inaccurate neighborhood, thereby resulting in poor recommendations. The purpose of this paper is to propose a novel similarity measure based on potential field.
Design/methodology/approach
The proposed approach constructs a dense matrix: user-user potential matrix, and uses this matrix to compute potential similarities between users. Then the potential similarities are modified based on users’ preliminary neighborhoods, and k users with the highest modified similarity values are selected as the active user’s nearest neighbors. Compared to the rating matrix, the potential matrix is much denser. Thus, the sparsity problem can be efficiently alleviated. The similarity modification scheme considers the number of common neighbors of two users, which can further improve the accuracy of similarity computation.
Findings
Experimental results show that the proposed approach is superior to the traditional similarity measures.
Originality/value
The research highlights of this paper are as follows: the authors construct a dense matrix: user-user potential matrix, and use this matrix to compute potential similarities between users; the potential similarities are modified based on users’ preliminary neighborhoods, and k users with the highest modified similarity values are selected as the active user’s nearest neighbors; and the proposed approach performs better than the traditional similarity measures. The manuscript will be of particular interests to the scientists interested in recommender systems research as well as to readers interested in solution of related complex practical engineering problems.
Details