Search results
1 – 10 of over 203000
The purpose of this paper is to apply link prediction to community mining and to clarify the role of link prediction in improving the performance of social network analysis.
Abstract
Purpose
The purpose of this paper is to apply link prediction to community mining and to clarify the role of link prediction in improving the performance of social network analysis.
Design/methodology/approach
In this study, the 2009 version of Enron e-mail data set provided by Carnegie Mellon University was selected as the research object first, and bibliometric analysis method and citation analysis method were adopted to compare the differences between various studies. Second, based on the impact of various interpersonal relationships, the link model was adopted to analyze the relationship among people. Finally, the factorization of the matrix was further adopted to obtain the characteristics of the research object, so as to predict the unknown relationship.
Findings
The experimental results show that the prediction results obtained by considering multiple relationships are more accurate than those obtained by considering only one relationship.
Research limitations/implications
Due to the limited number of objects in the data set, the link prediction method has not been tested on the large-scale data set, and the validity and correctness of the method need to be further verified with larger data. In addition, the research on algorithm complexity and algorithm optimization, including the storage of sparse matrix, also need to be further studied. At the same time, in the case of extremely sparse data, the accuracy of the link prediction method will decline a lot, and further research and discussion should be carried out on the sparse data.
Practical implications
The focus of this research is on link prediction in social network analysis. The traditional prediction model is based on a certain relationship between the objects to predict and analyze, but in real life, the relationship between people is diverse, and different relationships are interactive. Therefore, in this study, the graph model is used to express different kinds of relations, and the influence between different kinds of relations is considered in the actual prediction process. Finally, experiments on real data sets prove the effectiveness and accuracy of this method. In addition, link prediction, as an important part of social network analysis, is also of great significance for other applications of social network analysis. This study attempts to prove that link prediction is helpful to the improvement of performance analysis of social network by applying link prediction to community mining.
Originality/value
This study adopts a variety of methods, such as link prediction, data mining, literature analysis and citation analysis. The research direction is relatively new, and the experimental results obtained have a certain degree of credibility, which is of certain reference value for the following related research.
Details
Keywords
Junsheng Zhang, Yunchuan Sun and Changqing Yao
This paper aims to semantically linking scientific research events implied by scientific and technical literature to support information analysis and information service…
Abstract
Purpose
This paper aims to semantically linking scientific research events implied by scientific and technical literature to support information analysis and information service applications. Literature research is an important method to acquire scientific and technical information which is important for research, development and innovation of science and technology. It is difficult but urgently required to acquire accurate, timely, rapid, short and comprehensive information from the large-scale and fast-growing literature, especially in the big data era. Existing literature-based information retrieval systems focus on basic data organization, and they are far from meeting the needs of information analytics. It becomes urgent to organize and analyze scientific research events related to scientific and technical literature for forecasting development trend of science and technology.
Design/methodology/approach
Scientific literature such as a paper or a patent is represented as a scientific research event, which contains elements including when, where, who, what, how and why. Metadata of literature is used to formulate scientific research events that are implied in introduction and related work sections of literature. Named entities and research objects such as methods, materials and algorithms can be extracted from texts of literature by using text analysis. The authors semantically link scientific research events, entities and objects, and then, they construct the event space for supporting scientific and technical information analysis.
Findings
This paper represents scientific literature as events, which are coarse-grained units comparing with entities and relations in current information organizations. Events and semantic relations among them together formulate a semantic link network, which could support event-centric information browsing, search and recommendation.
Research limitations/implications
The proposed model is a theoretical model, and it needs to verify the efficiency in further experimental application research. The evaluation and applications of semantic link network of scientific research events are further research issues.
Originality/value
This paper regards scientific literature as scientific research events and proposes an approach to semantically link events into a network with multiple-typed entities and relations. According to the needs of scientific and technical information analysis, scientific research events are organized into event cubes which are distributed in a three-dimensioned space for easy-to-understand and information visualization.
Details
Keywords
Alesia Zuccala, Mike Thelwall, Charles Oppenheim and Rajveen Dhiensa
The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the…
Abstract
Purpose
The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH).
Design/methodology/approach
The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text analysis (web content mining), utilizing the power of commercial search engines and drawing upon the information science fields of bibliometrics and webometrics. LexiURL is a computer program designed to calculate summary statistics for lists of links or URLs. Its output is a series of standard reports, for example listing and counting all of the different domain names in the data.
Findings
Link data, when analysed together with user transaction log files (i.e. Web referring domains) can provide insights into who is using a digital library and when, and who could be using the digital library if they are “surfing” a particular part of the Web; in this case any site that is linked to or colinked with the NeLH. This study found that the NeLH was embedded in a multifaceted Web context, including many governmental, educational, commercial and organisational sites, with the most interesting being sites from the.edu domain, representing American Universities. Not many links directed to the NeLH were followed on September 25, 2005 (the date of the log file analysis and link extraction analysis), which means that users who access the digital library have been arriving at the site via only a few select links, bookmarks and search engine searches, or non‐electronic sources.
Originality/value
A number of studies concerning digital library users have been carried out using log file analysis as a research tool. Log files focus on real‐time user transactions; while LexiURL can be used to extract links and colinks associated with a digital library's growing Web network. This Web network is not recognized often enough, and can be a useful indication of where potential users are surfing, even if they have not yet specifically visited the NeLH site.
Details
Keywords
Andrea Hrckova, Robert Moro, Ivan Srba and Maria Bielikova
Partisan news media, which often publish extremely biased, one-sided or even false news, are gaining popularity world-wide and represent a major societal issue. Due to a growing…
Abstract
Purpose
Partisan news media, which often publish extremely biased, one-sided or even false news, are gaining popularity world-wide and represent a major societal issue. Due to a growing number of such media, a need for automatic detection approaches is of high demand. Automatic detection relies on various indicators (e.g. content characteristics) to identify new partisan media candidates and to predict their level of partisanship. The aim of the research is to investigate to a deeper extent whether it would be appropriate to rely on the hyperlinks as possible indicators for better automatic partisan news media detection.
Design/methodology/approach
The authors utilized hyperlink network analysis to study the hyperlinks of partisan and mainstream media. The dataset involved the hyperlinks of 18 mainstream media and 15 partisan media in Slovakia and Czech Republic. More than 171 million domain pairs of inbound and outbound hyperlinks of selected online news media were collected with Ahrefs tool, analyzed and visualized with Gephi software. Additionally, 300 articles covering COVID-19 from both types of media were selected for content analysis of hyperlinks to verify the reliability of quantitative analysis and to provide more detailed analysis.
Findings
The authors conclude that hyperlinks are reliable indicators of media affinity and linking patterns could contribute to partisan news detection. The authors found out that especially the incoming links with dofollow attribute to news websites are reliable indicators for assessing the type of media, as partisan media rarely receive links with dofollow attribute from mainstream media. The outgoing links are not such reliable indicators as both mainstream and partisan media link to mainstream sources similarly.
Originality/value
In contrast to the extensive amount of research aiming at fake news detection within a piece of text or multimedia content (e.g. news articles, social media posts), the authors shift to characterization of the whole news media. In addition, the authors did a geographical shift from more researched US-based media to so far under-researched European context, particularly Central Europe. The results and conclusions can serve as a guide how to derive new features for an automatic detection of possibly partisan news media by means of artificial intelligence (AI).
Peer review
The peer review history for this article is available at the following link: https://publons.com/publon/10.1108/OIR-10-2020-0441.
Details
Keywords
Esteban Romero‐Frías and Liwen Vaughan
The paper seeks to extend co‐link analysis to web sites of heterogeneous companies belonging to different industries and countries, and to cluster companies by industries and…
Abstract
Purpose
The paper seeks to extend co‐link analysis to web sites of heterogeneous companies belonging to different industries and countries, and to cluster companies by industries and compare results from different countries.
Design/methodology/approach
Web sites of 255 companies that belong to five stock exchange indexes were included in the study. Data on co‐links pointing to these web sites were gathered using Yahoo!. Co‐link data were analyzed using multidimensional scaling (MDS) to generate MDS maps that would position companies based on their co‐link counts.
Findings
Comparisons of results across different countries and economies showed the following overall pattern: companies whose businesses are information‐based tend to form well‐defined clusters, while companies operating on a more traditional business model tend not to form clear groups. A comparison between the EU zone and the USA suggests that the EU economy is not well integrated yet.
Practical implications
The findings from the study suggest the possibility of using co‐link analysis to distinguish information‐based industries from traditional industries.
Originality/value
The paper extends co‐link analysis from a single industry to heterogeneous industries with global and complex business phenomena.
Details
Keywords
Masahiro Ito, Kotaro Nakayama, Takahiro Hara and Shojiro Nishio
Recently, the importance and effectiveness of Wikipedia Mining has been shown in several researches. One popular research area on Wikipedia Mining focuses on semantic relatedness…
Abstract
Purpose
Recently, the importance and effectiveness of Wikipedia Mining has been shown in several researches. One popular research area on Wikipedia Mining focuses on semantic relatedness measurement, and research in this area has shown that Wikipedia can be used for semantic relatedness measurement. However, previous methods are facing two problems; accuracy and scalability. To solve these problems, the purpose of this paper is to propose an efficient semantic relatedness measurement method that leverages global statistical information of Wikipedia. Furthermore, a new test collection is constructed based on Wikipedia concepts for evaluating semantic relatedness measurement methods.
Design/methodology/approach
The authors' approach leverages global statistical information of the whole Wikipedia to compute semantic relatedness among concepts (disambiguated terms) by analyzing co‐occurrences of link pairs in all Wikipedia articles. In Wikipedia, an article represents a concept and a link to another article represents a semantic relation between these two concepts. Thus, the co‐occurrence of a link pair indicates the relatedness of a concept pair. Furthermore, the authors propose an integration method with tfidf as an improved method to additionally leverage local information in an article. Besides, for constructing a new test collection, the authors select a large number of concepts from Wikipedia. The relatedness of these concepts is judged by human test subjects.
Findings
An experiment was conducted for evaluating calculation cost and accuracy of each method. The experimental results show that the calculation cost of this approach is very low compared to one of the previous methods and more accurate than all previous methods for computing semantic relatedness.
Originality/value
This is the first proposal of co‐occurrence analysis of Wikipedia links for semantic relatedness measurement. The authors show that this approach is effective to measure semantic relatedness among concepts regarding calculation cost and accuracy. The findings may be useful to researchers who are interested in knowledge extraction, as well as ontology researches.
Details
Keywords
To construct web visibility profiles of news web sites by examining hyperlinks pointing to the sites.
Abstract
Purpose
To construct web visibility profiles of news web sites by examining hyperlinks pointing to the sites.
Design/methodology/approach
National newspapers from USA (USA Today), Canada (The Globe and Mail), China (People's Daily) as well as Hong Kong (Sing Tao Daily) were selected for the study. A total of 1,859 links pointing to the four news sites were manually classified into the four aspects of language, country, types of sites, and reasons or purposes for linking.
Findings
A comparison of the four news sites provided useful information on their web visibility. The Globe and Mail seemed to have a larger international reach than USA Today. Neither newspaper web site attracted links from China or from pages in the Chinese language. Outside China, People's Daily, an official Chinese Government newspaper, is not as visible as Hong Kong based Sing Tao Daily. USA Today and The Globe and Mail were used more for news citing or reprinting purposes while People's Daily seemed to be used more as a research resource.
Research limitations/implications
Link analysis like this provides us with only an indirect view of the online readership and the methodology has limitations. Not all readers create links to the newspaper sites that they visit. Readers could be led to a news site through other venues including “social bookmarking” services.
Practical implications
The study shows that link analysis is a novel and useful method that journalists and information professionals can use to gauge online readership and potential impact of news sites.
Originality/value
Presented a novel method that complements but not replaces other web user studies such as web server log analysis.
Details