Search results
1 – 10 of 401Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Anne H.H. Ngu and Yihong Zhang
This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.
Abstract
Purpose
This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.
Design/methodology/approach
In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed.
Findings
Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed.
Originality/value
To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.
Details
Keywords
Zhuoxuan Jiang, Chunyan Miao and Xiaoming Li
Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by…
Abstract
Purpose
Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by learners all over the world, unprecedented massive educational resources are aggregated. The educational resources include videos, subtitles, lecture notes, quizzes, etc., on the teaching side, and forum contents, Wiki, log of learning behavior, log of homework, etc., on the learning side. However, the data are both unstructured and diverse. To facilitate knowledge management and mining on MOOCs, extracting keywords from the resources is important. This paper aims to adapt the state-of-the-art techniques to MOOC settings and evaluate the effectiveness on real data. In terms of practice, this paper also tries to answer the questions for the first time that to what extend can the MOOC resources support keyword extraction models, and how many human efforts are required to make the models work well.
Design/methodology/approach
Based on which side generates the data, i.e instructors or learners, the data are classified to teaching resources and learning resources, respectively. The approach used on teaching resources is based on machine learning models with labels, while the approach used on learning resources is based on graph model without labels.
Findings
From the teaching resources, the methods used by the authors can accurately extract keywords with only 10 per cent labeled data. The authors find a characteristic of the data that the resources of various forms, e.g. subtitles and PPTs, should be separately considered because they have the different model ability. From the learning resources, the keywords extracted from MOOC forums are not as domain-specific as those extracted from teaching resources, but they can reflect the topics which are lively discussed in forums. Then instructors can get feedback from the indication. The authors implement two applications with the extracted keywords: generating concept map and generating learning path. The visual demos show they have the potential to improve learning efficiency when they are integrated into a real MOOC platform.
Research limitations/implications
Conducting keyword extraction on MOOC resources is quite difficult because teaching resources are hard to be obtained due to copyrights. Also, getting labeled data is tough because usually expertise of the corresponding domain is required.
Practical implications
The experiment results support that MOOC resources are good enough for building models of keyword extraction, and an acceptable balance between human efforts and model accuracy can be achieved.
Originality/value
This paper presents a pioneer study on keyword extraction on MOOC resources and obtains some new findings.
Details
Keywords
Renato Ribeiro Nogueira Ferraz, Marcus Vinícius Cesso da Silva, Renan Antônio da Silva and Luc Quoniam
The purpose of this paper is to present the use of a free code computational tool, Patent2net, in the search of patents for the implementation of distance learning aimed at…
Abstract
Purpose
The purpose of this paper is to present the use of a free code computational tool, Patent2net, in the search of patents for the implementation of distance learning aimed at Continuing Medical Education.
Design/methodology/approach
This technical report is based on the extraction, organization and availability, in the format of graphs and dynamic tables, and also based on information in other patents on the subject, made available in the Espacenet database.
Findings
As a result, it was possible to identify a Chinese patent, free for reproduction in Brazil, which describes an e-learning system that simulates 3D scenarios for training nursing teams.
Research limitations/implications
The paper has used one unique patent database, but containing more than 100m documents.
Practical implications
The selected patent can contribute to the improvement of care and behavioral techniques of the health professionals.
Social implications
The training of health professionals can improve the public and supplementary health systems.
Originality/value
This is the first paper in that de technometric analisys of patents was used to solve a problem regarding the training of health professionals.
Details
Keywords
Martin Nečaský, Petr Škoda, David Bernhauer, Jakub Klímek and Tomáš Skopal
Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking…
Abstract
Purpose
Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often insufficient. However, there exist many ways of improving the dataset discovery by employing content-based retrieval, machine learning tools, third-party (external) knowledge bases, countless feature extraction methods and description models and so forth.
Design/methodology/approach
In this paper, the authors propose a modular framework for rapid experimentation with methods for similarity-based dataset discovery. The framework consists of an extensible catalog of components prepared to form custom pipelines for dataset representation and discovery.
Findings
The study proposes several proof-of-concept pipelines including experimental evaluation, which showcase the usage of the framework.
Originality/value
To the best of authors’ knowledge, there is no similar formal framework for experimentation with various similarity methods in the context of dataset discovery. The framework has the ambition to establish a platform for reproducible and comparable research in the area of dataset discovery. The prototype implementation of the framework is available on GitHub.
Details
Keywords
Kimmo Kettunen, Heikki Keskustalo, Sanna Kumpulainen, Tuula Pääkkönen and Juha Rautiainen
This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different…
Abstract
Purpose
This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different quality OCR on users' subjective perception through an interactive information retrieval task with a collection of one digitized historical Finnish newspaper.
Design/methodology/approach
This study is based on the simulated work task model used in interactive information retrieval. Thirty-two users made searches to an article collection of Finnish newspaper Uusi Suometar 1869–1918 which consists of ca. 1.45 million autosegmented articles. The article search database had two versions of each article with different quality OCR. Each user performed six pre-formulated and six self-formulated short queries and evaluated subjectively the top 10 results using a graded relevance scale of 0–3. Users were not informed about the OCR quality differences of the otherwise identical articles.
Findings
The main result of the study is that improved OCR quality affects subjective user perception of historical newspaper articles positively: higher relevance scores are given to better-quality texts.
Originality/value
To the best of the authors’ knowledge, this simulated interactive work task experiment is the first one showing empirically that users' subjective relevance assessments are affected by a change in the quality of an optically read text.
Details
Keywords
Julián Monsalve-Pulido, Jose Aguilar, Edwin Montoya and Camilo Salazar
This article proposes an architecture of an intelligent and autonomous recommendation system to be applied to any virtual learning environment, with the objective of efficiently…
Abstract
This article proposes an architecture of an intelligent and autonomous recommendation system to be applied to any virtual learning environment, with the objective of efficiently recommending digital resources. The paper presents the architectural details of the intelligent and autonomous dimensions of the recommendation system. The paper describes a hybrid recommendation model that orchestrates and manages the available information and the specific recommendation needs, in order to determine the recommendation algorithms to be used. The hybrid model allows the integration of the approaches based on collaborative filter, content or knowledge. In the architecture, information is extracted from four sources: the context, the students, the course and the digital resources, identifying variables, such as individual learning styles, socioeconomic information, connection characteristics, location, etc. Tests were carried out for the creation of an academic course, in order to analyse the intelligent and autonomous capabilities of the architecture.
Details
Keywords
Tore Ståhl, Eero Sormunen and Marita Mäkinen
The internet and search engines dominate within people’s information acquisition, especially among the younger generations. Given this trend, this study aims to explore if…
Abstract
Purpose
The internet and search engines dominate within people’s information acquisition, especially among the younger generations. Given this trend, this study aims to explore if information and communication technology (ICT) practices, internet reliance and views of knowledge and knowing, i.e. epistemic beliefs, interact with each other. Everyday practices and conceptions among beginning undergraduate students are studied as a challenge for higher education.
Design/methodology/approach
The study builds upon survey-based quantitative data operationalising students’ epistemic beliefs, their internet reliance and their ICT practices. The survey items were used to compute subscales describing these traits, and the connections were explored using correlations analysis.
Findings
The results suggest that the more beginning undergraduate students rely on internet-based information, the more they are inclined to epistemic beliefs where knowledge is regarded as certain, unchanging, unambiguous and as being handed down by some authority.
Research limitations/implications
The approach used in the study applies to the sample used, and further research is required to test the applicability of the approach on larger samples.
Practical implications
The study highlights the risk of everyday information practices being transferred into the educational context.
Social implications
Ignorance of these changes may pose a risk for knowledge building on different educational levels and in a longer perspective, a threat to democracy.
Originality/value
While there is some research on epistemic beliefs in relation to internet-based information, studies approaching the problem over a possible connection between epistemic beliefs and internet reliance are scarce. In addition, this study implies a conceptual bridge between epistemic beliefs and internet reliance over the concept of algorithmic authority.
Details
Keywords
This study aims to provide the history and overview of the major categories of physical education (PE) assistance that Japan has provided to other countries by extracting the…
Abstract
Purpose
This study aims to provide the history and overview of the major categories of physical education (PE) assistance that Japan has provided to other countries by extracting the major categories from the various materials.
Design/methodology/approach
This study is divided into two phases, Phases 1 and 2. Surveys and analyses were further conducted. In Phase 1, a web browser-based survey was conducted to ascertain the major categories of PE assistance that Japan has provided to other countries. The practices and projects investigated were classified inductively, and the major categories were extracted. In Phase 2, a literature review was conducted to organise the history and overview of each category extracted in Phase 1.
Findings
Six major categories were extracted: (1) dispatch of Japan Overseas Cooperation Volunteers engaged in PE assistance, (2) assistance through training for those involved in PE, (3) revision or formulation of a PE curriculum, (4) preparation of textbooks or instructional materials for PE, (5) organising sports event and (6) maintenance of PE equipment and facilities.
Originality/value
Japan has a long history of providing PE assistance to other countries. However, historical materials on the practices and projects of PE are becoming scattered. Little literature addresses this gap, which this study seeks to address. This study can help policy makers in other countries, who can use Japan’s PE assistance practices and policies for reference, to assist them in formulating their own policies.
Details
Keywords
This study aimed to find out the web content accessed by university students and to compare the level of interaction with real-life friends and online friends.
Abstract
Purpose
This study aimed to find out the web content accessed by university students and to compare the level of interaction with real-life friends and online friends.
Design/methodology/approach
In this study, the quantitative research design used, and the researcher collected data through the survey method. The population comprises all undergraduate students at the University of the Punjab, Lahore. The sample of 320 students, age ranges from 18 to 22 years from eight selected departments, collected through a simple random sampling technique and after extraction 284 questionnaires evaluated by using Statistical Package for Social Sciences (SPSS).
Findings
The findings of the study showed that students preferred activities on the Internet is to access social networking sites. Additionally, the mobile phone is the most commonly used device among university students to access the Internet. Furthermore, students mostly used Facebook to keep in touch with their old friends and talk on different topics more easily with their online friends as compared to real-life friends. The study also shows that the results of both the hypothesis are significant; therefore, no difference exists regarding time spent on the Internet in real-life friendship patterns and online friendship patterns.
Originality/value
The research was used to find out the difference between the online friendship and real-life friendship patterns of the two groups who use the Internet for less time and who spend more time on the Internet among the university students.
Details
Keywords
Asefeh Asemi, Andrea Ko and Mohsen Nowkarizi
This paper reviews literature on the application of intelligent systems in the libraries with a special issue on the ES/AI and Robot. Also, it introduces the potential of…
Abstract
Purpose
This paper reviews literature on the application of intelligent systems in the libraries with a special issue on the ES/AI and Robot. Also, it introduces the potential of libraries to use intelligent systems, especially ES/AI and robots.
Design/methodology/approach
Descriptive and content review methods are applied, and the researchers critically reviewed the articles related to library ESs and robots from Web of Science as a general database and Emerald as a specific database in library and information science from 2007–2017. Four scopes considered to classify the articles as technology, service, user and resource. It is found that published researches on the intelligent systems have contributed to many librarian purposes like library technical services like the organization of information resources, storage and retrieval of information resources, library public services as reference services, information desk and other purposes.
Findings
A review of the previous studies shows that ESs are a useable intelligent system in library and information science that mimic librarian expert’s behaviors to support decision making and management. Also, it is shown that the current information systems have a high potential to be improved by integration with AI technologies. In this researches, librarian robots mostly designed for detection and replacing books on the shelf. Improving the technology of gripping, localizing and human-robot interaction are the main concern in recent librarian robot research. Our conclusion is that we need to develop research in the area of smart resources.
Originality/value
This study has a new approach to the literature review in this area. We compared the published papers in the field of ES/AI and robot and library from two databases, general and specific.
Details