Search results
1 – 10 of over 11000Vishal Kumar and Evelyn Ai Lin Evelyn Teo
The usability aspect of the construction operations building information exchange (COBie) datasheet has been largely overlooked. Users find it difficult to find relevant data…
Abstract
Purpose
The usability aspect of the construction operations building information exchange (COBie) datasheet has been largely overlooked. Users find it difficult to find relevant data inside COBie and understand the dependencies of information. This research study is a part of a more comprehensive research study to identify the usability issues associated with COBie and propose solutions to deal with them. This paper aims to discuss the challenges associated with the visualization aspect of COBie and proposes a solution to mitigate them.
Design/methodology/approach
This paper is based on design thinking and waterfall methodology. While the design thinking methodology is used to explore the issues associated with the visualization aspect of COBie, the waterfall methodology is used to develop a working prototype of the visualizer for the COBie datasheet using a spreadsheet format.
Findings
The paper demonstrates that the property graph model based on a node-link diagram can be effectively used to represent the COBie datasheet. This will help in storing data in a visually connected manner and looking at links more dynamically. Moreover, converting and storing data into an appropriate database will help reach data directly rather than navigate multiple workbooks. This database can also help get the history of data inside the COBie datasheet as it develops throughout the project.
Originality/value
This research proposes a novel approach to visualize the COBie datasheet interactively using the property graph model, a type of node-link diagram. Using the property graph model will help users see data in a connected way, which is currently missing in the spreadsheet representation of COBie data. Moreover, this research also highlights that storing historical changes in COBie data can help understand how data has evolved throughout the construction. Additionally, structured storage of data in relationship format can help users to access the end of connected data directly through the efficient search.
Details
Keywords
Paolo Manghi, Claudio Atzori, Michele De Bonis and Alessia Bardi
Several online services offer functionalities to access information from “big research graphs” (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate…
Abstract
Purpose
Several online services offer functionalities to access information from “big research graphs” (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate scholarly/scientific communication entities such as publications, authors, datasets, organizations, projects, funders, etc. Depending on the target users, access can vary from search and browse content to the consumption of statistics for monitoring and provision of feedback. Such graphs are populated over time as aggregations of multiple sources and therefore suffer from major entity-duplication problems. Although deduplication of graphs is a known and actual problem, existing solutions are dedicated to specific scenarios, operate on flat collections, local topology-drive challenges and cannot therefore be re-used in other contexts.
Design/methodology/approach
This work presents GDup, an integrated, scalable, general-purpose system that can be customized to address deduplication over arbitrary large information graphs. The paper presents its high-level architecture, its implementation as a service used within the OpenAIRE infrastructure system and reports numbers of real-case experiments.
Findings
GDup provides the functionalities required to deliver a fully-fledged entity deduplication workflow over a generic input graph. The system offers out-of-the-box Ground Truth management, acquisition of feedback from data curators and algorithms for identifying and merging duplicates, to obtain an output disambiguated graph.
Originality/value
To our knowledge GDup is the only system in the literature that offers an integrated and general-purpose solution for the deduplication graphs, while targeting big data scalability issues. GDup is today one of the key modules of the OpenAIRE infrastructure production system, which monitors Open Science trends on behalf of the European Commission, National funders and institutions.
Details
Keywords
Maren Parnas Gulnes, Ahmet Soylu and Dumitru Roman
Neuroscience data are spread across a variety of sources, typically provisioned through ad-hoc and non-standard approaches and formats and often have no connection to the related…
Abstract
Purpose
Neuroscience data are spread across a variety of sources, typically provisioned through ad-hoc and non-standard approaches and formats and often have no connection to the related data sources. These make it difficult for researchers to understand, integrate and reuse brain-related data. The aim of this study is to show that a graph-based approach offers an effective mean for representing, analysing and accessing brain-related data, which is highly interconnected, evolving over time and often needed in combination.
Design/methodology/approach
The authors present an approach for organising brain-related data in a graph model. The approach is exemplified in the case of a unique data set of quantitative neuroanatomical data about the murine basal ganglia––a group of nuclei in the brain essential for processing information related to movement. Specifically, the murine basal ganglia data set is modelled as a graph, integrated with relevant data from third-party repositories, published through a Web-based user interface and API, analysed from exploratory and confirmatory perspectives using popular graph algorithms to extract new insights.
Findings
The evaluation of the graph model and the results of the graph data analysis and usability study of the user interface suggest that graph-based data management in the neuroscience domain is a promising approach, since it enables integration of various disparate data sources and improves understanding and usability of data.
Originality/value
The study provides a practical and generic approach for representing, integrating, analysing and provisioning brain-related data and a set of software tools to support the proposed approach.
Details
Keywords
Francesco Rouhana and Dima Jawad
This paper aims to present a novel approach for assessing the resilience of transportation road infrastructure against different failure scenarios based on the topological…
Abstract
Purpose
This paper aims to present a novel approach for assessing the resilience of transportation road infrastructure against different failure scenarios based on the topological properties of the network. The approach is implemented in the context of developing countries where data scarcity is the norm, taking the capital city of Beirut as a case study.
Design/methodology/approach
The approach is based on the graph theory concepts and uses spatial data and urban network analysis toolbox to estimate the resilience under random and rank-ordering failure scenarios. The quantitative approach is applied to statistically model the topological graph properties, centralities and appropriate resilience metrics.
Findings
The research approach is able to provide a unique insight into the network configuration in terms of resilience against failures. The road network of Beirut, with an average nodal degree of three, turns to act more similarly to a random graph when exposed to failures. Topological parameters, connectivity and density indices of the network decline through disruptions while revealing an entire dependence on the state of nodes. The Beirut random network responds similarly to random and targeted removals. Critical network components are highlighted following the approach.
Research limitations/implications
The approach is limited to an undirected and weighted specific graph of Beirut where the capacity to collect and process the necessary data in such context is limited.
Practical implications
Decision-makers are better able to direct and optimize resources by prioritizing the critical network components, therefore reducing the failure-induced downtime in the functionality.
Originality/value
The resilience of Beirut transportation network is quantified uniquely through graph theory under various node removal modes.
Details
Keywords
Aya Khaled Youssef Sayed Mohamed, Dagmar Auer, Daniel Hofer and Josef Küng
Data protection requirements heavily increased due to the rising awareness of data security, legal requirements and technological developments. Today, NoSQL databases are…
Abstract
Purpose
Data protection requirements heavily increased due to the rising awareness of data security, legal requirements and technological developments. Today, NoSQL databases are increasingly used in security-critical domains. Current survey works on databases and data security only consider authorization and access control in a very general way and do not regard most of today’s sophisticated requirements. Accordingly, the purpose of this paper is to discuss authorization and access control for relational and NoSQL database models in detail with respect to requirements and current state of the art.
Design/methodology/approach
This paper follows a systematic literature review approach to study authorization and access control for different database models. Starting with a research on survey works on authorization and access control in databases, the study continues with the identification and definition of advanced authorization and access control requirements, which are generally applicable to any database model. This paper then discusses and compares current database models based on these requirements.
Findings
As no survey works consider requirements for authorization and access control in different database models so far, the authors define their requirements. Furthermore, the authors discuss the current state of the art for the relational, key-value, column-oriented, document-based and graph database models in comparison to the defined requirements.
Originality/value
This paper focuses on authorization and access control for various database models, not concrete products. This paper identifies today’s sophisticated – yet general – requirements from the literature and compares them with research results and access control features of current products for the relational and NoSQL database models.
Details
Keywords
Milind Tiwari, Jamie Ferrill and Vishal Mehrotra
This paper advocates the use of graph database platforms to investigate networks of illicit companies identified in money laundering schemes. It explains the setup of the data…
Abstract
Purpose
This paper advocates the use of graph database platforms to investigate networks of illicit companies identified in money laundering schemes. It explains the setup of the data structure to investigate a network of illicit companies identified in cases of money laundering schemes and presents its key application in practice. Grounded in the technology acceptance model (TAM), this paper aims to present key operationalisations and theoretical considerations for effectively driving and facilitating its wider adoption among a range of stakeholders focused on anti-money laundering solutions.
Design/methodology/approach
This paper explores the benefits of adopting graph databases and critiques their limitations by drawing on primary data collection processes that have been undertaken to derive a network topology. Such representation on a graph database platform provides the opportunity to uncover hidden relationships critical for combatting illicit activities such as money laundering.
Findings
The move to adopt a graph database for storing information related to corporate entities will aid investigators, journalists and other stakeholders in the identification of hidden links among entities to deter activities of corruption and money laundering.
Research limitations/implications
This paper does not display the nodal data as it is framed as a background to how graph databases can be used in practice.
Originality/value
To the best of the authors’ knowledge, no studies in the past have considered companies from multiple cases in the same graph network and attempted to investigate the links between them. The advocation for such an approach has significant implications for future studies.
Details
Keywords
Sheila Anderson and Tobias Blanke
The purpose of this paper is to analyse the steps taken to produce new kinds of integrated documentation on the Holocaust in the European Holocaust Research Infrastructure…
Abstract
Purpose
The purpose of this paper is to analyse the steps taken to produce new kinds of integrated documentation on the Holocaust in the European Holocaust Research Infrastructure project. The authors present the user investigation methodology as well as the novel data design to support this complex field.
Design/methodology/approach
The paper is based on the scholarly primitives framework. From here, it proceeds with two empirical studies of Holocaust archival research and the implementation steps taken. The paper employs key insights from large technology studies in how to organise such work. In particular, it uses the concepts of social-technical assemblages and intermediation.
Findings
The paper offers a number of findings. First from the empirical studies, it presents how Holocaust researchers and archivist perceive the way they currently do research in archives. It then presents how the intermediation and digital transformation of such research can be enabled without violating its foundations. The second major insight is the technical research into how to use graph databases to integrate heterogeneous research collections and the analysis opportunities behind.
Originality/value
The paper is based on existing work by the authors but takes this work forward into the world of real-life existing historical research on archives. It demonstrates how the theoretical foundations of primitives are fit for purpose. The paper presents a completely new approach on how to (re)organise archives as research infrastructures and offers a flexible way of implementing this. Next to these major insights, a range of new solutions are presented how to arrange the socio-technical assemblages of research infrastructures.
Details
Keywords
Daniel Hofer, Markus Jäger, Aya Khaled Youssef Sayed Mohamed and Josef Küng
For aiding computer security experts in their study, log files are a crucial piece of information. Especially the time domain is very important for us because in most cases…
Abstract
Purpose
For aiding computer security experts in their study, log files are a crucial piece of information. Especially the time domain is very important for us because in most cases, timestamps are the only linking points between events caused by attackers, faulty systems or simple errors and their corresponding entries in log files. With the idea of storing and analyzing this log information in graph databases, we need a suitable model to store and connect timestamps and their events. This paper aims to find and evaluate different approaches how to store timestamps in graph databases and their individual benefits and drawbacks.
Design/methodology/approach
We analyse three different approaches, how timestamp information can be represented and stored in graph databases. For checking the models, we set up four typical questions that are important for log file analysis and tested them for each of the models. During the evaluation, we used the performance and other properties as metrics, how suitable each of the models is for representing the log files’ timestamp information. In the last part, we try to improve one promising looking model.
Findings
We come to the conclusion, that the simplest model with the least graph database-specific concepts in use is also the one yielding the simplest and fastest queries.
Research limitations/implications
Limitations to this research are that only one graph database was studied and also improvements to the query engine might change future results.
Originality/value
In the study, we addressed the issue of storing timestamps in graph databases in a meaningful, practical and efficient way. The results can be used as a pattern for similar scenarios and applications.
Details
Keywords
Edoardo Ramalli and Barbara Pernici
Experiments are the backbone of the development process of data-driven predictive models for scientific applications. The quality of the experiments directly impacts the model…
Abstract
Purpose
Experiments are the backbone of the development process of data-driven predictive models for scientific applications. The quality of the experiments directly impacts the model performance. Uncertainty inherently affects experiment measurements and is often missing in the available data sets due to its estimation cost. For similar reasons, experiments are very few compared to other data sources. Discarding experiments based on the missing uncertainty values would preclude the development of predictive models. Data profiling techniques are fundamental to assess data quality, but some data quality dimensions are challenging to evaluate without knowing the uncertainty. In this context, this paper aims to predict the missing uncertainty of the experiments.
Design/methodology/approach
This work presents a methodology to forecast the experiments’ missing uncertainty, given a data set and its ontological description. The approach is based on knowledge graph embeddings and leverages the task of link prediction over a knowledge graph representation of the experiments database. The validity of the methodology is first tested in multiple conditions using synthetic data and then applied to a large data set of experiments in the chemical kinetic domain as a case study.
Findings
The analysis results of different test case scenarios suggest that knowledge graph embedding can be used to predict the missing uncertainty of the experiments when there is a hidden relationship between the experiment metadata and the uncertainty values. The link prediction task is also resilient to random noise in the relationship. The knowledge graph embedding outperforms the baseline results if the uncertainty depends upon multiple metadata.
Originality/value
The employment of knowledge graph embedding to predict the missing experimental uncertainty is a novel alternative to the current and more costly techniques in the literature. Such contribution permits a better data quality profiling of scientific repositories and improves the development process of data-driven models based on scientific experiments.
Details
Keywords
ELDO C. KOENIG and TERRY J. FREDERICK
Select properties are presented for a graph model of a general automaton consisting of a processor, environment and time graph. The properties, stated in the form of theorems and…
Abstract
Select properties are presented for a graph model of a general automaton consisting of a processor, environment and time graph. The properties, stated in the form of theorems and corollaries, deal with connectedness, number of points and lines and indegree and outdegree as the model relates to the automaton's sets, functions and characteristics. The properties are illustrated by an example automaton.