The purpose of this paper is to present a generic pipeline for Resource Description Framework (RDF) graph mining to provide a comprehensive review of each step in the knowledge discovery from data process. The authors also investigate different approaches and combinations to extract feature vectors from RDF graphs to apply the clustering and theme identification tasks.
The proposed methodology comprises four steps. First, the authors generate several graph substructures (Walks, Set of Walks, Walks with backward and Set of Walks with backward). Second, the authors build neural language models to extract numerical vectors of the generated sequences by using word embedding techniques (Word2Vec and Doc2Vec) combined with term frequency-inverse document frequency (TF-IDF). Third, the authors use the well-known K-means algorithm to cluster the RDF graph. Finally, the authors extract the most relevant rdf:type from the grouped vertices to describe the semantics of each theme by generating the labels.
The experimental evaluation on the state of the art data sets (AIFB, BGS and Conference) shows that the combination of Set of Walks-with-backward with TF-IDF and Doc2vec techniques give excellent results. In fact, the clustering results reach more than 97% and 90% in terms of purity and F-measure, respectively. Concerning the theme identification, the results show that by using the same combination, the purity and F-measure criteria reach more than 90% for all the considered data sets.
The originality of this paper lies in two aspects: first, a new machine learning pipeline for RDF data is presented; second, an efficient process to identify and extract relevant graph substructures from an RDF graph is proposed. The proposed techniques were combined with different neural language models to improve the accuracy and relevance of the obtained feature vectors that will be fed to the clustering mechanism.
In the era of industry 4.0, managing the design is a challenging mission. Within a dynamic environment, several disciplines have adopted the complex adaptive system (CAS…
In the era of industry 4.0, managing the design is a challenging mission. Within a dynamic environment, several disciplines have adopted the complex adaptive system (CAS) perspective. Therefore, this paper aims to explore how we may deepen our understanding of the design process as a CAS. In this respect, the key complexity drivers of the design process are discussed and an organizational decomposition for the simulation of the design process as CAS is conducted.
The proposed methodology comprises three steps. First, the complexity drivers of the design process are presented and are matched with those of CAS. Second, an analysis of over 111 selected papers is presented to choose the appropriate model for the design process from the CAS theory. Third, the paper provides methodological guidelines to develop an organizational decision support system that supports the complexity of the design process.
An analysis of the key drivers of design process complexity shows the need to adopt the CAS theory. In addition to that, a comparative analysis between all the organizational methodologies developed in the literature leads the authors to conclude that agent-oriented Software Process for engineering complex System is the appropriate methodology for simulating the design process. In this respect, a system requirements phase of the decision support system is conducted.
The originality of this paper lies in the fact of analysing the complexity of the design process as a CAS. In doing so, all the richness of the CAS theory can be used to meet the challenges of those already existing in the theory of the design.