Search results

1 – 10 of 677
Article
Publication date: 12 November 2020

Seyed Mohammad Javad Hosseini, Bahman Arasteh, Ayaz Isazadeh, Mehran Mohsenzadeh and Mitra Mirzarezaee

The purpose of this study is to reduce the number of mutations and, consequently, reduce the cost of mutation test. The results of related studies indicate that about 40% of…

Abstract

Purpose

The purpose of this study is to reduce the number of mutations and, consequently, reduce the cost of mutation test. The results of related studies indicate that about 40% of injected faults (mutants) in the source code are effect-less (equivalent). Equivalent mutants are one of the major costs of mutation testing and the identification of equivalent and effect-less mutants has been known as an undecidable problem.

Design/methodology/approach

In a program with n branch instructions (if instruction) there are 2n execution paths (test paths) that the data and codes into each of these paths can be considered as a target of mutation. Given the role and impact of data in a program, some of data and codes propagates the injected mutants more likely to the output of the program. In this study, firstly the error-propagation rate of the program data is quantified using static analysis of the program control-flow graph. Then, the most error-propagating test paths are identified by the proposed heuristic algorithm (Genetic Algorithm [GA]). Data and codes with higher error-propagation rate are only considered as the strategic locations for the mutation testing.

Findings

In order to evaluate the proposed method, an extensive series of mutation testing experiments have been conducted on a set of traditional benchmark programs using MuJava tool set. The results depict that the proposed method reduces the number of mutants about 24%. Also, in the corresponding experiments, the mutation score is increased about 5.6%. The success rate of the GA in finding the most error-propagating paths of the input programs is 99%. On average, only 7.46% of generated mutants by the proposed method are equivalent. Indeed, 92.54% of generated mutants are non-equivalent.

Originality/value

The main contribution of this study is as follows: Proposing a set of equations to measure the error-propagation rate of each data, basic-block and execution path of a program. Proposing a genetic algorithm to identify a most error-propagating path of program as locations of mutations. Developing an efficient mutation-testing framework that mutates only the strategic locations of a program identified by the proposed genetic algorithms. Reducing the time and cost of mutation testing by reducing the equivalent mutants.

Details

Data Technologies and Applications, vol. 55 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 2 June 2020

Nasrin Shomali and Bahman Arasteh

For delivering high-quality software applications, proper testing is required. A software test will function successfully if it can find more software faults. The traditional…

Abstract

Purpose

For delivering high-quality software applications, proper testing is required. A software test will function successfully if it can find more software faults. The traditional method of assessing the quality and effectiveness of a test suite is mutation testing. One of the main drawbacks of mutation testing is its computational cost. The research problem of this study is the high computational cost of the mutation test. Reducing the time and cost of the mutation test is the main goal of this study.

Design/methodology/approach

With regard to the 80–20 rule, 80% of the faults are found in 20% of the fault-prone code of a program. The proposed method statically analyzes the source code of the program to identify the fault-prone locations of the program. Identifying the fault-prone (complex) paths of a program is an NP-hard problem. In the proposed method, a firefly optimization algorithm is used for identifying the most fault-prone paths of a program; then, the mutation operators are injected only on the identified fault-prone instructions.

Findings

The source codes of five traditional benchmark programs were used for evaluating the effectiveness of the proposed method to reduce the mutant number. The proposed method was implemented in Matlab. The mutation injection operations were carried out by MuJava, and the output was investigated. The results confirm that the proposed method considerably reduces the number of mutants, and consequently, the cost of software mutation-test.

Originality/value

The proposed method avoids the mutation of nonfault-prone (simple) codes of the program, and consequently, the number of mutants considerably is reduced. In a program with n branch instructions (if instruction), there are 2n execution paths (test paths) that the data and codes into each of these paths can be considered as a target of mutation. Identifying the error-prone (complex) paths of a program is an NP-hard problem. In the proposed method, a firefly optimization algorithm as a heuristic algorithm is used for identifying the most error-prone paths of a program; then, the mutation operators (faults) are injected only on the identified fault-prone instructions.

Details

Data Technologies and Applications, vol. 54 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 3 May 2023

Rucha Wadapurkar, Sanket Bapat, Rupali Mahajan and Renu Vyas

Ovarian cancer (OC) is the most common type of gynecologic cancer in the world with a high rate of mortality. Due to manifestation of generic symptoms and absence of specific…

Abstract

Purpose

Ovarian cancer (OC) is the most common type of gynecologic cancer in the world with a high rate of mortality. Due to manifestation of generic symptoms and absence of specific biomarkers, OC is usually diagnosed at a late stage. Machine learning models can be employed to predict driver genes implicated in causative mutations.

Design/methodology/approach

In the present study, a comprehensive next generation sequencing (NGS) analysis of whole exome sequences of 47 OC patients was carried out to identify clinically significant mutations. Nine functional features of 708 mutations identified were input into a machine learning classification model by employing the eXtreme Gradient Boosting (XGBoost) classifier method for prediction of OC driver genes.

Findings

The XGBoost classifier model yielded a classification accuracy of 0.946, which was superior to that obtained by other classifiers such as decision tree, Naive Bayes, random forest and support vector machine. Further, an interaction network was generated to identify and establish correlations with cancer-associated pathways and gene ontology data.

Originality/value

The final results revealed 12 putative candidate cancer driver genes, namely LAMA3, LAMC3, COL6A1, COL5A1, COL2A1, UGT1A1, BDNF, ANK1, WNT10A, FZD4, PLEKHG5 and CYP2C9, that may have implications in clinical diagnosis.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 7 August 2017

Sathyavikasini Kalimuthu and Vijaya Vijayakumar

Diagnosing genetic neuromuscular disorder such as muscular dystrophy is complicated when the imperfection occurs while splicing. This paper aims in predicting the type of muscular…

Abstract

Purpose

Diagnosing genetic neuromuscular disorder such as muscular dystrophy is complicated when the imperfection occurs while splicing. This paper aims in predicting the type of muscular dystrophy from the gene sequences by extracting the well-defined descriptors related to splicing mutations. An automatic model is built to classify the disease through pattern recognition techniques coded in python using scikit-learn framework.

Design/methodology/approach

In this paper, the cloned gene sequences are synthesized based on the mutation position and its location on the chromosome by using the positional cloning approach. For instance, in the human gene mutational database (HGMD), the mutational information for splicing mutation is specified as IVS1-5 T > G indicates (IVS - intervening sequence or introns), first intron and five nucleotides before the consensus intron site AG, where the variant occurs in nucleotide G altered to T. IVS (+ve) denotes forward strand 3′– positive numbers from G of donor site invariant and IVS (−ve) denotes backward strand 5′ – negative numbers starting from G of acceptor site. The key idea in this paper is to spot out discriminative descriptors from diseased gene sequences based on splicing variants and to provide an effective machine learning solution for predicting the type of muscular dystrophy disease with the splicing mutations. Multi-class classification is worked out through data modeling of gene sequences. The synthetic mutational gene sequences are created, as the diseased gene sequences are not readily obtainable for this intricate disease. Positional cloning approach supports in generating disease gene sequences based on mutational information acquired from HGMD. SNP-, gene- and exon-based discriminative features are identified and used to train the model. An eminent muscular dystrophy disease prediction model is built using supervised learning techniques in scikit-learn environment. The data frame is built with the extracted features as numpy array. The data are normalized by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn.

Findings

To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations. Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. This paper also deliberates the results of statistical learning carried out with the same set of gene sequences with synonymous and non-synonymous mutational descriptors.

Research limitations/implications

The data frame is built with the Numpy array. Normalizing the data by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn. While learning the SVM model, the cost, gamma and kernel parameters are tuned to attain good results. Scoring parameters of the classifiers are evaluated using tenfold cross-validation using metric functions of scikit-learn library. Results of the disease identification model based on non-synonymous, synonymous and splicing mutations were analyzed.

Practical implications

Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. The performance of the classifiers are increased by using different estimators from the scikit-learn library. Several types of mutations such as missense, non-sense and silent mutations are also considered to build models through statistical learning technique and their results are analyzed.

Originality/value

To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations.

Details

World Journal of Engineering, vol. 14 no. 4
Type: Research Article
ISSN: 1708-5284

Keywords

Article
Publication date: 19 August 2013

Helder Ken Shimo and Renato Tinos

– The purpose of this paper is to propose two operators for diversity and mutation control in artificial immune systems (AISs).

Abstract

Purpose

The purpose of this paper is to propose two operators for diversity and mutation control in artificial immune systems (AISs).

Design/methodology/approach

The proposed operators are applied in substitution to the suppression and mutation operators used in AISs. The proposed mechanisms were tested in the opt-aiNet, a continuous optimization algorithm inspired in the theories of immunology. The traditional opt-aiNet uses a suppression operator based on the immune network principles to remove similar cells and add random ones to control the diversity of the population. This procedure is computationally expensive, as the Euclidean distances between every possible pair of candidate solutions must be computed. This work proposes a self-organizing suppression mechanism inspired by the self-organizing criticality (SOC) phenomenon, which is less dependent on parameter selection. This work also proposes the use of the q-Gaussian mutation, which allows controlling the form of the mutation distribution during the optimization process. The algorithms were tested in a well-known benchmark for continuous optimization and in a bioinformatics problem: the rigid docking of proteins.

Findings

The proposed suppression operator presented some limitations in unimodal functions, but some interesting results were found in some highly multimodal functions. The proposed q-Gaussian mutation presented good performance in most of the test cases of the benchmark, and also in the docking problem.

Originality/value

First, the self-organizing suppression operator was able to reduce the complexity of the suppression stage in the opt-aiNet. Second, the use of q-Gaussian mutation in AISs presented better compromise between exploitation and exploration of the search space and, as a consequence, a better performance when compared to the traditional Gaussian mutation.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 6 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 8 June 2012

Iulia Maries and Emil Scarlat

The purpose of this paper is to analyse the role of computational intelligence techniques in the process of communities' formation.

Abstract

Purpose

The purpose of this paper is to analyse the role of computational intelligence techniques in the process of communities' formation.

Design/methodology/approach

The paper develops a high performance genetic algorithm for community formation based on collective intelligence capacity. An experimental study is presented to illustrate the algorithm.

Findings

Collective intelligence does not represent the sum of individual intelligences, it is the ability of the community to complete more tasks than single individuals. The paper reveals the need for mechanisms that allow a large group of professionals to make decisions better than single individuals.

Practical implications

The genetic algorithm proposed in the paper may be used to obtain the optimal structure of a community, in terms of number of members and their role in the community.

Originality/value

The key concept is a new fitness index, an intelligence index, which is the optimal combination between intelligence and cooperation, and allows not only community formation, but also intelligence to be the driving principle in the community formation process.

Article
Publication date: 1 June 2007

Orestes Chouchoulas and Alan Day

Although the idea of linking a shape grammar to a genetic algorithm is not new, this paper proposes a novel way of combining these two elements in order to provide a tool that can…

Abstract

Although the idea of linking a shape grammar to a genetic algorithm is not new, this paper proposes a novel way of combining these two elements in order to provide a tool that can be used for design exploration. Using a shape grammar for design generation provides a way of creating a range of potential solutions to a design problem which fit with the designer's stylistic agenda. A genetic algorithm can then be used to take these designs and develop them into a much richer set of solutions which can still be recognised as part of the same family. By setting quantifiable targets for design performance, the genetic algorithm can evolve new designs which exhibit the best features of previous generations. The designer is then presented with a wide range of high scoring solutions and can choose which of these to take forward and develop in the conventional manner. The novelty of the proposed approach is in the use of a shape code, which describes the steps that the shape grammar has taken to create each design. The genetic algorithm works on this shape code by applying crossover and mutation in order to create a range of designs that can be tested. The fittest are then selected in order to provide the genetic material for the next generation. A prototype version of such a program, called Shape Evolution, has been developed. In order to test Shape Evolution it has been used to design a range of apartment buildings which are required to meet certain performance criteria.

Details

Open House International, vol. 32 no. 2
Type: Research Article
ISSN: 0168-2601

Keywords

Open Access
Article
Publication date: 30 September 2015

Lucia Parisi, Teresa Di Filippo and Michele Roccella

Cornelia de Lange syndrome (CdLS) is a congenital disorder characterized by distinctive facial features, growth retardation, limb abnormalities, intellectual disability, and…

Abstract

Cornelia de Lange syndrome (CdLS) is a congenital disorder characterized by distinctive facial features, growth retardation, limb abnormalities, intellectual disability, and behavioral problems. Cornelia de Lange syndrome is associated with abnormalities on chromosomes 5, 10 and X. Heterozygous point mutations in three genes (NIPBL, SMC3 and SMC1A), are responsible for approximately 50-60% of CdLS cases. CdLS is characterized by autistic features, notably excessive repetitive behaviors and expressive language deficits. The prevalence of autism spectrum disorder (ASD) symptomatology is comparatively high in CdLS. However, the profile and developmental trajectories of these ASD characteristics are potentially different to those observed in individuals with idiopathic ASD. A significantly higher prevalence of self-injury are evident in CdLS. Self-injury was associated with repetitive and impulsive behavior. This study describes the behavioral phenotype of four children with Cornelia de Lange syndrome and ASDs and rehabilitative intervention that must be implemented.

Details

Mental Illness, vol. 7 no. 2
Type: Research Article
ISSN: 2036-7465

Keywords

Article
Publication date: 3 July 2017

Peter Martin

Diagnosing pain and pain inflicting diseases are crucial issues in the health care of individuals with intellectual and developmental disabilities. The purpose of this paper is to…

Abstract

Purpose

Diagnosing pain and pain inflicting diseases are crucial issues in the health care of individuals with intellectual and developmental disabilities. The purpose of this paper is to delineate possible peculiarities in pain perception, characterizing a syndrome-specific spectrum of pain causing diseases as well as particular features of pain expression in Rett syndrome (RTT).

Design/methodology/approach

A selective review of the literature on pain, dolorous disorders and diseases, molecular aspects of pain transduction, pain perception, and expression of painful conditions in RTT was undertaken.

Findings

RTT causing mutations in the methyl-CpG-binding protein 2 (MECP2) have an impact on various endogenous molecules modulating pain transmission. Individuals with RTT are specifically prone to numerous pathological states which can cause pain. By thorough observation/application of proper tools, it is possible to recognize painful states in persons with RTT.

Originality/value

This paper imparts empirical/evidence-based data on pain perception/transmission, possible syndrome-specific causes of pain and pain expression/assessment in RTT, with the objective of promoting the quality of clinical practice in this crucial issue.

Details

Advances in Autism, vol. 3 no. 3
Type: Research Article
ISSN: 2056-3868

Keywords

Article
Publication date: 6 November 2023

Shahram Sedghi and Somayeh Ghaffari Heshajin

Genetics, a discipline of biology, is one of the most recent and rapidly advancing disciplines in science. This study aims to present a bibliometric analysis of the genetics…

Abstract

Purpose

Genetics, a discipline of biology, is one of the most recent and rapidly advancing disciplines in science. This study aims to present a bibliometric analysis of the genetics research output of Iranian authors, map the intellectual structure of these studies and investigate the development path of this literature and the interrelationships among the main topics.

Design/methodology/approach

This study searched the Web of Science database for documentation of Iranian-published genetics research published up to 2020. Further, this study used HistCite software to profile and analyze the most cited articles and references and to draw their historiographies.

Findings

A database search revealed 21,329 documents that created the study population. The highest cited publications based on the Global Citation Score (GCS) and Local Citation Score (LCS) achieved scores of 602 and 47, respectively. The publication growth rate study demonstrated consistent expansion over time. The scientific maps based on LCS and GCS had five and four clusters, respectively. Furthermore, journal articles emerged as the predominant type of publication.

Practical implications

The significance of this study is in its contribution to understanding the genetics research position in Iran, informing policymakers and researchers, helping scientific collaboration and its impact on public attitudes and quality of life. The results of the present study, with benefits for various groups of communities, such as policymakers, academic groups and public society, can bridge the gap between theoretical research and practical implications.

Social implications

The results of this study, by helping future advancement in health care, medical genetics and disease prevention, may have a direct and indirect positive influence on the quality of life. Furthermore, it may lead to more informed discussions on health care and biotechnology as well as influencing public attitudes and perceptions.

Originality/value

Ultimately, this study concludes that despite the proliferation of publications in terms of quantity and complexity, especially in areas such as disease diagnosis, prevention and treatment, there remains a need for more attention to other facets of genetics such as biology and biotechnology. Iranian publications are most related to population genetics, human genetics, molecular genetics, medical genetics, genomics, developmental genetics and evolutionary genetics out of 10 branches of genetics. This study reveals patterns in scientific outputs and authorship collaborations and plays an alternative and innovative role in revealing Iranian research trends in genetics.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

1 – 10 of 677