Search results

1 – 10 of over 195000
Article
Publication date: 27 January 2020

Renze Zhou, Zhiguo Xing, Haidou Wang, Zhongyu Piao, Yanfei Huang, Weiling Guo and Runbo Ma

With the development of deep learning-based analytical techniques, increased research has focused on fatigue data analysis methods based on deep learning, which are gaining in…

355

Abstract

Purpose

With the development of deep learning-based analytical techniques, increased research has focused on fatigue data analysis methods based on deep learning, which are gaining in popularity. However, the application of deep neural networks in the material science domain is mainly inhibited by data availability. In this paper, to overcome the difficulty of multifactor fatigue life prediction with small data sets,

Design/methodology/approach

A multiple neural network ensemble (MNNE) is used, and an MNNE with a general and flexible explicit function is developed to accurately quantify the complicated relationships hidden in multivariable data sets. Moreover, a variational autoencoder-based data generator is trained with small sample sets to expand the size of the training data set. A comparative study involving the proposed method and traditional models is performed. In addition, a filtering rule based on the R2 score is proposed and applied in the training process of the MNNE, and this approach has a beneficial effect on the prediction accuracy and generalization ability.

Findings

A comparative study involving the proposed method and traditional models is performed. The comparative experiment confirms that the use of hybrid data can improve the accuracy and generalization ability of the deep neural network and that the MNNE outperforms support vector machines, multilayer perceptron and deep neural network models based on the goodness of fit and robustness in the small sample case.

Practical implications

The experimental results imply that the proposed algorithm is a sophisticated and promising multivariate method for predicting the contact fatigue life of a coating when data availability is limited.

Originality/value

A data generated model based on variational autoencoder was used to make up lack of data. An MNNE method was proposed to apply in the small data case of fatigue life prediction.

Details

Anti-Corrosion Methods and Materials, vol. 67 no. 1
Type: Research Article
ISSN: 0003-5599

Keywords

Open Access
Article
Publication date: 17 December 2019

Yingjie Yang, Sifeng Liu and Naiming Xie

The purpose of this paper is to propose a framework for data analytics where everything is grey in nature and the associated uncertainty is considered as an essential part in data

1275

Abstract

Purpose

The purpose of this paper is to propose a framework for data analytics where everything is grey in nature and the associated uncertainty is considered as an essential part in data collection, profiling, imputation, analysis and decision making.

Design/methodology/approach

A comparative study is conducted between the available uncertainty models and the feasibility of grey systems is highlighted. Furthermore, a general framework for the integration of grey systems and grey sets into data analytics is proposed.

Findings

Grey systems and grey sets are useful not only for small data, but also big data as well. It is complementary to other models and can play a significant role in data analytics.

Research limitations/implications

The proposed framework brings a radical change in data analytics. It may bring a fundamental change in our way to deal with uncertainties.

Practical implications

The proposed model has the potential to avoid the mistake from a misleading data imputation.

Social implications

The proposed model takes the philosophy of grey systems in recognising the limitation of our knowledge which has significant implications in our way to deal with our social life and relations.

Originality/value

This is the first time that the whole data analytics is considered from the point of view of grey systems.

Details

Marine Economics and Management, vol. 2 no. 2
Type: Research Article
ISSN: 2516-158X

Keywords

Article
Publication date: 5 November 2019

R. Dale Wilson and Harriette Bettis-Outland

Artificial neural network (ANN) models, part of the discipline of machine learning and artificial intelligence, are becoming more popular in the marketing literature and in…

1258

Abstract

Purpose

Artificial neural network (ANN) models, part of the discipline of machine learning and artificial intelligence, are becoming more popular in the marketing literature and in marketing practice. This paper aims to provide a series of tests between ANN models and competing predictive models.

Design/methodology/approach

A total of 46 pairs of models were evaluated in an objective model-building environment. Either logistic regression or multiple regression models were developed and then were compared to ANN models using the same set of input variables. Three sets of B2B data were used to test the models. Emphasis also was placed on evaluating small samples.

Findings

ANN models tend to generate model predictions that are more accurate or the same as logistic regression models. However, when ANN models are compared to multiple regression models, the results are mixed. For small sample sizes, the modeling results are the same as for larger samples.

Research limitations/implications

Like all marketing research, this application is limited by the methods and the data used to conduct the research. The findings strongly suggest that, because of their predictive accuracy, ANN models will have an important role in the future of B2B marketing research and model-building applications.

Practical implications

ANN models should be carefully considered for potential use in marketing research and model-building applications by B2B academics and practitioners alike.

Originality/value

The research contributes to the B2B marketing literature by providing a more rigorous test on ANN models using B2B data than has been conducted before.

Details

Journal of Business & Industrial Marketing, vol. 35 no. 3
Type: Research Article
ISSN: 0885-8624

Keywords

Article
Publication date: 12 November 2019

Kun-Huang Huarng and Tiffany Hui-Kuang Yu

The use of linear regression analysis is common in the social sciences. The purpose of this paper is to show the advantage of a qualitative research method, namely, structured…

Abstract

Purpose

The use of linear regression analysis is common in the social sciences. The purpose of this paper is to show the advantage of a qualitative research method, namely, structured qualitative analysis (SQA), over the linear regression method by using different characteristics of data.

Design/methodology/approach

Data were gathered from a study of online consumer behavior in Taiwan. The authors changed the content of the data to have different sets of data. These data sets were used to demonstrate how SQA and linear regression works individually, and to contrast the empirical analyses and empirical results from linear regression and SQA.

Findings

The linear regression method uses one equation to model different characteristics of data. When facing a data set containing a big and a small size of different characteristics, linear regression tends to provide an equation by modeling the characteristics of the big size data and subsuming those of the small size. When facing a data set containing similar sizes of data with different characteristics, linear regression tends to provide an equation by averaging these data. The major concern is that the one equation may not be able to reflect the data of various characteristics (different values of independent variables) that result in the same outcome (the same value of dependent variable). In contrast, SQA can identify various variable combinations (multiple relationships) leading to the same outcome. SQA provided multiple relationships to represent different sizes of data with different characteristics so it created consistent empirical results.

Research limitations/implications

Two research methods work differently. The popular linear regression tends to use one equation to model different sizes and characteristics of data. The single equation may not be able to cover different behaviors but may lead to the same outcome. Instead, SQA provides multiple relationships for different sizes of data with different characteristics. The analyses are more consistent and the results are more appropriate. The academics may re-think the existing literature using linear regression. It would be interesting to see if there are new findings for similar problems by using SQA. The practitioners have a new method to model real world problems and to understand different possible combinations of variables leading to the same outcome. Even the relationship obtained from a small data set may be very valuable to practitioners.

Originality/value

This paper compared online consumer behavior by using two research methods to analyze different data sets. The paper offered the manipulation of real data sets to create different data sizes of different characteristics. The variations in empirical results from both methods due to the various data sets facilitate the comparison of both methods. Hence, this paper can serve as a complement to the existing literature, focusing on the justification of research methods and on limitations of linear regression.

Details

International Journal of Emerging Markets, vol. 15 no. 4
Type: Research Article
ISSN: 1746-8809

Keywords

Article
Publication date: 14 December 2018

Erion Çano and Maurizio Morisio

The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is…

Abstract

Purpose

The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared.

Design/methodology/approach

The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations.

Findings

The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps.

Originality/value

Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.

Details

Data Technologies and Applications, vol. 53 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Book part
Publication date: 17 November 2010

Gregory E. Smith and Cliff T. Ragsdale

Several prominent data-mining studies have evaluated the performance of neural networks (NNs) against traditional statistical methods on the two-group classification problem in…

Abstract

Several prominent data-mining studies have evaluated the performance of neural networks (NNs) against traditional statistical methods on the two-group classification problem in discriminant analysis. Although NNs often outperform traditional statistical methods, their performance can be hindered because of failings in the use of training data. This problem is particularly acute when using NNs on smaller data sets. A heuristic is presented that utilizes Mahalanobis distance measures (MDM) to deterministically partition training data so that the resulting NN models are less prone to overfitting. We show this heuristic produces classification results that are more accurate, on average, than traditional NNs and MDM.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-0-85724-201-3

Article
Publication date: 9 October 2019

Rokas Jurevičius and Virginijus Marcinkevičius

The purpose of this paper is to present a new data set of aerial imagery from robotics simulator (AIR). AIR data set aims to provide a starting point for localization system…

Abstract

Purpose

The purpose of this paper is to present a new data set of aerial imagery from robotics simulator (AIR). AIR data set aims to provide a starting point for localization system development and to become a typical benchmark for accuracy comparison of map-based localization algorithms, visual odometry and SLAM for high-altitude flights.

Design/methodology/approach

The presented data set contains over 100,000 aerial images captured from Gazebo robotics simulator using orthophoto maps as a ground plane. Flights with three different trajectories are performed on maps from urban and forest environment at different altitudes, totaling over 33 kilometers of flight distance.

Findings

The review of previous research studies show that the presented data set is the largest currently available public data set with downward facing camera imagery.

Originality/value

This paper presents the problem of missing publicly available data sets for high-altitude (100‒3,000 meters) UAV flights; the current state-of-the-art research studies performed to develop map-based localization system for UAVs depend on real-life test flights and custom-simulated data sets for accuracy evaluation of the algorithms. The presented new data set solves this problem and aims to help the researchers to improve and benchmark new algorithms for high-altitude flights.

Details

International Journal of Intelligent Unmanned Systems, vol. 8 no. 3
Type: Research Article
ISSN: 2049-6427

Keywords

Article
Publication date: 22 February 2021

Pierluigi Santosuosso

Despite the potential of Big Data analytics, the analysis of Micro Data represents the main way of forecasting the expected values of recorded amounts and/or ratios for small

Abstract

Purpose

Despite the potential of Big Data analytics, the analysis of Micro Data represents the main way of forecasting the expected values of recorded amounts and/or ratios for small auditing firms and certified public accountants dealing with analytical procedures. This study aims to examine how effective Micro Data analytics are by testing the forecast accuracy of the ratio of the allowance for doubtful accounts to the trade accounts receivable and the natural logarithm of the net sales of goods and services, the first exposed to a greater uncertainty than the second.

Design/methodology/approach

Micro Data are low in volume, variety, velocity and variability, but high in veracity. Given the over-fitting problems affecting Micro Data analytics, the in-sample and out-of-sample forecasts were made for both tests. Multiple regression and neural network models were performed using a sample of 35 Italian industrial listed companies.

Findings

The accuracy level of the forecasting models was found in terms of mean absolute percentage error and other accuracy measures. The neural network model provided more accurate forecasts than multiple regression in both tests, showing a higher accuracy level for the amounts exposed to less uncertainty. Moreover, no generalized conclusions on predictors included in the models could be drawn.

Practical implications

The examination of forecast accuracy helps auditors to evaluate whether analytical procedures can be successfully applied to detect misstatements when Micro Data are used and which model gives the most accurate forecasts.

Originality/value

This is the first study to measure the forecast accuracy of the multiple regression and neural network models performed using a Micro Data set. Forecast accuracy is crucial for evaluating the effectiveness of analytical procedures.

Details

Meditari Accountancy Research, vol. 30 no. 1
Type: Research Article
ISSN: 2049-372X

Keywords

Open Access
Article
Publication date: 28 February 2023

Luca Rampini and Fulvio Re Cecconi

This study aims to introduce a new methodology for generating synthetic images for facility management purposes. The method starts by leveraging the existing 3D open-source BIM…

1010

Abstract

Purpose

This study aims to introduce a new methodology for generating synthetic images for facility management purposes. The method starts by leveraging the existing 3D open-source BIM models and using them inside a graphic engine to produce a photorealistic representation of indoor spaces enriched with facility-related objects. The virtual environment creates several images by changing lighting conditions, camera poses or material. Moreover, the created images are labeled and ready to be trained in the model.

Design/methodology/approach

This paper focuses on the challenges characterizing object detection models to enrich digital twins with facility management-related information. The automatic detection of small objects, such as sockets, power plugs, etc., requires big, labeled data sets that are costly and time-consuming to create. This study proposes a solution based on existing 3D BIM models to produce quick and automatically labeled synthetic images.

Findings

The paper presents a conceptual model for creating synthetic images to increase the performance in training object detection models for facility management. The results show that virtually generated images, rather than an alternative to real images, are a powerful tool for integrating existing data sets. In other words, while a base of real images is still needed, introducing synthetic images helps augment the model’s performance and robustness in covering different types of objects.

Originality/value

This study introduced the first pipeline for creating synthetic images for facility management. Moreover, this paper validates this pipeline by proposing a case study where the performance of object detection models trained on real data or a combination of real and synthetic images are compared.

Details

Construction Innovation , vol. 24 no. 1
Type: Research Article
ISSN: 1471-4175

Keywords

Article
Publication date: 5 September 2016

Runhai Jiao, Shaolong Liu, Wu Wen and Biying Lin

The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on…

Abstract

Purpose

The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on incremental clustering which divides data into series of data chunks and only a small amount of data need to be clustered at each time. Few researches on incremental clustering algorithm address the problem of optimizing cluster center initialization for each data chunk and selecting multiple passing points for each cluster.

Design/methodology/approach

Through optimizing initial cluster centers, quality of clustering results is improved for each data chunk and then quality of final clustering results is enhanced. Moreover, through selecting multiple passing points, more accurate information is passed down to improve the final clustering results. The method has been proposed to solve those two problems and is applied in the proposed algorithm based on streaming kernel fuzzy c-means (stKFCM) algorithm.

Findings

Experimental results show that the proposed algorithm demonstrates more accuracy and better performance than streaming kernel stKFCM algorithm.

Originality/value

This paper addresses the problem of improving the performance of increment clustering through optimizing cluster center initialization and selecting multiple passing points. The paper analyzed the performance of the proposed scheme and proved its effectiveness.

Details

Kybernetes, vol. 45 no. 8
Type: Research Article
ISSN: 0368-492X

Keywords

1 – 10 of over 195000