Search results

1 – 10 of over 60000
Article
Publication date: 10 April 2009

Minghu Ha, Witold Pedrycz, Jiqiang Chen and Lifang Zheng

The purpose of this paper is to introduce some basic knowledge of statistical learning theory (SLT) based on random set samples in set‐valued probability space for the first time…

Abstract

Purpose

The purpose of this paper is to introduce some basic knowledge of statistical learning theory (SLT) based on random set samples in set‐valued probability space for the first time and generalize the key theorem and bounds on the rate of uniform convergence of learning theory in Vapnik, to the key theorem and bounds on the rate of uniform convergence for random sets in set‐valued probability space. SLT based on random samples formed in probability space is considered, at present, as one of the fundamental theories about small samples statistical learning. It has become a novel and important field of machine learning, along with other concepts and architectures such as neural networks. However, the theory hardly handles statistical learning problems for samples that involve random set samples.

Design/methodology/approach

Being motivated by some applications, in this paper a SLT is developed based on random set samples. First, a certain law of large numbers for random sets is proved. Second, the definitions of the distribution function and the expectation of random sets are introduced, and the concepts of the expected risk functional and the empirical risk functional are discussed. A notion of the strict consistency of the principle of empirical risk minimization is presented.

Findings

The paper formulates and proves the key theorem and presents the bounds on the rate of uniform convergence of learning theory based on random sets in set‐valued probability space, which become cornerstones of the theoretical fundamentals of the SLT for random set samples.

Originality/value

The paper provides a studied analysis of some theoretical results of learning theory.

Details

Kybernetes, vol. 38 no. 3/4
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 18 October 2011

Minghu Ha, Jiqiang Chen, Witold Pedrycz and Lu Sun

Bounds on the rate of convergence of learning processes based on random samples and probability are one of the essential components of statistical learning theory (SLT). The…

Abstract

Purpose

Bounds on the rate of convergence of learning processes based on random samples and probability are one of the essential components of statistical learning theory (SLT). The constructive distribution‐independent bounds on generalization are the cornerstone of constructing support vector machines. Random sets and set‐valued probability are important extensions of random variables and probability, respectively. The paper aims to address these issues.

Design/methodology/approach

In this study, the bounds on the rate of convergence of learning processes based on random sets and set‐valued probability are discussed. First, the Hoeffding inequality is enhanced based on random sets, and then making use of the key theorem the non‐constructive distribution‐dependent bounds of learning machines based on random sets in set‐valued probability space are revisited. Second, some properties of random sets and set‐valued probability are discussed.

Findings

In the sequel, the concepts of the annealed entropy, the growth function, and VC dimension of a set of random sets are presented. Finally, the paper establishes the VC dimension theory of SLT based on random sets and set‐valued probability, and then develops the constructive distribution‐independent bounds on the rate of uniform convergence of learning processes. It shows that such bounds are important to the analysis of the generalization abilities of learning machines.

Originality/value

SLT is considered at present as one of the fundamental theories about small statistical learning.

Open Access
Article
Publication date: 3 July 2023

Hung T. Nguyen

This paper aims to offer a tutorial/introduction to new statistics arising from the theory of optimal transport to empirical researchers in econometrics and machine learning.

Abstract

Purpose

This paper aims to offer a tutorial/introduction to new statistics arising from the theory of optimal transport to empirical researchers in econometrics and machine learning.

Design/methodology/approach

Presenting in a tutorial/survey lecture style to help practitioners with the theoretical material.

Findings

The tutorial survey of some main statistical tools (arising from optimal transport theory) should help practitioners to understand the theoretical background in order to conduct empirical research meaningfully.

Originality/value

This study is an original presentation useful for new comers to the field.

Details

Asian Journal of Economics and Banking, vol. 7 no. 2
Type: Research Article
ISSN: 2615-9821

Keywords

Article
Publication date: 5 May 2023

Nguyen Thi Dinh, Nguyen Thi Uyen Nhi, Thanh Manh Le and Thanh The Van

The problem of image retrieval and image description exists in various fields. In this paper, a model of content-based image retrieval and image content extraction based on the…

Abstract

Purpose

The problem of image retrieval and image description exists in various fields. In this paper, a model of content-based image retrieval and image content extraction based on the KD-Tree structure was proposed.

Design/methodology/approach

A Random Forest structure was built to classify the objects on each image on the basis of the balanced multibranch KD-Tree structure. From that purpose, a KD-Tree structure was generated by the Random Forest to retrieve a set of similar images for an input image. A KD-Tree structure is applied to determine a relationship word at leaves to extract the relationship between objects on an input image. An input image content is described based on class names and relationships between objects.

Findings

A model of image retrieval and image content extraction was proposed based on the proposed theoretical basis; simultaneously, the experiment was built on multi-object image datasets including Microsoft COCO and Flickr with an average image retrieval precision of 0.9028 and 0.9163, respectively. The experimental results were compared with those of other works on the same image dataset to demonstrate the effectiveness of the proposed method.

Originality/value

A balanced multibranch KD-Tree structure was built to apply to relationship classification on the basis of the original KD-Tree structure. Then, KD-Tree Random Forest was built to improve the classifier performance and retrieve a set of similar images for an input image. Concurrently, the image content was described in the process of combining class names and relationships between objects.

Details

Data Technologies and Applications, vol. 57 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Abstract

Details

Functional Structure and Approximation in Econometrics
Type: Book
ISBN: 978-0-44450-861-4

Article
Publication date: 20 September 2021

Marwa Kh. Hassan

Distribution. The purpose of this study is to obtain the modified maximum likelihood estimator of stress–strength model using the ranked set sampling, to obtain the asymptotic and…

Abstract

Purpose

Distribution. The purpose of this study is to obtain the modified maximum likelihood estimator of stress–strength model using the ranked set sampling, to obtain the asymptotic and bootstrap confidence interval of P[Y < X], to compare the performance of author’s estimates with the estimates under simple random sampling and to apply author’s estimates on head and neck cancer.

Design/methodology/approach

The maximum likelihood estimator of R = P[Y < X], where X and Y are two independent inverse Weibull random variables common shape parameter that affect the shape of the distribution, and different scale parameters that have an effect on the distribution dispersion are given under ranked set sampling. Together with the asymptotic and bootstrap confidence interval, Monte Carlo simulation shows that this estimator performs better than the estimator under simple random sampling. Also, the asymptotic and bootstrap confidence interval under ranked set sampling is better than these interval estimators under simple random sampling. The application to head and neck cancer disease data shows that the estimator of R = P[Y < X] that shows the treatment with radiotherapy is more efficient than the treatment with a combined radiotherapy and chemotherapy under ranked set sampling that is better than these estimators under simple random sampling.

Findings

The ranked set sampling is more effective than the simple random sampling for the inference of stress-strength model based on inverse Weibull distribution.

Originality/value

This study sheds light on the author’s estimates on head and neck cancer.

Details

International Journal of Quality & Reliability Management, vol. 39 no. 7
Type: Research Article
ISSN: 0265-671X

Keywords

Open Access
Article
Publication date: 29 December 2021

M'Hamed El-Louh, Mohammed El Allali and Fatima Ezzaki

In this work, the authors are interested in the notion of vector valued and set valued Pettis integrable pramarts. The notion of pramart is more general than that of martingale…

Abstract

Purpose

In this work, the authors are interested in the notion of vector valued and set valued Pettis integrable pramarts. The notion of pramart is more general than that of martingale. Every martingale is a pramart, but the converse is not generally true.

Design/methodology/approach

In this work, the authors present several properties and convergence theorems for Pettis integrable pramarts with convex weakly compact values in a separable Banach space.

Findings

The existence of the conditional expectation of Pettis integrable mutifunctions indexed by bounded stopping times is provided. The authors prove the almost sure convergence in Mosco and linear topologies of Pettis integrable pramarts with values in (cwk(E)) the family of convex weakly compact subsets of a separable Banach space.

Originality/value

The purpose of the present paper is to present new properties and various new convergence results for convex weakly compact valued Pettis integrable pramarts in Banach space.

Details

Arab Journal of Mathematical Sciences, vol. 29 no. 2
Type: Research Article
ISSN: 1319-5166

Keywords

Article
Publication date: 7 August 2017

Eun-Suk Yang, Jong Dae Kim, Chan-Young Park, Hye-Jeong Song and Yu-Seop Kim

In this paper, the problem of a nonlinear model – specifically the hidden unit conditional random fields (HUCRFs) model, which has binary stochastic hidden units between the data…

Abstract

Purpose

In this paper, the problem of a nonlinear model – specifically the hidden unit conditional random fields (HUCRFs) model, which has binary stochastic hidden units between the data and the labels – exhibiting unstable performance depending on the hyperparameter under consideration.

Design/methodology/approach

There are three main optimization search methods for hyperparameter tuning: manual search, grid search and random search. This study shows that HUCRFs’ unstable performance depends on the hyperparameter values used and its performance is based on tuning that draws on grid and random searches. All experiments conducted used the n-gram features – specifically, unigram, bigram, and trigram.

Findings

Naturally, selecting a list of hyperparameter values based on a researchers’ experience to find a set in which the best performance is exhibited is better than finding it from a probability distribution. Realistically, however, it is impossible to calculate using the parameters in all combinations. The present research indicates that the random search method has a better performance compared with the grid search method while requiring shorter computation time and a reduced cost.

Originality/value

In this paper, the issues affecting the performance of HUCRF, a nonlinear model with performance that varies depending on the hyperparameters, but performs better than CRF, has been examined.

Details

Engineering Computations, vol. 34 no. 6
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 1 September 2005

Renkuan Guo and Ernie Love

Intends to address a fundamental problem in maintenance engineering: how should the shutdown of a production system be scheduled? In this regard, intends to investigate a way to…

Abstract

Purpose

Intends to address a fundamental problem in maintenance engineering: how should the shutdown of a production system be scheduled? In this regard, intends to investigate a way to predict the next system failure time based on the system historical performances.

Design/methodology/approach

GM(1,1) model from the grey system theory and the fuzzy set statistics methodologies are used.

Findings

It was found out that the system next unexpected failure time can be predicted by grey system theory model as well as fuzzy set statistics methodology. Particularly, the grey modelling is more direct and less complicated in mathematical treatments.

Research implications

Many maintenance models have developed but most of them are seeking optimality from the viewpoint of probabilistic theory. A new filtering theory based on grey system theory is introduced so that any actual system functioning (failure) time can be effectively partitioned into system characteristic functioning times and repair improvement (damage) times.

Practical implications

In today's highly competitive business world, the effectively address the production system's next failure time can guarantee the quality of the product and safely secure the delivery of product in schedule under contract. The grey filters have effectively addressed the next system failure time which is a function of chronological time of the production system, the system behaviour of near future is clearly shown so that management could utilize this state information for production and maintenance planning.

Originality/value

Provides a viewpoint on system failure‐repair predictions.

Details

Journal of Quality in Maintenance Engineering, vol. 11 no. 3
Type: Research Article
ISSN: 1355-2511

Keywords

Open Access
Article
Publication date: 15 December 2023

Chon Van Le and Uyen Hoang Pham

This paper aims mainly at introducing applied statisticians and econometricians to the current research methodology with non-Euclidean data sets. Specifically, it provides the…

Abstract

Purpose

This paper aims mainly at introducing applied statisticians and econometricians to the current research methodology with non-Euclidean data sets. Specifically, it provides the basis and rationale for statistics in Wasserstein space, where the metric on probability measures is taken as a Wasserstein metric arising from optimal transport theory.

Design/methodology/approach

The authors spell out the basis and rationale for using Wasserstein metrics on the data space of (random) probability measures.

Findings

In elaborating the new statistical analysis of non-Euclidean data sets, the paper illustrates the generalization of traditional aspects of statistical inference following Frechet's program.

Originality/value

Besides the elaboration of research methodology for a new data analysis, the paper discusses the applications of Wasserstein metrics to the robustness of financial risk measures.

1 – 10 of over 60000