Search results

1 – 10 of over 4000
Article
Publication date: 1 February 2016

Manoj Manuja and Deepak Garg

Syntax-based text classification (TC) mechanisms have been overtly replaced by semantic-based systems in recent years. Semantic-based TC systems are particularly useful in those…

Abstract

Purpose

Syntax-based text classification (TC) mechanisms have been overtly replaced by semantic-based systems in recent years. Semantic-based TC systems are particularly useful in those scenarios where similarity among documents is computed considering semantic relationships among their terms. Kernel functions have received major attention because of the unprecedented popularity of SVMs in the field of TC. Most of the kernel functions exploit syntactic structures of the text, but quite a few also use a priori semantic information for knowledge extraction. The purpose of this paper is to investigate semantic kernel functions in the context of TC.

Design/methodology/approach

This work presents performance and accuracy analysis of seven semantic kernel functions (Semantic Smoothing Kernel, Latent Semantic Kernel, Semantic WordNet-based Kernel, Semantic Smoothing Kernel having Implicit Superconcept Expansions, Compactness-based Disambiguation Kernel Function, Omiotis-based S-VSM semantic kernel function and Top-k S-VSM semantic kernel) being implemented with SVM as kernel method. All seven semantic kernels are implemented in SVM-Light tool.

Findings

Performance and accuracy parameters of seven semantic kernel functions have been evaluated and compared. The experimental results show that Top-k S-VSM semantic kernel has the highest performance and accuracy among all the evaluated kernel functions which make it a preferred building block for kernel methods for TC and retrieval.

Research limitations/implications

A combination of semantic kernel function with syntactic kernel function needs to be investigated as there is a scope of further improvement in terms of accuracy and performance in all the seven semantic kernel functions.

Practical implications

This research provides an insight into TC using a priori semantic knowledge. Three commonly used data sets are being exploited. It will be quite interesting to explore these kernel functions on live web data which may test their actual utility in real business scenarios.

Originality/value

Comparison of performance and accuracy parameters is the novel point of this research paper. To the best of the authors’ knowledge, this type of comparison has not been done previously.

Details

Program, vol. 50 no. 1
Type: Research Article
ISSN: 0033-0337

Keywords

Abstract

Details

Machine Learning and Artificial Intelligence in Marketing and Sales
Type: Book
ISBN: 978-1-80043-881-1

Book part
Publication date: 23 June 2016

Yulia Kotlyarova, Marcia M. A. Schafgans and Victoria Zinde-Walsh

For kernel-based estimators, smoothness conditions ensure that the asymptotic rate at which the bias goes to zero is determined by the kernel order. In a finite sample, the…

Abstract

For kernel-based estimators, smoothness conditions ensure that the asymptotic rate at which the bias goes to zero is determined by the kernel order. In a finite sample, the leading term in the expansion of the bias may provide a poor approximation. We explore the relation between smoothness and bias and provide estimators for the degree of the smoothness and the bias. We demonstrate the existence of a linear combination of estimators whose trace of the asymptotic mean-squared error is reduced relative to the individual estimator at the optimal bandwidth. We examine the finite-sample performance of a combined estimator that minimizes the trace of the MSE of a linear combination of individual kernel estimators for a multimodal density. The combined estimator provides a robust alternative to individual estimators that protects against uncertainty about the degree of smoothness.

Details

Essays in Honor of Aman Ullah
Type: Book
ISBN: 978-1-78560-786-8

Keywords

Article
Publication date: 15 June 2015

Zhenyuan Tang and Decheng Wan

The jet impingement usually accompanying large interface movement is studied by the in-house solver MLParticle-SJTU based on the modified moving particle semi-implicit (MPS…

Abstract

Purpose

The jet impingement usually accompanying large interface movement is studied by the in-house solver MLParticle-SJTU based on the modified moving particle semi-implicit (MPS) method, which can provide more accurate pressure fields and deformed interface shape. The comparisons of the pressure distribution and the shape of free surface between the presented numerical results and the analytical solution are investigated. The paper aims to discuss these issues.

Design/methodology/approach

To avoid the instability in traditional MPS, a modified MPS method is employed, which include mixed source term for Poisson pressure equation (PPE), kernel function without singularity, momentum conservative gradient model and highly precise free surface detection approach. Detailed analysis on improved schemes in the modified MPS is carried out. In particular, three kinds of source term in PPE are considered, including: particle number density (PND) method, mixed source term method and divergence-free method. Two typical kernel functions containing original kernel function with singularity and modified kernel function without singularity are analyzed. Three kinds of pressure gradient are considered: original pressure gradient (OPG), conservative pressure gradient (CPG) and modified pressure gradient (MPG). In addition, particle convergence is performed by running the simulation with various spatial resolutions. Finally, the comparison of the pressure fields by the modified MPS and by SPH is presented.

Findings

The modified MPS method can provide a reliable pressure distribution and the shape of the free surface compared to the analytical solution in a steady state after the water jet impinging on the wall. Specifically, mixed source term in PPE can give a reasonable profile of the shape of free surface and pressure distribution, while PND method adopted in the traditional MPS is not stable in simulation, and divergence-free method cannot produce rational pressure field near the wall. Two kernel functions show similar pressure field, however, the kernel function without singularity is preferred in this case to predict the profile of free surface and pressure on the wall. The shape of free surface by CPG and MPG is agreement with the analytical solution, while a great discrepancy can be observed by OPG. The pressure peak by MPG is closer to the analytical solution than that by CPG, while the pressure distribution on the right hand side of the pressure peak by latter is better match with the analytical solution than that by former. Besides, fine spatial resolution is necessary to achieve a good agreement with analytical results. In addition, the pressure field by the modified MPS is also quite similar to that by SPH, and this can further validate the reliable of current modified MPS.

Originality/value

The present modified MPS appears to be a stable and reliable tool to deal with the impinging jet flow problems involving large interface movement. Mixed source term in PPE is superior to PND adopted in the traditional MPS and divergence-free method. The kernel function without singularity is preferred to improve the computational accuracy in this case. CPG is a good choice to obtain the shape of free surface and the pressure distribution by jet impingement.

Details

Engineering Computations, vol. 32 no. 4
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 14 November 2016

Shrawan Kumar Trivedi and Shubhamoy Dey

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering system with…

Abstract

Purpose

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering system with high classification accuracy and good sensitivity towards false positives. In that context, this paper aims to present a combined classifier technique using a committee selection mechanism where the main objective is to identify a set of classifiers so that their individual decisions can be combined by a committee selection procedure for accurate detection of spam.

Design/methodology/approach

For training and testing of the relevant machine learning classifiers, text mining approaches are used in this research. Three data sets (Enron, SpamAssassin and LingSpam) have been used to test the classifiers. Initially, pre-processing is performed to extract the features associated with the email files. In the next step, the extracted features are taken through a dimensionality reduction method where non-informative features are removed. Subsequently, an informative feature subset is selected using genetic feature search. Thereafter, the proposed classifiers are tested on those informative features and the results compared with those of other classifiers.

Findings

For building the proposed combined classifier, three different studies have been performed. The first study identifies the effect of boosting algorithms on two probabilistic classifiers: Bayesian and Naïve Bayes. In that study, AdaBoost has been found to be the best algorithm for performance boosting. The second study was on the effect of different Kernel functions on support vector machine (SVM) classifier, where SVM with normalized polynomial (NP) kernel was observed to be the best. The last study was on combining classifiers with committee selection where the committee members were the best classifiers identified by the first study i.e. Bayesian and Naïve bays with AdaBoost, and the committee president was selected from the second study i.e. SVM with NP kernel. Results show that combining of the identified classifiers to form a committee machine gives excellent performance accuracy with a low false positive rate.

Research limitations/implications

This research is focused on the classification of email spams written in English language. Only body (text) parts of the emails have been used. Image spam has not been included in this work. We have restricted our work to only emails messages. None of the other types of messages like short message service or multi-media messaging service were a part of this study.

Practical implications

This research proposes a method of dealing with the issues and challenges faced by internet service providers and organizations that use email. The proposed model provides not only better classification accuracy but also a low false positive rate.

Originality/value

The proposed combined classifier is a novel classifier designed for accurate classification of email spam.

Details

VINE Journal of Information and Knowledge Management Systems, vol. 46 no. 4
Type: Research Article
ISSN: 2059-5891

Keywords

Article
Publication date: 6 November 2017

Fahimeh Saberi Zafarghandi, Maryam Mohammadi, Esmail Babolian and Shahnam Javadi

The purpose of this paper is to introduce a local Newton basis functions collocation method for solving the 2D nonlinear coupled Burgers’ equations. It needs less computer storage…

Abstract

Purpose

The purpose of this paper is to introduce a local Newton basis functions collocation method for solving the 2D nonlinear coupled Burgers’ equations. It needs less computer storage and flops than the usual global radial basis functions collocation method and also stabilizes the numerical solutions of the convection-dominated equations by using the Newton basis functions.

Design/methodology/approach

A meshless method based on spatial trial space spanned by the local Newton basis functions in the “native” Hilbert space of the reproducing kernel is presented. With the selected local sub-clusters of domain nodes, an approximation function is introduced as a sum of weighted local Newton basis functions. Then the collocation approach is used to determine weights. The method leads to a system of ordinary differential equations (ODEs) for the time-dependent partial differential equations (PDEs).

Findings

The method is successfully used for solving the 2D nonlinear coupled Burgers’ equations for reasonably high values of Reynolds number (Re). It is a well-known issue in the analysis of the convection-diffusion problems that the solution becomes oscillatory when the problem becomes convection-dominated if the standard methods are followed without special treatments. In the proposed method, the authors do not detect any instability near the front, hence no technique is needed. The numerical results show that the proposed method is efficient, accurate and stable for flow with reasonably high values of Re.

Originality/value

The authors used more stable basis functions than the standard basis of translated kernels for representing of kernel-based approximants for the numerical solution of partial differential equations (PDEs). The local character of the method, having a well-structured implementation including enforcing the Dirichlet and Neuman boundary conditions, and producing accurate and stable results for flow with reasonably high values of Re for the numerical solution of the 2D nonlinear coupled Burgers’ equations without any special technique are the main values of the paper.

Details

International Journal of Numerical Methods for Heat & Fluid Flow, vol. 27 no. 11
Type: Research Article
ISSN: 0961-5539

Keywords

Article
Publication date: 4 October 2017

Mehdi Habibi and Ahmad Reza Danesh

The purpose of this study is to propose a pulse width based, in-pixel, arbitrary size kernel convolution processor. When image sensors are used in machine vision tasks, large…

Abstract

Purpose

The purpose of this study is to propose a pulse width based, in-pixel, arbitrary size kernel convolution processor. When image sensors are used in machine vision tasks, large amount of data need to be transferred to the output and fed to a processor. Basic and low-level image processing functions such as kernel convolution is used extensively in the early stages of most machine vision tasks. These low-level functions are usually computationally extensive and if the computation is performed inside every pixel, the burden on the external processor will be greatly reduced.

Design/methodology/approach

In the proposed architecture, digital pulse width processing is used to perform kernel convolution on the image sensor data. With this approach, while the photocurrent fluctuations are expressed with changes in the pulse width of an output signal, the small processor incorporated in each pixel receives the output signal of the corresponding pixel and its neighbors and produces a binary coded output result for that specific pixel. The process is commenced in parallel among all pixels of the image sensor.

Findings

It is shown that using the proposed architecture, not only kernel convolution can be performed in the digital domain inside smart image sensors but also arbitrary kernel coefficients are obtainable simply by adjusting the sampling frequency at different phases of the processing.

Originality/value

Although in-pixel digital kernel convolution has been previously reported however with the presented approach no in-pixel analog to binary coded digital converter is required. Furthermore, arbitrary kernel coefficients and scaling can be deployed in the processing. The given architecture is a suitable choice for smart image sensors which are to be used in high-speed machine vision tasks.

Details

Sensor Review, vol. 37 no. 4
Type: Research Article
ISSN: 0260-2288

Keywords

Article
Publication date: 26 August 2014

Xin Ma, Rubing Ge and Li Zhang

The purpose of this paper is to build a support vector machine (SVM) model to evaluate the city air quality level, using the three main air pollutants selected as evaluation…

Abstract

Purpose

The purpose of this paper is to build a support vector machine (SVM) model to evaluate the city air quality level, using the three main air pollutants selected as evaluation index.

Design/methodology/approach

PM10, SO2, NO2 are the most important three air pollutants and their concentration data are selected as the influencing factor. And the SVM model is build and used to evaluate the air quality level of 29 major cities in China 2011. The cross-validation is adopted to select optimal penalty parameters and optimal kernel function, and the classification accuracies achieved under different normalization methods and kernel functions are compared in the end.

Findings

The study found, the parameters and kernel functions chosen by the SVM model have influence on the model's prediction accuracy. Through continuous optimization of model parameters, finally it is found that the model performs better with [0, 1] normalization method and RBF kernel function. It proves that SVM classification model is effective in dealing with the problem of city air quality evaluation.

Practical implications

The result of this study shows that the SVM classification model can be well applied to predict the city air quality level by using air pollutants concentration data as evaluation index. It can help the government and relevant department issue corresponding environmental policy and environmental protection measures.

Originality/value

The qualitative and quantitative study method are combined in this paper, on the basis of predecessors’ research results, as well as careful analysis to select evaluation index. The SVM classification model build is simulated by using Matlab technique, beyond comparing the accuracy, its outcomes and its efficiency in classification are demonstrated.

Details

Kybernetes, vol. 43 no. 8
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 9 May 2008

Slavko Vujević and Petar Sarajčev

This paper aims to describe a numerical procedure for approximating the potential distribution for a harmonic current point source, which is either buried in horizontally…

Abstract

Purpose

This paper aims to describe a numerical procedure for approximating the potential distribution for a harmonic current point source, which is either buried in horizontally stratified multilayer earth, or positioned in the air. The procedure is very efficient and general. The total number of layers and the source position in relation to the medium model layers are completely arbitrary.

Design/methodology/approach

The efficiency of the computation procedure is based on the successful application of the numerical approximation of two kernel functions of the integral expression for the potential distribution within an arbitrarily chosen layer of the medium model. Each kernel function of the observed layer is approximated using a linear combination of 15 real exponential functions. Using these approximations and the analytical integration based on the Weber integral, a simple expression for numerical approximation of potential distribution within boundaries of the observed medium layer is given. Potential retardation is taken into account approximately.

Findings

The numerical procedure developed for the approximation of potential distribution for a harmonic current point source, which is positioned arbitrarily in air or in horizontally stratified multilayer earth, is efficient, numerically stable and generally applicable.

Research limitations/implications

Numerical model developed for the harmonic current point source is the basis of a wider numerical models for computation of the harmonic and transient fields of earthing system, which consists of earthing grids buried in horizontally stratified multilayer earth and metallic structures in the air.

Originality/value

This is efficient and numerically stable frequency dependent harmonic current point source model. Potential retardation, which has been neglected at the first step of the approximation, is subsequently added to the potential expression in such a way that the Helmholtz differential equation has been approximately solved without introducing the Sommerfeld integrals.

Details

COMPEL - The international journal for computation and mathematics in electrical and electronic engineering, vol. 27 no. 3
Type: Research Article
ISSN: 0332-1649

Keywords

Article
Publication date: 24 June 2020

Ahmad Reza Danesh and Mehdi Habibi

The purpose of this paper is to design a kernel convolution processor. High-speed image processing is a challenging task for real-time applications such as product quality control…

Abstract

Purpose

The purpose of this paper is to design a kernel convolution processor. High-speed image processing is a challenging task for real-time applications such as product quality control of manufacturing lines. Smart image sensors use an array of in-pixel processors to facilitate high-speed real-time image processing. These sensors are usually used to perform the initial low-level bulk image filtering and enhancement.

Design/methodology/approach

In this paper, using pulse-width modulated signals and regular nearest neighbor interconnections, a convolution image processor is presented. The presented processor is not only capable of processing arbitrary size kernels but also the kernel coefficients can be any arbitrary positive or negative floating number.

Findings

The performance of the proposed architecture is evaluated on a Xilinx Virtex-7 field programmable gate array platform. The peak signal-to-noise ratio metric is used to measure the computation error for different images, filters and illuminations. Finally, the power consumption of the circuit in different operating conditions is presented.

Originality/value

The presented processor array can be used for high-speed kernel convolution image processing tasks including arbitrary size edge detection and sharpening functions, which require negative and fractional kernel values.

1 – 10 of over 4000