Search results
1 – 10 of over 4000The commitment to statistical process control programmes is becoming commonplace in US industry. However, some companies are experiencing failure of these programmes, particularly…
Abstract
The commitment to statistical process control programmes is becoming commonplace in US industry. However, some companies are experiencing failure of these programmes, particularly in multi‐strata (population) production processes. Even if such a process is in a state of statistical control, there is a high likelihood that one or more strata could drift away from the target owing to an assignable cause. The success of a QC programme depends on the ability of a quality control practitioner to detect this shift with a greater statistical power (sensitivity) and take corrective actions. Addresses the problem faced by the multi‐strata production process of a local manufacturing company in detecting a single stratum shift from the target with a high level of sensitivity. Proposes the selection of an appropriate sampling method (stratified or random) to have a strong bearing on the relative sensitivity of detecting the above shift in a single stratum. Develops power curves for the above mentioned process under stratified and random sampling scenarios, when a shift occurs in a single stratum. Examines the relationship of sample size to the threshold level of the stratum shift and the preferred sampling method.
Details
Keywords
This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P…
Abstract
Purpose
This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies.
Design/methodology/approach
In the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors’ proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic Distributed and Parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at two levels using the stratified random sampling method. This sampling method is also applied to extract the shared learning base (SLB) and the partial learning base for the first level (PLBL1) and the partial learning base for the second level (PLBL2). The experimental results show the efficiency of our solution that the authors provided without significant loss of the classification results. Thus, in practical terms, the system DDPML is generally dedicated to big data mining processing, and works effectively in distributed systems with a simple structure, such as client-server networks.
Findings
The authors got very satisfactory classification results.
Originality/value
DDPML system is specially designed to smoothly handle big data mining classification.
Details
Keywords
Information and communications technology (ICT) offers enormous opportunities for individuals, businesses and society. The application of ICT is equally important to economic and…
Abstract
Information and communications technology (ICT) offers enormous opportunities for individuals, businesses and society. The application of ICT is equally important to economic and non-economic activities. Researchers have increasingly focused on the adoption and use of ICT by small and medium enterprises (SMEs) as the economic development of a country is largely dependent on them. Following the success of ICT utilisation in SMEs in developed countries, many developing countries are looking to utilise the potential of the technology to develop SMEs. Past studies have shown that the contribution of ICT to the performance of SMEs is not clear and certain. Thus, it is crucial to determine the effectiveness of ICT in generating firm performance since this has implications for SMEs’ expenditure on the technology. This research examines the diffusion of ICT among SMEs with respect to the typical stages from innovation adoption to post-adoption, by analysing the actual usage of ICT and value creation. The mediating effects of integration and utilisation on SME performance are also studied. Grounded in the innovation diffusion literature, institutional theory and resource-based theory, this study has developed a comprehensive integrated research model focused on the research objectives. Following a positivist research paradigm, this study employs a mixed-method research approach. A preliminary conceptual framework is developed through an extensive literature review and is refined by results from an in-depth field study. During the field study, a total of 11 SME owners or decision-makers were interviewed. The recorded interviews were transcribed and analysed using NVivo 10 to refine the model to develop the research hypotheses. The final research model is composed of 30 first-order and five higher-order constructs which involve both reflective and formative measures. Partial least squares-based structural equation modelling (PLS-SEM) is employed to test the theoretical model with a cross-sectional data set of 282 SMEs in Bangladesh. Survey data were collected using a structured questionnaire issued to SMEs selected by applying a stratified random sampling technique. The structural equation modelling utilises a two-step procedure of data analysis. Prior to estimating the structural model, the measurement model is examined for construct validity of the study variables (i.e. convergent and discriminant validity).
The estimates show cognitive evaluation as an important antecedent for expectation which is shaped primarily by the entrepreneurs’ beliefs (perception) and also influenced by the owners’ innovativeness and culture. Culture further influences expectation. The study finds that facilitating condition, environmental pressure and country readiness are important antecedents of expectation and ICT use. The results also reveal that integration and the degree of ICT utilisation significantly affect SMEs’ performance. Surprisingly, the findings do not reveal any significant impact of ICT usage on performance which apparently suggests the possibility of the ICT productivity paradox. However, the analysis finally proves the non-existence of the paradox by demonstrating the mediating role of ICT integration and degree of utilisation explain the influence of information technology (IT) usage on firm performance which is consistent with the resource-based theory. The results suggest that the use of ICT can enhance SMEs’ performance if the technology is integrated and properly utilised. SME owners or managers, interested stakeholders and policy makers may follow the study’s outcomes and focus on ICT integration and degree of utilisation with a view to attaining superior organisational performance.
This study urges concerned business enterprises and government to look at the environmental and cultural factors with a view to achieving ICT usage success in terms of enhanced firm performance. In particular, improving organisational practices and procedures by eliminating the traditional power distance inside organisations and implementing necessary rules and regulations are important actions for managing environmental and cultural uncertainties. The application of a Bengali user interface may help to ensure the productivity of ICT use by SMEs in Bangladesh. Establishing a favourable national technology infrastructure and legal environment may contribute positively to improving the overall situation. This study also suggests some changes and modifications in the country’s existing policies and strategies. The government and policy makers should undertake mass promotional programs to disseminate information about the various uses of computers and their contribution in developing better organisational performance. Organising specialised training programs for SME capacity building may succeed in attaining the motivation for SMEs to use ICT. Ensuring easy access to the technology by providing loans, grants and subsidies is important. Various stakeholders, partners and related organisations should come forward to support government policies and priorities in order to ensure the productive use of ICT among SMEs which finally will help to foster Bangladesh’s economic development.
Details
Keywords
Esteban and D. Morales
Uses a unified expression, called Hh,vφ1φ2 entropy to study the asymptotic properties of entropy estimates. Shows that the asymptotic distribution of entropy estimates, in a…
Abstract
Uses a unified expression, called Hh,vφ1φ2 entropy to study the asymptotic properties of entropy estimates. Shows that the asymptotic distribution of entropy estimates, in a stratified random sampling set‐up, is normal. Based on the asymptotic precision of entropy estimates, optimum sample size allocations are developed under various constraints. Gives the relative precision of stratified and simple random sampling. Also provides applications to test statistical hypotheses and to build confidence intervals.
Details
Keywords
Iris Stuart, Yong-Chul Shin, Donald P. Cram and Vijay Karan
The use of choice-based, matched, and other stratified sample designs is common in auditing research. However, it is not widely appreciated that the data analysis for these…
Abstract
The use of choice-based, matched, and other stratified sample designs is common in auditing research. However, it is not widely appreciated that the data analysis for these studies has to take into account the non-random nature of sample selection in these designs. A choice-based, matched or otherwise stratified sample is a nonrandom sample that must be analyzed using conditional analysis techniques. We review five research streams in the auditing area. These streams include work on determinants of audit litigation, audit fees, auditor reporting in financially distressed firms, audit quality and auditor switches. Cram, Karan, and Stuart (CKS) (2009) demonstrated the accuracy of conditional analysis, compared to unconditional analysis, of nonrandom samples through the use of simulations, replications, and mathematical proofs. Papers since published have continued to rely upon questionable research, however, and it is hard for researchers to identify what is the reliability of a given work. We complement and extend CKS (2009) by identifying audit papers in selected research streams whose results will likely differ if the data gathered are analyzed using conditional analysis techniques. Thus research can be advanced either by replication and reanalysis, or by refocus of new research upon issues that should no longer be viewed as settled.
Details
Keywords
Christopher S. Henry and Tamás Ilyés
For central banks who study the use of cash, acceptance of card payments is an important factor. Surveys to measure levels of card acceptance and the costs of payments can be…
Abstract
For central banks who study the use of cash, acceptance of card payments is an important factor. Surveys to measure levels of card acceptance and the costs of payments can be complicated and expensive. In this paper, we exploit a novel data set from Hungary to see the effect of stratified random sampling on estimates of payment card acceptance and usage. Using the Online Cashier Registry, a database linking the universe of merchant cash registers in Hungary, we create merchant and transaction level data sets. We compare county (geographic), industry and store size stratifications to simulate the usual stratification criteria for merchant surveys and see the effect on estimates of card acceptance for different sample sizes. Further, we estimate logistic regression models of card acceptance/usage to see how stratification biases estimates of key determinants of card acceptance/usage.
Details
Keywords
Iraj Rahmani and Jeffrey M. Wooldridge
We extend Vuong’s (1989) model-selection statistic to allow for complex survey samples. As a further extension, we use an M-estimation setting so that the tests apply to general…
Abstract
We extend Vuong’s (1989) model-selection statistic to allow for complex survey samples. As a further extension, we use an M-estimation setting so that the tests apply to general estimation problems – such as linear and nonlinear least squares, Poisson regression and fractional response models, to name just a few – and not only to maximum likelihood settings. With stratified sampling, we show how the difference in objective functions should be weighted in order to obtain a suitable test statistic. Interestingly, the weights are needed in computing the model-selection statistic even in cases where stratification is appropriately exogenous, in which case the usual unweighted estimators for the parameters are consistent. With cluster samples and panel data, we show how to combine the weighted objective function with a cluster-robust variance estimator in order to expand the scope of the model-selection tests. A small simulation study shows that the weighted test is promising.
Details
Keywords
Laouni Djafri, Djamel Amar Bensaber and Reda Adjoudj
This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in…
Abstract
Purpose
This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in the shortest possible time.
Design/methodology/approach
This paper is divided into two parts. The first one is to improve the result of the prediction. In this part, two ideas are proposed: the double pruning enhanced random forest algorithm and extracting a shared learning base from the stratified random sampling method to obtain a representative learning base of all original data. The second part proposes to design a distributed architecture supported by new technologies solutions, which in turn works in a coherent and efficient way with the sampling strategy under the supervision of the Map-Reduce algorithm.
Findings
The representative learning base obtained by the integration of two learning bases, the partial base and the shared base, presents an excellent representation of the original data set and gives very good results of the Big Data predictive analytics. Furthermore, these results were supported by the improved random forests supervised learning method, which played a key role in this context.
Originality/value
All companies are concerned, especially those with large amounts of information and want to screen them to improve their knowledge for the customer and optimize their campaigns.
Details