Search results
1 – 10 of over 7000Son Nguyen, Gao Niu, John Quinn, Alan Olinsky, Jonathan Ormsbee, Richard M. Smith and James Bishop
In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an…
Abstract
In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an abundance of imbalanced data in many fields. In this chapter, we compare the performance of six classification methods on an imbalanced dataset under the influence of four resampling techniques. These classification methods are the random forest, the support vector machine, logistic regression, k-nearest neighbor (KNN), the decision tree, and AdaBoost. Our study has shown that all of the classification methods have difficulty when working with the imbalanced data, with the KNN performing the worst, detecting only 27.4% of the minority class. However, with the help of resampling techniques, all of the classification methods experience improvement on overall performances. In particular, the Random Forest, in combination with the random over-sampling technique, performs the best, achieving 82.8% balanced accuracy (the average of the true-positive rate and true-negative rate).
We then propose a new procedure to resample the data. Our method is based on the idea of eliminating “easy” majority observations before under-sampling them. It has further improved the balanced accuracy of the Random Forest to 83.7%, making it the best approach for the imbalanced data.
Details
Keywords
Son Nguyen, Phyllis Schumacher, Alan Olinsky and John Quinn
We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of…
Abstract
We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of home loan applications. During the process, we propose our undersampling algorithm to cope with the issues created by the imbalance of the data. Our technique is shown to work competitively against popular resampling techniques such as random oversampling, undersampling, synthetic minority oversampling technique (SMOTE), and random oversampling examples (ROSE). We also investigate the relation between the true positive rate, true negative rate, and the imbalance of the data.
Details
Keywords
Jochen Hartmann and Oded Netzer
The increasing importance and proliferation of text data provide a unique opportunity and novel lens to study human communication across a myriad of business and marketing…
Abstract
The increasing importance and proliferation of text data provide a unique opportunity and novel lens to study human communication across a myriad of business and marketing applications. For example, consumers compare and review products online, individuals interact with their voice assistants to search, shop, and express their needs, investors seek to extract signals from firms' press releases to improve their investment decisions, and firms analyze sales call transcripts to increase customer satisfaction and conversions. However, extracting meaningful information from unstructured text data is a nontrivial task. In this chapter, we review established natural language processing (NLP) methods for traditional tasks (e.g., LDA for topic modeling and lexicons for sentiment analysis and writing style extraction) and provide an outlook into the future of NLP in marketing, covering recent embedding-based approaches, pretrained language models, and transfer learning for novel tasks such as automated text generation and multi-modal representation learning. These emerging approaches allow the field to improve its ability to perform certain tasks that we have been using for more than a decade (e.g., text classification). But more importantly, they unlock entirely new types of tasks that bring about novel research opportunities (e.g., text summarization, and generative question answering). We conclude with a roadmap and research agenda for promising NLP applications in marketing and provide supplementary code examples to help interested scholars to explore opportunities related to NLP in marketing.
Details
Keywords
The purpose of the chapter is to develop methodological recommendations and to determine criteria of measuring “conflict character” of socio-economic system.
Abstract
Purpose
The purpose of the chapter is to develop methodological recommendations and to determine criteria of measuring “conflict character” of socio-economic system.
Methodology
Due to large diversity of conflicts of socio-economic systems, the authors compile common methodological recommendations for all economic conflicts, but criteria of measuring “conflict character” of socio-economic system by the example of crisis as a manifestation/example of conflict for which statistical and information bases are available and its precise, objective, and authentic evaluation is possible are offered. The methodological tools of this work are based on the method of systematization, the method of classification, the method of comparative analysis, and the method of formalization of authors’ conclusions and recommendations.
Conclusions
Methodological recommendations and criteria for measuring the “conflict character” of socio-economic system are offered – they allow classifying “conflict” systems. As to the value of the index of conflict character, socio-economic systems with reduced conflict level, moderate conflict level, high conflict level, and very high conflict level are distinguished. They differ as to the capability to oppose internal crises, reaction to external crises, and the model of development of economic conflicts. According to the developed methodological recommendations and offered criteria, “conflict” level of socio-economic systems of developed countries – the USA and Germany, and developing countries – China and Russia is measured in the period of the 2008 global economic crisis.
Originality/value
Based on the offered classification, it is possible to forecast development and management of conflicts of various “conflict” socio-economic systems.
Details
Keywords
Kin Fun Li, Yali Wang and Wei Yu
Purpose — To develop methodologies to evaluate search engines according to an individual's preference in an easy and reliable manner, and to formulate user-oriented metrics to…
Abstract
Purpose — To develop methodologies to evaluate search engines according to an individual's preference in an easy and reliable manner, and to formulate user-oriented metrics to compare freshness and duplication in search results.
Design/methodology/approach — A personalised evaluation model for comparing search engines is designed as a hierarchy of weighted parameters. These commonly found search engine features and performance measures are given quantitative and qualitative ratings by an individual user. Furthermore, three performance measurement metrics are formulated and presented as histograms for visual inspection. A methodology is introduced to quantitatively compare and recognise the different histogram patterns within the context of search engine performance.
Findings — Precision and recall are the fundamental measures used in many search engine evaluations due to their simplicity, fairness and reliability. Most recent evaluation models are user oriented and focus on relevance issues. Identifiable statistical patterns are found in performance measures of search engines.
Research limitations/implications — The specific parameters used in the evaluation model could be further refined. A larger scale user study would confirm the validity and usefulness of the model. The three performance measures presented give a reasonably informative overview of the characteristics of a search engine. However, additional performance parameters and their resulting statistical patterns would make the methodology more valuable to the users.
Practical implications — The easy-to-use personalised search engine evaluation model can be tailored to an individual's preference and needs simply by changing the weights and modifying the features considered. A user is able to get an idea of the characteristics of a search engine quickly using the quantitative measure of histogram patterns that represent the search performance metrics introduced.
Originality/value — The presented work is considered original as one of the first search engine evaluation models that can be personalised. This enables a Web searcher to choose an appropriate search engine for his/her needs and hence finding the right information in the shortest time with the least effort.
Details
Keywords
Qiongwei Ye and Baojun Ma
Internet + and Electronic Business in China is a comprehensive resource that provides insight and analysis into E-commerce in China and how it has revolutionized and continues to…
Abstract
Internet + and Electronic Business in China is a comprehensive resource that provides insight and analysis into E-commerce in China and how it has revolutionized and continues to revolutionize business and society. Split into four distinct sections, the book first lays out the theoretical foundations and fundamental concepts of E-Business before moving on to look at internet+ innovation models and their applications in different industries such as agriculture, finance and commerce. The book then provides a comprehensive analysis of E-business platforms and their applications in China before finishing with four comprehensive case studies of major E-business projects, providing readers with successful examples of implementing E-Business entrepreneurship projects.
Internet + and Electronic Business in China is a comprehensive resource that provides insights and analysis into how E-commerce has revolutionized and continues to revolutionize business and society in China.
Martin A. Sims and Nicholas O’Regan
Technology is defined by Krajewski and Ritzman (2000, p. 17) as ‘the know-how, physical things, and procedures used to produce products and services’. Over the past two decades…
Abstract
Technology is defined by Krajewski and Ritzman (2000, p. 17) as ‘the know-how, physical things, and procedures used to produce products and services’. Over the past two decades, the development of high-technology-based firms has been actively encouraged by governments and development agencies (Westhead & Storey, 1994) as a source of competitive advantage. In many cases, small high-technology-based firms have effectively exploited market opportunities. This has been helped by the emergence of generic technologies, most notably information technology that is knowledge intensive rather than capital and labour intensive (Rothwell, 1994, p. 12). Such technologies have been effectively used to open up new market niches for small- and medium-sized firms (SMEs). Accordingly, high-technology firms have become well established as sources of both competitiveness and employment creation (Oakey, 1991).
Kendra P. DeLoach, Melissa Dvorsky, Elaine Miller and Michael Paget
Students with emotional and behavioral challenges are significantly impacted by mental health issues. Teachers and other school staff need mental health knowledge to work more…
Abstract
Students with emotional and behavioral challenges are significantly impacted by mental health issues. Teachers and other school staff need mental health knowledge to work more effectively with these students. Collaboration with mental health professionals and sharing of information is essential.
Details
Keywords