Analysis of Questionnaire Data with R

Massimo Borelli (University of Trieste, Trieste, Italy)

Journal of Workplace Learning

ISSN: 1366-5626

Article publication date: 3 August 2012

738

Citation

Borelli, M. (2012), "Analysis of Questionnaire Data with R", Journal of Workplace Learning, Vol. 24 No. 6, pp. 439-440. https://doi.org/10.1108/13665621211250342

Publisher

:

Emerald Group Publishing Limited

Copyright © 2012, Emerald Group Publishing Limited


Multivariable Modeling and Multivariate Analysis for the Behavioral Sciences

Chapman and Hall/CRC

London

2009

320 pp.

9781439807699

There are some books that, when picking them in your hand and start reading, one regrets not having found them before: much effort would have been saved. This is the case, at least in our opinion, of Analysis of Questionnaire Data with R by Bruno Falissard (2011, Chapman and Hall/CRC – 280 pages). The author provides a nimble but comprehensive reference of classical statistical techniques that practitioners and researchers are required to face when studying workplace learning (and not only workplace learning) by means of questionnaire surveying. As suggested by its title, the book is entirely based on the powerful open source R language, which can be freely downloaded from the CRAN repository together with a huge number of up‐to‐date contributed libraries written to handle almost every statistical need. The strength of the volume is represented by its structure, as each section has split into two parts: inside the “In a few words” section the theoretical framework is sketched and appropriate literature references are provided; within the “In practice” part, the R commands and outputs are clearly explained to the reader, by means of a simple but effective enumerative trick.

After an introductive chapter, the book offers a rendezvous with the descriptive statistics and the more common graphs used to depict data. Associations and risks measures are discussed in the third chapter, with a nice discussion of the spherical representation of a correlation matrix, available in the psy package maintained by the author. The fourth chapter deals with the classical inferential arguments of confidence intervals and statistical hypotheses testing, discussing also the differences in the survey perspective and the inferential perspective of sample size determination. Quick mentions on bootstrapping and on hierarchical and mixed models end the chapter devoted to assessing statistical modelling. The seventh chapter deals in details with the principles for the validation of a composite score. In particular, Multi trait Multi‐method approach is discussed, factor analysis is explained and the classical psicometric measurement error (internal consistency via the Cronbach alpha; inter‐rater reliability) are well explained. Lastly, the book offers a primer on structural equation modelling with some worked examples, a brief introduction to R fundamentals for novices and an Appendix of R commands.

Who the book is devoted to? In particular, the volume appears to be tailored to people which already have a certain expertise in statistical analysis and which, for instance, want to migrate to the open‐source R abandoning other commercial packages. Or, it can be used as an aide‐mémoire to be kept in the bookshelf near your computer and to be referred in designing or analysing your research. In other terms, the book might not satisfy readers searching for an introductory social sciences statistics textbook oriented to self‐learners, which at the same time desires to approach R language. In the latter case, the Multivariable Modeling and Multivariate Analysis for the Behavioral Sciences by the authoritative Brian S. Everitt (2009, Chapman and Hall/CRC – 320 pages) can be the optimal choice. Exploiting a considerable number of real datasets, professor Everitt leads the behavioral scientists to discover all relevant statistical tools necessary to model data. The comprehension is lightened and the practicality is preferred as the mathematical details are reported in sections separated from the main text, encouraging the use of R software and providing hips and solutions.

Introducing the book with a chapter devoted to basic questions concerning the types and the methodology of studies and measurements that a researcher has to face, the volume goes on presenting a number of graphs suitable to depict data, and to explore them, in very different situations. The linear, locally weighted and multiple linear regression are the core chapters of the book, and they are written in a clear style. Assessing the item of multivariable modelling, and the the equivalence of ANOVA and multiple linear regression, Everitt persuades the reader on the importance to choose the “right” model to grasp the results of the experiment. Attention also is given in simplifying a model in order to find the minimal adequate one, and in the visual diagnosis of model goodness. An introduction to the “glims”, the generalized linear models, is offered and a deeper look to the binary outcome (logistic regression) is approached. Time‐to event analysis and random effects to model correlated data in longitudinal studies scheme are also considered within the chapters devoted to survival analysis and to linear mixed models. The classical themes of principal components analysis and factor analysis are discussed with a wealth of details. Also the question of unsupervised classification is discussed in the cluster analysis chapter. The volume ends presenting the classical MANOVA techniques in a chapter named grouped multivariate data, providing in an Appendix also some solutions, written in R code, to the proposed exercises.

About the reviewer

Massimo Borelli (April 26, 1965) has graduated in Mathematical Physics and earned his PhD in Neuroscience and Cognitive Sciences. Lecturer in Biostatistics and Social Statistics, he mainly researches soft computing techniques applied over the neural control of mechanical ventilation.

Related articles