# Bayesian factor analysis for mixed data on management studies

## Abstract

### Purpose

Factor analysis is the most used tool in organizational research and its widespread use in scale validations contribute to decision-making in management. However, standard factor analysis is not always applied correctly mainly due to the misuse of ordinal data as interval data and the inadequacy of the former for classical factor analysis. The purpose of this paper is to present and apply the Bayesian factor analysis for mixed data (BFAMD) in the context of empirical using the Bayesian paradigm for the construction of scales.

### Design/methodology/approach

Ignoring the categorical nature of some variables often used in management studies, as the popular Likert scale, may result in a model with false accuracy and possibly biased estimates. To address this issue, Quinn (2004) proposed a Bayesian factor analysis model for mixed data, which is capable of modeling ordinal (qualitative measure) and continuous data (quantitative measure) jointly and allows the inclusion of qualitative information through prior distributions for the parameters’ model. This model, adopted here, presents considering advantages and allows the estimation of the posterior distribution for the latent variables estimated, making the process of inference easier.

### Findings

The results show that BFAMD is an effective approach for scale validation in management studies making both exploratory and confirmatory analyses possible for the estimated factors and also allowing the analysts to insert *a priori* information regardless of the sample size, either by using the credible intervals for Factor Loadings or by conducting specific hypotheses tests. The flexibility of the Bayesian approach presented is counterbalanced by the fact that the main estimates used in factor analysis as uniqueness and communalities commonly lose their usual interpretation due to the choice of using prior distributions.

### Originality/value

Considering that the development of scales through factor analysis aims to contribute to appropriate decision-making in management and the increasing misuse of ordinal scales as interval in organizational studies, this proposal seems to be effective for mixed data analyses. The findings found here are not intended to be conclusive or limiting but offer a useful starting point from which further theoretical and empirical research of Bayesian factor analysis can be built.

## Keywords

#### Citation

Albuquerque, P., Demo, G., Alfinito, S. and Rozzett, K. (2019), "Bayesian factor analysis for mixed data on management studies", *RAUSP Management Journal*, Vol. 54 No. 4, pp. 430-445. https://doi.org/10.1108/RAUSP-05-2019-0108

### Publisher

:Emerald Publishing Limited

Copyright © 2019, Pedro Albuquerque, Gisela Demo, Solange Alfinito and Kesia Rozzett.

#### License

Published in *RAUSP Management Journal*. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode.

## Introduction

Factor analysis was initially developed by sociologist Charles Spearman (Spearman, 1904) who proposed the hypothesis that the wide variety of psychological measures as mathematical, verbal and logical reasoning skills, among others, could be explained by an underlying factor of general intelligence namely “g”.

Spearman (1904) developed what is known today as factor analysis, which has been widely used mainly to analyze the patterns of interrelationship between variables, to reduce the dimensionality of data, and to support the creation of scales (Rummel, 1988). A century later, this methodology is still widely used in various areas as Management, Political Science, Economics, and Psychology.

However, traditional factor analysis is not always applied correctly due to the misuse of ordinal data as if it were interval data and the inadequacy of the former for classical factor analysis (Jöreskog & Moustaki, 2001). There are also other problems that compromise the use of this technique such as the instability of parameters for small samples (Arrindell & Van der Ende, 1985; MacCallum, Widaman, Zhang, & Hong, 1999), the segregation between exploratory and confirmatory factor analyses (Hurley et al., 1997; Suhr, 2006; Thompson, 2004) the impossibility of inserting prior information on both qualitative and quantitative estimation of the parameters and also the difficulty of using mixed data such as ordinal, interval and ratio variables (Clinton & Lewis, 2008; Quinn, 2004).

In spite of those questions, factor analysis is undoubtedly the most used tool in organizational research. It is disseminated in different fields, such as self-reports appraisal (Podsakoff & Organ, 1986), human resource management (Allen, Shore, & Griffeth, 2003; Aquino, 2000; Bradfield & Aquino, 1999; Hui & Lee, 2000; Lubatkin, Simsek, Ling, & Veiga, 2006; Schuler & Jackson, 1989; Stevens & Campion, 1999), work and family conflicts (Carlson & Perrewé, 1999), managerial communication (Gopinath & Becker, 2000), entrepreneurship (Zahra, Neubaum, & Huse, 2000), psychological climate (Tsai, 2001), performance (Kidder, 2002), leadership (Elenkov & Manev, 2005), and so forth.

The spread use of the technique confirms the results of Hinkin (1995), who says factor analysis is most commonly used for data reduction and construction of scales. Approximately 71 per cent of the studies investigated by the author reported the use of factor analysis for such purpose. As expected, the results presented by Conway and Huffcutt (2003) confirm this scenario, since Comrey (1978) had already observed exponential growth of the use of factor analysis in scales’ validation.

The development of scales through factor analysis aims to contribute to appropriate decision-making in management. For instance, Dakduk et al. (2017) and Merkle and Wang (2018) argue the importance of the Bayesian approach in the Customer Behavior field, especially when the priors are known trough specialist or another accurate source of information.

Thus, the objective of the present article is to present and apply the Bayesian factor analysis for mixed data (BFAMD) in the context of empirical research in management by using the Bayesian paradigm for the construction of scales. The logic of the Bayesian approach is useful for cases in which the data are mixed, i.e. a combination of interval, ordinal or ratio variables and also in the presence of a prior information such as past studies (e.g. meta-analysis) or information gathered from the experience of specialists (Zyphur & Oswald, 2015). The Bayesian approach also facilitates the estimation of parameters, which may be complicated or even impossible using classical frequentist approach. Moreover, the BFAMD method also makes exploratory and confirmatory analyses possible for the estimated factors (Lohrke, Carson, & Lockamy, 2018).

Finally, as presented by Van de Schoot et al. (2017) the Bayesian paradigm is a promising approach that overcomes the problems of the standard methods in empirical and experimental fields in psychology, including also the scale validation in management studies. The authors found in a 25 years review that the use of Bayes has increased and broadened in the sense that this methodology can be used flexibly to tackle many different forms of questions, becoming then the most promissory method to measure latent variables in Applied Social Science fields.

This study is presented as follows: the theoretical background which encompasses the motivation for the use of mixed data in empirical research in management, the existing discussion on the use of ordinal scales in the estimation of factor analysis and how the BFAMD method may be used as a proposed approach. In the Methods section, we introduce the BFAMD and describe an empirical application in the management field. Finally, the results are presented and discussed by pointing the research limitations and its practical implications as well as highlighting directions for future research.

## Theoretical background

Due to the multidisciplinary corporate environment and its subjection to several different inputs, the construction of scales and latent variables that assist the decision-making process in management tends to be done both through qualitative and quantitative variables. The mixed use implicates a different type of treatment for each measurement according to its own constraints and properties.

Stevens (1946, p. 677) defines measurement as “the assignment of numerals to objects or events according to rules”. There are four scales of measurement that are quite different and cannot be used interchangeably even though they may be represented by numerals. The scales are divided into metric (quantitative) and non-metric (qualitative) scales. The quantitative measures are interval variables, with discrete levels in which the interval between each category is equal and well defined (e.g. number of people, number of computers); and ratio variables, with no data restriction and allowance of any value, even fractions (e.g. income and height). The qualitative measures are: nominal variables, in which designated numerals simply represent categories without implying in amounts of an attribute or characteristic and categorization of data does not associate a hierarchy level (e.g. gender); and ordinal variables, whose concept is broader and will be presented next.

### Ordinal scales in management studies

The use of ordinal scales in research questionnaires is broad, but its misuse is one of the most persistent and controversial issues in applied social research, according to Vigderhous (1977). The Likert scale (Likert, 1932), for instance, is categorical by construction but inadvertently commonly treated as a quantitative variable. In this sense, the author argues that treating ordinal data as interval without examining the values of the data set and the objective of the analysis may mislead and misrepresent the findings of a study.

Similarly, Jamieson (2004) argues that as Likert scales lie on the ordinal level of measurement, the intervals between the values cannot be assumed to be equal. He also emphasizes that the common practice of assuming the response formats in a Likert-type scale as interval level of measurement requires attention because descriptive and inferential statistics are different for ordinal and interval variables. Thus, if an inappropriate statistical technique is used, the researcher increases the chance of drawing the wrong conclusions from their research.

Malhotra (1999), for instance, states that Likert scales are one of the most used in the literature of Marketing and researchers have assumed this kind of scale as a quantitative interval variable. In the same vein, Martilla and Davis (1975) consider the treatment of ordinal data as interval as a major “sin” in Marketing Research.

Likewise, Göb, McCollin and Ramalhoto (2007) report that the problem of measuring attitudes, in general, suggests an interpretation of ordinal Likert scales, although appropriate analytical methods are not easily found in textbooks or statistical packages as methods for interval data are. Since the numbers in a Likert scale represent only categories and there is no certainty about the equality of intervals between one category and another, ordinal data cannot be treated as interval (Jöreskog & Moustaki, 2001).

In this context, Mittal and Kamakura (2001) suggest that the assumption of equal intervals for ordinal scales should be questioned both logically and empirically. As for the logic issue, there is no guarantee that respondents would be able to judge equal units or even if units assessed by an interviewee will coincide with those of another. Researchers must be careful then because certain statistical methods like Student’s *t*-tests are affected dramatically when the assumption of equal intervals (linearity) is violated.

Notwithstanding, Carifio and Perla (2007) defend that the use of both Likert scale and interval data can produce similar results. For example, research conducted by Carifio (1976, 1978) showed that the use of a 100 mm line with 2 to 7 anchor points as a response format to statements of attitudes produced empirically linear and interval data in scale, sub-scale and full level range. Such data indeed correlated with the answers given to the same questions using a Likert-type scale response format from 5 to 7 points.

Accordingly, Holgado-Tello, Chacón-Moscoso, Barbero-García and Vila-Abad (2010) state that numbers should be treated as categories, once they do not show metric properties. However, under certain conditions (e.g. large samples), it seems possible to use exploratory factor analysis for Likert-type scales and to obtain similar results as to the use of ordinal factor analysis.

Indeed, as stated by Carifio and Perla (2007) the basic problem with the misinterpretation of Likert scales is the belief that the labeling of a term anchoring, such as “I agree” is twice or one more unit than “partially agree” and so forth. This kind of data interpretation is usually due to either inadequate knowledge or logic and interpretation errors as well. Therefore, an ordinal scale is neither an equal units’ scale nor presents quantitative metrics and consequently, those sorts of reasoning tend to be inadequate.

Considering that most studies on scale validations in the management field use exploratory factor analysis regardless the variables being categorical or continuous, the discussion concerning the use of different scales in factor analyses is demanded and is also a gap in the literature. This is especially true for management studies considering that the few studies found in the scientific literature were from Statistics, Psychology and Health Sciences.

The use of exploratory factor analysis itself has also been questioned. Norris and Lecavalier (2010) argue that exploratory factor analysis is a widely used but poorly understood statistical procedure and discuss its methodological variations. They conclude that published recommendations and guidelines such as the use of exploratory factor analysis instead of principal component analysis; the use of a minimum of 200 participants or a subject-to-item ratio of at least 5:1; the use of oblique rotations; and specially, and the use of polychoric correlations for categorical data (Fabrigar et al., 1999; Floyd & Widaman, 1995; Ford, MacCallum, & Tait, 1986; Gorsuch, 1974; Lee & Comrey, 1992) are largely ignored.

We have found a lack of studies in the literature with the objective to compare exploratory and ordinal factor analysis in scale validations. Demo, Batelli, and Albuquerque (2015) developed and validated a customer relationship management scale (CRMS) for the video game’s industry. They first ran an exploratory analysis and then performed an ordinal analysis with the same criteria. The scale validated through ordinal analysis was found to outperform the one validated through exploratory analysis regarding the validity or the quality of the items. Nevertheless, those variations were considered small since the factor loadings varied slightly, according to the authors. Furthermore, the authors noticed that the total variance explained suffered a significant reduction in the ordinal analysis and Cronbach's alphas for reliability remained unchanged.

The improvement of quality and validity of items in ordinal factor analysis is probably due to the ordinal Likert scale format. Hence, it follows that ordinal analysis is more appropriate, but it does not invalidate the results obtained through exploratory factor analysis since the sample was fairly large (493 subjects). Although the results may suggest that for large samples both exploratory and ordinal factor analyses might present similar results, it is important to keep in mind that ordinal analysis is always preferable as stated by Armstrong (1981), Jamieson (2004), Kuzon et al. (1996).

The use of BFAMD turns out to be an important alternative tool to deal with such concerns taking in consideration that it allows the integration between categorical/ordinal scales and continuous variables in the same model. Thus, such approach would reduce the issue of scales suitability (e.g. Likert scales) for the development of models that support managerial decisions in factor analysis.

### Bayesian paradigm

Unlike the frequentist approach, the Bayesian paradigm does not consider parameters as fixed amounts that should be discovered, i.e. promptly estimated. In fact, instead of being fixed, parameters are random variables, which allow them some variability for each unit of the population (Berger, 1985).

Specifically in the management area, assigning a single parameter for all elements of the population may sound unrealistic and sometimes incorrect. Assuming, for example, that the effect of education on income is the same for all units of the population, or considering that the impact of a particular management policy is the same for all stakeholders is naive and does not have theoretical support since individuals are naturally heterogeneous and respond to the business environment diversely.

After a review of the literature that included more than 10,000 articles published in 15 journals from January 2001 to December 2010, Kruschke, Aguinis and Joo (2012) indicate that the Bayesian approaches are virtually absent from the organizational sciences. Their results point to a lack in the literature and may be a call for more researches to use this tool in organizational research to strengthen the field.

Among the advantages of the Bayesian approach over the classical frequentist approach, Kruschke, Aguinis and Joo (2012) mention the use of prior information; the estimation of the joint distribution of the model parameters in a global way; the permissibility of accepting the null hypothesis; the ease of running complex tests; the possibility of using small samples; and the possibility of multiple comparisons and more general power analysis.

In this sense, the Bayesian approach considers the parameters of the models as random variables, because each element of the population may have an effect associated with it differently. For example, a policy that encourages reading within an organization to increase productivity may, on average, have a positive effect. However, there will be individuals who will receive the stimulus either in a negative or null way even though the large majority will probably receive the encouragement positively.

One may calculate the probability of a negative effect, and also of an effect that is lower or higher than expected. Moreover, the statistical significance required by the classical frequentist approach is not reasonable, since it makes no sense to test the null hypothesis of the parameters because they are usually continuous random variables (Gelman, Carlin, Stern, & Rubin, 2003).

Graphically, the information constructed by the Bayesian approach may be represented in Figure 1.

The ultimate source of all model information is the posterior distribution, which is composed of two sources of information:

Prior distribution: in the prior distribution, the information of the parameters is presented, and then the estimates for the parameters of interest are generated. Usually, the information is obtained either through an interview with experts or by consulting previous work on the subject. It is possible that no information about the parameter of interest is available. In this case, one may work with a non-informative prior distribution or either an improper prior distribution (Berger, 1985; Samaniego, 2010; Zyphur & Oswald, 2015). At this stage, qualitative information about the phenomenon of interest is quantified, allowing the use of meta-analysis in the construction of prior information for the parameters.

Data information: at this stage, data on the phenomenon of interest are collected and then the likelihood function is constructed. This step is similar to the estimation step of the frequentist approach, in which a log-likelihood function is maximized to obtain estimates on the parameters of interest.

These two steps are joined to make the posterior distribution, which provides the probability distribution for the parameters given prior information and the likelihood function obtained by the collected data. Through the posterior distribution, the inference for the parameters is performed along with the goodness-of-fit tests and other hypothesis tests of interest to the analyst.

The weight of each component on the posterior distribution is given by the number of observations collected for the likelihood function (Data Information step) and by the accuracy of prior information. This relationship is represented in Figure 2.

In contrast to the frequentist approach, Bayesian inference does not force an artificial dichotomy between null and alternative hypotheses because it allows the construction of general hypotheses for the parameters accurately and does not require large sample sizes. In fact, in the presence of accurate prior information, it is possible to work with small sample sizes (Ansari & Jedidi, 2000; Dunson, 2000; Howson & Urbach, 2006; Scheines, Hoijtink, & Boomsma, 1999), which is common in managerial studies.

In this sense, BFAMD presents an interesting appeal to be used in management research as it allows the use of mixed data, i.e. ordinal, interval and ratio data. It also allows analyses with a small sample size as well as the inclusion of qualitative information through prior elicitation generalizing both exploratory and confirmatory approaches of factor analysis.

## Method

In this section, we introduce the BFAMD and describe an empirical application in the management field.

### Bayesian factor analysis for mixed data

Ignoring the categorical nature of some variables often used in management studies, as the popular Likert scale, may result in a model with false accuracy and possibly biased estimates. To address this issue, Quinn (2004) proposed a Bayesian Factor Analysis Model for Mixed Data, which is capable of modeling ordinal (qualitative measure) and continuous data (quantitative measure) jointly. It also allows the inclusion of qualitative information through prior distributions for the parameters’ model, as previously discussed.

The model proposed by Quinn (2004) presents the advantages already listed and allows the estimation of the posterior distribution for the latent variables estimated, making the process of inference easier. Thus, considering * X_{N×J}* the data matrix, each row (i = 1, …, N) represents a sampled observation and each column (j = 1, …, J) represents an observed variable, considering that each observed variable may be either ordinal or continuous. In case the j-th variable is ordinal, it will present C

_{j}categories.

The objective of the model is to estimate a matrix
* X_{N×J}*. Each element of the observed matrix may be decomposed in the following way:

_{j}. Note that when each category c for an ordinal variable is observed, it means that the latent variable is contained in an interval bounded by γ

_{j}(c - 1) and γ

_{jc}.

The association between the latent and observed variables is constructed through the traditional factor model, that is:

In the model, * x_{i}* is a J-dimensional vector representing the value of the latent variables for the i-th observation,

**Λ**

_{J × K}is the matrix of factor loadings for the K estimated factors,

**ϕ**

_{i}is a K-dimensional vector representing the scores for the K estimated factors, and

**ε**

_{i}is the errors vector assumed to have a multivariate normal distribution with J-dimensional zero mean vector and diagonal matrix of variances and covariances

**Ψ**

_{J × J}.

The posterior distribution, according to Quinn (2004) is given by:

Since there is usually no initial information on **γ**=(γ_{j1},…γ_{j}C_{j}), P(**γ**) is assumed to have an inproper uniform distribution. Thus, we have:

In this equation **Φ**_{N × K} is the matrix of scores for the K factors e N observations, 1(u) is an indicator function that assumes value equal to 1 when u is true and value equal to zero when u is false.

The distribution showed in (4) is composed of quantitative information from the data and qualitative information modeled by prior distributions. The posterior distribution is sampled through the MCMC (Markov Chain Monte Carlo). Thus, to estimate the latent variables, it is necessary to have a sample (matrix * X_{N×J}*) and prior information. If the researcher does not have prior information available or is uncertain regarding its accuracy, we suggest the use of prior non-informative (Berger, 1985).

### An empirical application

Important authors of CRM (Gronroos, 2017; Payne, 2012; Toedt, 2014) agree on the relevance of managing the relationship between organizations and their customers. Thus the adaptation of the organizational capacity to detect opportunities in the market and the constant effort of companies on establishing long term relationships with its business partners, especially with its customers, has been established as a priority for enterprises (Demo et al., 2018; Scussel & Demo, 2019).

Considering both the strategic relevance of CRM for organizations nowadays, and the lack of measuring scales customized for the B2C (Business-to-Consumer) market in general as well as the importance of validating a scale in different countries for improved generalizability, Demo and Rozzett (2013) validated the CRMS in the USA, based on the previous CRM scale that Rozzett and Demo (2010), developed and validated in Brazil. Afterwards, Demo et al. (2017) validated the CRMS in France to obtain indications of external validity and to proceed with a cross-cultural comparison as well.

Demo and Rozzett (2013) conducted three studies for the development and validation of the CRMS in the USA. For such purpose, three different American samples were collected using the Likert Scale as an ordinal variable.

Data from study 1 (N = 200) were used to select items based on EFA (Exploratory Factor Analysis). Then, CFA (Confirmatory Factor Analysis) was used on data obtained in study 2 (N = 403) to examine factor structure, as well as to provide construct validity through convergent validity. Scale reliability was assessed by Cronbach’s alpha on EFA and Jöreskog’s rho on CFA. Data from study 3 (N = 403) were used to test the scale generalizability.

As to study 1, 65 per cent of the employees were male, 63 per cent were White or Caucasian, 55 per cent were under the age of 26, 49.5 per cent had a Bachelor degree, 43.5 per cent had been customers of the companies chosen between 1 and 5 years, and 67 per cent affirmed they purchase from the companies chosen on a weekly (33 per cent) or monthly (34 per cent) basis. Regarding study 2, 64 per cent of the employees were male, 55 per cent were White or Caucasian, 45.5 per cent were between 26 and 40 years old, 48 per cent had a Bachelor degree, 42 per cent had been customers of the companies chosen between one and five years, and 49 per cent affirmed they purchase from the companies chosen on a monthly basis. Finally, 61 per cent of the employees in study 3 were male, 70 per cent were White or Caucasian, 48 per cent per cent were under the age of 26, 50 per cent had a Bachelor degree, 41.4 per cent had been customers of the companies chosen between 1 and 5 years, and 41 per cent affirmed they purchase from the companies chosen on a monthly basis.

Based on Demo and Rozzett (2013) database with 910 subjects, we performed an empirical illustrative application of BFAMD. The model specification was made assuming non-informative and uncorrelated priors. Specifically, for factor loadings and factor scores, we assumed a multivariate normal distribution centered at zero with diagonal variance and covariance matrix of 0.001 precision. For the cut points required for ordinal factor analysis, we assumed a uniform improper prior. For uniqueness, we assumed non-informative prior following an inverted gamma distribution with location and scale parameters equal to 0.001.

The Markov Chain Monte Carlo method was performed to 1,000,000 iterations with a burn-in sample of 10,000 and thinning interval equal to 100. For the first three factors, the credible interval for factor loadings was 95 per cent, as shown in Figure 3.

Figure 3 shows that the only factor contributing to the factor score is the first factor. The other factors present intersection with zero, which means there is a probability of 95 per cent that zero is contained in the factor loading’s credible interval for factors 2 and 3. These factors showed the same pattern and were therefore deleted from the analysis.

It is interesting to note that, through the Bayesian approach, factor analysis may be considered exploratory and confirmatory since the estimated amounts are random variables, which makes it possible to obtain probabilities for their representativeness in the model.

Table I presents statistics and other tests for the posterior distribution of the constructed model.

It is noticeable that all 20 variables positively affect the factor score. The Heidel diagnostic test uses the statistic Cramer-von Mises to test the null hypothesis that the values sampled by MCMC are from a stationary distribution, a requisite for good inference model. In this case, the test shows that for most of the variables, the null hypothesis is not rejected (Heidelberger & Welch, 1981; Plummer et al., 2007).

All factor loadings parameters except the parameters associated with the variables number 6 (This company treats its customers with respect.) and 17 (This company has good facilities (either physical, in case of stores, or virtual, in case of websites).), accepted the null hypothesis of stationarity, corroborating the suitability of the model built for the posterior distribution of the parameters. The parameter that has the greatest effect on the factor score is the parameter associated with the second variable (I recommend this company to friends and family.), as can be observed in Figure 3 and Table I.

Indeed, the item concerning the recommendation of a company to friends and family reinforces Payne’s (2006) statement that loyal customers not only buy repeatedly but also go a step further recommending the company to people they care about, like family and friends. Those recommendations reduce future customers’ acquisition costs (Ravald & Grönroos, 1996) and represent a relevant indicator of willingness to develop a long-term relationship.

However, this parameter is the one with the highest variability, suggesting a possible heterogeneity in the perception of respondents regarding this item. Possibly, a cultural bias might explain this heterogeneity, taking in account that American population is heterogeneously composed by several immigrants from Latin America, Asia, Africa and so on, who have pretty different cultural backgrounds that certainly influence their behaviors and consequently their propensity to make recommendations.

In the Bayesian Factor Analysis Method for Mixed Data, the interpretation of uniqueness communalities is slightly different since they assume an inverted gamma distribution for uniqueness. This allows values above 1, and therefore provides negative estimates for the communalities (Quinn, 2004).

To sum up, concerning uniqueness, the item that has the biggest information (exclusive) for factor score construction is the item number 2 (I recommend this company to friends and family). Similarly, as to communalities, the item with the biggest amount of information for factor score explanation is represented by item 17 (This company has good facilities – either physical, in case of stores, or virtual, in case of websites). In conclusion, the interpretation remains roughly the same and the model's ability to capture population’s heterogeneity in the responses was proven.

## Discussion

The Bayesian approach has found widespread use in a variety of fields in science. However, organizational sciences have hardly received the benefits of this approach so far, and few studies have been proposed to use or evaluate the benefits of this new inferential paradigm.

Thus, as to academic implications, this study aimed to contribute to the incipient literature on Bayesian paradigm in the management area, by showing how this paradigm may be used in the case of mixed data in empirical organizational analysis concerning to the scale construction field. Due to the extensive discussion on the use of categorical data through classical factor analysis, this paper proposes a solution by using the Bayesian factor analysis model for mixed data, which incorporates the use of mixed data (numeric and ordinal), and allows the analyst to insert prior information regardless sample size.

Also, an empirical model using BFAMD was presented demonstrating the effect of certain constructs in CRM. The constructed model showed a good fit in the Heidel test (Table I). Due to model complexity, it might be difficult or even impossible to build it upon the frequentist paradigm, and BFAMD turned out to be an effective approach for scale validation in management studies.

Concerning managerial implications, the BFAMD approach can be used to produce more trustable results in scale validations in the sense that incorporates adequately the ordinal data’s structure besides prior information, which in turn might improve the effectiveness of managers evaluations based on measurement scales regarding organizational phenomena by supporting decision-making and problem-solving processes.

The flexibility of the Bayesian approach presented here is counterbalanced by the fact that the main estimates used in factor analysis as uniqueness and communalities commonly lose their usual interpretation due to the choice of using prior distributions. Meanwhile, it is possible to explore and confirm the factor analysis in a model either by using the credible intervals for Factor Loadings (Figure 3) or by conducting specific hypotheses tests.

As limitations, we highlighted the use of noninformative priors and the slight interpretation presented in the uniqueness and communalities, which could difficult the interpretation for the measures, since they assume an inverted gamma distribution for uniqueness.

Since we did not find other empirical research comparing exploratory and ordinal factor analyses so far, it is recommended to conduct further studies comparing both methods to confirm the theory and the empirical study reviewed (Holgado-Tello et al., 2010; Jamieson, 2004; Kuzon et al., 1996). This would be especially relevant for small samples that have not been tested yet, in order to check for significant differences. Non-significant results would drive us to the conclusion that exploratory analysis with small samples is not appropriate for scale validations when categorical scales are used, as set by authors like Jöreskog and Moustaki (2001). If the results turn out to be significantly different, we would possibly conclude that exploratory factor analysis is not appropriate for validation of ordinal scales.

It is further suggested that other prior distributions are used to assess the sensitivity of the model regarding the choice of the hyper parameters depending on the sample size and variability of the data. In addition, we recommend the use of informative priors derived from interviews with specialists or from meta-analysis, whose results obtained are compared with other models.

## Conclusion

Finally, we may conclude, in spite of the limitations pointed, that the main objective of this study was reached, and the BFAMD in the context of empirical research in management was presented, discussed by using the Bayesian paradigm for the construction of scales, and illustrated through an empirical application in the marketing subject. Considering that the development of scales through factor analysis aims to contribute to appropriate decision-making in management and the increasing misuse of ordinal scales as interval in organizational studies, our proposal seems to be effective for mixed data analyses. The findings found here are not intended to be conclusive or limiting but offer a useful starting point from which further theoretical and empirical research of Bayesian factor analysis can be built.

## Figures

Summary of the posterior density

Variables | Loadings (Mean) | Loadings (SD) | Heidel test (p-value) |
Uniquenesses | Communalities |
---|---|---|---|---|---|

(1) This company deserves my trust. | 32.58 | 0.17 | 0.12 | 11.65 | −0.16 |

(2) I recommend this company to friends and family | 84.77 | 25.98 | 0.12 | 28.27 | −18.27 |

(3) This company treats me as an important customer | 45.67 | 0.33 | 0.09 | 22.05 | −12.05 |

(4) My shopping experiences with this company are better than I expected | 33.64 | 0.16 | 0.62 | 13.52 | −0.35 |

(5) I identify myself with this company | 22.53 | 0.12 | 0.14 | 14.19 | −0.42 |

(6) This company treats its customers with respect | 44.16 | 0.24 | 0.04 | 15.19 | −0.52 |

(7) This company offers personalized customer service | 25.64 | 0.13 | 0.82 | 12.22 | −0.22 |

(8) The products/services sold by this company are a good value (the benefits exceed the cost) | 33.36 | 0.18 | 0.36 | 12.57 | −0.25 |

(9) This company solves problems efficiently | 31.00 | 0.17 | 0.33 | 12.08 | −0.21 |

(10) This company tries to get to know my preferences, questions and suggestions | 30.07 | 0.19 | 0.10 | 16.13 | −0.61 |

(11) This company rewards my loyalty | 19.35 | 0.08 | 0.20 | 12.59 | −0.26 |

(12) This company has communication channels for complaints and suggestions (e.g., toll free, online customer service, etc.) | 38.54 | 0.28 | 0.84 | 13.55 | −0.35 |

(13) This company provides information about its policies, projects, products/services and new releases | 39.84 | 0.30 | 0.22 | 15.32 | −0.53 |

(14) I’m willing to buy other products/services from this company | 34.77 | 0.22 | 0.45 | 11.16 | −0.11 |

(15) This company encourages interaction among its customers (e.g., events, Facebook) | 24.01 | 0.26 | 0.49 | 13.78 | −0.37 |

(16) This company is socially and environmentally friendly | 36.43 | 0.55 | 0.84 | 14.70 | −0.47 |

(17) This company has good facilities (either physical, in case of stores, or virtual, in case of websites) | 33.21 | 0.22 | 0.04 | 0.77 | 0.22 |

(18) There are a few competitors to this company that have the same importance to me | 18.73 | 0.08 | 0.12 | 10.80 | −0.08 |

(19) This company offers convenience to its customers (e.g., online services, home delivery, 24-7 customer service) | 27.09 | 0.14 | 0.47 | 10.67 | −0.06 |

(20) The products/services sold by this company are high quality | 34.46 | 0.18 | 0.64 | 10.74 | −0.07 |

## References

Allen, D. G., Shore, L. M., & Griffeth, R. W. (2003). “The role of perceived organizational support and supportive human resource practices in the turnover process”. Journal of Management, *29*, 99–118.

Ansari, A., & Jedidi, K. (2000). “Bayesian factor analysis for multilevel binary observations”. Psychometrika, *65*, 475–496.

Aquino, K. (2000). Structural and individual determinants of workplace victimization: The effects of hierarchical status and conflict management style. Journal of Management, *26*, 171–193.

Armstrong, G. D. (1981). Parametric statistics and ordinal data: A pervasive misconception. Nursing Research, *30*, 60–62.

Arrindell, W. A., & Van der Ende, J. (1985). An empirical test of the utility of the observations-to-variables ratio in factor and components analysis. Applied Psychological Measurement, *9*, 165–178.

Berger, J. O. (1985). Statistical decision theory and bayesian analysis, New York, NY: Springer.

Bradfield, M., & Aquino, K. (1999). The effects of blame attributions and offender likableness on forgiveness and revenge in the workplace. Journal of Management, *25*, 607–631.

Carifio, J. (1976). Assigning students to career education programs by preference: Scaling preference data for program assignments. Career Education Quarterly, *1*, 7–26.

Carifio, J. (1978). Measuring vocational preferences: Ranking versus categorical rating procedures. Career Education Quarterly, *3*, 34–66.

Carifio, J., & Perla, R. (2007). Ten common misunderstandings, misconceptions, persistent myths and urban legends about Likert scales and Likert response formats and their antidotes. Journal of Social Sciences, *3*, 106–116.

Carlson, D. S., & Perrewé, P. L. (1999). The role of social support in the stressor-strain relationship: an examination of work-family conflict. Journal of Management, *25*, 513–540.

Clinton, J. D., & Lewis, D. E. (2008). Expert opinion, agency characteristics, and agency preferences. Political Analysis, *16*, 3–20.

Comrey, A. L. (1978). Common methodological problems in factor analytic studies. Journal of Consulting and Clinical Psychology, *46*, 648–659.

Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysis practices in organizational research. Organizational Research Methods, *6*, 147–168.

Dakduk, S., Ter Horst, E., Santalla, Z., Molina, G., & Malavé, J. (2017). Customer behavior in electronic commerce: A Bayesian approach. Journal of Theoretical and Applied Electronic Commerce Research, *12*, 1–20.

Demo, G., & Rozzett, K. (2013). Customer relationship management scale for the business-to-consumer market: Exploratory and confirmatory validation and models comparison. International Business Research, *6*, 29–42.

Demo, G., Batelli, L., & Albuquerque, P. (2015). Customer relationship management scale for video games’ players: Exploratory and ordinal factor analysis. Revista Organizações Em Contexto, *11*, 285–312.

Demo, G., Rozzett, K., Fogaça, N., & Souza, T. (2018). Development and validation of a customer relationship scale for airline companies. Brazilian Business Review, *15*, 105–119.

Demo, G., Watanabe, E. A. D. M., Chauvet, D. C. V., & Rozzett, K. (2017). Customer relationship management scale for the B2C market: A cross-cultural comparison. RAM. Revista de Administração Mackenzie, *18*, 42–69.

Dunson, D. B. (2000). Bayesian latent variable models for clustered mixed outcomes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), *62*, 355–366.

Elenkov, D. S., & Manev, I. M. (2005). Top management leadership and influence on innovation: the role of sociocultural context. Journal of Management, *31*, 381–402.

Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, *4*, 272.

Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, *7*, 286.

Ford, J. K., MacCallum, R. C., & Tait, M. (1986). The application of exploratory factor analysis in applied psychology: a critical review And analysis. Personnel Psychology, *39*, 291–314.

Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian data analysis, New York, NY: CRC.

Göb, R., Mccollin, C., & Ramalhoto, M. (2007). Ordinal methodology in the analysis of likert scales. Quality & Quantity, *41*, 601–626.

Gopinath, C., & Becker, T. E. (2000). Communication, procedural justice, and employee attitudes: Relationships under conditions of divestiture. Journal of Management, *26*, 63–83.

Gorsuch, R.L. (1974). Factor analysis, Philadelphia: W. B. Sounders.

Gronroos, C. (2017). Relationship marketing readiness: Theoretical background and measurement directions. Journal of Services Marketing, *31*, 218–225.

Heidelberger, P., & Welch, P. D. (1981). A spectral method for confidence interval generation and run length control in simulations. Communications of the ACM, *24*, 233–245.

Hinkin, T. R. (1995). A review of scale development practices in the study of organizations. Journal of Management, *21*, 967–988.

Holgado-Tello, F. P., Chacón-Moscoso, S., Barbero-García, I., & Vila-Abad, E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality & Quantity, *44*, 153–166.

Howson, C., & Urbach, P. (2006). Scientific Reasoning: the bayesian Approach, 3rd ed., Peru, IL: Open Court.

Hui, C., & Lee, C. (2000). Moderating effects of organization-based self-esteem on organizational uncertainty: Employee response relationships. Journal of Management, *26*, 215–232.

Hurley, A. E., Scandura, T. A., Schriesheim, C. A., Brannick, M. T., Seers, A., Vandenberg, R. J., & Williams, L. J. (1997). Exploratory and confirmatory factor analysis: Guidelines, issues, and alternatives. Journal of Organizational Behavior, *18*, 667–683.

Jamieson, S. (2004). Likert scales: How to (ab)use them. Medical Education, *38*, 1217–1218.

Jöreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, *36*, 347–387.

Kidder, D. L. (2002). The influence of gender on the performance of organizational citizenship behaviors. Journal of Management, *28*, 629–648.

Kruschke, J. K., Aguinis, H., & Joo, H. (2012). The time has come: Bayesian methods for data analysis in the organizational sciences. Organizational Research Methods, *15*, 722–752.

Kuzon, W. M., Jr, Urbanchek, M. G., & McCabe, S. (1996). The seven deadly sins of statistical analysis. Annals of Plastic Surgery, *37*, 265–272.

Lee, H. B., & Comrey, A. L. (1992). A first course in factor analysis, 2nd ed., Hilldale, NJ: Erlbaum.

Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, *22*, 55.

Lohrke, F.T., Carson, C. M., & Lockamy, A. (2018). Bayesian analysis in entrepreneurship decision-making research: Review and future directions. Management Decision, *56*, 972–986.

Lubatkin, M. H., Simsek, Z., Ling, Y., & Veiga, J. F. (2006). Ambidexterity and performance in small-to medium-sized firms: The pivotal role of top management team behavioral integration. Journal of Management, *32*, 646–672.

MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, *4*, 84–99.

Malhotra, N. (1999). Marketing research: an applied orientation, 3rd ed., New York, NY: Prentice Hall.

Martilla, J. A., & Davis, W. C. (1975). Four subtle sins in marketing research. Journal of Marketing, *39*, 8–15.

Merkle, E. C., & Wang, T. (2018). Bayesian latent variable models for the analysis of experimental psychology data. Psychonomic Bulletin & Review, *25*, 256–270.

Mittal, V., & Kamakura, A. (2001). Satisfaction, repurchase intent, and repurchase behavior: Investigating the moderating effect of customer. Journal of Marketing Research, *38*, 131–142.

Norris, M., & Lecavalier, L. (2010). Evaluating the use of exploratory factor analysis in developmental disability psychological research. Journal of Autism and Developmental Disorders, *40*, 8–20.

Payne, A. (2012). Handbook of CRM: Achieving excellence in customer management, Oxford: Elsevier.

Plummer, M. Best, N. Cowles, K. Vines, K., & Plummer, M. M. (2007). The CODA package, France: International Agency for Research on Cancer. Retrieved from http://www-fis.iarc.fr/coda (accessed 30 September 2013).

Podsakoff, P. M., & Organ, D. W. (1986). Self-reports in organizational research: Problems and prospects. Journal of Management, *12*, 531–544.

Quinn, K.M. (2004). Bayesian factor analysis for mixed ordinal and continuous responses. Political Analysis, *12*, 338–353.

Ravald, A., & Grönroos, C. (1996). The value concept and relationship marketing. European Journal of Marketing, *30*, 19–30.

Rozzett, K., & Demo, G. (2010). Desenvolvimento e validação fatorial da escala de relacionamento com clientes (ERC). Revista de Administração de Empresas, *50*, 383–395.

Rummel, R. J. (1988). Applied factor analysis, 4th ed., Chicago: Northwestern University Press.

Samaniego, F.J. (2010). A comparison of the bayesian and frequentist approaches to estimation, New York, NY: Springer.

Scheines, R., Hoijtink, H., & Boomsma, A. (1999). Bayesian estimation and testing of structural equation models. Psychometrika, *64*, 37–52.

Schuler, R. S., & Jackson, S. E. (1989). Determinants of human resource management priorities and implications for industrial relations. Journal of Management, *15*, 89–99.

Scussel, F., & Demo, G. (2019). The relational aspects of luxury consumption in Brazil: The development of a luxury customer relationship perception scale and the analysis of brand personality influence on relationship perception on luxury fashion brands. Brazilian Business Review, *16*, 174–190.

Spearman, C. E. (1904). General intelligence objectively determined and measured. The American Journal of Psychology, *5*, 201–293.

Stevens, M. J., & Campion, M. A. (1999). Staffing work teams: Development and validation of a selection test for teamwork settings. Journal of Management, *25*, 207–228.

Stevens, S. S. (1946). On the theory of scales of measurement. Science, *103*, 677–680.

Suhr, D. D. (2006). Exploratory or confirmatory factor analysis?, Cary, NC: SAS Institute.

Thompson, B. (2004). Exploratory and confirmatory factor analysis: Understanding concepts and applications, Washington, DC: American Psychological Association.

Toedt, M. (2014). A model for loyalty in the context of customer relationship marketing. European Scientific Journal, *4*, 229–237.

Tsai, W. C. (2001). Determinants and consequences of employee displayed positive emotions. Journal of Management, *27*, 497–512.

Van de Schoot, R., Winter, S. D., Ryan, O., Zondervan-Zwijnenburg, M., & Depaoli, S. (2017). A systematic review of Bayesian articles in psychology: The last 25 years. Psychological Methods, *22*, 217.

Vigderhous, G. (1977). The level of measurement and “permissible” statistical analysis in social research. The Pacific Sociological Review, *20*, 61–72.

Zahra, S. A., Neubaum, D. O., & Huse, M. (2000). Entrepreneurship in medium-size companies: Exploring the effects of ownership and governance systems. Journal of Management, *26*, 947–976.

Zyphur, M. J., & Oswald, F. L. (2015). Bayesian estimation and inference: a user’s guide. Journal of Management, *41*, 390–420.

## Further reading

Wilson, E. J., & Vlosky, R. P. (1997). Partnering relationship activities: Building theory from case study research. Journal of Business Research, *39*, 59–70.

## Acknowledgements

Pedro Albuquerque lead on methodology and contributed to formal analysis. Gisela Demo lead on Data curation and contributed to writing the original draft and reviewing and editing. Solange Alfinito also contributed to writing the original draft and reviewing editing. Kesia Rozzett contributed to data curation and validation