Choice Modelling: The State-of-the-art and The State-of-practice

Cover of Choice Modelling: The State-of-the-art and The State-of-practice

Proceedings from the Inaugural International Choice Modelling Conference



Table of contents

(28 chapters)



This paper discusses the influence of human sociality on choice behavior, through association with social networks and the influence of these networks on constraints, perceptions, preferences, and decision-making processes. The paper discusses ways to incorporate these factors into choice models, while retaining the aspects of the theory of individual rationality that are predictive. Finally, the paper outlines an econometric method for solving the “reflection problem” of determining whether social affiliations follow preferences, or preferences follow social affiliations, by distinguishing opportunity-based and preference-based motivations for association with social networks.


Purpose: This chapter introduces a choice modeling framework that explicitly represents the planning and action stages of the choice process.

Methodology: A discussion of evidence from behavioral research is followed by the development of a discrete choice modeling framework with explicit planning and action submodels. The plan/action choice model is formulated for both static and dynamic contexts; where the latter is based on the Hidden Markov Model. Plans are often unobservable and are treated as latent variables in model estimation using observed actions.

Implications: By modeling the interactions between the planning and action stages, we are able to incorporate richer specifications in choice models with better predictive and policy analysis capabilities. The applications of this research in areas such as driving behavior, route choice, and mode choice demonstrate the advantages of the plan/action model in comparison to a “black box” choice model in terms of improved microsimulations of behaviors that better represent real-life situations. As such, the outcomes of this chapter are relevant to researchers and policy analysts.


It has long been recognised that humans draw from a large pool of processing aids to help manage the everyday challenges of life. It is not uncommon to observe individuals adopting simplifying strategies when faced with ever increasing amounts of information to process, and especially for decisions where the chosen outcome will have a very marginal impact on their well-being. The transactions costs associated with processing all new information often exceed the benefits from such a comprehensive review. The accumulating life experiences of individuals are also often brought to bear as reference points to assist in selectively evaluating information placed in front of them. These features of human processing and cognition are not new to the broad literature on judgment and decision-making, where heuristics are offered up as deliberative analytic procedures intentionally designed to simplify choice. What is surprising is the limited recognition of heuristics that individuals use to process the attributes in stated choice experiments. In this paper we present a case for a utility-based framework within which some appealing processing strategies are embedded (without the aid of supplementary self-stated intentions), as well as models conditioned on self-stated intentions represented as single items of process advice, and illustrate the implications on willingness to pay for travel time savings of embedding each heuristic in the choice process. Given the controversy surrounding the reliability of self-stated intentions, we introduce a framework in which mixtures of process advice embedded within a belief function might be used in future empirical studies to condition choice, as a way of increasingly judging the strength of the evidence.


Many consumer choice situations are characterized by the simultaneous demand for multiple alternatives that are imperfect substitutes for one another. A simple and parsimonious multiple discrete-continuous extreme value (MDCEV) econometric approach to handle such multiple discreteness was formulated by Bhat (2005) within the broader Kuhn–Tucker (KT) multiple discrete-continuous economic consumer demand model of Wales and Woodland (1983). In this chapter, the focus is on presenting the basic MDCEV model structure, discussing its estimation and use in prediction, formulating extensions of the basic MDCEV structure, and presenting applications of the model. The paper examines several issues associated with the MDCEV model and other extant KT multiple discrete-continuous models. Specifically, the paper discusses the utility function form that enables clarity in the role of each parameter in the utility specification, presents identification considerations associated with both the utility functional form as well as the stochastic nature of the utility specification, extends the MDCEV model to the case of price variation across goods and to general error covariance structures, discusses the relationship between earlier KT-based multiple discrete-continuous models, and illustrates the many technical nuances and identification considerations of the multiple discrete-continuous model structure. Finally, we discuss the many applications of MDCEV model and its extensions in various fields.


Facial expression recognition by human observers is affected by subjective components. Indeed there is no ground truth. We have developed Discrete Choice Models (DCM) to capture the human perception of facial expressions. In a first step, the static case is treated, that is modelling perception of facial images. Image information is extracted using a computer vision tool called Active Appearance Model (AAM). DCMs attributes are based on the Facial Action Coding System (FACS), Expression Descriptive Units (EDUs) and outputs of AAM. Some behavioural data have been collected using an Internet survey, where respondents are asked to label facial images from the Cohn–Kanade database with expressions. Different models were estimated by likelihood maximization using the obtained data. In a second step, the proposed static discrete choice framework is extended to the dynamic case, which considers facial video instead of images. The model theory is described and another Internet survey is currently conducted in order to obtain expressions labels on videos. In this second Internet survey, videos come from the Cohn–Kanade database and the Facial Expressions and Emotions Database (FEED).



Stated choice experiments can be used to estimate the parameters in discrete choice models by showing hypothetical choice situations to respondents. These attribute levels in each choice situation are determined by an underlying experimental design. Often, an orthogonal design is used, although recent studies have shown that better experimental designs exist, such as efficient designs. These designs provide more reliable parameter estimates. However, they require prior information about the parameter values, which is often not readily available. Serial efficient designs are proposed in this paper in which the design is updated during the survey. In contrast to adaptive conjoint, serial conjoint only changes the design across respondents, not within-respondent thereby avoiding endogeneity bias as much as possible. After each respondent, new parameters are estimated and used as priors for generating a new efficient design. Results using the multinomial logit model show that using such a serial design, using zero initial prior values, provides the same reliability of the parameter estimates as the best efficient design (based on the true parameters). Any possible bias can be avoided by using an orthogonal design for the first few respondents. Serial designs do not suffer from misspecification of the priors as they are continuously updated. The disadvantage is the extra implementation cost of an automated parameter estimation and design generation procedure in the survey. Also, the respondents have to be surveyed in mostly serial fashion instead of all parallel.


Currently, the state of practice in experimental design centres on orthogonal designs (Alpizar et al., 2003), which are suitable when applied to surveys with a large sample size. In a stated choice experiment involving interdependent freight stakeholders in Sydney (see Hensher & Puckett, 2007; Puckett et al., 2007; Puckett & Hensher, 2008), one significant empirical constraint was difficult in recruiting unique decision-making groups to participate. The expected relatively small sample size led us to seek an alternative experimental design. That is, we decided to construct an optimal design that utilised extant information regarding the preferences and experiences of respondents, to achieve statistically significant parameter estimates under a relatively low sample size (see Bliemer & Rose, 2006).

The D-efficient experimental design developed for the study is unique, in that it centred on the choices of interdependent respondents. Hence, the generation of the design had to account for the preferences of two distinct classes of decision makers: buyers and sellers of road freight transport. This paper discusses the process by which these (non-coincident) preferences were used to seed the generation of the experimental design, and then examines the relative power of the design through an extensive bootstrap analysis of increasingly restricted sample sizes for both decision-making classes in the sample. We demonstrate the strong potential for efficient designs to achieve empirical goals under sampling constraints, whilst identifying limitations to their power as sample size decreases.


There have always been concerns about task complexity and respondent burden in the context of stated choice (SC) studies, with calls to limit the number of alternatives, attributes and choice sets. At the same time, some researchers have also made the case that too simplistic a design might be counterproductive given that such designs may result in issues of omitting important decision variables. This paper aims to take another look at the effects of design complexity on model results. Specifically, we make use of an approach devised by Hensher (2004)1 in which different respondents in the study are presented with designs of different complexity, and look specifically at effects on model scale in a UK context, adding to previous Chilean evidence by Caussade et al. (2005). The results of our study indicate that the impact of design complexity may be somewhat lower than anticipated, and that more complex designs may not necessarily lead to poorer results. In fact, some of the more complex designs lead to higher scale in the models. Overall, our findings suggest that respondents can cope adequately with large number of attributes, alternatives and choice sets. The implications for practical research are potentially significant, given the widespread use, especially in Europe, of stated choice designs with a limited number of alternatives and attributes.


In this paper, we analyze statistical properties of stated choice experimental designs when model attributes are functions of several design attributes. The scheduling model is taken as an example. This model is frequently used for estimating the willingness to pay (WTP) for a reduction in schedule delay early and schedule delay late. These WTP values can be used to calculate the costs of travel time variability. We apply the theoretical results to the scheduling model and design the choice experiment using measures of efficiency (S-efficiency and WTP-efficiency). In the simulation exercise, we show that the designs based on these efficiency criteria perform on average better than the designs used in the literature in terms of the WTP for travel time, schedule delay early, and schedule delay late variables. However, the gains in efficiency decrease in the number of respondents. Surprisingly, the orthogonal design performs rather well in the example we demonstrated.



Mixed logit models can represent heterogeneity across individuals, in both observed and unobserved preferences, but require computationally expensive calculations to compute probabilities. A few methods for including error covariance heterogeneity in a closed form models have been proposed, and this paper adds to that collection, introducing a new form of a Network GEV model that sub-parameterizes the allocation values for the assignment of alternatives (and sub-nests) to nests. This change allows the incorporation of systematic (nonrandom) error covariance heterogeneity across individuals, while maintaining a closed form for the calculation of choice probabilities. Also explored is a latent class model of nested models, which can similarly express heterogeneity. The heterogeneous models are compared to a similar model with homogeneous covariance in a realistic scenario, and are shown to significantly outperform the homogeneous model, and the level of improvement is especially large in certain market segments. The results also suggest that the two heterogeneous models introduced herein may be functionally equivalent.


The search for flexible models has led the simple multinomial logit model to evolve into the powerful but computationally very demanding mixed multinomial logit (MMNL) model. That flexibility search lead to discrete choice hybrid choice models (HCMs) formulations that explicitly incorporate psychological factors affecting decision making in order to enhance the behavioral representation of the choice process. It expands on standard choice models by including attitudes, opinions, and perceptions as psychometric latent variables.

In this paper we describe the classical estimation technique for a simulated maximum likelihood (SML) solution of the HCM. To show its feasibility, we apply it to data of stated personal vehicle choices made by Canadian consumers when faced with technological innovations.

We then go beyond classical methods, and estimate the HCM using a hierarchical Bayesian approach that exploits HCM Gibbs sampling considering both a probit and a MMNL discrete choice kernel. We then carry out a Monte Carlo experiment to test how the HCM Gibbs sampler works in practice. To our knowledge, this is the first practical application of HCM Bayesian estimation.

We show that although HCM joint estimation requires the evaluation of complex multi-dimensional integrals, SML can be successfully implemented. The HCM framework not only proves to be capable of introducing latent variables, but also makes it possible to tackle the problem of measurement errors in variables in a very natural way. We also show that working with Bayesian methods has the potential to break down the complexity of classical estimation.


In previous research (Abou-Zeid et al., 2008), we postulated that people report different levels of travel happiness under routine and nonroutine conditions and supported this hypothesis through an experiment requiring habitual car drivers to switch temporarily to public transportation. This chapter develops a general modeling framework that extends random utility models by using happiness measures as indicators of utility in addition to the standard choice indicators, and applies the framework to modeling happiness and travel mode switching using the data collected in the experiment. The model consists of structural equations for pretreatment (remembered) and posttreatment (decision) utilities and explicitly represents their correlations, and measurement equations expressing the choice and the pretreatment and posttreatment happiness measures as a function of the corresponding utilities. The results of the empirical model are preliminary but support the premise that the extended modeling framework, which includes happiness, will potentially enhance behavioral models based on random utility theory by making them more efficient.


This paper deals with choice set generation for the estimation of route choice models. Two different frameworks are presented in the literature: one aims at generating consideration sets and one samples alternatives from the set of all paths. Most algorithms are designed to generate consideration sets but fail in general to do so because some observed paths are not generated. In the sampling approach, the observed path as well as all considered paths is in the choice set by design. However, few algorithms can be actually used in the sampling context.

In this paper, we present the two frameworks, with an emphasis on the sampling approach, and discuss the applicability of existing algorithms to each of the frameworks.



It is often found that the value of travel time (VTT) is higher for car drivers than for public transport passengers. This paper examines the possible explanation that the difference could be due to a selection effect. The result is an inability to measure the effect of a mode difference, e.g., comfort, among transport modes. We specify a model that captures the mode difference through a mode dummy and use econometric techniques that allow treatment of the mode dummy as the result of an individual choice and hence endogenous. Using first a standard logit model we find a large and significant difference between the VTT for bus and car. When we control for endogeneity using instruments, the mode dummy becomes smaller and just significant. Our investigation is novel in that it allows for endogeneity in the estimation of VTT but like other applications using instruments the results indicate that we have difficulty in finding good instrumental variables.


The presence of respondents with apparently extreme sensitivities in choice data may have an important influence on model results, yet their role is rarely assessed or even explored. Irrespective of whether such outliers are due to genuine preference expressions, their presence suggests that specifications relying on preference heterogeneity may be more appropriate. In this paper, we compare the potential of discrete and continuous mixture distributions in identifying and accommodating extreme coefficient values. To test our methodology, we use five stated preference datasets (four simulated and one real). The real data were collected to estimate the existence value of rare and endangered fish species in Ireland.


Endogeneity or nonorthogonality in discrete choice models occurs when the systematic part of the utility is correlated with the error term. Under this misspecification, the model's estimators are inconsistent. When endogeneity occurs at the level of each observation, the principal technique used to treat for it is the control-function method, where a function that accounts for the endogenous part of the error term is constructed and is then included as an additional variable in the choice model. Alternatively, the latent-variable method can also address endogeneity. In this case, the omitted quality attribute is considered as a latent variable and modeled as a function of observed variables and/or measured through indicators. The link between the controlfunction and the latent-variable methods in the correction for endogeneity has not been established in previous work. This paper analyzes the similarities and differences between a set of variations of both methods, establishes the formal link between them in the correction for endogeneity, and illustrates their properties using a Monte Carlo experiment. The paper concludes with suggestions for future lines of research in this area.


This article addresses simultaneously two important features in random utility maximisation (RUM) choice modelling: choice set generation and unobserved taste heterogeneity. It is proposed to develop and to compare definitions and properties of econometric specifications that are based on mixed logit (MXL) and latent class logit (LCL) RUM models in the additional presence of prior compensatory screening decision rules. The latter allow for continuous latent bounds that determine choice alternatives to be or not to be considered for decision making. It is also proposed to evaluate and to test each against the other ones in an application to home-to-work mode choice in the Paris region of France using 2002 data.



This paper presents new evidence that the error in estimating the economic welfare of a transport scheme can be very large. This is for two reasons. Firstly when cost changes are large the income effect can be significant. This means the change in consumer surplus is no longer a good estimate of the compensating variation — the true measure of welfare benefit. Secondly, in the presence of large cost changes estimating the change in consumer surplus using the Rule of Half can lead to large errors. The paper uses a novel approach based on stated choice and contingent valuation data to estimate the size of this error for the situation of the provision of fixed links to islands in the Outer Hebrides of Scotland.


Interest in car-sharing initiatives, as a tool for improving transport network efficiency in urban areas and on interurban links, has grown in recent years. They have often been proposed as a more cost effective alternative to other modal shift and congestion relief initiatives, such as public transport or highway improvement schemes; however, with little implementation in practice, practitioners have only limited evidence for assessing their likely impacts.

This study reports the findings of a Stated Preference (SP) study aimed at understanding the value that car drivers put on car sharing as opposed to single occupancy trips. Following an initial pilot period, 673 responses were received from a web-based survey conducted in June 2008 amongst a representative sample of car driving commuters in Scotland.

An important methodological aspect of this study was the need to account for differences in behaviour to identify those market segments with the greatest propensity to car share. To this end, we estimated a range of choice model forms and compared the ability of each to consistently identify individual behaviours. More specifically, this included a comparison of:

Standard market segmentation approaches based on multinomial logit with attribute coefficients estimated by reported characteristics (e.g. age, income, etc.);

A two-stage mixed logit approach involving the estimation of random parameters logit models followed by an examination of individual respondent's choices to arrive at estimates of their parameters, conditional on know distributions across the population (following Revelt & Train, 1999); and

A latent-class model involving the specification of C classes of respondent, each with their own coefficients, and assigning each individual a probability that they belongs to a given class based upon their observed choices, socioeconomic characteristics and their reported attitudes.

As hypothesised, there are significant variations in tastes and preferences across market segments, particularly for household car ownership, gender, age group, interest in car pooling, current journey time and sharing with a stranger (as opposed to family member/friend). Comparing the sensitivity of demand to a change from a single occupancy to a car-sharing trip, it can be seen that the latter imposes a ‘penalty’ equivalent to 29.85 IVT minutes using the mixed logit structure and 26.68 IVT minutes for the multinomial specification. Segmenting this latter value according to the number of cars owner per household results in ‘penalties’ equivalent to 46.51 and 26.42 IVT minutes for one and two plus car owning households respectively.


Discrete choice models based on cross-sectional data have the important limitation of not considering habit and inertia effects and this may be especially significant in changing environments; notwithstanding, most demand models to date have been based on this type of data. To avoid this limitation, we started by building a mode choice panel around a drastically changing environment, the introduction of a radically new public transport system for the conurbation of Santiago de Chile. This paper presents the formulation and estimation of a family of discrete choice models that enables to treat two main elements: (i) the relative values of the modal attributes, as usual, and (ii) the shock resulting from the introduction of this radical new policy. We also analyse the influence of socioeconomic variables in these two forces.

We found that introducing this drastic new policy may even modify the perception of attribute values; in fact, the changes can be different among individuals, as socioeconomic characteristics act as either enhancers or softeners of the shock effects generated by the new policy.


We review what is known and what is still unknown about the process of revealing the impact of unreliability on travel choices. We do this from the perspective of a demand-modelling practitioner who wishes to allow for the benefits from improved reliability in the assessment of a transport scheme. We discuss the travel responses affected by unreliability, the requirements from the data used to model these responses, the explanatory variables used in these models and the additional information required as input when applying them. One of our findings is that there is a conflict between existing studies in their conclusions about the aversion to early arrival. Another notion is that it is unclear whether the common simplified treatment of the distribution of preferred arrival times is acceptable. We also suggest that the dominance of departure time shifting as a primary response to unreliability might refute the common assumptions about travellers' choice hierarchy, which was established without considering the impact of unreliability; this raises questions about the robustness of assignment models that do not allow time shifting.



Discrete choice models are widely used for estimating the effects of changes in attributes on a given product's likely market share. These models can be applied directly to situations in which the choice set is constant across the market of interest or in which the choice set varies systematically across the market. In both of these applications, the models are used to determine the effects of different attribute levels on market shares among the available alternatives, given predetermined choice sets, or of varying the choice set in a straightforward way.

Discrete choice models can also be used to identify the “optimal” configuration of a product or service in a given market. This can be computationally challenging when preferences vary with respect to the ordering of levels within an attribute as well the strengths of preferences across attributes. However, this type of optimization can be a relatively straightforward extension of the typical discrete choice model application.

In this paper, we describe two applications that use discrete choice methods to provide a more robust metric for use in Total Unduplicated Reach and Frequency (TURF) applications: apparel and food products. Both applications involve products for which there is a high degree of heterogeneity in preferences among consumers.

We further discuss a significant challenge in using TURF — that with multi-attributed products the method can become computationally intractable — and describe a heuristic approach to support food and apparel applications. We conclude with a summary of the challenges in these applications, which are yet to be addressed.


An assumption made in many applications of stated preference modeling is that preferences remain stable over time and over multiple exposures to information about choice alternatives. However, there are many domains where this assumption can be challenged. One of these is where individuals learn about new products. This paper aims to test how attribute preferences as measured in an experimental choice task shift when respondents are exposed to new product information. The paper presents results from a study investigating consumer preferences for a new consumer electronics product conducted among 400 respondents from a large consumer panel. All respondents received several choice tasks and were then able to read additional information about the new product. After this they completed an additional set of choice tasks. All choices were from pairs of new product alternatives that varied across eight attributes designed according to an orthogonal plan. Using heteroscedastic logit modeling, the paper analyses the shifts in attribute utilities and scale variances that result from the exposure to product information. Results show that as respondents become better informed about a new attribute the attribute has a greater influence on their choices. In addition a significant shift in scale variance is observed, suggesting an increase in preference heterogeneity after information exposure.


We investigate discrepancies between willingness to pay (WTP) and willingness to accept (WTA) in the context of a stated choice experiment. Using data on customer preferences for water services where respondents were able to both ‘sell’ and ‘buy’ the choice experiment attributes, we find evidence of non-linearity in the underlying utility function even though the range of attribute levels is relatively small. Our results reveal the presence of significant loss aversion in all the attributes, including price. We find the WTP–WTA schedule to be asymmetric around the current provision level and that the WTP–WTA ratio varies according to the particular provision change under consideration. Such reference point findings are of direct importance for practitioners and decision-makers using choice experiments for economic appraisal such as cost–benefit analysis, where failure to account for non-linearity in welfare estimates may significantly over- or under-state individual's preferences for gains and avoiding losses respectively.


Ranked preference data arise when a set of judges rank, in order of their preference, a set of objects. Such data arise in preferential voting systems and market research surveys. Covariate data associated with the judges are also often recorded. Such covariate data should be used in conjunction with preference data when drawing inferences about judges.

To cluster a population of judges, the population is modeled as a collection of homogeneous groups. The Plackett-Luce model for ranked data is employed to model a judge's ranked preferences within a group. A mixture of Plackett- Luce models is employed to model the population of judges, where each component in the mixture represents a group of judges.

Mixture of experts models provide a framework in which covariates are included in mixture models. Covariates are included through the mixing proportions and the component density parameters. A mixture of experts model for ranked preference data is developed by combining a mixture of experts model and a mixture of Plackett-Luce models. Particular attention is given to the manner in which covariates enter the model. The mixing proportions and group specific parameters are potentially dependent on covariates. Model selection procedures are employed to choose optimal models.

Model parameters are estimated via the ‘EMM algorithm’, a hybrid of the expectation–maximization and the minorization–maximization algorithms. Examples are provided through a menu survey and through Irish election data. Results indicate mixture modeling using covariates is insightful when examining a population of judges who express preferences.


This paper applies the mixed logit and the latent class models to analyse the heterogeneity in foreign investment location choices in Central and Eastern Europe. The empirical results show that the responsiveness of the probabilities of choices to invest in a particular location to country-level variables differs both across sectors and across firms of different characteristics. The paper highlights the superiority of the latent class model with regards to the model fit and the interpretation of results.


This paper introduces a behavioral framework to model residential relocation decision in island areas, at which the decision in question is influenced by the characteristics of island regions, policy variables related to accessibility measures, and housing prices at the proposed island area, as well as personal, household (HH), job, and latent characteristics of the decision makers.

The model framework corresponds to an integrated choice and latent variable (ICLV) setting where the discrete choice model includes latent variables that capture attitudes and perceptions of the decision makers. The latent variable model is composed of a group of structural equations describing the latent variables as a function of observable exogenous variables and a group of measurement equations, linking the latent variables to observable indicators.

An empirical study has been developed for the Greek Aegean island area. Data were collected from 900 HHs in Greece contacted via telephone. The HHs were presented hypothetical scenarios involving policy variables, where 2010 was the reference year. ICLV binary logit (BL) and mixed binary logit (MBL) relocation choice models were estimated sequentially. Findings suggest that MBL models are superior to BL models, while both the policy and the latent variables significantly affect the relocation decision and improve considerably the models' goodness of fit. Sample enumeration method is finally used to aggregate the results over the Greek population.

Cover of Choice Modelling: The State-of-the-art and The State-of-practice
Publication date