A survey on measuring efficiency through the determination of the least distance in data envelopment analysis

Purpose – The purpose of this paper is to provide an outline of the major contributions in the literature on the determination of the least distance in data envelopment analysis (DEA). The focus herein is primarily on methodological developments. Specifically, attention is mainly paid to modeling aspects, computational features, the satisfaction of properties and duality. Finally, some promising avenues of future research on this topic are stated. Design/methodology/approach – DEA is a methodology based on mathematical programming for the assessment of relative efficiency of a set of decision-making units (DMUs) that use several inputs to produce several outputs. DEA is classified in the literature as a non-parametric method because it does not assume a particular functional form for the underlying production function and presents, in this sense, some outstanding properties: the efficiency of firms may be evaluated independently on the market prices of the inputs used and outputs produced; it may be easily used with multiple inputs and outputs; a single score of efficiency for each assessed organization is obtained; this technique ranks organizations based on relative efficiency; and finally, it yields benchmarking information. DEA models provide both benchmarking information and efficiency scores for each of the evaluated units when it is applied to a dataset of observations and variables (inputs and outputs). Without a doubt, this benchmarking information gives DEA a distinct advantage over other efficiency methodologies, such as stochastic frontier analysis (SFA). Technical inefficiency is typically measured in DEA as the distance between the observed unit and a “benchmarking” target on the estimated piece-wise linear efficient frontier. The choice of this target is critical for assessing the potential performance of each DMU in the sample, as well as for providing information on how to increase its performance. However, traditional DEA models yield targets that are determined by the “furthest” efficient projection to the evaluated DMU. The projected point on the efficient frontier obtained as such may not be a representative projection for the judged unit, and consequently, some authors in the literature have suggested determining closest targets instead. The general argument behind this idea is that closer targets suggest directions of enhancement for the inputs and outputs of the inefficient units that may © Juan Aparicio. Published in Journal of Centrum Cathedra: The Business and Economics Research Journal. Published by Emerald Group Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and noncommercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode The author would like to thank Vincent Charles for the invitation to write this survey. Additionally, this work was supported by the Spanish Ministry for Economy and Competitiveness [grant number MTM2013-43903-P]. This article is part of a special issue Guest Edited by Rajiv D. Banker and Vincent Charles. The current issue and full text archive of this journal is available on Emerald Insight at: www.emeraldinsight.com/1851-6599.htm Data envelopment analysis


Introduction
Data envelopment analysis (DEA) is a mathematical programming, non-parametric technique commonly used to measure the relative performance of a set of homogeneous processing units, which use several inputs to produce several outputs.These operating units are usually called decision-making units (DMUs) in recognition of their autonomy in setting their input and output levels.In contrast to usual parametric techniques, as stochastic frontier analysis (SFA), DEA does not need to suppose a particular functional form for the production function, technical efficiency may be easily evaluated with multiple inputs and outputs, and it also produces relevant benchmarking information from a managerial point of view.In particular, DEA provides both input and output efficient targets, the coordinates of the projection point on the estimated efficient frontier and peers, the efficient observed DMUs that serve as benchmarks for each of the assessed units.
From its introduction, DEA has witnessed the definition of many different technical efficiency measures: radial (equiproportional) and non-radial, oriented and non-oriented and so on.Overall, all of them are based upon the determination of a "distance" from the evaluated unit to a target efficient point (projection point) on the piece-wise linear efficient frontier associated with DEA.The selection of a suitable projection point is a very important procedure in DEA for evaluating the potential performance of each assessed unit, as it can provide key benchmarking information on how inefficient DMUs could reach the status of efficient.As we will mention in Section 2, the efficiency measures defined in the first 20 years of life of DEA seek, in some sense, determining the projection points that are located as far as possible from the DMU evaluated.In this way, the benchmarking information that is indirectly generated by these traditional DEA models is related to the identification of the "furthest" efficient targets in inputs and outputs.This also means that traditional DEA measures maximize the total technical effort associated with the evaluated unit to reach the efficient frontier.Instead, it seems more natural to assume that inefficient firms/organizations (DMUs, in general) apply a principle of least action (PLA) with the aim of being technically efficient.Otherwise, inefficient units would need to make an extra effort, decreasing inputs and/or increasing outputs, to reach the frontier.The application of this "natural" PLA is linked to the determination of the closest targets on the efficient frontier of the corresponding DEA production possibility set.
Additionally, the determination of closest targets is connected to the calculation of the least distance from the evaluated unit to the efficient frontier of the reference technology.In fact, the former is usually computed through solving mathematical programming models associated with minimizing some type of distance (e.g.Euclidean).In this particular respect, the main contribution in the literature is the paper by Briec (1998) on Hölder distance functions, where formally technical inefficiency to the "weakly" efficient frontier is defined through mathematical distances.
All the interesting features of the determination of closest targets from a benchmarking point of view have generated, in recent times, the increasing interest of researchers in the calculation of the least distance to evaluate technical inefficiency (Aparicio et al., 2014a).So, in this paper, we present a general classification of published contributions, mainly from a methodological perspective; additionally, we indicate avenues for further research on this topic.
The approaches that we cite in this paper differ in the way that the idea of similarity is made operative.Similarity is, in this sense, implemented as the closeness between the values of the inputs and/or outputs of the assessed units and those of the obtained projections on the frontier of the reference production possibility set.Similarity may be measured through multiple distances and efficiency measures.In turn, the aim is to globally minimize DEA model slacks to determine the closest efficient targets.However, as we will show later in the text, minimizing a mathematical distance in DEA is not an easy task, as it is equivalent to minimizing the distance to the complement of a polyhedral set, which is not a convex set.This complexity will justify the existence of different alternatives for solving these types of models.
The paper unfolds as follows: in the following section, we state the main differences between the DEA approaches based on furthest targets and those based upon the determination of closest targets.In Section 3, we show the existing ways of computing the least distance and review the most important references on this issue.In Section 4, we discuss which properties the new approach satisfies, focusing our attention particularly on monotonicity.Section 5 contains the key existing results on how to measure and decompose economic inefficiency through least distance.Finally, Section 6 draws conclusions.

Furthest versus closest targets
In this section, we first revise both the traditional DEA literature, focusing our attention on the defined efficiency measures in the first years of life of DEA, and the literature devoted to the application of the PLA.Additionally, we compare graphically and numerically both approaches, showing that it is possible to find substantial differences between the targets provided by applying the criterion used by the traditional DEA models and those obtained when the criterion of closeness is utilized.What we call "traditional" DEA models in this survey are, of course, the radial and the directional distance function models, together with the Russell input and output measures of technical efficiency and their graph extension; the Russell graph measure of technical efficiency (Färe et al., 1985), the additive model (Charnes et al., 1985), the range-adjusted measure (Cooper et al., 1999) and the enhanced Russell graph (Pastor et al., 1999) or slacks-based measure (SBM; Tone, 2001), to name but a few.All these measures maximize, in some sense, a weighted sum of input and output slacks or do that in a second stage, as happens with the radial models.This means that the projection points yielded by them can be considered as the furthest efficient targets in contrast to the new philosophy that advocates the determination of the closest ones.
In standard microeconomic theory, firm's economic behavior is usually characterized by cost minimization, revenue maximization or profit maximization.The choice of a firm's specific approach depends, in part, on what assumptions one is willing to make.From a purely technical point of view, when market prices are not available or they make no sense, for example, for non-profit organizations, an alternative behavior is that associated with the PLA.The PLA, a well-known law in physics, states that nature always finds the most efficient course of action.The historical origin of this concept can be traced back at least to Pierre Louis Maupertuis and Leonhard Euler in the eighteenth century.In our context, the PLA is equivalent to minimizing the total technical effort of the assessed unit to become technically efficient.Total technical effort reflects a change in the inputs and outputs required by an inefficient DMU to reach the efficient frontier.In this way, the application of the PLA always yields the efficient targets associated with the least technical effort (see Aparicio et al., 2014c).
Regarding the revision of the literature linked to the determination of the application of the PLA, Charnes et al. (1996) was the first approach where this philosophy could be utilized, in this case, for assessing sensitivity in efficiency classification in DEA.Additionally, Coelli (1998) suggested a modification of the well-known second stage for radial models with the aim of seeking targets as similar as possible to the original assessed unit in an oriented context.It is also worth mentioning that Walter Briec presents a rich line of research covering theoretical aspects of this new philosophy.In particular, Briec (1998), Briec and Lesourd (1999) and Briec and Lemaire (1999) determined the least distance to the weakly efficient subset of the production possibility set using Hölder norms.Frei and Harker (1999) and Cooper et al. (2000, pp. 60-61) also suggested resorting to Hölder norms.Specifically, the Euclidean distance in the case of Frei and Harker (1999) and a weighted ϱ -distance in the case of Cooper et al. (2000).Moreover, Cherchye and Van Puyenbroeck (2001) defined the deviation between mixes in an oriented-space framework as the angle between the input vector of the evaluated unit and its projection point on the frontier, which corresponds to maximizing the corresponding cosine to identify the closest targets.Regarding other types of DEA technical efficiency measures, Gonzalez and Alvarez (2001) maximized the input-oriented Russell efficiency measure, when the traditional input-oriented Russell efficiency measure is really minimized.This means that the Gonzalez and Alvarez approach allows to determine the closest efficient targets in an oriented setting.Takeda and Nishino (2001) utilized techniques to evaluate sensitivity in efficiency classification based on Hölder norms following the aforementioned paper by Charnes et al. (1996). Later, Silva Portela et al. (2003) established that closeness or similarity may be measured with different distances and efficiency models in DEA.Consequently, the general aim must be to globally minimize the input and output slacks regardless of the DEA measure finally utilized.Lozano and Villa (2005) suggested a method that, step by step, determines a sequence of targets to be achieved in successive stages, which finally converge to the efficient frontier.More recently, Aparicio et al. (2007) identified closest targets for a dataset of international airlines, applying a new version of the SBM, showing that large differences between the targets provided by applying the criterion used by the traditional DEA models and those obtained when the PLA is utilized can be found in practice.Liu and Peng (2008) applied an approach based on the least distance to rank efficient units by means of a common set of weights.Furthermore, other authors have focused their analysis on the Euclidean distance, such as Baek and Lee (2009), Suzuki et al. (2010), Amirteimoori and Kordrostami (2010) and Aparicio and Pastor (2014a).In particular, Suzuki et al. (2010) introduced a new distance friction minimization method in the context of DEA to generate an appropriate (non-radial) efficiency-improving projection model, for both input reduction and output increase.Additionally, Fukuyama et al. (2014b) focused on ratio-form efficiency measures and Jahanshahloo et al. (2013) introduced the directional closest-target-based measures of efficiency, mixing Hölder norms and directional distance functions in DEA.Finally, Fukuyama et al. (2016) applied the PLA on Free Disposal Hull technologies.
One of the key derivatives of an efficiency assessment through the application of DEA techniques is the identification of targets.However, as we mentioned beforehand, a flaw of traditional DEA models is that they aim at maximizing input and output slacks, finding targets and peers that are not the closest to the DMUs being assessed.Next, we will try to illustrate the main differences between the two existing philosophies in DEA: one devoted to the determination of furthest targets and the other based on closest targets.
Let us now introduce some notation.Consider that we have observed n DMUs that use m inputs to produce s outputs.These are denoted by (X j , Y j ), j ϭ 1, …, n.It is assumed X j ϭ (x 1j , …, x mj ) Ͼ 0 m , j ϭ 1, …, n and Y j ϭ (y 1j , …, y sj ) Ͼ 0 s , j ϭ 1,…, n.The relative efficiency of each DMU 0 in the sample is assessed with reference to the so-called production possibility set T ϭ ͓(X,Y) Ն 0 mϩs :Y cancbe produced from X͔, which can be empirically constructed in DEA from the n observations by assuming several postulates (see Banker et al., 1984).If, in particular, variable returns to scale (VRS) is assumed, then it can be characterized as follows: The weakly efficient frontier of T is defined as Koopmans (1951), to measure technical efficiency in the Pareto sense, it is necessary to isolate a certain subset of Ѩ W (T). We are referring to the strongly efficient frontier, defined as follows: is the set of all the Pareto-Koopmans efficient points of T.
Next, we show the same inefficiency measure under two different perspectives: the traditional approach, based on furthest targets (FT) and the least distance approach, based upon closest targets (CT).In doing so, we focus our analysis on the well-known Euclidean distance.Before showing the definition of that measure of technical inefficiency for the two different philosophies, it is worth mentioning that it can be done under two additional perspectives.On the one hand, underperforming may be computed applying the Debreu-Farrell notion of technical inefficiency.In this case, the weakly efficient frontier would play a key role.On the other hand, technical inefficiency could be calculated resorting to the notion of Pareto-efficiency (Koopmans, 1951).In this case, a subset of the weakly efficient frontier, i.e. the strongly efficient frontier, should be utilized.Seeking simplicity, let us now focus our attention on the strongly efficient frontier Ѩ S (T).In this way, definition of the measure based on the determination of the furthest targets would be as follows: (2)   while the definition related to the identification of the closest targets would be as follows: Model (2) maximizes the objective function, and therefore, it maximizes the individual slacks.In this way, the yielded projection (target) point, ( ,…, y s0 ϩ s s ϩ* ) , will be in general far from the DMU (X 0 ,Y 0 ) under evaluation, where * hereinafter denotes optimality.
The objective function used in Model (3) is the same as that utilized by Model (2) with a striking difference: Model (3) minimizes the objective function, and consequently, it minimizes the individual slacks, instead of maximizing them as Model (2) does.Consequently, although the benchmarking of Model (2) provides the furthest targets of the assessed unit, Model (3) yields the most easily attainable targets.
Graphically, the differences between the two approaches can be illustrated by Figure 1.In this example, DMU D is evaluated through the two philosophies: D FT 2 (X 0 ,Y 0 ) and D CT 2 (X 0 ,Y 0 ).In the first case, Model (2) generates unit B as the target, while for the second case, Model (3) yields the (unobserved) point D* as target.Clearly, D* is closer than unit B to DMU D. Anyway, it is worth mentioning that one could also argue that, for certain situations (e.g.skill requirement for employees or market size of customers), sometimes it may be easier to focus only on input reduction or output increase (see Aparicio et al., 2016).These significant differences can also be observed when the weighted additive model is used under the two philosophies.To evaluate the level of technical inefficiency of DMU 0 with data (X 0 ,Y 0 ) applying the traditional approach, one can solve the following weighted additive model (Lovell and Pastor, 1995). where are weights representing the relative importance of unit inputs and unit outputs.Different paths can be followed in choosing such weights.One possibility selects them based on data and information.In this way, it is possible to achieve a dimensionless optimal value in Model (4), in the terminology followed by Lovell and Pastor (1995).Another possibility is to consider fixed values reflecting considerations not embodied in the data.In this sense, weights can represent value judgment provided by managers or policymakers.It is worth mentioning that there is a distinguished list of different weighted additive models.Among them, we want to highlight the following: the measure of inefficiency proportions (see Cooper et al., 1999) considering , where 1/X 0 ϭ (1/x 10 , …, 1/x m0 ) and 1/Y 0 ϭ (1/y 10 , …, 1/y s0 ); the range adjusted measure of inefficiency (see Cooper et al., 1999) considering ͕ y rj ͖; the bounded adjusted measure of inefficiency (see Cooper et al., 2011) considering where Y ˉϭ ( ȳ1 , …, ȳs ) with ȳr ϭ max 1ՅjՅn ͕y rj ͖, r ϭ 1, …, s; the normalized weighted additive model (see Lovell and Pastor, 1995) is the vector of sample standard deviations of inputs and ϩ ϭ ( 1 ϩ , …, s ϩ ) is the vector of sample standard deviations of outputs.
Additionally, Model (4) "maximizes" the weighted 1 distance from the assessed DMU 0 to the frontier of the production possibility set, thereby increasing outputs and reducing inputs at the same time.
It is not difficult to define a version of the weighted additive model based on the determination of the least distance.In particular, let us write that model in a compact way: Model (5) determines the weighted 1 distance from (X 0 ,Y 0 ) to the set of points belonging to the strongly efficient frontier that, at the same time, dominates DMU 0 in the sense of Pareto.Later on, we will show that Model (5) is not equivalent to Model (4), substituting "Max" by "Min" in the objective function.
Regarding other measures in DEA, if we focus on the SBM of Tone (2001), which is a measure of technical efficiency instead of technical inefficiency, then "Min" and "Max" must be exchanged in the models.Let us show that.
In the case of considering the traditional SBM, we have the following model (in a compact format): Model ( 6) minimizes the objective function, and then, by the sign of the coefficients of each slack in the objective function, it maximizes the individual slacks.In this way, the yielded projection point will be far from the DMU under evaluation.In contrast, following the PLA, we have an alternative for determining closest targets through the evaluation of the following model: Note that, as we pointed out, the objective function used in Model ( 7) is now maximized instead of minimized to determine the closest targets on the strongly efficient frontier.In this way, Model (7) generates the most easily attainable efficient targets.
A comparison between the results yielded by Models ( 6) and ( 7) was carried out in Aparicio et al. (2007) using a real database.In particular, Aparicio et al. (2007) found substantial dissimilarities between the targets provided by applying the criterion used by the traditional DEA models and those obtained when the PLA is considered for finding projection points on the strongly efficient frontier.In particular, they applied models ( 6) and ( 7) to a dataset on airlines.Specifically, a set of 28 international airlines from North America, Europe and Asia-Australia was assessed.For each airline, two outputs (passenger-kilometres flown, PASS, and freight tonne-kilometres flown, CARGO) and four inputs (number of employees, LAB, FUEL, other kind of inputs excluding labor and fuel expenses, MATL, and Capital, CAP) were considered.As for the results, Models ( 6) and ( 7) detect the same airlines as Pareto-efficient (9 of 28).This is no accident.Both models will always identify the same set of efficient DMUs.Moreover, Aparicio et al. (2007) observed large differences between the optimal values of Models ( 6) and ( 7).An example of that is AUSTRIA airlines ( ⌫ FT SBM (X 0 ,Y 0 ) ϭ 0.290 vs ⌫ CT SBM (X 0 ,Y 0 ) ϭ 0.769).Regarding the reason, the traditional SBM is maximizing the slacks, which in the application led to extremely large targets for the output CARGO.Additionally, the improvement percentages suggested for this output for USAIR was 961 per cent, and for EASTERN, it was 742 per cent under the traditional approach.Consequently, in this real example, the traditional SBM suggests such high improvement percentages in CARGO that they probably cannot be assumed by the assessed airlines.However, the improvement percentages suggested by the PLA are considerably less exacting: USAIR (38 per cent) and EASTERN (9 per cent).
In general terms, Aparicio et al. (2007) detected substantial differences between the targets determined by resorting to a criterion of either minimization or maximization, which shows that some of the airlines may reach the strongly efficient frontier with less technical effort than that suggested by the classical approach.Additionally, in the application, the traditional SBM suggested the need for considerable enhancements in some variables for units that are actually near the strongly efficient frontier.That is the case of, for example, CANADIAN.For this airline, the solution associated with Model (7) showed that this airline would become Pareto-efficient with practically reducing FUEL by 27 per cent, whereas the targets provided by the traditional SBM, Model (6), requires this DMU to decrease LAB by 35 per cent and FUEL by 21 per cent and to increase CARGO by 87 per cent.So, the targets associated with the optimal solution of the 151 Data envelopment analysis traditional SBM are more demanding than those corresponding to the application of the PLA.
In view of the preceding discussion, one conclusion is that the determination of the least distance or, what is equivalent, the application of the PLA, can be considered a suitable benchmarking tool in DEA.
In what follows, we will survey the main results regarding modeling, computational aspects and the satisfaction of some properties of interest from a mathematical and economical point of view.

Modeling and computational aspects
In contrast to models that determine the furthest targets, there is a stream in DEA literature that defends just the opposite, i.e. the projected points on the efficient frontier obtained as such are not a suitable representative projection for the evaluated unit.However, the implementation of the new philosophy based upon the PLA is not as easy as replacing "Max" by "Min", for example, in Model (4).The determination of the least distance and closest targets is a hard task from a computational point of view.This difficulty results from the complexity of determining the least distance to the frontier of a DEA technology from an interior point, as this problem is equivalent to minimizing a convex function on the complement of a convex set.
To illustrate that we cannot change "Max" by "Min" in Model ( 4) and get directly something mathematically equivalent to (5), let us consider Figure 2. In this case, 4) when DMU D is evaluated.However, if we minimize instead of maximize in program Model (4), then it is also an optimal solution because the objective function is lower-bounded by zero.But then, the projection point, the target, would be unit D, which is an Another usual misspecification of the model is that in which the convex combinations are restricted to the set of extreme efficient DMUs (Units A, B and C in Figure 2) instead of the whole sample.However, the same simple example is valid to prove that again it is not enough to substitute "Max" by "Min" in Model (4).In this respect, note that DMU D is a convex combination of Units A and C. Therefore, A ϭ 2/3, B ϭ 0, C ϭ 1/3 , s D Ϫ ϭ s D ϩ ϭ 0 is an optimal solution of Model (4) with "Min" when DMU D is assessed.So, the projection point is once more DMU D, an interior point of the production possibility set.
Nowadays, there are principally two paths for obtaining the least distance in the DEA literature.The first one is based on identifying all the faces of the efficient frontier of the polyhedral DEA technology in a first stage, determining the least distance as the minimum of the distances to each of the faces in a multi-stage process.In this way, this first path is related to a combinatorial non-deterministic polynomial-time hard (NP-hard) problem and has been followed by Cherchye andVan Puyenbroeck (2001) andSilva Portela et al. (2003).Regarding its implementation in practice, there are several possibilities.Each of them is associated with a way of identifying all the efficient faces of the DEA technology.Chronologically speaking, Paradi and Pille (1997) proposed an algorithm for identifying all the faces in DEA to calculate two mathematical norms: 1 and 2 .Their algorithm is based upon choosing all the combinations of extreme efficient DMUs and calculating the average point for each of them.Then, the (original) additive model (Charnes et al., 1985) is applied to check whether this "virtual" point is on the efficient frontier or not.So, the corresponding distances are determined by projecting each evaluated DMU to each previously identified face, and selecting that with a minimum value regarding the proposed distance.Silva Portela et al. (2003) proposed a different procedure to identify all the face-generating supporting hyperplanes of the considered technology: using the Qhull software (see Barber et al., 1996).Qhull is a general tool that has been developed to yield the convex hull of a dataset.In particular, it can be used to identify the supporting hyperplane equations for the efficient faces in DEA (Olesen and Petersen, 2003).However, we believe that Qhull is not user-friendly software in the case of DEA because it is not software designed exclusively for DEA practitioners.As far as we are aware, there is no software that executes the described multistage procedure in just one step.In the same vein, a group of researchers have tried to determine the efficient faces of the production possibility set in DEA through algorithms with different characteristics.This is the case of Jahanshahloo et al. (2005Jahanshahloo et al. ( , 2007Jahanshahloo et al. ( and 2010)).
The second path corresponds to the approach proposed by Aparicio et al. (2007), where the strongly efficient frontier is characterized by linear constraints and binary variables, which consequently allows the closest targets to be determined without explicitly calculating all the efficient faces by resorting to mixed integer linear programming (MILP).Next, we show the main result of Aparicio et al. (2007).Nevertheless, we first need to introduce the dual linear program of Model (4), as primal and dual programs associated with the weighted additive model will allow us to explain the Aparicio et al. result in a better way.
The linear dual of Model (4) can be written as follows: 153 Data envelopment analysis The following theorem characterizes the strongly efficient points that, additionally, dominates DMU 0 (X 0 ,Y 0 ).Theorem 1 (Aparicio et al., 2007).
Let S(X 0 ,Y 0 ;T): ,∀i, y r Ն y r0 ,∀r͖ be the set of strongly efficient points in T dominating (X 0 ,Y 0 ) in the sense of Pareto.Then, (X,Y) ʦ S(X 0 ,Y 0 ;T) if and only if ∃, s Ϫ , s ϩ , v, u, ␣, d, b such that x i ϭ ͚ jϭ1 n j x ij , ∀i, y r ϭ ͚ jϭ1 n j y rj , ∀r and where M is a sufficiently big positive number.
The points satisfying the constraints in Theorem 1 are those of the technology dominating the evaluated DMU that may be expressed as a convex combination of units lying on the same efficient face of the production possibility set.Also, since v i Ն w i Ϫ Ͼ 0, i ϭ 1, …, m, and u r Ն w r ϩ Ͼ 0, r ϭ 1, …, s, we have that the corresponding convex combinations of these units belong to a Pareto-efficient face of the technology.Additionally, the importance of the theorem lies in the fact that the set of feasible points at which the minimum distance to the Pareto-efficient frontier is achieved, can be represented through a set of "linear" constraints.This allows overcoming the computational difficulties associated with the non-convexity when dealing with the problem of minimizing a mathematical distance or maximizing an efficiency measure to the efficient frontier.
For applying Aparicio et al.'s approach to find the desired least distance, the practitioner only needs to specify how to measure the similarity between its inputs and outputs and the targets (a mathematical distance or an efficiency measure), as these will be determined as the optimal solution of a mathematical programming problem that minimizes (maximizes) the selected distance (efficiency measure) subject to the linear constraints that were shown in Theorem 1.
In particular, invoking Theorem 1, it is not difficult to formulate a MILP program equivalent, in optimal solutions and value, to Model (5), which seeks to determine the least distance and closest targets: Model (10) resorts to a big M.This type of resource in Mathematical Programming is very standard.Indeed, its value in practice is usually determined by the coefficients of some specific constraint of the model or some additional information, which permit fixing a particular value for it when the problem has to be implemented through some optimizer.However, this is not the case for Model (10).In our context, the value of M can be determined once we have identified all the efficient faces of the DEA technology,

155
Data envelopment analysis something that we would like to avoid.Nevertheless, it is not real problem from a computational point of view because constraints (9.9)-(9-13) are equivalent to j d j ϭ 0, j Ն 0, d j Ն 0, j ϭ 1, …, n, which can be implemented by means of a special ordered set (SOS) (Beale and Tomlin, 1970).SOS is a way to specify that a pair of variables cannot take strictly positive values at the same time and is a technique related to using special branching strategies.Traditionally, SOS was used with discrete and integer variables, but modern optimizers, for example, CPLEX, also use SOS with continuous variables.
Although Aparicio et al.'s result works well for the graph context, where inputs and outputs are changed at the same time, Aparicio et al. (2016) have recently shown that all existing approaches to determine the closest Pareto-efficient targets in DEA, even Aparicio et al. (2007), present some weaknesses when they are applied or adapted to the "oriented" framework, i.e. when the interest of the firm/organization is to expand its output bundle without requiring any increase in its inputs or to contract its input bundle without requiring a reduction in its outputs.In particular, they have proved that Theorem 1 can still be applied but under restrictive conditions: output (input) oriented, multiple outputs (inputs), one input (output) and constant returns to scale (CRS).To deal with the oriented problem in a suitable way, a new methodology based upon bilevel linear programming was introduced by Aparicio et al. (2016) to determine the desired targets.Its implementation is particularly rooted in the application of the Karush-Kuhn-Tucker optimality conditions to the lower-level problem and special ordered sets.
Regarding existing alternatives in the literature for determining the least distance, we have mentioned the main two.Nevertheless, other authors have introduced specific procedures to apply the PLA on the strongly efficient frontier.Unfortunately, these approaches do not perform correctly.On the one hand, Frei and Harker (1999) proposed an algorithm to obtain all the efficient facets in DEA based upon the optimal shadow prices identified for each DMU.However, this methodology only partially describes the real frontier because it does not apply an exhaustive procedure of searching.On the other hand, in the context of measuring technical efficiency through an input-oriented model, Gonzalez and Alvarez (2001) suggested minimizing the sum of all the input-specific contractions to reach the strongly efficient frontier: GA I (X 0 ,Y 0 ) ϭ min ͕ ͚ iϭ1 m (1 Ϫ i ): ( 1 x 10 , …, m x m0 )ʦѨ s L(Y 0 ), i Յ 1,∀i͖, where Ѩ s L(Y 0 ) is the strongly efficient frontier of the input requirement set.Note that this approach is related to the determination of closest targets under the Pareto-Koopmans criterion of technical efficiency because the model prefers values for i close to one.To implement their new model, Gonzalez and Alvarez introduced a multistage process based on the solution, in the first stage, of m linear models, each of them providing the k-th input-specific contraction.In the second stage, the desired value is obtained as the minimum of all the input-specific contraction determined previously (see Gonzalez and Alvarez, 2001, P1, p. 517).Unfortunately, this algorithm does not always lead to the correct solution, as is shown in Aparicio et al. (2016) by means of a counterexample.
If we focus our attention on the identification of the least distance to the "weakly" efficient frontier, then we can find two procedures, associated with the 1 and ϱ metrics, that correctly lead to the desired results (see Briec, 1998).Regarding the 1 metric, it is calculated through the resolution of m ϩ s linear programs, many as the number of variables.Each program coincides with a specific directional distance function (Chambers et al., 1998) that uses a directional vector with all components equal to zero except for the dimension (variable) being evaluated.As for the ϱ metric, it can be computed by means of the resolution of a directional distance function that utilizes a directional vector with all components equal to one.
In view of the preceding discussion, from a computational point of view, the determination of the least distance in DEA has not yet been satisfactorily solved, and consequently, the effort to apply new methods to overcome the problem is, therefore, justified.In this respect, other related papers are those by Martinez-Moreno et al. (2013), Lopez-Espin et al. (2014), Aparicio et al. (2014b) and Gonzalez et al. (2015), who apply genetic algorithms, meta-heuristics and parallel programming for determining closest efficient targets in DEA.

Properties of the new approach
Other alternative techniques for measuring efficiency, such as SFA, which is grounded on statistical tools, may check the goodness of fit of the proposed model by statistical tests (ANOVA, R 2 , etc.).In contrast, DEA lacks a goodness of fit tool.
The way of checking the goodness of the measures of efficiency in DEA is establishing a list of properties that the measure should satisfy a priori.In the context of efficiency measurement, Färe and Lovell (1978) were the first who proposed a set of desirable properties that an ideal efficiency measure must meet.Later, Cooper et al. (1999) and Pastor et al. (1999) stated similar requirements and suggested some others.Specifically, the main properties are as follows: (P1) the measure should be between zero and one, with one meaning full-efficiency; (P2) the assessed DMU is Pareto-Koopmans efficient if and only if the measure takes a value of one; (P3) units invariant; and (P4) strong monotonicity.
We particularly want to highlight the concept of strong monotonicity in this section because it is an indispensable property for any technical efficiency measure.Strong monotonicity relates the notion of efficiency to Pareto optimality.In detail, if DMU A dominates DMU B, in the Pareto sense, then the measure of technical efficiency associated to A should be strictly greater than the measure of technical efficiency of B or, equivalently, the measure of technical inefficiency associated to A should be strictly less than the measure of technical inefficiency of B. Indeed, two ways to formulate this property exist: a strong and a weak version.Next, we show their formal definitions. Definition Next, we introduce a relaxed version of Definition 1.
From the abovementioned definitions, if D satisfies strong monotonicity, then it is also weakly monotonic, but the opposite is not true.
To revise the literature devoted to these properties, we first need to introduce some particular definition.
The Hölder norms p (pʦ͓1,ϱ͔) are defined over a g-dimensional real normed space as follows: 157 Data envelopment analysis where Z ϭ (z 1 , …, z g ) ʦ R g .From Model (11), Briec (1998) define the Hölder distance function for DMU 0 with vector of inputs and outputs (X 0 ,Y 0 ) as follows: We will call "weak" Hölder distance function to D Ѩ w( T) p (X 0 ,Y 0 ) to distinguish this notion from one where Ѩ w (T) is substituted by Ѩ s (T) in Model ( 12).Accordingly, it is possible to define strong Hölder distance functions as Now, we are ready to revise the first results on monotonicity.In particular, Briec (1998, Proposition 1(3)) proved the following "natural" statement: weak Hölder distance functions meet weak monotonicity.At the same time, it is not hard to show that weak Hölder distance functions do not satisfy Definition 1. Accordingly, an interesting question is this: are strong Hölder distance functions strongly monotonic?In this respect, Baek and Lee (2009) were the first to attempt to deal with this problem.They defined an inefficiency measure based on the Euclidean distance and tried to show that this measure satisfied strong monotonicity.These authors thought that they had succeeded in proving the property.Unfortunately, strong monotonicity is not satisfied by Baek and Lee's approach, as Pastor and Aparicio (2010) showed by means of a numerical counterexample.Their counterexample also shows that the measure is not even weakly monotonic.Nevertheless, to state a counterexample of this type, it is necessary to deviate from the simple situation of a single input and a single output (at least we need three dimensions), as in this case, strong monotonicity is satisfied by the strong Hölder distance functions, as we prove next.
P1.If mϭsϭ1, then the strong Hölder distance functions satisfy strong monotonicity.
(3) If y 10 Ͼ ỹ11 , we have that ∃( X ˉ,Y 0 ) ʦ Ѩ s (T) with X ˉϭ inf ͕X: (X,Y 0 ) ʦ T͖ under VRS and CRS in two dimensions.In this case, we may carry out the same steps as in ii) and find a contradiction.
For p ϭ ϱ, it is possible to adapt the above procedure and achieve the same result.
Following the line of research of Baek and Lee (2009) and Pastor and Aparicio (2010), Ando et al. (2012) showed that, in general, strong Hölder distance functions do not meet strong monotonicity and suggested a solution for satisfying at least weak monotonicity.Their approach was based on modifying the usual definition of efficiency measure in DEA by not allowing the final projection point on the frontier to dominate the evaluated unit (X 0 ,Y 0 ).Their model is as follows: where T(X 0 ,Y 0 ) ϭ ͕(X 0 ϩ D x ,Y 0 Ϫ D y ): D x Ն 0, D y Ն 0͖ .
Note that D Ѩ s( T) p (X 0 ,Y 0 ) Ն f Ѩ s( T) p (X 0 ,Y 0 ) and the final projection point, (X=,Y=), is determined by minimizing the distance from (X ˆ,Y ˆ), a "virtual" point that is dominated by (X 0 ,Y 0 ) , to the strongly efficient frontier Ѩ s (T).
Later, Fukuyama et al. (2014a) proved strong monotonicity of strong Hölder distance functions resorting to an adaptation of Ando et al.'s methodology.In contrast to Ando et al. (2012) and Fukuyama et al. (2014a), where the usual definition of the Hölder distance functions is modified, Aparicio and Pastor (2013) and Aparicio and Pastor (2014b) have suggested an alternative possibility, based on the extension of the facets of the production possibility set.In particular, Aparicio and Pastor (2013) proved that the output-oriented version of the Russell measure (Färe et al., 1985) is a well-defined efficiency measure, satisfying strong monotonicity on the strongly efficient frontier, if efficiency is evaluated with respect to an extended facet production possibility set based on full dimensional efficient facets (FDEF) instead of standard DEA technology.Figures 3 and 4 illustrate the idea behind this method graphically.In Figure 3, two FDEFs appear, AB and BC, which are "extended" in Figure 4 to generate a new production possibility set that contains the original one.Additionally, Aparicio and Pastor (2014b) showed that the monotonicity property fails because this problem is related to the fact that, in general, not all the facets of the production possibility set are FDEFs.In other words, this drawback is associated with the dimensionality of the strongly efficient frontier.It is worth 159 Data envelopment analysis mentioning that these methods, which are founded on the extensions of facets, assume that there is at least one FDEF.However, it is not always true and depends on the geometrical configuration of the observed data.
As a conclusion of this section, let us highlight the main difference between the approaches proposed by Ando et al. (2012) and Fukuyama et al. (2014a) and those suggested  by Aparicio andPastor (2013 and2014b).The first ones use the standard DEA production possibility set but need to modify the traditional definition of efficiency.In contrast, Aparicio andPastor (2013 and2014b) resort to the traditional definition of efficiency but need to change the standard DEA technology by extending efficient facets.

Duality
In view of the content of the preceding sections, it seems that the problem of deriving the least distance is one of the relevant issues in recent DEA literature.However, one of the challenges related to this issue that has not yet been fully solved is the extension of this approach to other different frameworks as, for instance, the decomposition of overall inefficiency into technical and allocative inefficiencies.In for-profit organizations the measurement of overall inefficiency in the context of performance evaluation, in addition to the estimation of technical inefficiency, is generally the most important objective.A firm is usually interested in changing input and output quantities if this leads to real economic gains.In this sense, overall inefficiency measures how close the firm is to the optimal behavior given input and output market prices, while technical inefficiency measures how close the firm is to the frontier of the technology, being the second a traditional component of the first one (see Farrell, 1957).In this way, the firm's overall inefficiency has usually been decomposed into technical and allocative (price) inefficiency in the literature.
In DEA, it is well-known that no measure satisfies all desirable properties for measuring technical efficiency (Russell and Schworm, 2009).In this way, practitioners must choose between several "imperfect" options to evaluate technical efficiency.This is true if the focus is placed on treating the technical efficiency measures as being completely independent from prices and concepts of economic efficiency.However, Russell (1985) shows that if the existence of a dual relationship with the profit, cost or revenue functions is required as an axiom, then Shephard's distance functions are the adequate selection between all the alternatives because they are the natural dual to the usual measures of economic efficiency, allowing decomposition of the economic efficiency index.Consequently, the existence of a relationship between the technical efficiency measure and a measure of economic efficiency can be considered as an additional property to be satisfied, thereby linking this section with the previous one in this paper.
Throughout this section, let us assume that DMUs face exogenously determined market input and output prices and that the aim of each unit is to choose the input and output combinations that result in the maximum profit.
In this way, let us denote the value of the profit function hereafter as ⌸(C, Q) given input and output price vectors The usual profit inefficiency measures that can be found in the literature compares the observed profit for DMU 0 with vector of inputs and outputs (X 0 ,Y 0 ), ⌸ 0 : ϭ ͚ rϭ1 s q r y r0 Ϫ ͚ iϭ1 m c i x i0 , with the maximum level of profit ⌸(C, Q).There are two different ways to evaluate this economic inefficiency.On the one hand, we could use a ratio-form measure as ⌸ 0 /⌸(C, Q).However, this expression is only well-defined for ⌸(C, Q) 0 , a scenario that cannot always be guaranteed in practice.On the other hand, the economic 161 Data envelopment analysis loss could be determined by means of the use of a difference measure as ⌸(C, Q) Ϫ ⌸ 0 .This measure takes always non-negative values and a value of zero is related to nil economic inefficiency.Regarding this difference measure, Nerlove (1965) was the first to suggest the difference between optimal and actual profit as a measure of overall inefficiency.Unfortunately, as Nerlove (1965, p. 94) pointed out, this alternative has one notable drawback.In particular, it is not a suitable economic measure because it is homogeneous of degree one in prices.One solution to this problem was proposed by Chambers et al. (1998) for the directional distance function approach, using a particular deflator: the value of a reference bundle of inputs and outputs.
We now turn to the approaches that have related the notions of least distance and the profit function.As far as we are aware, there is only one paper in the literature that has considered this problem: Briec and Lesourd (1999).These authors resort to Hölder norms for measuring technical inefficiency but using the weakly efficient frontier as a reference set instead of the strongly efficient.We are referring, therefore, to the "weak" Hölder distance functions.
After introducing some notation and definitions, we are ready to show how we can derive a difference measure of profit inefficiency from a duality result proven in Briec and Lesourd (1999).
P2. (Briec and lesourd, 1999) Let (X 0 ,Y 0 ) be an input-output vector in T. Let t be the dual space of p with 1/p ϩ 1/t ϭ 1.Then, Proof.See Proposition 3.2 in Briec and lesourd (1999).By P2, it is obvious that if the input-output market prices, (C, Q), are such that (C, Q) Ն 1, then ⌸(C, Q) Ϫ ( ͚ rϭ1 s q r y r0 Ϫ ͚ iϭ1 m c i x i0 ) ϭ ⌸(C, Q) Ϫ ⌸ 0 Ն D Ѩ w( T) p (X 0 ,Y 0 ) for DMU 0 .In this way, we would get the usual difference measure of profit inefficiency on the left hand side of the inequality and the weak Hölder distance function on the right hand side, showing that it is possible to decompose overall inefficiency through D Ѩ w( T) p (X 0 ,Y 0 ).However, as Nerlove (1965) pointed out, the term ⌸(C, Q) Ϫ ⌸ 0 should be deflated to obtain an appropriate measure of profit inefficiency.Accordingly, we propose the following solution.
So, the term ⌸(C, Q) Ϫ ( ͚ rϭ1 s q r y r0 Ϫ ͚ iϭ1 m c i x i0 )/(C, Q) t may be considered a (deflated) measure of profit inefficiency, whereas D Ѩ w( T) p (X 0 ,Y 0 ) would be its technical inefficiency component and ⌸(C, Q) Ϫ ( ͚ rϭ1 s q r y r0 Ϫ ͚ iϭ1 m c i x i0 )/(C, Q) t Ϫ D Ѩ w( T) p (X 0 , Y 0 ) , the residual term, would be the allocative or price inefficiency component, following Farrell's tradition.

Conclusions and further research
The determination of least distance and closest targets have attracted significant attention over the past few years in the literature on efficiency measurement.A growing list of contributions has appeared especially focused on methodological aspects of this issue.Considerable attention has been paid to how to calculate the least distance, to both the weakly and strongly efficient frontier, and the satisfaction of the property of monotonicity.In this paper, we reviewed the state of the research and classified the different contributions.
To finish the paper, we state some promising avenues of future research on this methodology.
• From a computational point of view, the determination of the least distance has still not been satisfactorily resolved, and consequently, the application of new methods to solve the problem is justified.It particularly seems necessary to apply methodologies that avoid determining all the efficient faces of the production possibility set.In this respect, a paper comparing the execution times of existing approaches would also be of interest.• Current methodologies lack real applications demonstrating their capability, mainly from a benchmarking perspective.Most contributions are methodological.Papers showing the goodness-of-fit techniques with real databases are clearly needed.• Regarding the satisfaction of the property of monotonicity, some questions still remain open.For example, what happens with respect to other technical efficiency measures, like the SBM, when versions based upon the PLA are defined?Is there a "mathematical" distance (different from the Hölder metrics) that satisfies strong monotonicity on the strongly efficient frontier?In the same line, is there a new technical efficiency measure in standard DEA that satisfies strong monotonicity on the strongly efficient frontier?• P2, which allows an inequality for p between profit inefficiency and technical inefficiency to be established, is true under the assumption of measuring inefficiency with respect to the weakly efficient frontier.The natural question is: Is this result true in the case of substituting the weakly efficient frontier by the strongly efficient frontier?Research needs to be directed at developing new measures of overall inefficiency capable of dealing with the notion of Pareto-efficiency and the determination of least distance when technical inefficiency is evaluated.• Finally, the techniques based on the PLA should be extended to other areas in efficiency measurement, for instance, the determination of productivity change over time and its decomposition into its different sources.
167 Data envelopment analysis Figure 1.Furthest vs closest targets Figure 2. Example of internal projection Figure 3. Example of FDEFs