Is auditing the new evaluation? Can it be? Should it be?

Jon Pierre (Department of Political Science, University of Gothenburg, Gothenburg, Sweden)
B. Guy Peters (Department of Political Science, University of Pittsburgh, Pittsburgh, Pennsylvania, USA)
Jenny de Fine Licht (Department of Political Science, University of Gothenburg, Gothenburg, Sweden)

International Journal of Public Sector Management

ISSN: 0951-3558

Publication date: 13 August 2018



The purpose of this paper is to study the changing relationship between auditing and evaluation. Over the past several years, supreme auditing institutions (SAIs) in a number of advanced democracies have evolved from conventional auditing institutions to becoming increasingly concerned with assisting policy change and administrative reform in the public sector; tasks that are traditionally associated with evaluation. The paper discusses the potential consequences of this development for the SAIs themselves as well as for the audited and reforming institutions and for policy-making.


The paper uses qualitative method and draws on the extensive literature on auditing and evaluation. The analysis has also benefitted from the authors’ recent comparative research on SAIs.


The findings, summarized in six points, are that the growth of auditing in areas previously assigned to evaluators, has led to a shortened time perspective; stronger emphasis on the administration of policies; increased focus on efficiency of the audited entity; greater independence from the evaluated organizations; a shift in receiver of information toward the legislature and/or the public; and improved communication.

Practical implications

Evaluation as a professional and scholarly field has developed theories and advanced methods to assess the effectiveness of public programs. The growth of auditing may thus change the focus and quality of policy evaluation.


The paper speaks to both scholars and practitioners. To the best of the knowledge a similar analysis has not been done before.



Pierre, J., Peters, B. and de Fine Licht, J. (2018), "Is auditing the new evaluation? Can it be? Should it be?", International Journal of Public Sector Management, Vol. 31 No. 6, pp. 726-739.

Download as .RIS



Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited

Auditing, evaluation and public management reform

This paper reflects on the developments in the scope, role and impact strategy of auditing in the public sector over the past several decades and how this development has reshaped some elements of the policy process. It argues that the increasing value and significance associated with auditing from the 1980s onwards has placed it in a position in the policy process previously held by evaluation as an integral stage of that process, possibly to the detriment of both auditing and evaluation (see Roberts and Pollitt, 1994). The evidence used will be drawn primarily from advanced western democracies such as the Scandinavian countries, the USA and the antipodes with which we have the greatest familiarity. That said, the trends discussed here appear more universal and could be identified in a number of settings, and we have included some evidence from other settings.

While conventional auditing certainly has an important evaluative component, evaluation – sustained by a host of literature, scholarly experts and professional organizations – developed extensive expertise in different types of evaluation which were of very little interest to auditors. More recently, performance auditing has expanded its agenda to include policy and regulatory evaluation; areas where conventional auditing expertise is of only limited help. More specifically, these changes in the cast of institutions responsible for assessing the performance of public organizations and in guiding changes involves at least five important changes. First, the time perspective on change may have been shortened. Second, the emphasis appears to be more on the administration of policies, rather than on the design of those policies. Third, while conventional evaluation aims at informing future policy-making, supreme auditing institutions (SAI) mission is to conduct auditing with a view of increasing the efficiency of the audited entity. Fourth, SAIs have much greater independence from the evaluated organizations than evaluators tend to have in relationship to the institutions or programs they assess. Finally, the client for the assessment of the organizations is now more commonly the legislature rather than the executive itself. These points will be elaborated later in the paper.

SAIs are well-established parts of the institutional setup in almost all countries. While SAIs in the western world are typically seen as critical to effective accountability of government and the bureaucracy, SAIs play essential roles in less democratic societies as well (Pollitt et al., 1999). With 194 members and another five associated members, INTOSAI, the global organization for SAIs, demonstrates that auditing is essential to all forms of government.

If auditing is a perennial component of government we also note that the prominence and significance of auditing offices has varied considerably over time. Auditing in the public sector was long seen as an important, albeit not the most exciting, component of government. Toiling in anonymity characterized a period when auditing was mainly concerned with financial auditing, ensuring that public money was spent prudently and wisely. This was also a time when auditing reports rarely appeared in newspaper headlines or parliamentary debates; neither did auditor’s reports have any noticeable impact on policy, administration or administrative reform. There was very little creative about this type of auditing; it was an ex post facto process ensuring that public servants had acted within their mandate and in accordance with regulatory frameworks.

The 1980s and 1990s (with some variation in different national contexts) witnessed what Michael Power (1994) labeled the “audit explosion”; a surge in audits in all corners of the public sphere, challenging conventional mechanisms of administrative control and accountability. More than anything else, Power argued, auditing “must be understood as an idea […] that is internal to the ways in which practitioners and policy-makers make sense of what they are doing.” The main purpose of auditing in this perspective was not so much to ensure prudent spending of public resources as had been the practice in financial auditing. Now, as performance auditing gained prominence, the issue was more “about ways of talking about administrative control,” a new way of setting government’s administrative priorities and strategies, and ultimately a new type of objectives of government (Power, 1994, pp. 4-5; italics in original).

The “audit explosion” impacted across the public sector financial land audit system. The objective was to strengthen accountability, increase control over public borrowing, and improve the quality of public services. It transformed audit from a passive, reactive exercise to more proactive and prescriptive control over actors in the public sector; urged public organizations to conform to the requirements of performance auditing, i.e. to make themselves auditable; and challenged the development of international standards in public sector accounting (Brusca et al., 2018).

The expansion of performance audits gave performance auditing a more direct and active role in administrative reform. While “the auditor of the 1960s and 1970s was largely unthinkable of as a possible agent of public sector reform” (Power, 2005, p. 328) this role became now more common. The scope of auditing also changed. While financial auditing remained important, performance auditing gained increasing prominence (Barzelay, 1997; Pollitt and Summa, 1996). To some extent, this development was related to the growing interest in performance measurement and management in most western countries as part of the New Public Management reform drive (Romzek, 2000).

The chief purpose of the “new” performance auditing is not just to audit but also to assist the audited entity to reform. Impact (preferably measurable) of auditing has become a more important objective and auditors and regulators tend to rely less on conventional rules and “sticks” and instead are more inclined to use incentives and advice to shape the behavior of the audited entity. This process has been underway for some time among regulators[1]. There now appears to be a similar reassessment with regard to performance auditing, at least in some jurisdictions. It is a shift from emphasizing means to emphasizing ends.

The development in which performance auditing adopts parts of the role previously played by evaluation in the policy process is potentially damaging to the conventional contributions of both auditing and evaluation to that process. Both provide policy-makers and public sector managers with essential information and both represent distinct professions each with their own discourse and epistemic communities. However, while auditing has increased its focus on performance auditing and in many countries become an agent of administrative reform, conventional evaluation has become marginalized (Roberts and Pollitt, 1994). Critics of this development have argued that performance auditing suffers from a lack of clear standards and diffuse professional interests (Schwarz and Mayne, 2005, pp. 241-244) and that conceptually, performance auditing is “a misnomer for a class of mainly evaluative review activities” (Barzelay, 1997, p. 235).

The paper critically reviews the consequences of the development wherein auditing increasingly adopts key features of evaluation. The next section briefly discusses the key features of traditional evaluation and then contrasts those features with the design and policy role of contemporary auditing. From there the paper assesses whether auditing could be seen as “the new evaluation” and, if so, what the main consequences of that development might be.

Revisiting evaluation – what was so good about the good old days?

The heyday of policy evaluation was in the 1960s and 1970s. The increased concern with evaluation at the time represented the confluence of several factors in politics and in the social sciences. Perhaps the most important factor was the continued expansion of public programs and the then unbridled faith that, given enough resources and authority, the public sector could solve social problems. Even in countries with a history of extreme skepticism about the capacity of government to solve problems, there were significant investments in public policies; the “Great Society” in the USA is the most obvious example (for an overview, see Pattyn et al., 2018).

The second factor that influenced the development and institutionalization of evaluation at the time was the development of methodologies in the social sciences that promised to provide answers about the effectiveness and even efficiency of public programs. Most of the techniques involved were not new, but they began to be applied more extensively and were in some cases taken from academic research into more applied areas. With substantial hubris social scientists argued that they could not only tell whether a program was working or not, but could also provide answers about how to make it work.

The third factor involved in the expansion of evaluation research was that governments, believing in the first two factors above, were willing to invest in evaluation. For example, most of the programs in the Great Society in the USA had substantial budgets for evaluation; up to ten percent of total funding in some cases[2]. This money was designed primarily to assist in improving the programs, but also was intended in part to legitimate the programs and the large-scale public intervention in a society that was largely uncertain about government social programs (Aaron, 1978). The experience in the USA was mirrored in other countries which were willing to use substantial amounts of public money to create evaluation “shops” in many public organizations or to create organizations and programs that worked across the public sector, e.g. PAR in the UK (see Heclo and Wildavsky, 1974) and The Swedish Agency for Public Management (Statskontoret).

Finally, reflecting the above three trends, auditing organizations in the public sector in many countries began to move from “green eyeshade” financial accounting to performance accounting (Mosher, 1979). That trend is reflected more thoroughly in the remainder of this paper, but its roots are in much earlier adoptions of the idea that accountability meant much more than financial accountability and that the access that the SAIs had to the remainder of the public sector gave them a privileged position to begin assessing how well public programs were performing.

Issues in evaluation research

The evaluation movement that evolved at this time had a number of characteristics and principles that shaped their work. And there were also important differences in the manner in which evaluations were done that produced seemingly contradictory results. As will be pointed out in later sections of this paper some of these principles remain in effect in the evaluation work of contemporary SAIs, while others have been largely lost. Some members of the old-style evaluation community regard these changes with more than a little skepticism (see e.g. Schwarz and Mayne, 2005) and there are important questions about the relevance of “the good old days” for contemporary government evaluations.

A key feature of the “old” evaluation approach was that it was longitudinal and in-depth. The general purpose of much of this evaluation work was to uncover the underlying causes of success or failure in a program, and that research goal was considered to require substantial time for an adequate assessment. A program may well be a success or failure in the short term but those effects may decay over time. The greater time frame involved in that style of evaluation enabled researchers to have a better sense of the impact of a program, rather than just looking at the outcomes.

Much of contemporary assessments of public programs is based on relatively simplistic indicators (see Bouckaert and Peters, 2002) but while the older versions of evaluation did indeed use indicators, they tended to be supplemented with more qualitative evidence. The social indicators movement was in full flower at the same time as much of this evaluation was occurring, and there were important attempts to integrate the indicators movement with evaluation (Innes, 1989). That integration of indicators into evaluation was successful to some extent but also was seen by some critics as over simplifying the very complex processes involved when an intervention from government was being made.

One of the important tensions in the earlier strand of evaluation was the difference between evaluations based on economic criteria (especially efficiency) and evaluations based more on political and effectiveness criteria. Cost-benefit analysis is generally used as an ex ante assessment of policies, but it is also used in ex post evaluations. The results of these analyses are often at odds with those of evaluation methods based on whether there was indeed the desired effects of a program, and even more so when values such as participation and feelings of efficacy are included in the evaluation[3].

Another tension in this evaluation was the potential conflict between in-depth evaluations and political necessities. Attempting to understand the dynamics of a policy area its consequences across time takes time and substantial investment. Doing otherwise encounters the risk of “sleeper effects” and other significant errors in evaluation (see Salamon, 1979). On the other hand, politicians and many citizens want to know if the program was successful and to know that before the next election, or before the next vote on the budget. If most evaluation is done within organizations with a more or less permanent staff a longer time perspective on program success is understandable, but that does not match well with political realities[4].

There also was, and continues to be, a tension between objective and subjective forms of policy evaluation (see Vedung, 2013). The older forms of policy evaluation focused attention on objective measures of success or failure, albeit with some concerns with the perceptions of the program recipients. Contemporary evaluation activities also utilize primarily objective indicators, while also adding a variety of participatory mechanisms that enable citizens and stakeholders to give their views on the programs.

Finally, there is a question of whether the purpose of evaluation was learning and improving programs or whether the purpose was to reward or punish the organizations and programs involved. Although there were certainly instances in which evaluation was used as a means of punishing poor performance, or (even more infrequently) rewarding excellent performance, much of the emphasis in traditional evaluation was on feedback and learning (Marier, 2013). Vedung (2006), for example, talks about the uses of evaluation research for accountability and for improvement.

Evaluating evaluation

While this older approach to evaluation research had a number of virtues, the process was far from perfect. One of the most important deficiencies was that much of the evaluation was in-house. As noted above, many organizations in the public sector had their own evaluation shops that might well be expected to be biased in favor of the program. Governments therefore had to trade off the knowledge of the program possessed by internal evaluators vs the greater objectivity of external evaluators.

In addition to the possibilities of bias when there is so much of the evaluation done within the confines of the evaluated organization, there may be additional issues in structuring the process. One of the more important of these is that the evaluation tended to reinforce the “silos” existing in government. While that tendency has perhaps been exacerbated by using performance indicators in much of contemporary evaluation (Bianchi and Peters, 2017), the failure to evaluate broad areas of public sector activity and attempt to develop better policy integration represented a significant problem in evaluation (for a contrary example see Challis, 1988).

Thus, the virtues of the more traditional forms of evaluation research must be weighed against some possible weaknesses. The evaluation movement has to some extent waned but also has served as the foundation for the performance management movement (see Blalock, 1999). For both forms of assessment of public sector programs and organizations, the fundamental questions are very much the same. They are both attempting to understand if those government programs are working, and for whom are they working; citizens or the organizations themselves?

From performance auditing toward evaluation

Auditing in general and performance auditing in particular has developed considerably since the early 1980s. While always relevant to public finance and governance, it has become a more noticeable component of government. The new discourse of auditing from the 1980s and 1990s and onwards has contributed to defining auditing as a key component of public management and increasing the centrality of auditing in management and administrative reform (Power, 1999).

This development was driven mainly by four factors. The first factor involved a growing focus on efficiency and outcomes of government agencies as a leitmotif in NPM reform (Hood, 1991; Lynn, 2005; Pollitt and Bouckaert, 2011; Pollitt et al., 1999). Also, the increasing usage of contractors to deliver public service heightened government’s need to measure performance (Barberis, 1998, Ferlie et al., 1996). And, as Power (1999, 2005) points out, this growing focus on performance and outcomes was sustained by a change in the discourse of public management emphasizing visible and measureable results and value for money.

Furthermore, the past several decades have seen the consolidation of international standards for the organization and conduct of auditing (Brunsson and Jacobsson, 2000). Membership in the INTOSAI, the international organization for SAIs, has become a seal of quality for SAIs. The consolidation of the international network for SAIs may also have contributed to boosting the domestic status of the SAIs. This was, for instance, demonstrated in the reorganization of the Swedish SAI in light of the INTOSAI’s recommendations when complying with INTOSAI recommendations that SAIs should report to Parliament and not to executive government had significant impact on domestic reform in Sweden. It was very clear that the government was anxious to comply with INTOSAI norms (Ahlbäck Öberg, 2001; Bringselius and Ahlbäck Öberg, 2017).

While the INTOSAI seems to be careful not to impose universal standards, governments appear strongly committed to implementing their recommendations (Pierre and de Fine Licht, 2017). The INTOSAI portrays auditing as a keystone of good governance and few governments would want to receive criticism for not providing their SAIs with the autonomy and integrity prescribed by the INTOSAI or to see their auditing office failing to deliver on the INTOSAI’s recommendations. The INTOSAI’s leverage on domestic institutions is soft law (Mörth, 2004) but, given the prestige that surrounds performance auditing, highly effective regulation.

Third, as government’s focus on efficiency increased, so did, by extension, their attention to performance measurement (Bouckaert and Halligan, 2008). Performance auditing, i.e. in-depth analyses of selected agencies performance became an important supplement to the across-the-board strategy of performance measurement and target setting. In some jurisdictions, as will be shown below, audits were not only used to study the performance of an entity but were also seen as opportunities to coach failing entities by assisting in developing networks with other entities, providing benchmarks and promoting best practice and institutional learning.

The focus on performance central to NPM reform has to some extent, and more notably in some jurisdiction than in others, blurred the distinction between reform and auditing. SAIs in many countries are pursuing a reform agenda, identifying themselves as agents of the reform effort by elaborating performance auditing and, in some cases, adopting an advocacy role to facilitate best practice and institutional learning among the entities they audited (Lonsdale, 2007; Pollitt, 2003; Power, 2005). These developments have also meant that the distinction between auditing and evaluation described earlier is today less noticeable.

Fourth, somewhat paradoxically, at the same time that managerialism and New Public Management have become central to governance, the governments making policy became more ideological, and less interested in focused evaluations. Evidence-based policy has been a mantra, but many policy choices (and often too instrument choices) appear driven more by ideology than by evidence, and likewise evaluations of existing policies appear less valuable than management information about performance in delivering services.

This emphasis on ideology in evaluation can be seen when contrasting the legislation in the USA that created the Great Society and that which created Obamacare. In the former almost ten percent of the budget was allocated to evaluation, while in the latter there were no formal evaluation funds. Much of the “evaluation” of Obamacare has been political and ideological with private organizations making the assessments. Many of these “evaluators” have been ideological, e.g. think tanks on both sides of the issue, but also including some more objective ones such as the Commonwealth Fund and the Kaiser Family Fund.

Thus, there has been a sequence of developments contributing to the growing prominence of performance auditing and what appears to be an overlap between such auditing and evaluation in many countries, particularly jurisdictions with extensive NPM reform. In the initial phase, public management reform increased government’s interest in measuring performance in order to boost efficiency and service quality. The focus on performance meant, second, that performance auditing became “an idea in good currency” (Schön, 1983) and that national audit offices became key agents for pursuing moving public management reform.

As a result, performance auditing expanded its scope and developed partly new strategies which together helped to crowd out conventional evaluation. Jantz et al. (2015, p. 962) suggest that in response to the increasing popularity of policy instruments such as decentralization, performance measurement and management and marketization, SAIs would be expected “to go beyond their traditional roles of ‘bookkeeping’ and ‘guardians of legality,’ to (also) evaluate broader issues of policy effectiveness and ‘value for money.’”

Within these general trends of SAIs, there is considerable variation among different countries in how they perceive their role in the reform process; differences that to some extent can be understood as linked to their administrative history and tradition (e.g. Painter and Peters, 2010; Peters, 2003; but see Pierre and de Fine Licht, 2017). For example, SAIs embedded in an Anglo-American administrative tradition can be expected to emphasize public interest, management and efficiency, whereas the legalistic traditions of the Scandinavian countries, where implementing the legal framework is seen as a safeguard against capture and politicization, can be expected to place more emphasis on “doing things in the right way.”

Even within the Nordic countries there are different strategies adopted by SAIs to cope with the demands on the offices (Jeppesen et al., 2017). Thus, considering the case of Denmark, Triantafillou (2017, p. 3) observes that in the challenges which SAIs often experience between defending their independence and boosting their relevance by promoting management reform, “the Danish SAI has persistently prioritized independence over relevance.” Following a decade of considerable fluctuations concerning the role of the SAI Riksrevisionen in Sweden (Bringselius, 2013), the focus is currently shifting toward reform advocacy and support (Pierre and de Fine Licht, 2017). In Finland, a creative form blending auditing and evaluation has been developed, especially in the so-called steering audits[5]. While evaluation has to some extent been increasing with the development of a well-funded Strategic Research Council, there also has been a shift toward assessing efficiency and effectiveness in evaluation, something more akin to performance auditing (Ahonen, 2015). Also, shifts in technology within the Finnish National Audit Office appear to be moving toward more performance accounting (Ahonen, 2016).

In Sweden, the institutional footprints of the changing salience of auditing and evaluation are easy to see. A number of “inspection agencies” charged with overseeing other agencies were created over the past several years, e.g. in the education, culture and social insurance policy sectors (for the UK, see Hood et al., 1999).The analyses produced by these “inspection agencies” are for the most part similar to conventional program evaluation. Meanwhile, the government, as part of its public management program, has issued secondary legislation on organizational management and control and annual reports which is more closely related to auditing functions and technologies.

SAIs’ role in promoting performance auditing remains important although some are cautious not to become too involved with their auditees which could compromise their independence. The New Zealand SAI, OAG, for example, “does not comment on policy,” according to their website, and that their strategy aims at identifying problems rather than giving specific advice. Audits are essentially on-off interactions with the auditee. The Australian SAI, ANAO, formally subscribes to the principle that the SAI should not give too specific advice since it should not “audit itself.” Compared to the OAG, however, ANAO sees its audits much more as opportunities not just to measure performance and efficiency but also to assist in reforming the entity and, to that end, to develop a more continuous relationship with that organization (Pierre and de Fine Licht, 2017).

These observations are snapshots of phenomena in motion as SAIs in several countries appear to be open to new strategies. Recent changes in the auditing process thus reflect, with significant cross-national variation, a tendency to move the SAIs closer to the administrative reform process. That strategy requires that audits are not just a formal, on-off interaction but that it also includes the promotion of reform concepts as well as advice and information about how the entity could improve its performance.

To sum up so far, the pattern we see suggests that the more interactive and engaging a SAI is with its auditees the more inclined and susceptible would the SAI be to blend conventional performance auditing with analyses usually associated with evaluation. Conversely, the more entrenched auditing is in a legal framework the less likely that auditing will also conduct evaluation. That said, there are specific aspects of the growing importance of auditing in relationship to evaluation which seem to be more universal in character. For instance, the shorter time perspective and the more distinct focus on efficiency appear to be a more universal trend.

We should also be cognizant of at least one pressure for returning to more “old fashioned” in-depth evaluation. To the extent that evidence-based policy-making has become an increasing demand by policy-makers, there must be evidence that the program being adopted does indeed work. Performance management and its emphasis on improving efficiency probably cannot provide the detailed understanding of the efficacy of programs and of their dynamics that would be required for evidence-based policy-making.

Consequences of the growth of auditing

Today, auditing fulfills many of the functions in the policy process that were previously associated with evaluation. There has been a significant decline in the number and staffing of formal evaluation organizations within the public sector. The evaluation function is now being performed by auditing organizations, consultancies, as well as by institutionalized versions of performance management.

While evaluation may be still occurring, it does not appear to be the same in-depth evaluations advising program managers within house that was once characteristic of the public sector. For example, one study by the US Government Accountability Office (2013) found that approximately three-fourths of federal managers had not had a thorough evaluation of their programs in over five years. This report was prompted by the Government Performance and Results Act, and noted that the performance auditing associated with that act was significantly different from the in-depth evaluations that had been common in government (for Canada see Mayne, 2006).

There appear to be six important contributions that auditing and evaluation can offer to policy-making: a shorter time perspective; an emphasis on the administration of policies rather than an institutional focus; more attention to efficiency and less on due process; a greater dependence on the audited or evaluated organizations; and a shift in delivery from executive government toward the legislature and the broader public. The paper will now discuss these points in some detail.

Shortened time perspective

As already mentioned, the movement away from other forms of evaluation to the use of auditing as the means of evaluating programs has tended to shorten the time perspective used in assessing whether programs are successful. Most evaluations done through evaluation shops within agencies or by think tanks, universities and consultants, tended to focus on achievement of the fundamental goals of programs rather than on measurable short-term outputs. The assessment of those basic goals within a program is difficult to perform within a short time frame.

Both of these forms of assessment have their virtues. The longer-term approach of most evaluations may uncover more fundamental reasons for program failures or inefficiencies, but it may do so only after a great deal of time and money has been spent on a less than efficient program, or the program has become so entrenched that attempting to reverse or terminate the program may be politically costly. However, the shorter term perspective of auditing and performance management may uncover issues in a more timely manner but may also make premature judgments about the programs’ utility. The shorter term assessments therefore may be more suitable for management issues than for fundamental issues about policy design. As with several of the issues raised here, there is a need to find some balance between forms of assessment, and perhaps also to engage in both forms of assessment of public programs.

Stronger emphasis on the administration of policies

A basic difference between performance auditing and evaluation is that the former usually targets institutions while the latter tend to target programs. There are obviously close linkages between the organizational and program paradigms, but most agencies deliver multiple programs just as many programs are delivered jointly by several public bodies. This mixture of programs and organizations in governing means that auditing generates a different type of information than evaluation and is of different value to policy-makers.

Certainly, program evaluations reflect on organizational performance but often critical evaluations point out flaws in program design rather than underperforming organizations. Thus, while there is a linkage between programs and organization in evaluation – just as an audit of an agency will look closely at the programs it delivers – evaluation and auditing draw on two basically different approaches. With the expansion of auditing, to some extent at the expense of evaluation, we today know more about the performance of government organizations and the administration of policies but less about program design and performance (see Chelimsky, 1985, 1996). And, since policy-making to a large extent is matter of design and the translation of policy into programs, contemporary auditing would appear to have less to offer compared to conventional evaluation.

Another problem relates to the linkages between auditing and evaluation and policy-making. SAIs, as mentioned, go at great lengths to defend their autonomy vis-à-vis executive government as their integrity is essential to professional and independent auditing (see, for instance, Lonsdale, 2008). They are thus operating at some distance from the policy process. Evaluation, on the other hand, is an integral part of the policy process and is better positioned to give advice on policy design.

Increased focus on efficiency of the audited entity

The principal emphasis of most traditional evaluation research was on the effectiveness of programs rather than on their efficiency. Most audit-based evaluations, on the other hand, tend to be at least as, if not more, concerned with the efficiency of provision of public services. This distinction mirrors to some extent the difference between cost-benefit analysis and cost-effectiveness analysis of public policies. While both of these attributes of programs are important for the public sector, there is still a question of emphasis (Leeuw, 1996), and perhaps a need to consider the differential relevance of each criterion for different types of programs.

The focus on efficiency not only emphasizes the economic effects of a program but it also tends to assign less importance to unintended consequences of public action. As Vedung (2006) has pointed out, one decision that evaluators must make is how much weight to ascribe to effects of policies that lie outside the goals of the policy. Most auditing assessments of program concentrate on the extent to which the stated goals of the policy are achieved and tend to ignore other outcomes. While perhaps understandable from an auditing perspective, this may weaken the overall contribution of evaluation to the effectiveness, and even efficiency, of the public sector.

At the extreme, as Leeuw (2009) argued, performance auditing becomes a means of reinforcing management decisions rather than an assessment of policy. In this use of performance auditing it is not a question of what programs work, but how management can make a program work better. Rather than being the arbiter of success and failure for programs, auditing becomes a means of tinkering with programs without asking more fundamental questions. That tinkering can be valuable but may not solve fundamental design flaws.

Greater independence from the evaluated organizations

One of the major virtues of auditing as a form of assessment of public programs is that the organizations involved tend to be independent of the evaluated programs or organizations. The relationship of the evaluator to the evaluated organization is a continuing issue in evaluation research. On the one hand, having an evaluator who works within the organization will give him or her greater access to information and there is a greater likelihood that the programs being implemented will be understood fully. On the other hand an external evaluator has greater independence and objectivity and can also bring different ideas to bear on the policies involved. NPM appeared to have a general emphasis on autonomy, e.g. agencies, and the use of SAIs is one more example of the commitment to organizational autonomy.

Thus, there is some difficulty in finding the proper balance between autonomy and impact for evaluation organizations (Lonsdale, 2008). There is also some difficulty in finding a balance between using evaluation or auditing as a tool for learning and as a tool for rewarding or punishing performance (Marier, 2013). The internal evaluation can be more conducive to learning and improving the performance of the program, while the external audit may be perceived as a basis of sanctions, even if that may not be the intent of the audit (Schwarz and Struhkamp, 2007).

Shift in receiver of information toward the legislature and/or the public

Some SAIs, for instance, the New Zealand supreme auditor, very clearly reach out to the public, portraying themselves as acting in the public’s interest to ensure that New Zealanders’ tax money is spent wisely and appropriately. Other SAIs focus their work internally and view the legislature rather than the public to be the more relevant client. As with several of the other strategic choices made by these organizations, this involves benefits and costs for the SAI depending upon which option is selected.

The more public role for the SAI is certainly in line with a democratic and participatory conception of government, but also involves risks of misunderstandings and potential incomprehension in the public that is not an expert in understanding public programs as are political and administrative elites. On the other hand, keeping all the information about the performance of public programs locked up within the public sector can create the impression of excessive secrecy and that government has something to hide.

Most SAIs adopt a mixed strategy in dealing with the public. While they continue to assume that their principal client is the legislature, or in some cases the executive, they do make their analyses open to the general public and the media. For example, those who wish can get a daily e-mail from the Government Accountability Office with a list of the day’s reports or receive information about audits and other recently published material from the Australian, New Zealand or Swedish SAIs on Twitter. While the average citizen does not sit waiting in eager anticipation for this message, many in the media do (Pierre and de Fine Licht, 2017).

Improved communication

Following from the above point, the use of performance auditing rather than evaluation as the principal means of assessing the work of public sector organizations may improve the communication about the work of holding government accountable. The work of evaluation organizations was often less visible to the public and often was expressed in theoretical terms and/or statistical analysis. Performance auditing, on the other hand, tends to communicate in relatively simple terms about percentages of targets achieved, and other readily comprehensible information.

The question for the public sector then becomes whether it is better, for the sake of public accountability, to have less complete information understood more completely, or vice versa? If, as this paper has been arguing, evaluation research provides more information about the underlying dynamics of a policy then that more detailed understanding should be useful for experts but perhaps be on little use for general consumption. The rather simple information from performance auditing may not reach the causes of any policy problems but it can be better at raising public awareness of issues.

The paper has discussed these six changes resulting from a shift from evaluation to auditing as contributions, but they may also represent some losses as well. For example, while quick and clearly communicated assessments of performance are valuable, so too are in-depth evaluations that take more time and produce perhaps somewhat turgid results. We could provide the negative consequences for each of the positives mentioned above, but the fundamental point would not change. Selecting the mode of assessment for public programs and organizations requires judgment, and one size may well not fit all.

Concluding discussion

Against the backdrop of these changes in evaluation and in performance auditing, we can assess whether auditing is becoming “the new evaluation” and, if so, what the consequences of that development are. The title of this paper provides three questions: Is auditing the new evaluation? Can it be? Should it be? These questions are all relevant, but they are also difficult to answer in any definitive manner.

At the beginning of its utilization, performance auditing had a fairly narrow scope, focusing on the efficiency of individual entities within government. In some administrative cultures, SAIs would give recommendations to the entity and follow up some time later to see to what extent those recommendations were implemented. In other cases, the auditor would merely point out shortcomings to the entity and table its report with Parliament. Thus, performance auditing primarily was an exchange between auditor and auditee. The audit and the nature of the interaction between auditor and auditee were not geared to allow the auditor to promote reform, best practice or institutional learning.

Evaluation had, and where still practiced to any significant decree still has, a different approach to assessing programs (see Roberts and Pollitt, 1994). Evaluation has been to a large extent designed specifically to help improve policy and program design. Therefore, evaluation reports were targeted more at policy-making institutions than the evaluated entity or program. If auditing focuses on administrative organizations, evaluation was mainly looking at programs or policies and therefore focuses somewhat more on policy-making institutions.

This pattern has changed considerably over the past couple of decades. Performance auditing has become an important component of public management while evaluation plays a less prominent role today than it did in the 1980s. So, in conclusion: auditing is to a large extent “the new evaluation” but while it helps boost the efficiency of public organizations it cannot, and does not seek to, inform policy or program design. In a way this is yet another illustration of a general trend in public administration; the primacy of efficiency objectives over procedural consideration or even policy design. As key agents of boosting public sector efficiency, SAIs in many jurisdictions have gained increased prominence in the public sector. This may come at the expense of policy advice by evaluators. And in the worst cases it may result in poor policies being delivered very efficiently.



It is, for instance, indicative that the cover of the 2004-5 annual report of the Victoria EPA in Australia featured the slogan “Come with us” alongside a picture of a carrot.


Coming out of that tradition of evaluation of Great Society programs Head Start—a pre-school program primarily for disadvantaged children—is almost certainly the most evaluated program in human history.


The Washington Monthly once reported on two assessments of the same program. One, by a group of Harvard economists found the program was a crashing failure. The other assessment, done by political scientists and sociologists from Berkeley, found the program to be a great success. The problem is, of course, that they were both correct.


Interestingly, with performance evaluations of public administrators and their programs now being done on an annual basis or even more frequently, politicians may actually have a longer time perspective than the bureaucrats.


Aaron, H. (1978), Politics and the Professors: The Great society in Perspective, The Brookings Institution, Washington, DC.

Ahlbäck Öberg, S. (2001), “En konstitutionell revolution”, Politologen, Vol. 25, Spring, pp. 25-32.

Ahonen, P. (2015), “Aspects of the institutionalization of evaluation in Finland: basic, agency, process and change”, Evaluation, Vol. 21, No. 3, pp. 308-324.

Ahonen, P. (2016), “Performance audits by the national audit office of Finland: computational analysis of topical structures”, paper presented at Meeting of European Evaluation Society, Maastricht.

Barberis, P. (1998), “The new public management and a new accountability”, Public Administration, Vol. 76 No. 3, pp. 451-470.

Barzelay, M. (1997), “Central audit institutions and performance auditing: a comparative analyses of organizational strategies in the OECD”, Governance, Vol. 10, pp. 235-260.

Bianchi, C. and Peters, B.G. (2017), “Dynamic modeling and policy coordination”, in Borgonovi, E., Anessi-Pessina, E.E. and Bianchi, C. (Eds), Outcome-Based Performance Management in the Public Sector, Springer, Heidelberg, pp. 143-159.

Blalock, A.B. (1999), “Evaluation research and the performance management movement: from estrangement to useful integration”, Evaluation, Vol. 5 No. 2, pp. 117-139.

Bouckaert, G. and Halligan, J. (2008), Managing Performance: International Comparisons, Routledge, Abingdon.

Bouckaert, G. and Peters, B.G. (2002), “Performance measurement and management: the Achilles’ heel of administrative modernization”, Public Performance & Management Review, Vol. 25 No. 4, pp. 359-362.

Bringselius, L. (2013), Organisera Oberoende Granskning: Riksrevisionens Första tio år, Studentlitteratur, Lund.

Bringselius, L. and Ahlbäck Öberg, S. (2017), “Forskning om den statliga revisionen i sverige”, in Bringselius, L. (Ed.), Den statliga Revisionen i Norden; Forskning, Praktik och Politik, Studentlitteratur, Lund, pp. 61-78.

Brunsson, N. and Jacobsson, B. (Eds) (2000), A World of Standards, Oxford University Press, Oxford.

Brusca, I., Caperchione, E., Cohen, S. and Manes-Rossi, F. (2018), “IPSAS, EPSAS and other challenges in European public sector accounting and auditing”, in Ongaro, E. and van Thiel, S. (Eds), The Palgrave Handbook of Public Administration and Management in Europe, Palgrave, London, pp. 165-186.

Challis, L. (1988), Joint Approaches to Social Policy: Rationality and Practice, Cambridge University Press, Cambridge.

Chelimsky, E. (1985), “Comparing and contrasting auditing and evaluation”, Evaluation Review, Vol. 9 No. 4, pp. 483-503.

Chelimsky, E. (1996), “Auditing and evaluation: whither the relationship?”, New Directions for Evaluation, No. 71, pp. 61-67.

Ferlie, E., Ashburner, L., Fitzgerald, L. and Pettigrew, A. (1996), New Public Management in Action, Oxford University Press, Oxford.

Heclo, H. and Wildavsky, A. (1974), The Private Government of Public Money, University of California Press, Berkeley, CA.

Hood, C. (1991), “A public management for all seasons?”, Public Administration, Vol. 69 No. 1, pp. 3-19.

Hood, C., James, O., Scott, C., Jones, G.W. and Travers, T. (1999), Regulation Inside Government: Waste Watchers, Quality Police and Sleaze Busters, Oxford University Press, Oxford.

Innes, J.E. (1989), “Disappointments and legacies of social indicators”, Journal of Public Policy, Vol. 9 No. 4, pp. 429-433.

Jantz, B., Reichborn-Kjennerud, K. and Vrangbaek, K. (2015), “Control and autonomy – the SAIs in Norway, Denmark, and Germany as Watchdogs in an NPM-Era?”, International Journal of Public Administration, Vol. 38 Nos 13-14, pp. 960-970.

Jeppesen, K.K., Carrington, T., Catasus, B., Johmsen, A., Reichborn-Kennerud, K. and Vakkuri, J. (2017), “Strategic options of supreme audit institutions: the case of four Nordic countries”, Financial Accountability and Management, Vol. 33 No. 2, pp. 146-170.

Leeuw, F.L. (1996), “Auditing and evaluation: bridging a gap, worlds to meet?”, New Directions in Evaluation, No. 71, pp. 51-60.

Leeuw, F.L. (2009), “Evaluation: a booming industry but is it adding value?”, Evaluation Journal of Australasia, Vol. 9 No. 1, pp. 3-9.

Lonsdale, J. (2007), “Walking a tightrope? The changing role of state audit in accountability regimes in Europe”, in Bemelmans-Videc, M., Perrin, B. and Lonsdale, J. (Eds), Making Accountability Work: Dilemmas for Evaluation and Audit, Transaction Publishers, New Brunswick, pp. 85-104.

Lonsdale, J. (2008), “Balancing independence and responsiveness: a practitioner perspective on the relationships shaping performance audit”, Evaluation, Vol. 14 No. 2, pp. 227-248.

Lynn, L.E. (2005), “Public management: a concise history of the field”, in Ferlie, E., Lynn, L.E. and Pollitt, C. (Eds), The Oxford Handbook of Public Management, Oxford University Press, Oxford, pp. 27-50.

Marier, P. (2013), “Policy feedback and learning”, in Araral, E., Fritzen, S., Howlett., M., Ramesh, M. and Wu, X. (Eds), Routledge Handbook of Public Policy, Routledge, London, pp. 401-414.

Mayne, J. (2006), “Audit and evaluation in public management: challenges, reforms and different roles”, Canadian Journal of Program Evaluation, Vol. 21 No. 1, pp. 11-45.

Mörth, U. (Ed.) (2004), Soft Law in Governance and Regulation: An Interdisciplinary Analysis, Edward Elgar, Cheltenham.

Mosher, F.C. (1979), The GAO: The Quest for Accountability in American Government, Westview, Boulder, CO.

Painter, M. and Peters, B.G. (Eds) (2010), Tradition and Public Administration, Palgrave, Basingstoke.

Pattyn, V., van Voorst, S., Mastenbroek, E. and Dunlop, C.A. (2018), “Policy evaluation in Europe”, in Ongaro, E. and van Thiel, S. (Eds), The Palgrave Handbook of Public Administration and Management in Europe, Palgrave, London, pp. 577-594.

Peters, B.G. (2003), “Administrative traditions and the Anglo-American democracies”, in Halligan, J.A. (Ed.), Civil service systems in Anglo-American countries, Edward Elgar, Cheltenham, pp. 10-26.

Pierre, J. and de Fine Licht, J. (2017), “‘How do supreme audit institutions manage their autonomy and impact? A comparative analysis’ (with Jenny de Fine Licht)”, Journal of European Public Policy, doi: doi/full/10.1080/13501763.2017.1408669.

Pollitt, C. (2003), “Performance audits in Western Europe: trends and choices”, Critical Perspectives on Accounting, Vol. 14 Nos 1-2, pp. 157-170.

Pollitt, C. and Bouckaert, G. (2011), Public Management Reform: A Comparative Analysis, Oxford University Press, Oxford.

Pollitt, C. and Summa, H. (1996), “Performance audit and evaluation: similar tools, different relationships?”, New Directions for Evaluation, No. 71, pp. 29-50.

Pollitt, C., Girre, X., Lonsdale, J., Mul, R., Summa, H. and Waerness, M. (1999), Performance or compliance?, Oxford University Press, Oxford.

Power, M. (1994), The Audit Explosion, Demos, London.

Power, M. (1999), The Audit Society: Rituals of Verification, Oxford University Press, Oxford.

Power, M. (2005), “The theory of the audit explosion”, in Ferlie, E., Lynn, L.E. and Pollitt, C. (Eds), The Oxford Handbook of Public Management, Oxford University Press, Oxford, pp. 326-346.

Roberts, S. and Pollitt, C. (1994), “Audit or evaluation: a national audit office VFM study”, Public Administration, Vol. 72 No. 4, pp. 527-549.

Romzek, B.S. (2000), “Dynamics of public sector accountability in an age of reform”, International Review of Administrative Sciences, Vol. 66 No. 1, pp. 21-44.

Salamon, L.M. (1979), “The time dimension in policy evaluation: the case of new deal land reform”, Public Policy, Vol. 27 No. 1, pp. 129-183.

Schön, D.A. (1983), The Reflective Practitioner: How Professionals think in Action, Basic Books, New York, NY.

Schwarz, C. and Struhkamp, G. (2007), “Does evaluation build or destroy trust?”, Evaluation, Vol. 13 No. 3, pp. 323-339.

Schwarz, R. and Mayne, J. (Eds) (2005), Quality Matters: Seeking Confidence in Evaluation, Auditing and performance Reporting, Transaction Publishers, Brunswick.

Triantafillou, P. (2017), “Playing a zero-sum game? The pursuit of independence and relevance in performance auditing”, Public Administration, doi: 10.1111/padm.12377.

US Government Accounting Office (2013), Program Evaluation: Strategies to Facilitate Agency’s Use of Evaluation, US Government Accounting Office, Washington, DC.

Vedung, E. (2006), “Evaluation research”, in Peters, B.G. and Pierre, J. (Eds), Handbook of Public Policy, Sage, London, pp. 397-416.

Vedung, E. (2013), “Six models of evaluation”, in Araral, E., Fritzen, S., Howlett, M., Ramesh, M. and Wu, X. (Eds), Routledge Handbook of Public Policy, Routledge, London, pp. 387-400.

Corresponding author

Jon Pierre can be contacted at: