Auditing and compliance in public procurement – an empirical assessment

Purpose – This study aims to empirically evaluate the effectiveness of government auditing of local authorities ’ compliancewith theprocurement rules. Design/methodology/approach – A diff-in-diff approach is used where the measure of compliance is (changes in) the incidence of private litigation under the Public Procurement Act, in audited vs non-audited municipalities. Further, semi-structuredinterviewswere conducted with chiefprocurement of ﬁ cials. Findings – No statistically signi ﬁ cant effect is found. While strong effects of audits can be ruled out, the statistical resultsandtheinterviews donot,however,contradicta modest butlong-lasting effect. Originality/value – Few studies have addressed the effect of public procurement auditing on compliance. This study developsanempirical framework andpresentsempirical results.


Introduction
It is notoriously difficult to measure the impact of audits on behaviour [1]. In fact, following Michael Power's 2000 classic The audit society: Rituals of verification, a whole sub-literature argues that audits do not live up to expectations, are often ineffective and often have unintended dysfunctional consequences. In this vein, two recent empirical studies report distortionary effect of audits and transparency on public procurements. Gerardino et al. (2017) report that more intense audits of Chilean public procurements reduced the use of competitive auctions, while Duguay et al. (2020) find that when auditing possibilities increased in the European Union, the procuring entities more often opted for lowest-price open-auction formats, hence seemingly intensifying competition but, according to the study, at the cost of less efficient provider selection. Although the results are somewhat contradictory, both sets of authors argue that more intense auditing had adverse side-effects stemming from the auditees trying to reduce the risks of legal action following the auditing process [2].
The present study focuses on the effect of audits on conduct, rather than on outcomes such as price and quality. That auditing and monitoring in general can distort incentives when all quality dimensions and all types of effort cannot be effectively observed has long been recognized in theoretical research (Holmstrom and Milgrom, 1991, for the seminal article on multitasking and monitoring). Related research has shown that limiting the procurer's discretion and making the contract-award mechanism more transparent and predictable may have adverse consequences for contract execution (Spulber, 1990;Manelli and Vincent, 1995). Cameron (2000) provides empirical evidence that support these concerns; she finds that although rigid bid evaluation lowers prices by 18% (for a common type of contract), contract breaches also increased by 50%. Coviello et al. (2022) find similar results for a sample of Italian public-works procurements. Reduced buyer discretion is associated with longer total delivery times, longer delays and larger cost overruns, with inconclusive effects on rebates relative to the reserve price [3]. Some qualitative empirical studies of public procurement (Kelman, 1990) have come to similar conclusions.
However, some recent work points in the other direction, suggesting that reduced buyer discretion may have beneficial effects. For example, Baltrunaite et al. (2021) use a diff-in-diff approach to study the consequences of an Italian reform that raised the threshold under which a negotiated procedure could be used. They find that more discretion resulted both in higher costs and in more frequent contracting with politically connected firms. Szucs (2017) uses similar methods to study the consequences of a Hungarian reform that raised the thresholds for "invitational" procedures, with similar findings. Lalive et al. (2017), in study of train services in Germany, find that competitive tendering results in better outcomes than direct contracting.
The few existing quantitative empirical studies of audits of public procurements have a positive view of the effects of government oversight, increased transparency and more openly competitive procedures. For example, Olken (2007) reports that road construction costs in Indonesia were substantially and significantly lower when costs were audited with a high probability, while Zamboni and Litschig (2018) report similar findings for a more general set of procurements in Brazil. Di Tella and Schargrodsky (2003) find that increased monitoring reduced prices of hospital supply in Argentine by 15%. Lewis-Faupel et al. (2016) find that eprocurement lowered prices and/or improved quality in Indian and Indonesian procurements (Fazekas and Blum (2021) for a survey of the literature). The context of the empirical literature is often one where corruption is a key concern; the authors conclude that previous work mainly supports the theoretical predictions of standard law-and-economics models, that more intense auditing yields higher levels of compliance and, as a consequence, lower prices [4]. Bandiera et al. (2009) propose a new method to distinguish corruption (or "active waste") from sheer inefficiency ("passive waste"). Their findings, based on Italian public procurements, suggest that waste because of inefficiency is far more important than inefficiency because of corruption. Consequently, this adds support for the alternative view, that limiting the public officials' discretion may be the wrong recipe for efficient procurement. The theoretical and empirical literature discussed above takes its point of departure to be reduced buyer discretion, rather than more intense auditing, but it is likely that more intense auditing will effectively result in reduced buyer discretion, for example through more frequent use of lowest-price bidding or other changes in procurement practices that make bid evaluation more rigid.
This study adds to the still rather small empirical literature by looking more in detail at the effects of auditing on the audited entities. Rather than looking at the average effect of auditing on outcome variables, such as transaction prices or variables correlated with rule following, such as lowest-price auctions, it seeks to study the effect on each organization's tendency to comply (or not comply) with the rules. In this sense, the present study is similar to studies of the deterrence effect of (criminal) punishment on illicit behaviour [5].
Specifically, I study whether procuring authorities that are prosecuted by the Swedish public-procurement "watchdog" for violating the Public Procurement Act are less likely or not, to subsequently face private litigation for non-compliance with the same Act. Because of data limitationsi.e. as there is no direct measurement of compliancethe probability of private litigation is used as a proxy measure for compliance with the rules. The validity of this proxy is supported by a robust finding that authorities that face a high risk of private litigation also have a higher (per procurement) probability of a court ruling in favour of the plaintiff.
A general difficulty in studying the effect of being subject to auditing, prosecution or sentencing on individual behaviour is that selection effects are likely to be strong. Entities, whether individuals, firms or authorities, that are subject to audits, prosecution or sentencing are likely to differ significantly from entities that are not. Hence, a naïve regression analysis of whether audited authorities are more or less likely to break rules in the future may result in seemingly strong evidence in support of the hypothesis that audits increase the likelihood of rule breaking. I seek to overcome this problem by using a difference-in-differences (diff-in-diff) approach. That is, I study whether the behaviour of authorities that are audited by the watchdog changes, relative to the behaviour of a comparison group consisting of authorities that are not.
The present study does not find support for the proposition that legal action by the public-procurement watchdog results in procurement procedures more in line with the Public Procurement Act. However, this may be because the statistical power of the regression analysis is relatively weak, since the number of observations is relatively small. A small interview study with chief procurement officers suggests that public procurement audits may, after all, have an impact on the authorities' behaviour.
2. Public procurement in Sweden and procurement regulatory oversight As in most developed countries, the aggregate value of public procurements in Sweden amounts to a large fraction of GDP, around 18%. This is similar to neighbouring countries in the Nordic region and more generally in north-western Europe [6].
Public procurement in Sweden is regulated by the Public Procurement Act, which was enacted so as to fulfil Sweden's obligation, as a member of the European Union, to implement the EU Public Procurement directives. According to Swedish regulation 2007:1117, the Swedish Competition Authority (SCA) is responsible for regulatory supervision of public procurement. Its oversight can result in cases that can be either mandatory or discretionary, depending on how the rules were violated and other legal circumstances. If a violation of the rules is established, the procuring authority will be fined with up to SEK 10mn. However, the SCA does not have the authority to take the formal fining decision. It effectively has the role of a government prosecutor that brings cases before the city court.
As discussed below, (potential) providers can also seek judicial review in the same court. If the court finds that the rules have been violated, possible consequences are that the procurement must be corrected or redone or that a contract that was the result of the procurement is rendered void. The court may also find that the rules have been violated but that the contract should not be rendered void; such a finding will trigger a mandatory case [7]. The city courts' rulings can be appealed to the relevant Court of Appeals and occasionally, if trial permit is granted, to the Supreme Administrative Court.
During the 2011-2019 period, the authority sought fines in at least 186 cases. On three occasions, it sought the maximum amount, SEK 10mn. The authority may also formally criticize a procuring entity without seeking fines. During the same period, this happened in at least 47 cases. Almost 60% of the cases were directed against municipalities or municipality-owned incorporated companies. Among cases where fines were sought, more than 50% concerned municipalities and their companies. The average fine sought in discretionary cases was close to SEK 1.1mn, almost four times higher than the average fine sought in mandatory cases [8]. The much lower fine level for the latter category is likely because of the SCA finding these rules violations to be less offensive, given that a court has established that the contract should not be rendered void.
The number of cases can be set against the total number of advertised public procurements and can be compared with the number of procurements for which one or more private party request a judicial review ("private litigation" or just "litigation"). In recent years, about 18,500 procurements have been advertised annually, and for about 7% of those, about 1,300 cases per year, at least one private party has requested a judicial review. In a study covering an earlier period, the National Agency for Public Procurement (NAPP) (NAPP, 2017) reports that reviews were sought for between about 1,300 and 1,600 cases annually, corresponding to about 8% [9]. According to Swedish Competition Authority (2013), the relative incidence of requests for review is high, compared to other member countries [10].
A government inquiry found that in about one third of the requests for review, the court's findings are favourable for the applicant private party [11]. In contrast, NAPP (2017) found that rulings are favourable for the applicant in only about 20% of these cases. The difference is likely because of how the outcomes are classified. These alternative estimates of court-established wrongdoing correspond to between 1.5% and 2.5% of all public procurements.
Meanwhile, about 0.2% of all public procurements made by municipalities, including municipal-owned companies, are likely to be subject to regulatory supervision by the SCA. Even if the relatively high success rate of the SCA and the low success rate in private litigation is taken into account, this still means that it is much more likely that a municipality will be compelled to take corrective action in a procurement because of private litigation, than it having to pay fines following regulatory supervision by the SCA.

Methods
In clinical studies, it is often clear what effect measures we are genuinely interested in. Examples include survival and absence of disease; such measures are known as clinical endpoints. However, as these effects may only be measurable after many years or even decades, it is common to use so-called surrogate endpoints that are measurable after a much shorter period. Examples include cholesterol levels in the blood and shrinking cancer size; measures that in and of themselves have no or little value, but that are believed to be causally related to the clinical endpoints.
When studying the effectiveness of audits in general and procurement audits in particular, the causal chain is much less well understood. The studies cited in the introduction mainly focused on what can be compared to a clinical endpoint: costs. As mentioned, studies of public procurement often find that more intense auditing is associated with lower coststhe implicit assumption being that the link between the "treatment" (audits) and the improved outcome is a higher level of compliancewhile the broader literature on auditing presents a more complex picture.
However, a number of studies have focused on indications of distorted incentives in public procurement because of the more intense monitoring. For example, some of the studies cited above found that more auditing or increased transparency (that facilitates monitoring) resulted in more frequent use of lowest-price procurement and made contract breaches more common. Palguta and Pertold (2017) find that procuring authorities bunch procurements below thresholds, likely to avoid the more intense monitoring of high-value procurements and hence to maintain a higher degree of discretion [12].
To summarize, the literature suggests that more intense auditing should increase compliance, but direct evidence of this is largely missing. On the one hand, price reductions following more intense auditing or increased transparency have been interpreted as indications of compliance. On the other hand, contract breaches appear to become more common under more intense auditing, when transparency increases and when the procurers' discretion is limited. Procuring authorities try to protect themselves from the fallout of more intense scrutiny by awarding contracts on the basis of lowest priceor else try to avoid monitoring by downsizing the procurements.
Graphically, the chain of causation can be illustrated as in Figure 1. More intense auditing can result in good-faith compliance, e.g. less corruption, but it can also result in "evasive" compliance. Examples of the latter, documented in the literature, include more frequent use of lowest-price criteria and downsizing of contracts to avoid monitoring. The resulting changes in procurement practices will, in turn, affect outcomes (or endpoints). Several studies have found that prices have fallen as a result of more auditing, but there are also indications of lower qualityfor example more frequent contract breaches.
In the present study, absence of adverse legal findings is seen either as a proxy measure for compliance or as a surrogate endpoint. The latter viewpoint, emphasized in the figure, is relevant under the assumption that the procurement rules are well-designed and that, consequently, compliance will result in efficient procurement. The alternative view, that absence of legal findings is a proxy for compliance, is more relevant in a critical perspective that does not take for granted that rules are well-designed. Compliance is difficult to observe directly and previous research has instead emphasized indications of evasive actions, such as bunching below thresholds.
The reason for studying the relation between audits and compliance is not that it is inherently more interesting than the relation between audits and endpoints, rather the opposite. The motivation is instead to further our understanding of the effects of audits on compliance. Since compliance can have both positive and negative direct effects on endpoints (or outcomes), it is of independent value to study the relation between audits and compliance as directly as possible, besides studying the relation between audits and outcomes (endpoints). This is especially so since audits can cause both compliance in good faith and evasive action. Hence, the present study focuses on the relation between audits (or regulatory supervision) and compliance, with absence of adverse legal findings used as a proxy measure for compliance. Public procurement A naïve analysis would proceed by studying the statistical relation between audits and compliance (or absence of adverse legal findings). As mentioned in the introduction, however, such a research design would be flawed, since it may well be that audits are targeted at authorities that are known or believed to exhibit a low degree of compliance. Hence, the researcher could effectively be studying the inverse relation, from noncompliance to audits.
A well-known approach to address this sample-selection problem is diff-in-diff regression analysis [13]. Here, I use measures of compliance of individual municipalities during two different time periods to obtain changes over time ("differences"). Next, differences for a group of municipalities that have been subject to regulatory supervision or "audits" (the "treatment group") is compared to differences for a group of municipalities that have not (the "control group"); this comparison of differences constitutes the "difference-indifferences".
Three further complications must be mentioned. Firstly, the most direct (although inverse) measure of compliance, the share of a municipality's procurements for which a court finds non-compliance with the Public Procurement Act, is available for one year only [14]. However, the share of the municipalities' procurements for which at least one request for review (below sometimes referred to as "private litigation") has been submitted to court is available for several years. If it can be established that the latter share is correlated with the former, then the incidence of private litigation can be used as an inverse proxy measure for compliance with the procurement rules. As will be shown below, the statistical relation between private litigation and court findings of non-compliance is positive and statistically significant.
Secondly, while there exists public data on the number of requests for review per municipality and year, data on the number of procurements per municipality per year is available only since 2021. Thirdly, a distinction can be made between general preventive effects and individual preventive effects. Audits of one municipality are likely to have preventive effects both at the individual level and at the general level. If the general preventive effect completely dominates, the diff-in-diff approach will be of no avail, since the control group will be as much affected as the treatment group. If the general effect is weaker than the individual effect, a diff-in-diff analysis will underestimate the latter, since the control group will to some extent be affected. These complications will be further discussed below.

Empirical method and data 4.1. Econometric model
The canonical diff-in-diff research design can be explained with reference to the following equation: where r it is an observation of the dependent variable in period t for individual (here municipality) i, T it is a dummy variable representing time with T it = 1 in the treatment period, S it a dummy variable identifying the treatment group, with S it = 1 for individuals (municipalities) that belong to that group. The key parameter of interest is b 3 which captures the effect of the treatment on the treated, while b 1 is the average effect of time for both groups and b 3 is the average effect of being in the treatment group (except for the effect of the treatment). Finally, X it represents other time-varying variables, b 4 the associated parameters and « it the individual error terms.

JOPP
A common implementation is fixed-effect estimation. However, as the data in the present study only covers two periods (see below), a simple and numerically identical estimation method is the first-difference estimator [15]: Here, the parameter of interest is g 3 which captures the treatment effect on the treated. The empirical application uses the share of audited procurements and the fines per inhabitant, per municipality, rather than a single indicator variable, as implied by the above equation. Hence, the empirical specification is: where A i represents one or more variables that measure the auditing intensity. The identifying assumptions are that the control group is not affected by the treatment, i.e. that municipalities are not affected by audits of other municipalities and that in the absence of treatment, the treatment group and the comparison group have similar trends. As mentioned, an audit may become known to and affect other municipalities, causing a downward bias of the estimated effect. If the effect is sufficiently strong, it will not be possible to detect an effect of the audit. Similarly, if the common-trends assumption is violated, the estimates will be biased. Parallel trends prior to the treatment, for the control group and the treatment group, are typically taken as evidence in support of the second of the identifying assumptions. Here, however, only two periods are available for analysis. Hence, the common-trends assumption must be addressed more indirectly, by looking at variables that may have a causal relation with the incidence of public procurement and with rules compliance. The analysis presented in the Appendix indicates parallel trends when it comes to the left-right orientation of the governing coalition throughout the 2002-2022 period and at least for the period up until 2016 for other variables such as municipal costs and outsourcing share.

Data
The dependent variable measures the relative incidence of requests for review of public procurements (or "private litigation"). Ideally, the variable should have been the fraction of the municipality's procurements for which there is such a request. However, as the number of procurements is only available as of 2021, the dependent variable is defined as the ratio of actual requests for review in 2015-2016 or 2020 and the number of procurements in 2021, per municipality. Hence, the maintained assumption is that the number of procurements per municipality has been stable between 2015 and 2021. At the national level, this is true. The total number of advertised procurements has varied between 18,400 and 18,600 during the seven-year period, except in 2020, when the number fell to just under 18,000.
Two data sets were obtained from the NAPP: firstly, a random sample of 1,000 requests for review from the years 2015 and 2016 and, secondly, all requests for review for 2020 [16]. The first of these data sets was compiled for and analysed in NAPP (2017). Specifically, the study coded the outcome of the judicial review for each case. As mentioned, the applicant was considered to be successful in about 20% of the cases. In the present study, successful requests for review are assumed to indicate non-compliance with the procurement rules.
If a similar study were available for 2020, a possible measure of compliancer it in the above equationswould be the share of a municipality's procurements for which there was a successful request for review. Since this is not the case, an alternative measure must be Public procurement found. The chosen alternative is the share of procurements for which there is a review. It is plausible that private parties (providers) are more likely to request a review (or litigate) when there are rules violations. This assumption is supported if the share of reviewed procurements with successful requests for review is positively correlated with the share of procurements for which there is a review.
In the 2015-2016 sample, 156 out of a total of 290 Swedish municipalities are observed. The share of judicial reviews where the applicant is successful was calculated for these municipalities, and this share was modelled as a function of the number of procurements and, critically, the share of procurements for which there was a request for review, whether successful or not. The regression model and the estimated parameters were as follows: Applicant succes rate ¼ 0:16 À 0:001 Á No of procurements þ 0:99 Á Review share 2:78 ð Þ À1:03 ð Þ 2:11 ð Þ The t-values are within parenthesis. According to the estimation, when the share of procurements for which there is a request for review rises with one percentage point, the success rate for the applicant also rises with about one percentage point, suggesting that the review share can, in fact, be used as a proxy for (non-)compliance with the procurement rules.
The number of procurements per municipality is highly correlated with the number of inhabitants and is likely to be fairly stable between years, at least for relatively large municipalities. Figure 2 plots the number of procurements against population and presents the estimated log-linear relation between the two. The estimated model explains 45% of the variation in the data [17].

JOPP
Turning to the main analysis, the first step in obtaining the dependent variable is to calculate the relative incidence of public procurement reviews, defined as the number of public procurement reviews divided by the number of public procurements in 2021 for the same municipality. The ratio is calculated for reviews in 2015-2016 and 2020. The dependent variable is then calculated as the difference between the two ratios.
With few exceptions, only procurements made by the municipality itself are counted. In particular, procurements made by incorporated companies owned by the municipality are in most cases not counted, mainly because of the practical problem of linking companies to municipalities. The exceptions are, firstly, procurements made by corporations that have as their core business to procure for the municipality that owns the company and, secondly, procurements made by associations that two or more municipalities set up to procure on behalf of the members. The former category was allocated to the municipality that owns the corporation; the latter category was allocated to the dominant municipality within the association or split between two or three dominant municipalities proportionally to their population. In 2021, less than 4% of all procurements made by municipalities fell in either of these categories.
It is common that large municipalities procure on behalf of small neighbouring municipalities and sometimes a small municipality procures on behalf of one or more neighbours. For this reason, municipalities with less than 15,000 inhabitants are excluded from the analysis. Among the 160 municipalities that meet the population criterion, five reported no advertised procurement in 2021 and were excluded from the analysis presented in Figure 1. In addition, one relatively large municipality reported only a single procurement in 2021 [18].
Further investigation revealed four extensive cooperative arrangements in which one municipality managed all or most procurements on behalf of neighbouring municipalities [19]. These arrangements served a total of 24 municipalities, 13 of which had more than 15,000 inhabitants and accounted for about 6% of all municipal procurements. The six municipalities mentioned in the previous paragraph as well as five of the seven referred to in footnote 18 were found in this category.
In these cooperations, a large share of the procurements was registered with one of the cooperating municipalities, not necessarily the largest. Three of the cooperations were able to report approximate number of procurements per municipality. For the fourth procurement cooperation, with five municipalities with more than 15,000 inhabitants, the number of procurements were imputed, using the parameters shown in Figure 1. By coincidence, the sum of the imputed values exactly matched the number of procurements made by these municipalities [20].
Of the cases audited by the SCA, about 100 concerned the 160 largest municipalities, including cases against municipal companies. The cases were initiated between 2011 and 2019 and firstly reported by the SCA in summarizing reports between 2015 and 2020. Of the cases, 20 are from the 2011-2013 period. The first three reports, those from 2015-2017, reported cases that on average had been initiated two years earlier. For the last three reports, the corresponding average lag was one year.
The SCA has had the authority to audit and prosecute violations of the Public Procurement Act since 2010. In this study, it is assumed that the audits' impact on the municipalities came between 2015 and 2020. It is reasonable to assume that most municipalities did not begin reorganizing and enhancing capacities immediately after the first audits. Rather, it is likely that they reacted with a delay. In addition, the court proceedings that follow the audits take time and can be appealed and at the start it may not have been clear how successful the authority would be. To the extent that some Public procurement municipalities had already reacted on the audits by 2015 or, alternatively, that late audits had not had their full impact in 2020, the statistical analysis may underestimate the effect of the audits, as discussed further below.
Descriptive statistics for the 160 municipalities and their procurements, procurement audits and court reviews are reported in Table 1. For the three variables affected by adjustments to the procurement count, as described above, descriptives are reported both before and after adjustment.
The 160 municipalities together account for almost 90% of the Swedish population and an almost equal percentage of all municipalities' advertised procurements in 2021 [21]. The average population size was just under 60,000 and the average municipality made 45 procurements in 2021. Over the entire 2011-2019 period, the SCA audited about 0.6 procurements per municipality and sought fines that averaged almost SEK 400,000 per municipality or SEK 6.25 per inhabitant [22].
In 2020, on average 3.7 procurements per municipality came under court review, corresponding to about 8.5% in the sample (figure not reported in the table). The key dependent variable is the change in the share of court-reviewed procurements. The average change (after adjustment for cooperative procurement arrangements) between 2015/2016 and 2020 was 1.19 percentage points [23].
Since the numbers for 2015-2016 are drawn from a sample that corresponds to about 75% of one year's procurement, the number of court-reviewed procurements per municipality per year is likely to be higher than those reported in the Notes: For 2015-2016, (court) reviews were reported only for a sample with a size corresponding to about 75% of one year's procurement. Diff share reviews was calculated by first dividing No. of reviews by No. of procurements and then taking the difference between the two fractions. No. of audits counts audits for the entire 2011-2019 period; Share audit is defined as the ratio of this variable to No. of procurements even though the latter variable only counts procurements for one year. Note that the unweighted averages reported in the table differ from the weighted averages. For example, the weighted share of audited procurements is 1.34%, as against the unweighted average of 1.71. Adjustments to the count of procurements were made for four cooperative arrangements among neighbouring municipalities as well as for a cooperation between one region and one municipality. See main text for further explanations Source: Statistics Sweden, (Swedish) National Agency for Public Procurement, Swedish Competition Authority (SCA), data from reports from the SCA compiled by the author gives the false impression that the share of procurements subject to court review has increased. However, since it is a random sample, it is still possible to analyse if the difference in review share between the two periods correlate with the SCA's audit intensity, as measured by the audit share and the fines per capita sought by the authority. Finally, Share audits was calculated by dividing No. of audits with No of procurements. Since the former counts audits over a nine-year period, although 80% of the cases were initiated during the last six years, while the latter counts procurements during only one year, the risk that a given municipal will be subject to an audit by the SCA when it makes a procurement is less than a quarter of a percent, rather than the 1.74% reported in the table [25]. The audit count includes cases against municipal companies, as it is assumed that procurement officials are relatively well-informed about such events within the own municipality.

Results
The preferred dependent variable, Diff share reviews in Table 1, is the change in the share of procurements for which a court review was sought between the two periods of observation, in percent, per municipality. In Model 1A, the explanatory variables are, firstly, a measure of the share of procurements made by the municipality that were audited by the SCA (Share audits) and, secondly, the fines sought by the SCA for violations of the procurement rules, per inhabitant (Fines per 1,000 inhab). As can be seen in Table 2, neither of the corresponding parameters is statistically significant.
Model 2A adds the natural logarithm of the population to the regression. Models 1B and 2B are estimated with the 13 observations with municipalities with extensive cooperative arrangements excluded. These modifications have only small effects on key parameters. [26] The 95% confidence interval around the point estimate for the share audited by the SCA is [À0.59, 0.21] in Model 1A and very similar in the other models. That is, under the assumption that the statistical model is correctly specified, the hypothesis of no effect cannot be ruled out at any conventional statistical level, but it can also be said that each audit made by the SCA during the observational period resulted, with a high probability, in an average reduction in 2020 by less than 0.6 court reviews. The statistical analysis thus suggests that the effect of audits is, at most, relatively modest.
However, to assess the total effect, assumptions must be made as to how long-lasting the effect of an audit may be and how quickly it has an impact. Alternatively, a lag structure can be estimated. If the effect is permanent, we cannot necessarily say that the effect of an audit is small, as even the elimination of, e.g. a fifth or a quarter of a review every year (or one review every four to five years) for each additional audit may be considered substantial. To address this issue, the model was re-estimated with the audit intensity variable split into As can be seen in Table 3, the parameters remain insignificant, but the point estimates for the variable representing early audits, mainly from the 2012-2015 period, are of similar magnitude as those reported in Table 2, while the point estimates for later audits are very close to zero. Taken at face value, using the boundaries of the 95% confidence interval and assuming that the effect lasts for ten years, each audit would deter between 0 and 5-6 court reviews (private litigations), while the point estimates correspond to about two court reviews being deterred.
The hypothesis that audits have no effect cannot be excluded, but again the results are also consistent with audits having no more than a modest but relatively long-lasting effect. In order for firmer conclusions to be drawn, larger samples or more direct measures of effect are needed.
As a complement to the quantitative study, semi-structured interviews were conducted with Chief procurement officials at municipalities that had experienced at least one audit by the SCA. The audited municipalities were stratified according to population into five equally large strata of about ten; from each strata two municipalities were selected at random. Two of the municipalities elected not to participate. An interview guide with about 20 questions was used in interviews conducted via video link and lasting, in most cases, between 40 and 60 min.
Five of the respondents stated that the SCA's audits had not influenced the way the municipality organized procurement staff or their procurement routines; three of whom made the statements with emphasis. However, a more nuanced picture emerged during the interviews, leaving the impression that the audits had in fact been influential in the majority of cases. Additional procurement staff had been hired, policies had been revised and procedural routines had been developedat least to some extent in response to the audits.
The respondents that agreed that audits had, at least to some extent, influenced procurement routines etc. had varying opinions concerning the mechanisms through which the influence was felt. Suggestions included increased respect among politicians and managers for the procurement staff's competence and for the procurement rules, aversity to negative media attention, the possibility of being fined for rules violationsand that the procurement specialists learned from the auditing process.
Several of the interviewees hinted at occasional or even ongoing rules transgressions within their own organization. One respondent suggested that the detection probability was sufficiently low and the consequences of detections sufficiently mild, that some decisionmakers continue to make rational calculations whether the rules should be followed or not. Notes: ***Denotes statistical significance at the 0.1% level; ** denotes statistical significance at the 1% level; *denotes statistical significance at the 5% level. Model 3 and 4 estimate separate effect parameters for early (pre 2017) and late audits with control for population size (Model 3) or number of procurements (Model 4) JOPP Two respondents suggested that the SCA should audit procurements more often and not only procurements for which they have specific indications that the rules have been violated, as is the current practice. That is, it was suggested that procurements should be audited in a systematic and pro-active way, rather than reactively. Finally, the interviewees' knowledge of audits of their own municipality, their relative lack of knowledge of audits of other municipalities and their description of processes for their staff's skill development provided support for the proposition that learning effects are mainly local. In turn, this suggests that the general preventive effect may be relatively small, compared to the preventive effects at the level of municipalities or regions.

Discussion
This study develops a methodology for evaluating the impact of procurement audits that potentially can be applied more generally to auditing activities. Perhaps because of the relatively small sample size, the study fails to demonstrate that audits have a statistically significant effect on how procurements are conducted. While the findings are consistent with audits having no effect, it cannot be ruled out that audits have a modest but relatively long-lasting effect on municipalities' procurement practices.
Further, the effect may be underestimated if the general preventive effect is large. It may be that all or most municipalities learn from audits, irrespective of whom is audited. Interviews with chief procurement officials at eight municipalities, however, indicate that audits of other than neighbouring municipalities do not receive much attention. If so, the parameter estimates are not consistent with the SCA's audits having strong effects.
The interviews also reveal a diversity of opinions concerning the effectiveness of the SCA's procurement audits. Some officials maintain that audits have no effect on how they procure while some officials suggest they have a relatively large effect. A theme that comes up in several interviews centres on the legitimacy of the procurement rules. Because of private litigation, the municipalities' decisions are frequently challenged in court (subjected to court review). Avoiding court reviews hence becomes an overriding objective for many officials; an objective that partially obscures the objective of procuring high-quality goods and services at reasonable prices, with potentially detrimental effects for the legitimacy of the procurement rules. In this context, it is desirable that the SCA's audits enhance the legitimacy of the procurement rules, rather than the opposite.
Too much focus on optimal deterrence, via sufficiently high detection risks and sufficiently grim consequences upon detection, carries the risk of further reducing the legitimacy of the rules, especially if the audits focus on formalities. In this context, it is interesting to note that the SCA seeks lower fines under mandatory-prosecution rules than it does when its decision to prosecute is discretional. In such circumstances, a court of law has already effectively given the procuring authority a partial pardon when it established that the contract resulting from the incorrect procurement should not be rendered void.
Conversely, however, a low detection probability for rules violations can result in a spiral of mistrust between providers and procurers. Providers and potential providers that perceive that municipalities do not follow the rules may become less hesitant in calling for court reviews and less hesitant in cutting corners. Since this is costly for the municipalities, the municipalities may focus even more on avoiding court reviews as well as on raising bureaucratic hurdles that are intended to exclude dishonest providers from the procurements.
A related risk with excessive and poorly targeted auditing is that procuring authorities make bid evaluation rigid, when attempting to make the process transparent and predictable. The consequences of reduced buyer discretion have been analysed extensively in the theoretical literature, typically with emphasis on its detrimental effects. Some influential empirical studies have provided evidence in support of these concerns, while some recent empirical work on public procurement have found evidence to the contrary, as discussed in the introduction However, more intense auditing could potentially be an attractive alternative to stricter rules and reduced buyer discretion. Unfortunately, little research directly addresses the consequences of audits on compliance with public-procurement rules. More existing research has focused on how procurement outcomes are affected by more intense auditing, but often with applications to settings where there is a high risk of corruption. Some of the findings have been ambiguous, underlining the need for a deeper understanding of the relation between rules, monitoring, discretion, compliance, avoidanceand outcomesin public procurement. The present study attempts to fill some of this gap. 2. See also Yang, 2008;Carrillo et al., 2017;Lichand and Fernandes, 2019. 3. Qualitatively similar results are reported by Spagnolo, 2012and Coviello et al., 2018. Coviello et al., 2022 provides extensive references to the literature.
4. See Becker, 1968. 5. For a recent survey, see Chalfin and McCrary, 2017, that summarize their finding by concluding that "[e]vidence in favor of deterrence effects is mixed". 7. When a mandatory case comes to the attention of the SCA, it must prosecute for fines. Such cases arise if two conditions are met. Firstly, an authority has contracted directly with a provider, without competitive tendering, in violation of the Public Procurement Act or an authority has signed a contract during the legally required stand-still period. (After the winning bid in a competitive tender has been announced, there is a stand-still period of about two weeks, during which rival providers can appeal the procuring authority's decision.) Secondly, a court has found that the contract should not be declared void, despite the procurement rules having been violated. This may happen if the services of the provider are urgently needed.
In most such cases that come under the court's review, however, it will declare the contract void. The SCA can still take action, although it then has no obligation to do so. A discretionary case will similarly arise if the SCA prosecutes an infringement that has not previously been brought to the court's attention. The two most common types of infringement resulting in discretionary cases are the same that triggers a mandatory case: contracting without competitive tendering and contracting during the stand-still period. Other infringements can also trigger discretionary cases.
8. These numbers were compiled by the author from the SCA reports No. 2015:7, 2016:1, 2017:4, 2018:2, 2019:1,2020:3 and 2021:2. Since reporting principles varied in the first reports, the exact number is uncertain. The true number may include five to ten additional cases, most of which are likely to be cases where the court rejected the SCA's application that the procuring entity should be fined.
9. The NAPP is tasked with providing guidance and support for public procurement, rather than with supervision. JOPP 10. Based on the EU's 2012 Annual Public Procurement Implementation Review. Tukiainen and Halonen, 2020, study causes of litigation in Swedish public procurement. They find large variations between industries and some tendency to fewer litigations when contracts are awarded on the basis of lowest price. The study reports that the Swedish litigation rate is about twice as high as that in Finland.
12. Palguta and Pertold find that downsized contracts are more often given to firms with anonymous owners, suggesting that corruption may be a driving force.
13. See, e.g., Card and Krueger (1994); Abadie (2005); Angrist and Pischke (2009). 14. In principle court rulings are open documents. However, they are not available in an easily accessible database and even if they were, coding the findings of the court for statistical analysis would be a major undertaking. Hence, in practice availability is limited to the data set compiled by the NAPP for the 2015-2016 period, as described below.
15. Wooldridge, 2001. 16. According to NAPP, at most a couple of cases may still not have been registered. The 2020 data was provided via the Swedish Competition Authority.
17. I return to the outliers below.
18. An extended search for misreported data identified seven additional municipalities where the ratio of reported and predicted procurements, or its inverse, exceeded 3.
19. In contrast to procurement via formal associations set up jointly by two or more municipalities, as discussed above, the procurement processes are here managed directly by one municipality.
20. Many individual procurements were made on behalf of two or more municipalities. In one instance, counting each of the established supply relations yielded a number more than twice as large as the number of procurements. Such joint procurements were fractionalized among the participating municipalities.
21. From National Agency for Public Procurement's database, 7,083 procurements could be attributed to the 160 municipalities. As described in the text, the database reports zero procurements for a handful of large municipalities, while another data source reports a positive number of court reviews of these municipalities' procurements. Hence, a model prediction of the number of procurements was used for this group of municipalities, adding 180 procurements for a total of 7,263. The predicted procurements are included in the table. 22. The weighted average is about SEK 7 per inhabitant.
23. The maximum value, 30%, resulted from one municipality having zero court reviews during the first period, while three of its ten procurements in 2020 were subject to court review. Similarly, the minimum value of À30% was the result of one municipality having 40% of its procurements reviewed in 2015/2016 but only 10%in 2020.

Public procurement
Appendix. The parallel-trends assumption This section explores the assumption of common trends. Since key data is only available for one or two periods, common trends are explored through other variables that may impact on the municipality's procurement decisions: the left-right orientation of the governing coalition of the municipality, the outsourcing share of total municipal costs, the level of municipal costs and of outsourcing and number of inhabitants.
The left-right orientation is captured by a dummy variable for the Social Democrats either being represented in the governing coalition or being the party that alone governs the municipality. Figure A1 shows that the trends are parallel for the two groups throughout the 2002-2022 period, although the Social Democrats are in power more often in the audited municipalities.
The audited municipalities are, on average, twice as large in terms of population as are the nonaudited municipalities. If plotted in levels, whether the vertical axis is in logarithms or not, the trends appear parallel. However, if population is indexed to 100 in 2011 and then plotted, as in Figure A2, it becomes apparent that population growth is slightly higher in the audited municipalities, with a perannum growth of about 1% as against 0.8% in the control group.
Total municipal costs, in SEK, and outsourcing of services, in SEK, varies with the size of the municipality. To facilitate comparison, these are indexed to 100 in 2011 for each municipality and then averaged across the two groups. As shown in Figures A3 and A4, index values virtually coincide for the 2011-2016 period but then diverge slightly.
Further insights can be gained by looking at the outsourcing share, defined as outsourcing of services as a fraction of municipal costs. Both audited and non-audited municipalities increase the outsourcing share until 2016, but the share increases slightly faster for the latter category. After 2016, the share fell for both categories, but the fall was steeper for audited municipalities. By 2021, the gap between the two groups had been closed ( Figure A5).
The fall in the outsourcing share after 2016 could be interpreted as a signal of a general retrenchment of procurement practices. If so, procurement officers could spend more time on each procurement, hence reducing the scope for suppliers to litigate. However, there is no corresponding fall in the total number of procurements, suggesting that the trend reversal was specific for services outsourcing. Note that the number of advertised procurements shown in the figure is for all types of buyers, including regions, central government and government-owned companies. Note also that the fall in 2014 is due to higher thresholds for direct-award procurements.

Public procurement
The Social Democrats, traditionally less interested in outsourcing than centre-right parties, came into power in many municipalities in the 2014 election. However, the outsourcing share kept rising until 2016 and only then fell back. An alternative explanation is that the Syrian refugee crisis in 2016 led to a relatively large cost expansion while not impacting on outsourcing, hence reducing the outsourcing share. However, such an effect would not be large enough to explain the relatively sharp fall in 2017 and 2018.
It does not seem likely that the audits by the SCA caused the fall in outsourcing in 2017.
Corresponding author Mats A. Bergman can be contacted at: mats.bergman@sh.se For instructions on how to order reprints of this article, please visit our website: www.emeraldgrouppublishing.com/licensing/reprints.htm Or contact us for further details: permissions@emeraldinsight.com