Systematic reviews of clinical decision tools for acute abdominal pain

Clinical Governance: An International Journal

ISSN: 1477-7274

Article publication date: 14 August 2007

88

Citation

(2007), "Systematic reviews of clinical decision tools for acute abdominal pain", Clinical Governance: An International Journal, Vol. 12 No. 3. https://doi.org/10.1108/cgij.2007.24812cae.002

Publisher

:

Emerald Group Publishing Limited

Copyright © 2007, Emerald Group Publishing Limited


Systematic reviews of clinical decision tools for acute abdominal pain

Systematic reviews of clinical decision tools for acute abdominal painJ.L.Y Liu, J.C. Wyatt, J.J. Deeks, S. Clamp, J. Keen, P. Verde, C. Ohmann, J. Wellwood, M. Dawes and D.G. Altman

Background

Making accurate decisions for patients with acute abdominal pain (AAP) is difficult. To avoid missing seriously ill patients, many undergo unnecessary surgery, with negative laparotomy rates of 25 per cent. However, delays can lead to 20 per cent perforation rates. Many conditions cause AAP and no single clinical finding or test is both specific and sensitive. Many decision tools (DTs) combining two or more findings have been developed to aid AAP management, but no consensus exists on their appropriateness for clinical use.

Objectives

The study aimed to answer the following questions.

  1. 1.

    What are the diagnostic accuracies of DTs and doctors aided by DTs compared with those of unaided doctors?

  2. 2.

    What is the impact of providing doctors with an AAP DT on patient outcomes, clinical decisions and actions?

  3. 3.

    What factors are likely to determine the usage rates and usability of a DT?

  4. 4.

    What are the associated costs and likely cost-effectiveness of these DTs in routine use in the UK?

Methods

Data sources

MEDLINE, EMBASE, CINAHL, INSPEC CENTRAL, SIGLE and HEALTH-CD were searched for empirical English-language studies. Searches were conducted to 1 July 2003.

Study selection (inclusion criteria)

For question 1, the criteria for eligible studies included:

  • unselected patients with AAP were recruited consecutively or randomly sampled from a primary or secondary care setting;

  • patients had previously undiagnosed AAP lasting for seven days or less from onset;

  • the study reported accuracies of AAP DTs, with or without comparisons to unaided doctors’ decisions;

  • an adequate reference standard was described; and

  • sensitivity and specificity could be calculated.

For question 2, the criteria for eligible studies included:

  • the study was a randomised controlled trial (RCT) or quasi-RCT;

  • the patients were the same as for question 1;

  • evaluations were conducted of the impact of AAP DTs, compared with unaided doctors’ decisions; and

  • the study reported some measure of impact on patient outcomes, clinical decisions or actions.

Data extraction

Data from each eligible study were extracted. For question 1, this included patient characteristics, type of DT, healthcare setting, and the accuracy of DTs and unaided doctors’ decisions. For question 2, this included outcomes, clinical decisions and actions for patients of doctors aided or unaided by a DT. Potential sources of heterogeneity were extracted for both questions.

Data synthesis

For the accuracy review, meta-analysis was conducted. Among studies comparing diagnostic accuracies of DTs with unaided doctors, error rate ratios provided estimates of the differences between the false-negative and false-positive rates of the DT and unaided doctors’ performance. Pooled error rate ratios and 95 per cent confidence intervals (CIs) for false-negative rates and false-positive rates were computed. Metaregression was used to explore heterogeneity.

Results

Question 1

Overall, 32 studies from 27 articles, all based in secondary care, were eligible for the review of DT accuracies, while two were eligible for the review of the accuracy of hospital doctors aided by DTs. Sensitivities and specificities for DTs ranged from 53 to 99 per cent and 30 to 99 per cent, respectively. Those for unaided doctors ranged from 64 to 93 per cent and 39 to 91 per cent, respectively. A total of 13 studies reported false-positive and false-negative rates for both DTs and unaided doctors, enabling a direct comparison of their performance. In random effects meta-analyses, DTs had significantly lower false-positive rates (error rate ratio 0.62, 95 per cent CI 0.46 to 0.83) than unaided doctors. DTs may have higher false-negative rates than unaided doctors (error rate ratio 1.34, 95 per cent CI 0.93 to 1.93). Significant heterogeneity was present.

Two studies compared the diagnostic accuracies of doctors aided by DTs to unaided doctors. In a multiarm cluster RCT (n=5193), the diagnostic accuracy of doctors not given access to DTs was not significantly worse (sensitivity 28.4 per cent and specificity 96.0 per cent) than that of three groups of aided doctors (sensitivities of 42.4–47.9 per cent, and specificities of 95.5–96.5 per cent, respectively). In an uncontrolled before-and-after study (n=1484), the sensitivities and specificities of aided and unaided doctors were 95.5 per cent and 91.5 per cent (p=0.24) and 78.1 per cent and 86.4 per cent (p < 0.001), respectively.

The metaregression of DTs showed that:

  • prospective test-set validation at the site of the tool’s development was associated with considerably higher diagnostic accuracy than prospective test-set validation at an independent centre (relative diagnostic odds ratio (RDOR) 8.2; 95 per cent CI 3.1 to 14.7);

  • the earlier in the year the study was performed the higher the performance (RDOR 0.88, 0.83 to 0.92);

  • when developers evaluated their own DT there was better performance than when independent evaluators carried out the study (RDOR=3.0, 1.3 to 6.8); and

  • there was no evidence of association between other quality indicators and DT accuracy.

Question 2

The one eligible study of the impact study review, a four-arm cluster randomised trial (n=5193), showed that hospital admission rates of patients by doctors not allocated to a DT (42.8 per cent) were significantly higher than those by doctors allocated to three combinations of decision support (34.2–38.5 per cent) (p < 0.001). There was no evidence of a difference between perforation rates (p=0.19) and negative laparotomy rates in the four trial arms (p=0.46).

Question 3

Usage rates of DTs by doctors in accident and emergency departments ranged from 10 to 77 per cent in the six studies that reported them. Possible determinants of usability include the reasoning method used, the number of items used and the output format.

Question 4

A deterministic cost-effectiveness comparison demonstrated that a paper checklist is likely to be 100–900 times more cost-effective than a computer-based DT, under stated assumptions.

Conclusions

5.1 Implications for healthcare

  • With their significantly greater specificity and lower false-positive rates than doctors, DTs are potentially useful in confirming a diagnosis of acute appendicitis, but not in ruling it out.

  • The clinical use of well-designed, condition-specific paper or computer-based structured checklists is promising as a way to improve impact on patient outcomes, subject to further research.

Recommendations for research

This review uncovered important evidence gaps. The authors’ research recommendations include the following:

  1. 1.

    Primary research to compare paper-based checklists with computer-based tools exploring the type/format that maximises patient benefit.

  2. 2.

    Empirical research to identify the determinants of successful DTs, to provide more evidence to support the development of clinically useful tools.

  3. 3.

    More general systematic reviews (across a range of diseases or tools) to assess:

    • factors that make DTs more acceptable to doctors and patients; and

    • the relative clinical value of paper checklists versus computer-based tools.

© 2006 Crown Copyright

J.L.Y. Liu is based at: NHS/Cancer Research UK Centre for Statistics in Medicine, Wolfson College, Oxford University, UK; Department of Public Health, Oxford University, UK; and Health Informatics Centre, University of Dundee, UK. J.C. Wyatt is based at: Health Informatics Centre, University of Dundee, UK; National Institute for Health and Clinical Excellence (NICE), London, UK; and Department of Primary Health Care, Oxford University, UK. J.J. Deeks is based at NHS/Cancer Research UK Centre for Statistics in Medicine, Wolfson College, Oxford University, UK and Department of Public Health and Epidemiology, University of Birmingham, UK. S. Clamp and J. Keen are both based at Yorkshire Centre for Health Informatics, University of Leeds, UK. P. Verde is based at Coordination Centre for Clinical Trials and Theoretical Surgery Unit, Heinrich-Heine University Düsseldorf, Germany and Department of Statistics, University of Dortmund, Germany. C. Ohmann is based at Coordination Centre for Clinical Trials and Theoretical Surgery Unit, Heinrich-Heine University Düsseldorf, Germany. J. Wellwood is based at Department of Surgery, Whipps Cross Hospital, London, UK. M. Dawes is based at NHS R&D Centre for Evidence-Based Medicine, Oxford University, UK and Department of Family Medicine, McGill University, Montreal, Canada. D.G. Altman is based at NHS/Cancer Research UK Centre for Statistics in Medicine, Wolfson College, Oxford University, UK.

Further Reading

Liu, J.L.Y., Wyatt, J.C., Deeks, J.J., Clamp, S., Keen, J. and Verde, J. et al., (2006), “Systematic reviews of clinical decision tools for acute abdominal pain”, Health Technology Assessment, Vol. 10 No. 47

Related articles