Tools to help you interpret the clinical and statistical significance of data reported in clinical research.
This page covers the third of the five As -- appraising the evidence you have acquired through your database searches. Critical appraisal is a systematic analysis of a research article to determine the quality of the evidence it presents in reference to your clinical question. In this stage of the EBP process, your task is to analyze the different aspects (below) of each study to answer the following questions:
Quality of evidence is based on its level, as well as the strength of the study, and how directly it applies to your clinical question. This tab presents multiple aspects of articles that you can review in order to make judgements about the level, strength, and applicability of the evidence they present in relation to your
clinical question.
Text on this page adapted from the UW Libraries, and is licensed under a CC BY-NC 4.0 license. See details.
Quality of evidence is based in part upon its level. There exist multiple scales to represent an article's level of evidence, such as those produced by the Cochrane Library, as well as multiple graphic representations of this evidence such as the one to the left. That said, no scale or graphic of this nature represents an infallible, authoritative articulation of what places a study at a specific level. Rather, different types of research questions are best answered by different types of study (see the Strength of evidence tab to the right), and one type of study may represent a high level of evidence for a therapy-related question, but a lover level of evidence for a prognosis-related question. As noted on the New York Medical College EBM Resource Center's page on appraisal:
In a best-case scenario, you will be able to find a high-quality pre-appraised information source to answer your question. However, this is not available for all topics. Furthermore, you may also find it useful to appraise pre-appraised information. Once you figure out what to look for and where to look, you still have to worry about the quality of the material you find. You should always question the quality of the material you find. Remember, a poorly done systematic review is not better than a well done randomized controlled trial.
With that context, the following table from Winona State University offers a general presentation of an article's level of evidence and suggest study designs best suited to answer each type of clinical question:
Level of evidence (LOE) |
Description |
Level I |
Evidence from a systematic review or meta-analysis of all relevant RCTs (randomized controlled trial) or evidence-based clinical practice guidelines based on systematic reviews of RCTs or three or more RCTs of good quality that have similar results. |
Level II |
Evidence obtained from at least one well-designed RCT (e.g. large multi-site RCT). |
Level III |
Evidence obtained from well-designed controlled trials without randomization (i.e. quasi-experimental). |
Level IV |
Evidence from well-designed case-control or cohort studies. |
Level V |
Evidence from systematic reviews of descriptive and qualitative studies (meta-synthesis). |
Level VI |
Evidence from a single descriptive or qualitative study. |
Level VII |
Evidence from the opinion of authorities and/or reports of expert committees. |
That said, multiple schematics presenting different levels of evidence exist. The Johanna Briggs Institute, for example, splits its levels of evidence up by effectiveness, diagnosis, prognosis, economic considerations, and meaningfulness. The Cochrane Library uses a four-point scale of high, moderate, low, and very low certainty that a piece of evidence supports a particular outcome.
Regarding the strength of a study, various scales, such as the PEDro scale, have been developed to rank studies by their strength. Strength of evidence is based on any research design limitations, methodological limitations, and/or threats to validity that may affect interpretation of findings and generalization of results.
How strong the evidence is in support of a given intervention as a means of addressing a clinical question depends in large part upon the type of study being conducted. For this reason, the pyramid to the right splits off into different domains, and presents different types of design as providing stronger or weaker evidence according to the study design.
Following from the above context, the following table from Winona State University offer a general guide to an article's strength of evidence in relation to different study designs:
Clinical Question |
Suggested Research Design(s) |
All Clinical Questions |
Systematic review, meta-analysis |
Therapy |
Randomized controlled trial (RCT), meta-analysis |
Etiology |
Randomized controlled trial (RCT), meta-analysis, cohort study |
Diagnosis |
Randomized controlled trial (RCT) |
Prevention |
Randomized controlled trial (RCT), meta-analysis |
Prognosis |
Cohort study |
Meaning |
Qualitative study |
Quality Improvement |
Randomized controlled trial (RCT) |
Cost |
Economic evaluation |
In addition, partly bleeding into the Apply step of the EBP process, Dartmouth offers the following set of appraisal checklists.
While the additional background and context she provides is important to understand, Greenlaigh (1997 -- article begins on page 5 of the document) boils assessing an EBP article's methodology down to five points to consider:
The following is a general list of questions to ask regarding the internal and external validity of a study:
Greenlaigh, T. (1997). Assessing the methodological quality of published papers. BMJ 315: 305-8.
Text in the validity of evidence section adapted from the UW Libraries, and is licensed under a CC BY-NC 4.0 license. See details.
The concept of applicability is presented under multiple terms, ranging from relevance, generalizability, and external validity. Just as it is encompassed by multiple terms, it also poses multiple definitions. Shadish, Cook, and Campbell (2002) define it as, "inferences about the extent to which a causal relationship holds over variations in persons, settings, treatments and outcomes." Atkins, Chang, Gartlehner, Buckley, Whitlock, Berliner, and Matchar (2010) define it as, "the extent to which the effects observed in published studies are likely to reflect the expected results when a specific intervention is applied to the population of interest under 'real-world' conditions." Broadly, applicability addresses the question of how relevant a study or synthesis of studies is to a patient's situation.
The body of literature specific to discussions of applicability in evidence based practice is slimmer than bodies discussing other aspects of appraisal. While he is writing about a specific piece of software designed to aid in the EBP process, Pearson (2004) notes a general list of questions to pose regarding the applicability of an intervention to a patient's situation:
Despite the existence of questions like these, Atkins et. al. (2010) note a lack of standards or guidance in assessing applicability. Rather, applicability has historically been something of a judgement call. Seeking to rectify this situation, Atkins et. al. (2010) set as their goal to "describe a systematic but practical approach for considering applicability in the process of reviewing, reporting, and synthesizing evidence from eligible studies." Though they are writing about systematic reviews specifically, their guidance provers applicable (pun intended) to the evaluation of the applicability of acquired studies to a patient's situation.
With the caveat that, "applicability depends on context and cannot be assessed with a universal rating system," they draw from a number of existing models (listed in the further reading section below) to describe a four-step general system for assessing applicability. While the fourth step is specific to systematic reviews, the first three steps of that process are relevant to evidence based practice:
Works cited:
-- Atkins, D., Chang, S., Gartlehner, G., Buckley, D.I., Whitlock, E.P., Berliner, E., and Matchar, D. (2010). Assessing the applicability of studies when comparing medical interventions. In Methods guide for comparative effectiveness reviews. AHRQ Publication No. 10(14)-EHC063-EF. Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services.
-- Pearson, A. (2004). Balancing the evidence: Incorporating the synthesis of qualitative data into systematic reviews. JBI Reports 2004(2): 45-64.
-- Shadish, W., Cook, T., and Campbell, D. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.
Further reading:
-- Bornhöft G., Maxion-Bergemann S., Wolf U., Kienle G.S., Michalsen A., Vollmar H.C., Gilbertson S., and Matthiessen P.F. (2006). Checklist for the qualitative evaluation of clinical studies with particular focus on external validity and model validity. BMC Med Res Methodol(6): 56. doi: 10.1186/1471-2288-6-56.
-- Green L.W., and Glasgow R.E. (2006). Evaluating the relevance, generalization, and applicability of research: Issues in external validation and translation methodology. Evaluation & the Health Professions, 29(1):126-153. doi:10.1177/0163278705284445.
-- Pibouleau L., Boutron I., Reeves B.C., Nizard R., and Ravaud P. (2009). Applicability and generalisability of published results of randomised controlled trials and non-randomised studies evaluating four orthopaedic procedures: Methodological systematic review. BMJ: 339:b4538. doi: 10.1136/bmj.b4538.
-- Rothwell, P.M. (2005). External validity of randomised controlled trials: "to whom do the results of this trial apply?" Lancet, 365(9453): 82-93.
This page is licensed under a Creative Commons license.
Synthesis involves combining ideas or results from two or more sources in a meaningful way. In EBP, the synthesis is focused on the clinical question. You may combine the details from the article appraisals into themes to organize the ideas. The writing must remain objective and accurately report the information from the original sources.
Discuss implications for practice, education, or research. The discussion may include suggestions or recommendations for changes to practice, education or research as well as confirmation of current practice. A table may be used to display the information collected from the articles under discussion.
|
Article 1 [1st Author and Year] |
Article 2 [1st Author and Year] |
Article 3 [1st Author and Year] |
Article 4 [1st Author and Year] |
Population |
|
|
|
|
Setting |
|
|
|
|
Outcome Measure(s) |
|
|
|
|
Study Design |
|
|
|
|
Intervention |
|
|
|
|
Key Findings |
|
|
|
|
Critical appraisal |
|
|
|
|
Study Quality |
|
|
|
|
Text on this page adapted from the UW Libraries, and is licensed under a CC BY-NC 4.0 license. See details.