Research Guides: Evidence Based Practice: Appraise

So I found an article, now what?

Congratulations on finding some articles that align with your research question! Now it's time to see if these articles are worthy. Do they align with your research question? Do they have quality evidence? What are the limitations?

Let's appraise what we've found!

Before you read:

Stop! Don't read every article you find from cover-to-cover. Instead, follow these steps to save time!

Read the abstract! An abstract is a short paragraph (no more than 250 words) at the top of the paper that quickly covers the key questions, study designs, methods, results, and purpose. This will help you determine if the article truly aligns with your research topic.
Read the introduction! This will provide you with the background information of the study and lay the foundation for the rest of the paper.
Jump to the Results section! Note that charts and graphs offer quick summaries of the data collected. Check if the results answer the authors question.
Discussion and Conclusion! This is where the authors talk about what their results mean. Do you agree with their conclusions? Could there be alternate conclusions that were not addressed?
Read the Methods! How did the authors address their question? Does the chosen method match the question type (e.g., is it appropriate?) Is it replicable and explicit or vague?

If at any step, the study appears to veer away from your topic (applicability) or lacks sound evidence (relability) then make note of it and move on!

Also, don't forget to check the presented limitations of the study to see if any are a deal-breaker for you!

Want to know where the methods section is? Or what to look for in an introduction? Check out North Carolina State University Library's Anatomy of a Scholarly Article (CC-NC-SA 3.0)

This form from Flinders University has some great questions you can ask yourself at each section of any article to make sure you're getting the most out of what you read!

Levels of evidence — Note. *Figure.* Moira Tannenbaum, M; Sebastian, S, (2022). CC BY 2.0

What is "Evidence"?

Evidence is as any information that is used to inform the decision-making process and is derived from rigorous research methods. This can include data from clinical trials, systematic reviews, meta-analyses, observational studies, expert opinions, and patient preferences. A key characteristic of evidence is that it is obtained through reliable and transparent methods, allowing for the evaluation of its validity and applicability to a particular situation or question.

What is the Evidence Hierarchy?

EBP traditionally ranks the various study types based on the strength and precision of their research methods. Different hierarchies exist for different question types, and even experts may disagree on the exact rank of information in the evidence hierarchies. Still, most agree that quantatively speaking, current, well designed systematic reviews and meta-analyses are at the top of the pyramid, and that expert opinion and anecdotal experience are at the bottom.

Qualitative data tends to be harder to categorize, but there is a general consensus that some qualitative research methods provide more rigorous and reliable insights than others, with meta-syntheses towards the top of the pyramid, and single case-studies towards the bottom.

What should I do if there isn't a systematic review or meta-analysis on my topic?

If there isn't a systematic review or meta-analysis on your topic, then the next-best evidence you should be looking for will depend primarily on your question type!

Therapy or Treatment search for Randomized Controlled Trials
Prevention search for RCT's or Prospective Studies
Diagnosis search for RCT's or Cohort Studies
Prognosis search for Cohort or Case-Controlled Studies
Etiology search for Cohort Studies
Meaning search for Qualitative Studies

When the Evidence-Based Pyramid Doesn't Fit

The Research Pyramid, introduced by Borgetto et al. (2007), offers a more nuanced approach by categorizing evidence. Each dimension is represented by a side of the pyramid, with varying levels of rigor corresponding to different study types. For example, experimental studies such as randomized controlled trials (RCTs) occupy the top tier of the quantitative side, while meta-syntheses represent the highest tier for qualitative studies.

What is Statistics?

You're reading a delightful paper all about your chosen topic, when BAM! suddenly there's tables, charts, percentages, and math just ruining your flow. Don't skip this section or fall into despair just yet! This section will help you understand what it is the results sections are trying to say, so you can confidently make the best decision based on the data.

Statistics, made simple.

So, what is statistics? Statistics is finding the story behind the numbers. Its a field of study that makes sense of data by organizing, analyzing, and interpreting it. With statistics, we can uncover patterns, trends, and relationships hidden within the data. These findings help us make informed decisions, predict outcomes, and understand the world around us better.

There are two main branches of statistics. Descriptive and Inferential.

Descriptive statistics summarizes and organizes data to give us a clear picture of its characteristics.
Inferential statistics helps us make predictions or generalizations about a larger group based on a sample.

For example, If you were doing a study surveying the Cressman librarians about their favorite authors, then descriptive statistics would help you summarize and describe the most popular authors among them. However, if you found that a certain author was highly favored among the Cressman librarians, then inferential statistics could help you predict whether that author is likely to be popular among librarians in general.

What is data?

To understand statistics, you first have to understand data. Data is a collection of observations, typically coming from a sample of a population. An observation is the unit of measurement in your data. Observations will represent different things for different data.

For example, if your data is describing a population of students, each student is considered an observation. If your data is measuring the price of fruit at the grocery store, each apple or orange would represent a distinct observation.

The characteristics that describe these observations (price, weight, height, gender) are called variables. A variable is a measure of something that differs between observations or can change over time.

Descriptive Statistics

Descriptive statistics is what it sounds like: measures that are intended to describe the variables in your data. Descriptive statistics include measures like minimums, maximums, averages (means), medians, mode, percentiles, and range.

Mean

Diagram of central tendency with positive and negative skew — Note. A negatively skewed, normal, and positively skewed distribution and their respective mean, median, and mode. Adapted from *Ledidi Academy* by Parys A. V., (n.d.), https://ledidi.com/academy/measures-of-central-tendency-mean-median-and-mode, C.C. 2.0.

The mean (also called the average) is calculated by adding up each observation’s value for a certain variable (like height) and dividing by the number of observations. This gives you a sense of what value the variable tends to take on for the observations in your data – like how tall a 5th grader tends to be.
Note: Mean skews to outliers. This means that if you are taking a sample, and there are some really big (or small) numbers in the group, they can make the mean get bigger too, even if most of the numbers are smaller. So, those outlier numbers are like pulling the mean towards them, making it not show what most of the numbers are like.

Median

The median is calculated by listing the values of a variable for all observations from least to greatest and by finding the center value. If there are an even number of observations, the median is calculated by taking the average of the two center values.

For example: Suppose your variable is age

Student A: 8 y.o. | Student B: 9 y.o. | Student C: 10 y.o. | Student D: 11 y.o.

If we were only taking the median of students A, B, and C, then the median would be 9 years old because that is the center value. But since there are an even number of students in the sample, the median is the average of the two center values, 9 and 10. So the median student’s age is 9 and a half years old.

Percentiles and Quartiles

Percentiles are used to describe how observations rank relative to other observations in a sample.

For example, having a GPA in the 90th percentile means that 90% of students (observations) have a lower GPA.

Quartile refers to the 25th, 50th, and 75th percentiles. You might see the term “IQR” or “interquartile range” which refers to the range of values between the 25th and 75th percentile.

Box Plots

Most often, you'll see descriptive data neatly summarized in what is a called a box plot. A box plot contains 5 key pieces of information: the minimum, the first quartile, the median, the third quartile, and the maximum.

What is Inferential Statistics?

Inferential statistics helps us make conclusions or predictions about a big group by looking at a smaller part of it. We use inferential statistics when we can't measure or observe everything in a group, but we still want to know something about it.

Point Estimation

sampling error

Point estimation is the process of estimating a characteristic (statistic) about a population when you only have access to a sample of that population.

For example, you might be interested in knowing the class’s average test score for an exam you just took, but only three of your friends agreed to tell you their grade! If you take the average score across those three students, you can use this to guess at the average score of the class – but that guess might not be very accurate. The more students you survey (the larger the number of observations) the more likely you are to guess correctly.

Because point estimates are guesses at what the population looks like, they inherently have what we call sampling error because we can’t know exactly what the population looks like if we only have a sample. In general, the larger sample reduces the sampling error. (Other things can reduce the sampling error, too. Like making sure to randomly sample. For example, if you only ask the slackers for what they got on the exam, your sample average is likely to be lower than the class average.)

Example:

An example of a point estimate is the average score that a sample of 30 high school students received on the SAT. The median describes all 30 scores with one number — making it a point — and that point may differ from the average SAT score for all high school students nationwide — making it an estimate.

Margins of Error

Because point estimates are estimates. Margins of error tell you how much a point estimate from a sample may differ from the true population value.

The larger the margin of error, the less confident researchers can be that the point estimate is approximating the population value.

Confidence Intervals

A confidence interval is calculated by adding and subtracting the margin of error from the point estimate. Confidence intervals suggest what range of values around the point estimate is likely to include the population characteristic.

The wider the margin of error, the wider the confidence interval, and the more uncertainty about what the population characteristic might be.

What is Hypothesis Testing?

Whenever we do an experiment, we need to rule out that our results could have occurred simply to chance. Hypothesis testing is a subset of inferential statistics, and its main goal is to help determine if the results you found in your study (which were found with a sample of a population) could be applied to the larger population and still hold true with a certain level of confidence.

But lets back up a bit. Logically speaking, It is much easier to prove something is false than to prove something is true. If we want to prove something is false, then we only need one counter example. But, if we want to prove something is true, then we need to prove it is true in every possible situation.

For example, the classic argument is the claim "all swans are white." If I want to prove that the statement "all swans are white" is true, then I would need to go out and check every single swan. That's impossible! However, I can prove that the statement "all swans are white" is false by finding just one swan that is isn't white.

When we start doing an experiment, we create a hypothesis about a population. A hypothesis (H_A), also know as an alternate hypothesis, is a proposed explanation or prediction about a phenomenon or a relationship between variables. For example, we may hypothesize that Cedar Crest students, on average, chat with their librarians more so than the average college student.

The null hypothesis (H₀) is a statement that suggests there is no significant difference, effect, or relationship between the phenomena or variables being studied. The null hypothesis exists as the status quo, or default assumption. In our example, the null hypothesis is that Cedar Crest students chat with their librarians just as much as the average college student.

Because it would be much harder to prove our hypothesis was true in every case, we instead prove that the null hypothesis was false in this singular case. This is because if the null hypothesis (that there is no change) is false, then our hypothesis (that there is significant change) is more likely to be true.

P-values

The p-value is a measure of how likely the sample results are, assuming the null hypothesis is true. To put it simply, the p-value is the probability that your observed phenomonon happened simply due to chance.

The P value is the probability that any particular outcome would have arisen by chance. Standard scientific practice usually deems a P value of less than 1 in 20 (expressed as P=0.05, and equivalent to a betting odds of 20 to 1) as "statistically significant" and a P value of less than 1 in 100 (P=0.01) as "statistically highly significant."

What is Regression Analysis?

Regression analysis is a statistical tool that is used to understand how and if two factors are connected, and how one factor can change when another factor does. For example, regression analysis might be used to estimate the average effect an extra hour of studying will have on a student’s exam grade. Regressions use one or more explanatory variables (like time spent studying) to estimate an outcome variable (like a test score).

The researchers choose which explanatory variables are used in their regression analysis. The importance of this choice cannot be understated. Having too few or too many explanatory variables (or simply choosing the wrong ones) can render the results of a regression analysis completely useless!

So, what can go wrong?

For example, you probably know that "time spent studying for an exam" is not the only determinant of a student’s score. Other factors could include: how much sleep the student got the night before, if the student has test anxiety, and the student's general mastery over the material. These factors can have equal, if not greater, influence on the student’s score. If you don't include those variables in your anaylsis, the you get something called omitted variable bias and that can discredit the results of the the whole regression analysis.

One consequence of omitting key explanatory variables is that it can make two factors seem related when they are not. The classic example is a regression that looks at the effect of ice cream sales on shark attacks. If you were to run a regression analysis on data that measures ice cream sales and shark attacks over time, you would find that ice cream sales are heavily correlated with shark attacks! Before you start crafting theories about shark’s having a sweet tooth, remember that correlation does not equal causation!.

Remember! Correlation does not equal Causation!

A regression result showing that two variables are related does not prove that the two variables are causally related. A regression is not a complex model that replicates the real world.

Interpreting Research Statistics

Statistics can inform your understanding of a research topic, and it can provide evidence to inform your choices. But! It’s important to think about statistics as being able to support an idea but being unable to prove it. Statistics can be a powerful type of evidence, but there are several pitfalls to avoid.

The safest rule for interpreting research statistics in peer-reviewed sources is to rely on authors’ own description of statistical results, the authors’ own interpretation and discussion of the results as evidence to inform a research question, and the authors’ own assessment of the limitations of the statistical evidence.

Researchers typically focus on narrow questions, but their data can be misinterpreted when used to address a different question. After identifying research studies that use statistics that seem to directly address your research question, read the authors’ own interpretation of the statistics. Then, ask yourself:

Did the authors design their statistical analysis in a way that directly helps address my question, or would it take a leap to use this data for my research?

Statistical tests are complicated and designed for a narrow question — so, only use statistical evidence that was designed to address your specific question.

Note: More complex statistical methods generally require narrower applications and interpretations.

Finally, recognize that a statistically significant result is not necessarily a meaningful result in the real world: ask yourself:

Is the result summarized in the study clinically meaningful or compelling evidence for answering your research question?

In describing your conclusions about the statistics, be sure to stick with other lessons from this guide: correlation is not causation, and good research means acknowledging the limitations of your research sources.

Common Statistical Techniques

T-Tests

t-Tests are used to compare the mean score (on some continuous variable) between two groups.

t-Tests can be employed to compare the mean scores of two different groups (independent-samples t-test) or to compare the same group of people on two different occasions (paired-samples t-test).

ANOVA

Whilst t-tests compare the mean score on one variable for two groups, analysis of variance is used to test more than two groups.

Analysis of variance (ANOVA) checks how much variation exists between groups compared to within each group.

Correlation

When two variables are correlated, it means that changes in one variable are associated (but may not cause!) with changes in the other variable. There are two key components of correlation: Strength (how closely the data points form a line) and Direction (which direction the line goes).

r versus R²

The lowercase "r" typically refers to the Pearson correlation coefficient, which measures the strength and direction of the linear relationship between two variables.

Strength: Refers to how closely the data points in a scatterplot cluster around a straight line.
- A strong correlation means that the points are tightly clustered around the line (r will be closer to 1 or -1).
- A weak correlation means that the points are more scattered (r will be closer to 0).
  - Small/weak: r= ± .10 to ±.29
  - Medium/moderate: r= ±.30 to ±.49
  - Large/strong: r= ±.50 to ±1
Direction: Refers to whether the relationship between the variables is positive or negative.
- Positive correlation: When one variable increases, the other variable tends to increase as well. In a scatterplot, this relationship is depicted by an upward sloping line. (r gets closer to 1)
- Negative correlation: When one variable increases, the other variable tends to decrease. In a scatterplot, this relationship is depicted by a downward sloping line. (r gets closer to -1)

The uppercase "R" generally represents the coefficient of determination, often denoted as R²This value describes the proportion of variation in the dependent variable that is explained by the independent variable(s) in a regression model. In other words, R² indicates how well the independent variable(s) predict the variation in the dependent variable.

Remember that correlation does not equal causation. If you want to see some interesting correlations check out Spurious Correlations.

Multiple regression is an extension of correlation analysis. Multiple regression is used to explore the relationship between one dependent variable and a number of independent variables or predictors.

The purpose of a multiple regression model is to predict values of a dependent variable based on the values of the independent variables or predictors.

Chi-square test for independence is used to explore the relationship between two categorical variables. Each variable can have two or more categories.

Evidence Based Practice

Check out this review forms for various study designs!