These checklists from the Critical Appraisal Skills Programme can help you make sense and double-check the evidence of even the most complicated studies!
Congratulations on finding some articles that align with your research question! Now it's time to see if these articles are worthy. Do they align with your research question? Do they have quality evidence? What are the limitations?
Let's appraise what we've found!
Stop! Don't read every article you find from cover-to-cover. Instead, follow these steps to save time!
If at any step, the study appears to veer away from your topic (applicability) or lacks sound evidence (relability) then make note of it and move on!
Also, don't forget to check the presented limitations of the study to see if any are a deal-breaker for you!
Evidence is as any information that is used to inform the decision-making process and is derived from rigorous research methods. This can include data from clinical trials, systematic reviews, meta-analyses, observational studies, expert opinions, and patient preferences. A key characteristic of evidence is that it is obtained through reliable and transparent methods, allowing for the evaluation of its validity and applicability to a particular situation or question.
EBP traditionally ranks the various study types based on the strength and precision of their research methods. Different hierarchies exist for different question types, and even experts may disagree on the exact rank of information in the evidence hierarchies. Still, most agree that quantatively speaking, current, well designed systematic reviews and meta-analyses are at the top of the pyramid, and that expert opinion and anecdotal experience are at the bottom.
Qualitative data tends to be harder to categorize, but there is a general consensus that some qualitative research methods provide more rigorous and reliable insights than others, with meta-syntheses towards the top of the pyramid, and single case-studies towards the bottom.
If there isn't a systematic review or meta-analysis on your topic, then the next-best evidence you should be looking for will depend primarily on your question type!
The Research Pyramid, introduced by Borgetto et al. (2007), offers a more nuanced approach by categorizing evidence. Each dimension is represented by a side of the pyramid, with varying levels of rigor corresponding to different study types. For example, experimental studies such as randomized controlled trials (RCTs) occupy the top tier of the quantitative side, while meta-syntheses represent the highest tier for qualitative studies.
You're reading a delightful paper all about your chosen topic, when BAM! suddenly there's tables, charts, percentages, and math just ruining your flow. Don't skip this section or fall into despair just yet! This section will help you understand what it is the results sections are trying to say, so you can confidently make the best decision based on the data.
So, what is statistics? Statistics is finding the story behind the numbers. Its a field of study that makes sense of data by organizing, analyzing, and interpreting it. With statistics, we can uncover patterns, trends, and relationships hidden within the data. These findings help us make informed decisions, predict outcomes, and understand the world around us better.
There are two main branches of statistics. Descriptive and Inferential.
For example, If you were doing a study surveying the Cressman librarians about their favorite authors, then descriptive statistics would help you summarize and describe the most popular authors among them. However, if you found that a certain author was highly favored among the Cressman librarians, then inferential statistics could help you predict whether that author is likely to be popular among librarians in general.
To understand statistics, you first have to understand data. Data is a collection of observations, typically coming from a sample of a population. An observation is the unit of measurement in your data. Observations will represent different things for different data.
For example, if your data is describing a population of students, each student is considered an observation. If your data is measuring the price of fruit at the grocery store, each apple or orange would represent a distinct observation.
The characteristics that describe these observations (price, weight, height, gender) are called variables. A variable is a measure of something that differs between observations or can change over time.
Descriptive statistics is what it sounds like: measures that are intended to describe the variables in your data. Descriptive statistics include measures like minimums, maximums, averages (means), medians, mode, percentiles, and range.
The mean (also called the average) is calculated by adding up each observation’s value for a certain variable (like height) and dividing by the number of observations. This gives you a sense of what value the variable tends to take on for the observations in your data – like how tall a 5th grader tends to be.
Note: Mean skews to outliers. This means that if you are taking a sample, and there are some really big (or small) numbers in the group, they can make the mean get bigger too, even if most of the numbers are smaller. So, those outlier numbers are like pulling the mean towards them, making it not show what most of the numbers are like.
The median is calculated by listing the values of a variable for all observations from least to greatest and by finding the center value. If there are an even number of observations, the median is calculated by taking the average of the two center values.
For example: Suppose your variable is age
Student A: 8 y.o. | Student B: 9 y.o. | Student C: 10 y.o. | Student D: 11 y.o.
If we were only taking the median of students A, B, and C, then the median would be 9 years old because that is the center value. But since there are an even number of students in the sample, the median is the average of the two center values, 9 and 10. So the median student’s age is 9 and a half years old.
Percentiles are used to describe how observations rank relative to other observations in a sample.
For example, having a GPA in the 90th percentile means that 90% of students (observations) have a lower GPA.
Quartile refers to the 25th, 50th, and 75th percentiles. You might see the term “IQR” or “interquartile range” which refers to the range of values between the 25th and 75th percentile.
Most often, you'll see descriptive data neatly summarized in what is a called a box plot. A box plot contains 5 key pieces of information: the minimum, the first quartile, the median, the third quartile, and the maximum.
Inferential statistics helps us make conclusions or predictions about a big group by looking at a smaller part of it. We use inferential statistics when we can't measure or observe everything in a group, but we still want to know something about it.
Point estimation is the process of estimating a characteristic (statistic) about a population when you only have access to a sample of that population.
For example, you might be interested in knowing the class’s average test score for an exam you just took, but only three of your friends agreed to tell you their grade! If you take the average score across those three students, you can use this to guess at the average score of the class – but that guess might not be very accurate. The more students you survey (the larger the number of observations) the more likely you are to guess correctly.
Because point estimates are guesses at what the population looks like, they inherently have what we call sampling error because we can’t know exactly what the population looks like if we only have a sample. In general, the larger sample reduces the sampling error. (Other things can reduce the sampling error, too. Like making sure to randomly sample. For example, if you only ask the slackers for what they got on the exam, your sample average is likely to be lower than the class average.)
An example of a point estimate is the average score that a sample of 30 high school students received on the SAT. The median describes all 30 scores with one number — making it a point — and that point may differ from the average SAT score for all high school students nationwide — making it an estimate.
Because point estimates are estimates. Margins of error tell you how much a point estimate from a sample may differ from the true population value.
The larger the margin of error, the less confident researchers can be that the point estimate is approximating the population value.
A confidence interval is calculated by adding and subtracting the margin of error from the point estimate. Confidence intervals suggest what range of values around the point estimate is likely to include the population characteristic.
The wider the margin of error, the wider the confidence interval, and the more uncertainty about what the population characteristic might be.
Whenever we do an experiment, we need to rule out that our results could have occurred simply to chance. Hypothesis testing is a subset of inferential statistics, and its main goal is to help determine if the results you found in your study (which were found with a sample of a population) could be applied to the larger population and still hold true with a certain level of confidence.
But lets back up a bit. Logically speaking, It is much easier to prove something is false than to prove something is true. If we want to prove something is false, then we only need one counter example. But, if we want to prove something is true, then we need to prove it is true in every possible situation.
For example, the classic argument is the claim "all swans are white." If I want to prove that the statement "all swans are white" is true, then I would need to go out and check every single swan. That's impossible! However, I can prove that the statement "all swans are white" is false by finding just one swan that is isn't white.
When we start doing an experiment, we create a hypothesis about a population. A hypothesis (HA), also know as an alternate hypothesis, is a proposed explanation or prediction about a phenomenon or a relationship between variables. For example, we may hypothesize that Cedar Crest students, on average, chat with their librarians more so than the average college student.
The null hypothesis (H0) is a statement that suggests there is no significant difference, effect, or relationship between the phenomena or variables being studied. The null hypothesis exists as the status quo, or default assumption. In our example, the null hypothesis is that Cedar Crest students chat with their librarians just as much as the average college student.
Because it would be much harder to prove our hypothesis was true in every case, we instead prove that the null hypothesis was false in this singular case. This is because if the null hypothesis (that there is no change) is false, then our hypothesis (that there is significant change) is more likely to be true.
The p-value is a measure of how likely the sample results are, assuming the null hypothesis is true. To put it simply, the p-value is the probability that your observed phenomonon happened simply due to chance.
The P value is the probability that any particular outcome would have arisen by chance. Standard scientific practice usually deems a P value of less than 1 in 20 (expressed as P=0.05, and equivalent to a betting odds of 20 to 1) as "statistically significant" and a P value of less than 1 in 100 (P=0.01) as "statistically highly significant."
Regression analysis is a statistical tool that is used to understand how and if two factors are connected, and how one factor can change when another factor does. For example, regression analysis might be used to estimate the average effect an extra hour of studying will have on a student’s exam grade. Regressions use one or more explanatory variables (like time spent studying) to estimate an outcome variable (like a test score).
The researchers choose which explanatory variables are used in their regression analysis. The importance of this choice cannot be understated. Having too few or too many explanatory variables (or simply choosing the wrong ones) can render the results of a regression analysis completely useless!
For example, you probably know that "time spent studying for an exam" is not the only determinant of a student’s score. Other factors could include: how much sleep the student got the night before, if the student has test anxiety, and the student's general mastery over the material. These factors can have equal, if not greater, influence on the student’s score. If you don't include those variables in your anaylsis, the you get something called omitted variable bias and that can discredit the results of the the whole regression analysis.
One consequence of omitting key explanatory variables is that it can make two factors seem related when they are not. The classic example is a regression that looks at the effect of ice cream sales on shark attacks. If you were to run a regression analysis on data that measures ice cream sales and shark attacks over time, you would find that ice cream sales are heavily correlated with shark attacks! Before you start crafting theories about shark’s having a sweet tooth, remember that correlation does not equal causation!.
A regression result showing that two variables are related does not prove that the two variables are causally related. A regression is not a complex model that replicates the real world.
Statistics can inform your understanding of a research topic, and it can provide evidence to inform your choices. But! It’s important to think about statistics as being able to support an idea but being unable to prove it. Statistics can be a powerful type of evidence, but there are several pitfalls to avoid.
Researchers typically focus on narrow questions, but their data can be misinterpreted when used to address a different question. After identifying research studies that use statistics that seem to directly address your research question, read the authors’ own interpretation of the statistics. Then, ask yourself:
Did the authors design their statistical analysis in a way that directly helps address my question, or would it take a leap to use this data for my research?
Note: More complex statistical methods generally require narrower applications and interpretations.
Finally, recognize that a statistically significant result is not necessarily a meaningful result in the real world: ask yourself:
Is the result summarized in the study clinically meaningful or compelling evidence for answering your research question?
In describing your conclusions about the statistics, be sure to stick with other lessons from this guide: correlation is not causation, and good research means acknowledging the limitations of your research sources.
t-Tests are used to compare the mean score (on some continuous variable) between two groups.
t-Tests can be employed to compare the mean scores of two different groups (independent-samples t-test) or to compare the same group of people on two different occasions (paired-samples t-test).
Whilst t-tests compare the mean score on one variable for two groups, analysis of variance is used to test more than two groups.
Analysis of variance (ANOVA) checks how much variation exists between groups compared to within each group.
When two variables are correlated, it means that changes in one variable are associated (but may not cause!) with changes in the other variable. There are two key components of correlation: Strength (how closely the data points form a line) and Direction (which direction the line goes).
The lowercase "r" typically refers to the Pearson correlation coefficient, which measures the strength and direction of the linear relationship between two variables.
The uppercase "R" generally represents the coefficient of determination, often denoted as R2 This value describes the proportion of variation in the dependent variable that is explained by the independent variable(s) in a regression model. In other words, R2 indicates how well the independent variable(s) predict the variation in the dependent variable.
Remember that correlation does not equal causation. If you want to see some interesting correlations check out Spurious Correlations.
Multiple regression is an extension of correlation analysis. Multiple regression is used to explore the relationship between one dependent variable and a number of independent variables or predictors.
The purpose of a multiple regression model is to predict values of a dependent variable based on the values of the independent variables or predictors.
Chi-square test for independence is used to explore the relationship between two categorical variables. Each variable can have two or more categories.