The
ANOVA
Family
of
Tests

Overview

  • Basic Concepts of ANOVAs
  • Example of Main Effects & Interactions
  • t- & F-Tests Roles & Nature
  • Types of ANOVAs
  • Post hoc Comparisons
  • Reference Tables

Basic Concepts of ANOVAs

What ANOVAs Test

  • ANOVAs are designed to test if two or more groups differ in the mean value of the outcome
    • E.g., if patients of different races tend to have different mean levels of blood lipids (e.g., LDLs)
  • So:
    • The outcome variable (DV) is continuous
    • The main predictors (IVs) are categorical
    • More complex models can test more complex & specific types of relationships

What ANOVAs Test (cont.)

  • ANOVAs are also designed to conduct “omnibus” tests
    • If a predictor is found to be significant, it means that there is one or more significant difference somewhere in that variable
      • E.g., that one or more racial groups have different mean levels of blood LDLs
  • We must then conduct post hoc tests to see where these differences are (more below)
    • E.g., conduct a series of t-tests to see which races differed in levels of LDLs

What ANOVAs Test (end)

  • The use of an initial, omnibus is intentional:
    • The initial omnibus test is used to control for conducting too many significance tests
    • And thus helping control for false positives (Type 1 errors)
  • Remember:
    • Test of one variable alone is called a main effect
    • Test between two (or more) variables are called interactions

Examples of What ANOVAs Test

  • Effects of nursing care delivery models on nurses’ fatigue (Kisanuki & Oriyama, 2025)
    • “Nurses experienced more fatigue than other healthcare professionals”
  • Effectiveness of training program on clinical decision-making regarding postoperative pain management (Reddy et al., 2025)
    • “Training improved clinical decision-making in experimental nurses compared to controls (F=34.76, p=0.01).”
    • “The experimental group had higher satisfaction with pain treatment, including overall condition, nursing care, and current analgesics (F=180.47, p=0.01).”

Examples of What ANOVAs Test (cont.)

  • Effect of communication training rubric on empathic communication skills of nurses at a tertiary care hospital (Rawat et al., 2024)
    • “After the intervention, all measurements conducted on experimental groups at various locations yielded better results”
  • Effects of single- and double-shift work on hand and cognitive functions in nurses (Ulupinar, 2024)
    • “[T]he double-shift work significantly exacerbated declines in all measured functions.”

ANOVAs as Semipartial Correlations

  • Basic formula for a two-way ANOVA:

DV = IV1 + IV2

e.g.,

Entrepreneurial Skills = Project-based Learning + GPA

  • Adding in GPA isolates the effect of project-based learning from the effect of GPA
    • Just like a semipartial correlation removes the effect of a third variable from one of the correlated pair
  • This tests the main effect of each IV

ANOVAs as Semipartial Correlations (cont.)

  • We can also add an interaction term:

DV = IV1 + IV2 + (IV1 × IV2)

e.g.,

Entrepreneurial Skills = Project-based Learning + GPA
+ (Project-based learning × GPA)

  • Adding the interaction term lets us also look at, e.g., the effect that GPA has on the relationship between project-based learning & entrepreneurial skills

Assumptions of ANOVAs

  • The “ANOVA family” is a group of similar tests
  • That all follow the basic idea of a linear regression
    • I.e., assuming the relationship between the predictors (IVs) and outcome (DV) is linear
    • And assuming any deviations from linearity are entirely due to error
  • Also assumes the variables and error are all normally distributed
    • Since it uses F-tests, this assumption is worth investigating more than usual

Example of
Main Effects &
Interactions

Factors Associated with End-of-Day Fatigue

Effect Type Variables β
Main Effect Busy 3.54*
Main Effect Takes Microbreaks 0.78
Main Effect Supervisory Support 0.04
2-Way Interaction Busy × Takes Microbreaks -3.28*
2-Way Interaction Busy × Supervisory Support -3.59*
3-Way Interaction Busy × Takes Microbreaks × Supervisory Support 1.37

* p < .05

From: Jefferson, D. P., Andiola, L. M., & Hurley, P. J. (2025). Surviving busy season: Using the job demands-resources model to investigate coping mechanisms. Contemporary Accounting Research, 42(1), 187 – 216. doi: 10.1111/1911-3846.12999

Main Effects & Interactions Example (cont.)

Also from Jefferson et al. (2025)

Main Effects & Interactions Example (cont.)

Also from Jefferson et al. (2025)

Main Effects & Interactions Example (end)

Also from Jefferson et al. (2025)

t-Tests & F-Tests

Roles of t- & F- Tests in ANOVAs

  • Remember, the main use of ANOVAs is to test:
    • Differences between groups
    • In the levels of an outcome variable
  • E.g.:
    • Differences in those with and without COPD
    • In their levels of cognitive functioning
  • Or:
    • Differences between those with COPD, asthma, & bronchiectasis
    • In their levels of cognitive functioning

Roles of t- & F- Tests in ANOVAs (cont.)

  • To test group differences, ANOVAs use t and F tests:
  1. Use an F test on a predictor to see if there are any differences
    • This is called the “omnibus” test because it just looks for any difference between any groups
  2. If the F test is significant, then use t tests to see which specific groups are different
    • These are often done as “post hoc” analyses
    • And are only done if the initial F test for that predictor was significant

t & F as Tests of Mean Differences

  • t- and F-tests are tests of mean differences
    • t-tests are used for two means
    • F-tests are used for three or more means
  • I.e., they are used to test whether two means are significantly different from each other
    • They can also be used to test if means are different than zero

t & F as Tests of Signal-to-Noise Ratio

  • Generally, information in a sample of data is placed into two categories:
    • Signal,” e.g.:
      • Difference between group means,
      • Magnitude of change over time, or
      • Amount two variables co-vary/co-relate
    • Noise,” e.g.,
      • Differences within a group
      • “Error”—anything not directly measured

As Signal-to-Noise Ratio (cont.)

  • E.g., the F-test in an ANOVA
    • A ratio of the “mean square variance” between an effect’s groups/levels
      • Versus the “mean square error” within each group
    • If F < 1, then there is more noise than signal
      • And no chance off significance
    • If F > 1, then use sample size to determine if the value is big enough to be significant

Example of Signal-to-Noise Ratio

  • \(\frac{\text{Signal}}{\text{Noise}}=\frac{\text{Mean Square Effect}}{\text{Mean Square Error}}\)
  • E.g., for the between groups effect: \(\frac{134.613}{2.159} = 62.360\)

From: Chularee, S., Tapin, J., Chainok, L., & Chiaranai, C. (2024). Effects of project‐based learning on entrepreneurship skills and characteristics of nursing students. Nursing & Health Sciences, 26(3), e13160-n/a. doi: 10.1111/nhs.13160

(Note that the mean squared effect for GPA is mis-computed: df should be 1 for a continuous variable.)

Types of ANOVAs

The ANOVA Family

  • There is a “family” of tests that all are related mathematically
    • And all given names similar to “ANOVA”
  • All use ordinary least squares to compute the linear regression
    • And similar (but not identical) methods to estimate error
    • Differences in computing the error term is the main way they differ

The ANOVA Family (cont.)

  • Common assumptions of all ANOVA-family models
    • Normally distributed errors
    • Homogeneity of variance (except in robust methods)
  • All require conducted post hoc analyses to find specific differences between levels of significant categorical predictors
    • E.g., if there is a significant main effect for (non-binary) gender,
      • We must then conduct post hocs to see which genders had different outcomes

Types of ANOVAs

ANOVA Type Categorical Predictors (IVs) Continuous
Predictors (IVs)
Number of
Outcome Variables (DVs)
One-Way 1 0 1
Two-Way 2 0 1
ANCOVA 1+ 1+ 1
Repeated Measures 1+ 1+ 1
MANOVA 1+ 1+ 2+
Mixed-Design 1+ 1+ 1+

N.b., we can combine types, e.g., a “repeated measures MANCOVA”

One-Way ANOVA

  • Compares means across 3+ levels of one predictor (IV)
    • (Can be used to test between 2 levels, but t-tests are better)
  • Assumes:

One-Way ANOVA Example

Kusi Amponsah, A., Oduro, E., Bam, V., Kyei-Dompim, J., Ahoto, C. K., & Axelin, A. (2019). Nursing students and nurses’ knowledge and attitudes regarding children’s pain: A comparative cross-sectional study. PloS One, 14(10), e0223730-. doi: 10.1371/journal.pone.0223730

Two-Way ANOVA

  • Tests effects of two categorical IVs
    • (On one continuous DV)
  • Can evaluate both:
    • Main effects &
    • Interaction
  • Assumptions similar to one-way ANOVA
  • E.g., Arsat et al. (2023): Effects nurse experience & work setting on caring behavior

ANCOVA

  • Includes both categorical & continuous predictors (IVs)
    • Continuous are added to control for their effects
    • But they are often investigated for significance, interactions, etc.
  • E.g., Solera-Sanchez et al. (2021): Effects of physical activity, diet adherence, sleep quality, sleep duration, and screen time on health-related quality of life in adolescents

Repeated Measures ANOVA

  • Used when same participants measured more than once
  • Accounts for within-participant correlations
  • Assumptions:
    • Sphericity,” that scores don’t get more (or less) spread out over time
    • And normality, etc.
  • E.g., Gattinger et al. (2023): Effects of an educational program on nurses’ competence, self-efficacy, and musculoskeletal health

MANOVA

  • “Multivariate ANOVA”: multiple DVs analyzed simultaneously
    • Thus tests whether groups differ on a combination of outcomes
  • Requires multivariate normality
  • E.g., Shaygan et al. (2022): Assessed changes in pain intensity and pain catastrophizing between the two groups over time
    • A repeated-measures MANOVA: Alzoubi et al. (2023)

Mixed-Design ANOVA

  • Mixed design combines within- and between-participants factors
    • An other way to test effect over time
  • Evaluates:
    • Time effects (within)
    • Group differences (between)
    • Interaction
  • E.g., Erwin et al. (2016): A two-way mixed-design ANOVA was employed with time of testing (morning, afternoon, evening) as the within-participants factor and gender as the between-participants factor

Repeated Measures vs. Mixed-Design

  • Repeated measures ANOVA
    • All independent variable levels applied to the same participants
    • “Do participants change across time or conditions?”
  • Mixed-design ANOVA
    • Some factors vary within participants; others vary between participants
    • “Do groups differ, do individuals change, and do these effects interact?”

Repeated Measures vs. Mixed-Design (cont.)

  • Use repeated measures when all data are within-participant
  • Use mixed design when combining within- and between-participant comparisons

Repeated Measures vs. Mixed-Design (end)

Feature Repeated Measures ANOVA Mixed-Design ANOVA
Design type Within-participants only Combination of within- and between-participants (split-plot design)
Example use Measuring the same individuals’ anxiety at 3 time points Comparing treatment × time in different groups
Primary question Do individuals change over time or across conditions? Do groups differ, do individuals change, and do these effects interact?
Participants Same participants in all conditions Some factors vary within, others between participants
Main effects tested Within-participants effects only Both within-participant and between-participant effects and their interaction
Assumptions Normality, sphericity Normality, sphericity (within), homogeneity of variances (between)
Statistical model All effects are nested within participants Combines nesting with group-level fixed effects

Post hoc Analyses

Post hoc Analyses

  • Since F-tests only test if there is 1+ differences somewhere among the levels of a variable,
    • We must conduct further analyses to analyze where these differences are
  • One may include additional protections against them in the post hoc analyses
    • These “family-wise” adjustments keep the error rate at, e.g., .05 for the whole “family” of post hoc analyses
      • At the expense of reduced power for individual tests

Post hoc Analyses (cont.)

  • General strategy is to set the \(\alpha\) so that it stays at 0.05 for the group of tests
    • If conducting multiple comparisons, adjustments must be made to control the family-wise error rate (FWER)
    • Without adjustments, the probability of making at least one false positive increases as the number of comparisons increases

Post hoc Analyses (end)

  • E.g., assume we’re making comparing three groups / levels of a predictor in post hoc analyses:
    • Each test uses \(\alpha\) = .05
    • Probability of at least one false positive:
      P(at least one error) = 1 - (1 - .05)3
                  = 1 - (.95)3 = 1 - .857 = .143
  • Can thus adjust \(\alpha\) per test to be \(\frac{.05}{3}\) = .0167
    • Ensures overall FWER does not exceed .05
  • This is called the Bonferroni correction
    • It is simple, but under-powered (Perneger, 1998)

Summary of Common Post Hocs

Test Type of Comparison Equal Variances? Note Best For
Bonferroni Any Yes Very conservative Planned comparisons
Holm Any Yes Less conservative Planned comparisons
Tukey HSD Pairwise Yes General ANOVA post hoc
REGW Pairwise Yes Requires equal ns More power than Tukey
Games-Howell Pairwise No Unequal variances/samples
Dunnett Against Control Yes Comparisons vs. a control group

When to Use Which Post Hoc

  • Simple Pairwise Comparisons
    • REGW-Q balances well between false positives and false negatives (power)
    • REGW-Q is best with equal variances and sample sizes
  • Unequal Variances or Sample Sizes
  • Complex Comparisons
    • Scheffé’s method offers the necessary flexibility
    • But is nearly too conservative (Kim, 2015), so simpler analyses (or non-ANOVA models) are worth striving for

Planned Comparisons

  • Instead of using an F-test followed by all possible comparisons in post hoc analyses,
    • Can create a set of planned comparisons of only a subset of interesting tests
      • Including of interaction effects
    • Can still use family-wise error correction if desired
  • Theory-driven & flexible
  • Doesn’t waste power on unnecessary comparisons
  • But less open to serendipity

The End

Reference Tables

Key Terms & Concepts

Term Definition
ANOVA (Analysis of Variance) A statistical method used to compare means among three or more groups by analyzing variance components.
Main effect The effect of one independent variable on the dependent variable, averaging across the levels of other variables.
Interaction effect In factorial ANOVA, the combined effect of two or more independent variables on the dependent variable.
F-statistic The ratio of between-group variance to within-group variance. A higher F suggests a greater likelihood of a true effect.
p-value Probability of observing the data (or more extreme) under the null hypothesis. A low p-value suggests statistical significance.
Effect size (e.g., η²) Measures the proportion of variance in the dependent variable explained by the independent variable.
Post hoc test Follow-up comparisons performed after a significant ANOVA to determine which specific group means differ.

More Terms & Concepts

Term Definition
Degrees of Freedom (df) The number of independent values used to estimate a parameter; differs for between and within groups.
Between-group variance Variability in the data that is due to the differences between group means. Reflects the effect of the independent variable.
Within-group variance Variability among individuals within the same group. Often considered “error” or residual variance.
Sum of Squares Between (SSB) Total squared deviation of each group mean from the overall mean, weighted by group size.
Sum of Squares Within (SSW) Total squared deviation of individual scores from their respective group mean.
Sum of Squares Total (SST) Total squared deviation of each observation from the overall mean; SST = SSB + SSW.
Mean Square Between (MSB) SSB divided by its degrees of freedom; an estimate of between-group variance.
Mean Square Within (MSW) SSW divided by its degrees of freedom; an estimate of within-group (error) variance.