Overview

  • Descriptive & Inferential Statistics
  • Sources of Variance and the Signal-to-Noise Ratio
  • Designing and Answering Questions
  • Hypothesis Testing

Descriptive & Inferential Statistics

Descriptives vs. Inferentials

  • Descriptives make no assumptions about the population from which the sample was drawn
    • They simply describe the sample of data
      • Counts, percents, ratios/odds
    • Can also describe the distribution of the sample
      • Central tendency (mode, media, mean)
      • Dispersion (standard deviation, skewness, kurtosis)

Descriptives vs. Inferentials (cont.)

  • Inferentials make assumptions—inferences—about the nature of the entire population of data
  • The assumptions made can vary
    • Aren’t always the same
    • And sometimes the assumptions can be tested
  • Making assumptions allows us to conduct hypothesis tests
    • Hypothesis testing doesn’t define inferential stats
    • But is the most common reason to make the assumptions

Review of Descriptive Statistics

  • Not to be underestimated
    • In addition to being informative, they are inherently robust
  • Robust statistics are tolerant of violations of assumptions/inferences made about the population
    • And since descriptives don’t make any assumptions about it…
  • N.b., however, that the line between descriptive & inferential is blurry
    • And inferential tests can be made on descriptives
      • E.g., if the counts/frequencies of occurrence are the same between two groups

Review of Descriptive Statistics (cont.)

  • Central tendency
    • Yes, where the “center” of the data “tends” to be
    • “Center,” of course, can be differentially defined
    • Mode: the most common value; very robust to outliers
    • Median: the value with the same number of other values on either side
      • Often used instead of mean when there are many outliers
    • Mean: average value; least robust to outliers

Review of Descriptive Statistics (cont.)

  • Dispersion
    • How spread out the data are
    • Standard deviation
      • On average, how far a given score is from the mean
      • Equivalent for the median is the median absolute deviation (MAD)
        • The median distance of scores from the median
    • Standard deviation is also related to variance (discussed a bit later)

Review of Descriptive Statistics (cont.)

  • Dispersion (cont.)
    • Skew & kurtosis
      • Skew: if one of the distribution tails is longer than the other
      • Kurtosis: how spread out the data are
        • Not a measure of “flatness” or “peakedness”
        • “Leptokurtic”: A very tight distribution; small SD
        • “Platykurtic”: A very spread-out distribution; large SD
    • Skew is more problematic than kurtosis

Assumptions in Inferential Statistics

  • Three general types of assumptions:
    1. That the sample represents the population
    2. That each data point (“datum”) is independent of the others
    3. That the population’s data are normally distributed
  • There are more/other assumptions that can be made, e.g.:
    • Other distribution shapes (e.g., logarithmic)
    • Nature of any missing data
    • Whether data are continuous or discrete

Assumptions in Inferential Statistics (cont.)

  • For hypothesis testing, some assumptions matter more than others
    • I.e., hypothesis tests tend to be robust against some violations of our assumptions
    • But not others
  • Understanding assumptions & their effects can
    help interpret results
  • Next:
    • Robustness of common assumptions for
      “ordinary least squares” (t-tests, ANOVAs)

Assumptions in Inferential Statistics:
Representativeness

  • That the sample represents the population
    • Robust with larger samples (also called “asymptotically robust”)
  • This is a manifestation of regression to the mean
    • Probably be better called “convergence to the population mean”
    • I.e., that sample values tend to resemble population values when:
      • The sample size gets bigger
      • Repeated samples are drawn
    • N.b. that this also applies to others stats, not just the mean

Assumptions in Inferential Statistics:
Representativeness (cont.)

  • Standard error of the mean (SEM, or just “standard error,” SE)
    • A measure for how well the sample mean represents the population mean
      • E.g., the population mean should have a ~68% chance of being within 1 SEM of the sample mean
    • The standard deviation describes the sample;
      SEM describes relationship between sample & population
    • The SEM gets ________ as the sample size gets larger

Assumptions in Inferential Statistics:
Representativen

  • The SEM is an inferential statistic
    • Viz., it assumes that participants/samples are independent & unbiased
      • Often written as “independent and identically distributed” (iid)
      • (N.b. that people can possibly be resampled)
  • The SEM is thus robust to violations of normality

Assumptions in Inferential Statistics:
Representativeness (end)

  • The SEM is robust in part because it tends to be symmetrical
    • I.e., the population mean has the same chance of being greater than or less than the sample mean
  • And if sample means tend to be symmetrically distributed,
    • Then the distribution of sample means tends to be
      normally distributed
  • This, my friends, is the Central Limit Theorem

Assumptions in Inferential Statistics:
Sample Independence

  • That each participant is independent of the others
    • Not robust! This assumption matters!
    • If one participant’s values are affected by other participants, this can introduce several types of bias
      • Can create false positives and/or false negatives
        • I.e., increase Type 1 and/or Type 2 errors
    • Can sometimes be addressed by properly “nesting” participants
      • E.g., patients in units, units in hospitals, etc.

Assumptions in Inferential Statistics:
Sample Independence (cont.)

  • That each data point (“datum”) is independent of the other data
  • A related assumption is that terms in a model are unrelated
    • E.g., that the independent variables (IVs) in an ANOVA are unrelated
      • And unrelated to the error term
    • Can manifest as multicollinearity
      • I.e., that 2+ IVs are highly correlated with each other
    • Ordinary least squares (OLS) tests—like ANOVAs—are generally robust against multicollinearity
      • Unless it’s extreme (e.g., r > .8)

Assumptions in Inferential Statistics:
Normality

  • I.e., the population’s data are normally distributed
  • OLS is robust against some deviations from normality:
    • Robust against kurtosis
      • Rarely actually matters
    • Moderately/asymptotically robust against skew
      • Especially from outliers
  • Not robust against multimodality
    • I.e., having more than one “hump” in the data

Assumptions in Inferential Statistics:
Normality (cont.)

  • Outliers have an out-sized effect on results
    • I.e., they affect descriptions & inferences more than data closer to the center
      • This is, in fact, by design: OLS does this on purpose
    • Nonetheless, researchers should somehow address outliers
  • Their effect is lessened as sample size increases

Assumptions in Inferential Statistics:
Normality (cont.)

  • Mulitmodal data
    • Can indicate two subsamples
      • I.e., that sample should be split, or “stratified”
      • Or look at “localized” measures that focus on only part of the range
    • Must be addressed somehow since measures of central tendency are inaccurate

The Signal-to-Noise Ratio /
Variance & Covariance

Signal-to-Noise Ratio

  • Generally, information in a sample of data is placed into
    two categories:
    • Signal,” e.g.:
      • Difference between group means
      • Magnitude of change over time
      • Amount two variables co-vary/co-relate
    • Noise,” e.g.,
      • Difference within a group
      • “Error”—anything not directly measured

Signal-to-Noise Ratio (cont.)

  • Many statistics & tests are these ratios
    • And investigating multiple signals & even multiple sources of noise
  • Generally, if there is more signal than noise,
    • We can then test if there is enough of a signal to “matter”
    • I.e., be significant
  • E.g., the F-test in an ANOVA
    • Is a ratio of “mean square variance” between groups/levels vs.
      “mean square error” within each group
    • If F > 1, then use sample size to determine if the value is big enough to be significant

Signal-to-Noise Ratio (cont.)

Source Sum of Squares df Mean Square F p Partial η2
Information Intervention 4.991 1 4.991 5.077 .025 0.028
Telesimulation Intervention 0.349 1 0.349 0.355 .552 0.022
Error 172.061 175 0.983
  • \(\frac{Signal}{Noise}=\frac{Mean\ Square\ Between}{Mean\ Square\ Error}\)
  • E.g., for the Information Intervention: \(\frac{4.991}{0.983} = 5.077\)

Designing and Answering Questions

Statistics as a Way of Thinking

  • Arguably a fundamental way quantitative research differs from qualitative
  • In part—yes—statistics is knowing the sorts of questions to ask
    • As Fisher said, “[t]o call in the statistician after the experiment is done may be no more than asking [them] to perform a post-mortem examination: [they] may be able to say what the experiment died of.”
  • But hopefully it’s more centered on an understanding of science and the philosophical Zeitgeist within which it operates
    • Stats embodies science’s parsimony, objectivity,
      systematicity, precision, & probabilistic nature

Theory and Observation

  • Understanding the nature of scientific theory
    • And it’s (preeminent) relationship to research design & interpretation
  • Deciding what & how to measure
    • Levels of measurement, bias, roles of
      assumptions
  • Study design, analysis, & interpretation
    • E.g., possible mechanisms & causal
      relationships
  • Model building

Hypothesis Testing

Basic Assumptions in Hypothesis Testing

  • Null hypothesis:
    • That their is no difference between the groups
      • (Or zero effect of treatment, etc.)
  • Significance test:
    • “What is the probability of obtaining the
      sample data if the null hypothesis it true
    • E.g., p = .02 is the probability of finding
      the effect if the null is true

Basic Assumptions in Hypothesis Testing (cont.)

  • But the null is rarely—if ever—true
    • There is likely some effect
  • And the “noise” of most significance tests is reduced by larger samples
    • This is related to the idea of regression to the mean
      • And larger samples having smaller SEMs
    • I.e., population distribution can be accurately measured with little error
  • Therefore, with a large enough sample, even small differences can be detected
    • “[T]he p-value is a measure of sample size” –Andy Gelman

Basic Assumptions in Hypothesis Testing (cont.)

  • That’s not bad or wrong
    • Or good and right
  • It’s simply the result of decisions made about how to make decisions
  • It’s the nature of the tools we use
    • And thus simply informs how we should use them.

Summary

Assumptions & Robustness

  • It’s important to understand what assumptions are made
    • And which matter & how
  • Generally, richer stats are less robust
    • Descriptives are inherently robust
    • Non-parametric statistics are more robust
      than parametric ones

Representativeness

  • Regression to the mean
    • “Convergence to population values”
    • For the mean, but also other parameters (SD, skew, etc.)
    • Not to be confused with the Central Limit Theorem
  • Standard error of the mean
    • Like a SD for the sample mean
    • Asymptotically robust

Sampling Independence

  • Matters greatly
    • Can increase both false positives and false negatives
    • Can not only affect studies,
      • But also entire lines/areas of research
  • Multicollinearity
  • Hierarchical / Multilevel / Mixed Models

Normality

  • The point isn’t to worry so much if you’re data are normal
    • In fact, it’s exceedingly unlikely they are
  • The point is to get good data
    • And to understand them for what they are
  • It’s mostly in collections of well-collected samples that we can rely on normality
    • Otherwise, it’s understanding when & how data are robust to violations of it

Normality (cont.)

  • Few—if any—distributions of real data are normal
    • Especially across independent samplings (via the central limit theorum)
    • And most stats are robust to violations of normality
      • But multimodality & non-independence (data that aren’t iid) are dangerous
  • Outliers, though, should not be ignored

Signal-to-Noise Ratio

  • The foundation of most inferential statistics
    • Inherently assumes—categorizes—some information as one or more “signals” & the rest as “noise”
    • Thus, it’s important to ensure this categorization is well done
  • It’s also as much part of study design as it is of study analysis
    • Indeed, along with theory, it’s a main driver of design

Hypothesis Testing

  • p-Value as
    • Chance to find effects assuming the null is true
    • Not just a measure of signal vs. noise
      • It’s also a measure of sample size

   The End