Understanding
Commonly-Reported
Statistics

Overview

  • Definition of Populations & Samples
  • Descriptive vs. Inferential Statistics
  • Descriptives
    • Central tendency & dispersion
    • Relationships
  • Inferential Statistics
    • Concepts
      • Hypothesis Testing
      • Signal-to-Noise Ratio
      • Effect Size
    • Common Tests
      • χ²
      • t & F

Populations & Samples

  • Population
    • An entire, well-defined group
    • E.g.:
      • All of those residing in New York City
      • Community-dwelling adults in NYC
      • Rarely actually measure an entire population
  • Sample
    • A subset of a population
    • Those whom are actually measured
    • The larger the sample—the higher percent it represents of the population—the more reliably we can use the sample to make inferences about the population

Descriptives vs. Inferentials

Descriptives

  • Descriptives make no assumptions about the population from which the sample was drawn
    • They simply describe the sample of data
      • Counts, percents, ratios/odds
    • Can also describe the distribution of the sample
      • Central tendency (mode, median, mean)
      • Dispersion (standard deviation, skewness, kurtosis)
    • Or describe relationships between variables
      • Correlations
      • Odds ratios

Inferentials

  • Inferentials make assumptions—inferences—about the nature of the entire population of data
  • The assumptions made can vary
    • Aren’t always the same
    • And sometimes the assumptions can be tested
  • Making assumptions allows us to conduct hypothesis tests
    • Hypothesis testing doesn’t define inferential stats
    • But is the most common reason to make the assumptions

Descriptive Statistics

Value of Descriptive Statistics

  • Not to be underestimated
    • In addition to being informative, they are inherently robust
  • Robust statistics are tolerant of violations of assumptions/inferences made about the population
    • And since descriptives don’t make any assumptions about it…

Central Tendency

  • Yes, where the “center” of the data “tends” to be
  • “Center,” of course, can be differentially defined
    • Mode
      • The most common value
        • Robust to outliers and oddly-shaped distributions
    • Median
      • The value with the same number of other values on either side
      • Used instead of mean when there are many outliers
        • Or less “normal” data
    • Mean
      • Average value; least robust to outliers
      • Abbreviated as X (or sometimes M)

Dispersion

  • How spread out the data are
    • I.e., how much variance there is in the data
  • Standard deviation (SD)
    • On average, how far a given score is from the mean
  • Median absolute deviation (MAD)
    • Equivalent for the median
      • The median distance of scores from the median

Final Thoughts on Dispersion (Variance)

  • Again, measures of dispersion are measures of variance
  • Variance defines how much information there is in a set of data
    • If everyone had the same value, then there would be very little information in that data
    • If everyone had a different value, then there is a lot of information
  • Therefore, variance is not bad!
    • The issue is explaining it…
  • This is related to degrees of freedom (df)
    • df being the degree to which the data
      can take on unique values
    • I.e., the amount of information it holds

Example of Mean & Standard Deviation

Iovino, P., Lyons, K. S., De Maria, M., Vellone, E., Ausili, D., Lee, C. S., Riegel, B., & Matarese, M. (2021). Patient and caregiver contributions to self-care in multiple chronic conditions: A multilevel modelling analysis. International Journal of Nursing Studies, 116, 103574–103574. https://doi.org/10.1016/j.ijnurstu.2020.103574

Descriptions of Relationships

  • We can measure relationships between two (or more) variables without making inferences about the population
    • Thus, measures of relationships can be descriptive
  • But, we can also use inferences to test the significance of relationships
    • Usually if the relationship is likely not zero
      • I.e., that there is some relationship between them

Correlation

  • How much the value of one variable can be predicted by the value of an other
  • Correlations are set to range from 0 to 1
    • If we can predict with 100% accuracy,
      • we say the correlation is 1
    • If one variable provides no information
      • we say the correlation is 0
  • Strangely, correlations are actually the square root of this predicted relationship
    • E.g., If a correlation is .50,
      • Then the accuracy with which we could predict one variable from the other is:

\[r = .50^2 = .25 \quad \text{or 25%}\]

Correlation (cont.)

  • Often conceived as whether they move together
    • I.e., as the values of one variable go up,
      • do the other values go up (or down), too?
    • If they move in the same direction,
      • we say it’s a positive correlation
    • If they move in opposite directions,
      • we say it’s a negative correlation
  • The symbol varies, depending on how the correlation is computed
    • r: Pearson, for 2 continuous
      • rpb: Point-biserial, 1 continuous & 1 dichotomous
    • ρ: Spearman, for 2 continuous/ordinal
    • τ: Kendall, for 2 ordinal

Example of Correlation

Table 2: Correlation between leadership style with job stress and anticipated turnover

Pishgooie, A. H., Atashzadeh‐Shoorideh, F., Falcó‐Pegueroles, A., & Lotfi, Z. (2019). Correlation between nursing managers’ leadership styles and nurses’ job stress and anticipated turnover. Journal of Nursing Management, 27(3), 527–534. https://doi.org/10.1111/jonm.12707

Odds

  • A measure of the relative frequency of two outcomes within a population
    • E.g., the number of patients who received human papillomavirus (HPV) vaccinations
    • Versus the number who do not
  • Can be expressed as a fraction:
    • E.g., among 1113 non-Hispanic Blacks, Rincon et al. (2024) found that:
      • 200 were vaccinated
      • 833 were not vaccinated
      • \(\frac{200}{833} \approx 0.24\)
      • 1 : 4 odds of not being vaccinated

Odds (cont.)

  • Why use odds?
    • Easily presents counts/frequencies
      • Themselves common outcome measures in healthcare
    • Can express rare events well
  • Are used in logistic regression

Odds Ratios

  • A measure of relative odds
    • How likely/common is something in one population
      vs. in an other population
    • “What are the rates of HPV vaccinations among non-Hispanic Blacks versus non-Hispanic Whites?”
  • Symbolized as OR
    • aOR is “adjusted odds ratio”
      • The adjustment is usually to better
        reflect the effects of other variables
  • Expressed as a number
    • Usually is between 0.1 and 10
    • But can range from 0 to +\(\infty\)

Odds Ratio (cont.)

  • Is indeed a ratio—of one odds to an other odds:

\[\text{Odds} = \frac{200\ \text{ vaccinated non-Hispanic Blacks}}{{833\ \text{ unvaccinated non-Hispanic Blacks}}} \approx 0.24\]


\[\text{Odds} = \frac{868\ \text{vaccinated non-Hispanic Whites}}{{4451\ \text{ unvaccinated non-Hispanic Whites}}} \approx 0.20\]


\[\text{Odds Ratio} = \frac{0.24}{0.20} \approx 1.2\]

  • Non-Hispanic Blacks are about 1.2 times as likely to be vaccinated against HPV as non-Hispanic Whites (Rincon et al, 2024).

Example of Odds Ratios

Fernandez-Lazaro, C. I., Brown, K. A., Langford, B. J., Daneman, N., Garber, G., & Schwartz, K. L. (2019). Late-career physicians prescribe longer courses of antibiotics. Clinical infectious diseases: An official publication of the Infectious Diseases Society of America, 69(9), 1467–1475. https://doi.org/10.1093/cid/ciy1130

Risks

  • Risks are simply probabilities
    • Risks are the chances of something happening
    • I.e., the number of people with some condition,
      • Divided by the total number of people
    • E.g., if 29 out of every 100 people in the US who develop hospital-onset methicillin-resistant S. aureus (MRSA) bloodstream infections die,
      • Then the risk of death among these patients is 0.29

Risk Ratios

  • Risk ratios (aka relative risk) are the probability of something happening in one group,
    • Relative to the probability of it happening in an other group.
  • E.g.: Kourtis et al. (2019) reported that:
    • Risk of death from hospital-onset MRSA: 0.29
    • Risk of death from community-onset MRSA: 0.18
    • The risk ratio (RR) of mortality of hospital-onset vs. community-onset is thus:

\(RR = \frac{\text{Mortality risk of }{hospital}\text{-onset MRSA}}{\text{Mortality risk of }{community}\text{-onset MRSA}} = \frac{0.29}{0.18} = 1.61\)

  • Someone is 1.6 times as likely to die from a hospital-acquired MRSA blood infection as from a community-acquired infection

Hazards

  • Hazards
    • Aka hazard rates, the risk of something happening over time.
      • E.g., the risk of dying from a procedure within 6 months
    • Typically used for deleterious outcomes (viz., morbidities & mortalities)
  • Hazard ratios
    • The relative risk of a given outcome
    • E.g., the risk of dying within 1 year of a procedure,
      • Versus the risk of dying without undergoing that procedure.

Example of Hazard Ratios

Alquézar-Arbé, A. et al. (2023). Influence of type of household on prognosis at one year in patients ≥65 years attending hospital emergency departments in Spain. The EDEN-6 study. Maturitas, 178, 107852–107852. doi: 10.1016/j.maturitas.2023.107852

Inferential Statistics:


General Concepts

Hypothesis Testing

Hypothesis Testing (cont.)

  • The “null” hypothesis is that there is no effect/difference
    • The p-value is technically the probability of finding the given pattern of data if the null is true
    • It’s couched this way mainly for philosophical reasons
      • I.e., that we can’t prove an effect
        • But simply that there doesn’t seem to be
          nothing
      • Kind of like criminal court
        • We don’t say that someone is innocent
        • Just that there isn’t enough evidence to
          prove guilt

Hypothesis Testing (cont.)

Hypothesis Testing (cont.)

Hypothesis Testing (cont.)

Hypothesis Testing (cont.)

Hypothesis Testing (cont.)

Hypothesis Testing (end)

Odd Ratios of CVD and Other Comorbidities Among Older Adults with Opioid Use Disorders

Baumann, S. & Samuels, W. E. (in preparation). Prevalence of Comorbidities Among Older Adults with Opioid Use Disorders [or something like that].

The Signal-to-Noise Ratio &
Its Use in Hypothesis Tests

Signal-to-Noise Ratio

  • Generally, information in a sample of data is placed into
    two categories:
    • Signal,” e.g.:
      • Difference between group means,
      • Magnitude of change over time, or
      • Amount two variables co-vary/co-relate
    • Noise”, e.g.,
      • Difference within a group
      • “Error”—anything not directly measured

Signal-to-Noise Ratio (cont.)

  • Many statistics & tests are these ratios
    • And investigating multiple signals & even multiple sources of noise
  • And if there is more signal than noise,
    • We can then test if there is enough of a signal to “matter”
    • I.e., be “significant”
  • E.g., the F-test in an ANOVA
    • A ratio of “mean square variance” between groups/levels vs. “mean square error” within each group
    • If F > 1, then use sample size to determine if the value is big enough to be significant

Signal-to-Noise Ratio (cont.)

Table 7.4: Tests of Between-Subject Effects on Health Literacy Knowledge

Source Sum of Squares df Mean Square F p Partial \(\eta^2\)
Information Intervention 4.991 1 4.991 5.077 .025 0.028
Telesimulation Intervention 0.349 1 0.349 0.355 .552 0.022
Error 172.061 175 0.983

  • \(\frac{\text{Signal}}{\text{Noise}}=\frac{\text{Mean Square Between}}{\text{Mean Square Error}}\)
  • E.g., for the Information Intervention: \(\frac{4.991}{0.983} = 5.077\)

Patton, S. (2022). Effects of telesimulation on the health literacy knowledge, confidence, and application of nursing students. Doctoral dissertation, The Graduate Center, CUNY.

Effect Size

  • The signal (along with noise) is thus often used as part of an hypotheses, e.g.:
    • The difference between two means
      • E.g., physicians vs. NPs giving instructions at discharge
    • Or if the mean of an effect is not zero
      • E.g., effect of drinking wine
  • But it can used on its own to measure the size of the effect
    • This effect size measure is often standardized
      • To allow it to be compared across different outcomes
      • Of different studies
  • The use of effect size measures is growing
    • To help say how significant an effect is
    • Or even to replace significance tests

Effect Size (cont.)

  • There are several measures of effect size
    • Different ones for different types of data & analyses
    • Unfortunately, they’re not all on comparable scales
    • But they are still very useful
      • And at least worth presenting along
        with significance tests
  • Most created by Cohen (1988)
    • But Kraft (2020) argued for more modest
      expectations for education interventions

  • Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). L. Erlbaum Associates.
  • Kraft, M. A. (2020). Interpreting effect sizes of education interventions. Educational Researcher, 49(4), 241–253. https://doi.org/10.3102/0013189X20912798

Effect Size (end)

Statistic Notes & Refs   Small* Medium Large
Cohen’s d Cohen, 1988, p. 25 .2 .5 .8
Cohen’s d For ed. interventions (Kraft, 2020) .05 < .2 ≥ .2
h Difference btwn proportions; p. 184 .2 .5 .8
r The correlation coefficient, p. 83 .1 .3 .5
q Difference btwn correlations; p. 115 .1 .3 .5
w For χ² goodness of fit & contingency tables; p. 227 .1 .3 .5
η² For (M)AN(C)OVAs .01 .06 .14
f & β Also for (M)AN(C)OVAs; p. 285 & p. 355 .1 .25 .4
f & β For ed. interventions (Kraft, 2020) .025 < .1 ≥ .1
f² & β² For multiple regression/correlation, p. 413; multivariate linear regression & multivariate R², p. 477 .02 .15 .35

* A “small” effect is an effect that accounts for about 1% of the total variance

Example of Effect Sizes

Correlations between:

  • Compassion satisfaction & compassion fatigue
  • Compassion satisfaction & burnout
  • Compassion fatigue and burnout

From Zhang et al. (2018)

Inferential Statistics:


Common Tests

The \(\chi^2\) Distribution & Test

  • Background
    • Invented by Karl Pearson (in an abstruse 1900 article)
    • Originally used to test “goodness of fit
      • If two sets of data follow the same distribution
        • E.g., if the distribution of healthy & unhealthy outcomes is the same for different races/ethnicities
      • Or how well a set of data fit a theoretical distribution
        • E.g., if a sample’s distribution is the same as a normal distribution

Characteristics of the \(\chi^2\) Distribution

  • The distribution’s shape, location, etc. are all determined by the degrees of freedom
    • I.e.:
      • The mean = df
      • The variance = 2df
        • SD =  2df 
      • The maximum value for the y-axis = df – 2
        (when dfs >1)
  • As the degrees of freedom increase:
    • The χ² curve approaches a normal distribution &
    • The curve becomes more symmetrical

Characteristics of the \(\chi^2\) Distribution (cont.)

Plots of Several χ² Distributions

Chi Dist

Uses of the \(\chi^2\) Distribution

  • Because it only depends on df,
    • And resembles a normal distribution,
  • It is useful for testing if data follow a normal distribution
    • Or often if the total set of deviations from normality
      • (Or any set of expected values)
    • Are greater than expected.
  • It can do this for discrete values—like counts
    • Since it depends only on counts

Uses of the \(\chi^2\) Distribution (cont.)

  • The χ² distribution has many uses, including:
    1. Estimating of parameters of a population with an unknown distribution
    2. Checking the relationships between categorical variables
    3. Checking independence of two criteria of classification of multiple qualitative variables
    4. Testing deviations of differences between expected and observed frequencies
    5. Conducting “goodness of fit” tests

Example of a \(\chi^2\) Test

Notes: “AA” = African Americans; p-values are from tests of χ²s

Zhang, A. Y., Koroukian, S., Owusu, C., Moore, S. E., & Gairola, R. (2022). Socioeconomic correlates of health outcomes and mental health disparity in a sample of cancer patients during the COVID-19 pandemic. Journal of Clinical Nursing. https://doi.org/10.1111/jocn.16266

t and F Statistics

  • Very common tests of differences in means
    • These are signal-to-noise ratios
    • Cannot be significant if there is more noise than signal
      • I.e., if t < 1 or if F < 1
    • If >1, then can be significant if the sample is big enough
  • t is used to test the mean difference between two groups
    (“t for two”)
    • F is used for three or more groups
  • Mathematically:
    • The distributions of each strongly resemble
      normal distributions
    • t² = F

Example of t-Tests

β-weights are tested via t- or F-tests.

Associations between

  • Nurse staffing & skill mix and
  • Hospital consumer assessment of health care providers & systems measures
  • In pooled cross-sectional and longitudinal regression models

From Martsolf et al. (2016)

\(The\ End\)