Linear Regression Review
and
Testing Models Theoretically

Overview

Summary to Date
Review of linear regression model
Partialing out variance
Combining similar sources of variance
Ostensible & non-ostensible variables
Model fit

Summary to Date

Descriptives vs. Inferentials

Descriptives good
- Simple & intuitive
  - Can efficiently describe the sample
- Robust
  - Because they make no assumptions about the population
Mean & SD
- SD as average distance from the mean
- SD as a standard unit of measurement
  - Standardized (z) scores
    - Why correlation is so popular
    - And covariance isn’t

Descriptives vs. Inferentials (cont.)

Inferentials
- Making assumptions about the population
- Most importantly the distribution
  - Often assume approximates a normal distribution
    - But we know we’re wrong
  - Assumptions most robust against:
    - Kurtosis (& skew a bit)
    - Non-independence of measures (“multicollinearity”)
    - Changes in variance over time (“heteroscedasticity”)

Descriptives vs. Inferentials (end)

Inferentials (cont.)
- Assumptions not robust against:
  - Non-independence of participants
  - Multi-modality (more than one “hump”)
Sample stats approximating population stats
- Accuracy of sample stats improves when:
  - Larger sample sizes
  - More representative sampling
  - Multiple “draws” of samples

Central Limit Theorum

“Multiple ‘draws’ of samples”
- Sample stats never equal population stats
  - A sample stat always has some error to its measurement
- But! (assuming consistent sampling techniques)
  - The error of measurement of sample stats greatly tends to be normally distributed
  - This leads to the Central Limit Theorem
    - Which undergirds—allows for—nearly every statistic you’ll use
    - So remember that, even if you rarely think about it

Variance & Covariance

Variance = Information
Seek to understand that information
- The more we understand, the better
- Often quantify “how much we understand” as a signal-to-noise ratio
  - $\text{Variance understood} = \frac{\text{Variance accounted for}}{\text{Variance }not \text{ accounted for}}$

So, if “accounting” for the effect of one variable on an other:
- $\text{Variance understood} = \frac{\text{Covariance}}{\text{Unshared variance}}$
- Which is a correlation
  - (When it’s standardized)

Variance & Covariance (cont.)

Variance = Information
Seek to understand that information
- The more we understand, the better (cont.)

And if we understand enough, we say we’ve made a “significant” insight
- When is “enough” enough?
  - Usually when we’re 95% sure we’ve found enough

I.e., when we’re 95% sure that our sample stat…

measures a population stat…

that is different than the “null” value.

(“Null” usually being “not different than zero,” “no effect,”
“no difference,” “no information,” etc.)

Partialing Out Variance

Ways to increase the size of the signal to the size of the noise:
Increase size of signal
- Bigger effects
- Greater range of measurements of effects

Decrease size of noise
- Greater precision of measurement
- Remove the noise

Partial out the variance that is unshared between those variables
- But is accounted for by some third variable

Ways of Communicating

Merging visuals with text
- What is best described where
- Visuals as “conversation pieces”
  - Text as highlighting what to focus on in visuals
  - Especially vis-à-vis theory

Efficiency & simplicity
- “Information-to-ink ratio”

Strong organization
- Clear guideposts & structure

Common—but not colloquial—language
- Following writing conventions
- Avoiding jargon and acronyms

Review of Linear Regression

Basic Strategy

Assume the relationships between variables are linear
Find a line that best accounts for all variables in the model
- Either via “ordinary least squares”
- Or “maximum likelihood”
If the line—the linear regression—accounts for enough of the total enough, declare significance of model
- Measure, e.g., with R²
And/or look at some/all of the included predictors to see which of them are significant contributor to that linear regression
- Amount of variance accounted for by that variables vs. total variance

Partialing Out Variance

At minimum, we separate out the variance associated with our predictor(s) from “error”
- And, perhaps, the “intercept,” the starting, pre-intervention values for each participant

\[Y = b_{0} + b_{1}X_{1} + e\]

Adding other predictors separates out—partials out—the variance associated with each:

\[Y = b_{0} + b_{1}X_{1} + b_{2}X_{2} + e\]

Partialing Out Variance

Error can also be separated
- E.g., if we know the sources of those errors (same hospital, neighborhood, etc.)

\[Y = b_{0} + b_{1}X_{1} + b_{2}X_{2} + e_{1} + e_{2}\]

We can also separate out error and effects of predictors over time
- Assume that events will be more similar to each other at one time point
- And that values within a person will tend to be more similar than between people
Different ways to do this,
- But I suggest using multilevel models of change for most longitudinal analyses (more later)

Combining Similar

Sources of Variance

Common Sources of Variance

Similar predictors may share too much variance
- If left un-addressed, can lead to “multicollinearity”
- Which leads to unstable models terms
  - E.g., terms will flip from being significant to not & back depending on what other terms are added to the model
Usually addressed by removing one of the multicollinear terms

But we can also combine or group those variables…

Combining Sources of Variance (cont.)

We do this all the time, in fact
- Adding up responses to items on the same survey
- Taking the average of a results from a blood draw

But what if we know two variables are related, but not really easily combined?
- E.g., ZIP code and salary
  - 10010 + $75,000 $\ne$ 85,010

Model Fit

We can group them into “families” of variables within the model…

\[Y = b_{0} + b_{1}X_{1} + ( b_{ZIP}X_{ZIP} + b_{Salary}X_{Salary} ) + e \]

We can test this by looking at the difference in model fit:

\[R^2_{FirstModel} = b_{0} + b_{1}X_{1} + e \]

\[R^2_{Second Model} = b_{0} + b_{1}X_{1} + ( b_{ZIP}X_{ZIP} + b_{Salary}X_{Salary} ) + e \]

\[\text{Difference} = R^2_{FirstModel} - R^2_{Second Model}\]

If that difference is significant, then that “family” of variables significantly improves our understanding of the outcome

Ostensible &

Non-Ostensible Variables

Ostensible & Non-Ostensible

Some Things We Can See…
- Neighborhoods & paychecks
- Blood pressure & adipose tissue
- Smiles and cortisol levels

Some Things We Can’t See…
- “Socio-economic status”
- “Health”
- “Stress”

Ostensible & Non-Ostensible (cont.)

Things we can observe empirically are sometimes called ostensible
While the underlying “construct” they are manifestations of are non-ostensible

Model Testing

ZIP & Salary Example (cont.)

This is usually done not using R²,
- But instead the information criterion
- Determined fro maximum likelihood estimations (MLEs)
  - So need to use them
  - But they are more robust than ordinary least squares
  - And ordinary least squares are MLEs when all assumptions are met

Common information criterion:
- Akaike Information Criterion (AIC)
- Bayesian Information Criterion (BIC)
- Both adjust for number of terms in the model
  - But BIC adjusts more aggressively
  - So use BIC when there are a lot of terms in the model

Model Testing (cont.)

AIC & BIC are measures of misfit
- So larger numbers are worse

Thus want to see that the model with the “family” of interest has a smaller information criterion
- I.e.:

\[\text{Difference in Model Fit} =\]

\[ AIC_{\text{Model without Family}} - AIC_{\text{Model with Family}}\]

Information Criteria can be large, so this may be, e.g.:

\[\text{Diff. in Model Fit} = 2010 - 2000 = 10\]

Model Testing (cont.)

We test this difference against a χ² with degrees of freedom equal to the difference in dfs between the models
E.g.:

$b_{1}X_{1}$

$b_{1}X_{1} + ( b_{ZIP}X_{ZIP} + b_{Salary}X_{Salary} )$

The second model as 2 more dfs than the first
- So text χ² = 10 with df = 2
- (Which would be significant; critical χ² $\approx$ 6 for one-tailed tests)

Linear Regression ReviewandTesting Models Theoretically

Overview

Summary to Date

Descriptives vs. Inferentials

Descriptives vs. Inferentials (cont.)

Descriptives vs. Inferentials (end)

Central Limit Theorum

Variance & Covariance

Variance & Covariance (cont.)

Partialing Out Variance

Ways of Communicating

Review of Linear Regression

Basic Strategy

Partialing Out Variance

Partialing Out Variance

Combining Similar

Common Sources of Variance

Combining Sources of Variance (cont.)

Model Fit

Ostensible &

Ostensible & Non-Ostensible

Ostensible & Non-Ostensible (cont.)

Model Testing

ZIP & Salary Example (cont.)

Model Testing (cont.)

Model Testing (cont.)

\(The\ End\)

Linear Regression Review
and
Testing Models Theoretically