Structural Equation Modeling
Core Concepts
Structural Equation Modeling (SEM) tests the relationships between non-ostensible (endogenous) factors
Much like a linear regression model tests relationships between ostensible (exogenous) variables / items
But SEM:
Is more flexible
Can test more complex models
Including causal relationships
Can investigate endogenous variables as well as exogenous
Typically uses maximum likelihood estimation instead of ordinary least squares
Core Concepts (cont.)
SEM in fact subsumes:
Factor analysis (typically CFA)
Canonical correlation (
multivariate analysis of correlation
)
Regression analysis (linear, logistic, etc.)
Discriminant analysis (
predicting classification into several groups
)
And is being used with increasing frequency to investigate, e.g.,
Communicating health-related research
Creating a health index
Study cultural concepts of love
Basic Analyses
In SEMs, we assess outputs of:
Model fit (χ², etc.)
Model parameters
Factor loadings if conducting factor analysis
Covariances & variances between factor indicators & factors
Often with especial attention given to factor inter-relationships
Error variances
Conceiving of SEMs
SEMs are often conceived of as
path diagrams
based on the same ideas we used for CFAs:
Same Model without Factor Indicators (Items) or Error Terms
We sometimes simplify the diagrammed model to present only the factors
This does
not
imply that the unrepresented parameters are not computed
A
double-headed
arrow denotes a predicted
correlation
(or bidirectional) relationship between the factors:
A
single-headed
arrow denotes a predicted
causal relationship
between the factors such that Factor A causes changes in Factor B:
Three Factors
The power of SEMs begins to show when we consider several factors
And more complex relationships, e.g.,
Factor A affects Factor C
Factor B affects Factor C
Factor A is unrelated to Factor B
Three Factors with Mediation
Factor A affects Factor C
Factor B affects the
relationship
between A & C
I.e., Factor B
mediates
the relationship between A & C
Here, Factor B has no
direct
effect on Factor C
Mediation vs. Moderation
Mediation
:
When a predictor
has
no
direct effect
on an outcome
But that predictor
affects something
else
that
does
have a direct effect on that outcome
E.g., hand washing per se doesn’t affect infections
Hand washing affects the number of microbes available to infect
Moderation
:
When a predictor
has a direct effect
on an outcome
But the
magnitude
of the effect is affected by something else
E.g., Thoroughness of hand washing affects number of microbes
Mediation vs. Moderation (cont.)
Again,
mediators
are drawn as affecting the path between the two:
Moderators
are drawn as going through another factor:
Yet More Complex Models
And, sure, a model could contain:
A direct effect,
A mediating effect,
and
A moderating effect
All of which could be tested
Mechanics of SEMs
Assumptions
Normality
Typically assume multivariate normality
N.b., maximum likelihood estimation is rather robust against departures from normality
But strongly multivariate non-normal data can
create larger χ²s
(i.e., greater model misfit)
Leading to higher Type 2 errors (false negatives, i.e., a falsely poorly-fit fit model)
I.e., parameter estimates per se will likely be reasonable
(If not asymptotically unbiased)
But standard errors (and thus χ²s) will be large
And probably somehow biased
Assumptions (cont.)
Normality (cont.)
With very non-normal data, can use:
Asymptotically distribution free
(aka
weighted least squares
) estimation
But the
Satorra-Bentler χ² correction
(
Satorra & Bentler, 2010
) is
preferred
when data are interval / ratio
See Finney & DiStefano (
2008
) for more on robust ML estimation
Assumptions (cont.)
Sample size
Like confirmatory factor analysis, requires large
N
(>200 or
~20 per factor indicator
)
Non-normality increases need for lager
N
Ordinal data
If monotonic & arguably sample a non-ostensible continuous construct
And have a large
N
(
> ~500
)
Can include via polychoric correlations
Or treat as interval
When there are several (>4) response options
And data are nearly normal
SEM Procedure
Largely the same as for general(ized) linear models
Usually with more parameters
Therefore, typically:
Use maximum likelihood estimation to try to fit a proposed model to the data’s covariance matrix (may also use variable / factor indicator means)
Review fit indices (e.g., χ², AIB/BIC, plus a few more)
Review final parameters values for insights
Modify model / parameters & test against other theory-driven models
Comparing Models
Comparing Models: Modifying Parameters
Simplest comparisons are between models that vary only in parameter estimates
E.g., whether to include / remove a relationship between two factors, e.g.:
Modification Indices
Lagrange multiplier test
Computes minimum amount the χ² (of the residual covariance matrix) would decrease if the given parameter were freed
Parameters with largest Lagrange multiplier values are most impactful on model
Still needs to be guided by theory
Especially since studies tend to find that models guided by Lagrange multiplier tests generalize poorly to other data
Comparing Models: Modifying Models
SEMs can test totally different arrangements of the factors
Or even different factor loadings
Through tests of overall model fits
Usually via χ², AIC, & BIC
Comparing Models: Comparing Groups
SEMs can test group differences
E.g., whether the same relationships between factors holds for different groups
And thus can be sophisticated alternatives to, e.g., (M)AN(C)OVAs
We can test how well a model fits different groups by placing / removing
equality constraints
in the model
Adding / removing an equality constraints tests how the model performs when assuming the groups are similar / dissimilar on a given parameter
Comparing Models: Comparing Groups (cont.)
Remember SEMs often include a series of linear regressions
These include intercepts for endogenous variables
We can compare whether these intercepts between groups
This will test the effect of group membership on the (other) model parameters
I.e., akin to adding a dummy variable for that group
Example of SEMs
Overall Model
Effets of Demographics
Effets of Demographics (cont.)
EFs & Academics
Overall Predicting Academics
The End