Conceptual Models

Overview
Aim
Structure
SEM and path diagrams
Correlation versus Causation
- Levels of measurement
- Covariance and Correlation
Data
References

Overview

Conceptual models may help a researcher to:

organize the thought process and improve structured thinking
critically think about causality
identify equivalent models
formulate (informative and testable) hypotheses¹
summarize hypotheses graphically for a lay audience
summarize hypotheses graphically for an expert audience when translated into a path diagram

Although closely related, a conceptual model is not a path diagram.
A conceptual model is a tool for researchers to organize the thought process and to concisely summarize hypotheses. Although it is a good idea to follow some conventions there are no real rules in how to graphically represent the assumed relations between concepts. Researcher can (should?!) be creative in how they want to depict non-linear relations, feedback mechanisms, or complex ‘aggregation rules’.
A path diagram is a formal and graphical representation of a structural model (i.e. a set of structural equations). There are clear rules how to depict manifest and latent variables, (error)variances between exogenous and endogenous variables, random intercepts and slopes, et cetera.

Aim

The intended learning outcomes of this course are:

Theoretical knowledge and insight: you will be able to define core concepts related to conceptual models (e.g. correlation, causality, mediation, moderation) and are able to explain the distinction with path diagrams.
Academic attitude: you develop a positive attitude towards applying conceptual models in your own social science research projects.
Research skills: you will be able to use conceptual models as a tool to: formulate strong hypotheses; summarize hypotheses; assess the system of formulated hypotheses in social science projects of others.

Structure

On each different page we discuss a different conceptual model.

Correlation
Direct effect
Two direct effects (partial correlation / multiple regression)
Mediation
Spuriousness
Moderation
Mediated Moderation
Moderated Mediation
Multi-level model
Multi-level mediation (2-1-1 model)
Multi-level moderation (cross-level interaction)

Each page contains the following subsections:

The Conceptual Model
Explanation
Abstract hypothesis/hypotheses
Real life example
Structural equations
Formal test of hypotheses

Feel free to jump to the page and subsection you are most interested in. But there is a clear order in the pages and subsections. The best way to reach the aim of this course is by going through the pages one by one from top to bottom.
Use the menu on the left to navigate through the main sections.
Use the menu on the right to navigate to subsections of each current page.

SEM and path diagrams

Although I will try to translate each Conceptual Model into a more formal set of structural equations and will show you how to test this structural model (with the R package Lavaan), this is not a tutorial on SEM or path diagrams.

For useful links see:

Correlation versus Causation

More often than not, the conceptual model will try to summarize causal mechanisms (e.g. X causes Y). And more often than not, it is not possible to give a causal interpretation of the estimates obtained after the hypotheses test. I will try to discuss the extent to which parameter estimates may be given a causal interpretation throughout this tutorial.
Also note that I use the label correlation but naturally the correct measure for the presumed (bivariate) association depends on the type of scales used for the measurement of the variables involved.

Levels of measurement

Categorical (binary and nominal)
Ordinal
Continuous:
- Interval
- Ratio

Covariance and Correlation

Social scientists try to explain variance and covariance. It is therefore a good idea to learn by heart the formula for variance and covariance. The sample variance of a random continuous variable X, VAR(X), is as follows:

\[VAR(X)=s_{xx}^2 = s_x^2= \frac{\Sigma^n_{i=1}(X_i-\overline{X})(X_i-\overline{X})}{n-1}= \frac{\Sigma^n_{i=1}(X_i-\overline{X})^2}{n-1}\]
(And the sample standard deviation is given by: \(STD(X)=\sqrt{s_x^2}=s_x\).)

The sample covariance of two random continuous variables X and Y, COV(X,Y) is as follows:

\[COV(X,Y)=s_{xy}^2 = \frac{\Sigma^n_{i=1}(X_i-\overline{X})(Y_i-\overline{Y})}{n-1}\] The sample correlation coefficient between two random continuous variables X and Y, COR(X,Y), is a covariance on standardized variables (\(z_x=X_{sd}=(X-\overline{X})/s_x\)) and hence:

\[COR(X,Y)=r_{xy} = \frac{s_{xy}^2}{s_x s_y}= \frac{\Sigma^n_i(X_i-\overline{X})(Y_i-\overline{Y})}{\sqrt{\Sigma^n_i(X_i-\overline{X})^2}\sqrt{\Sigma^n_i(Y_i-\overline{Y})^2}}\] Just to be complete. The population equivalent of the covariance:

\[\sigma _{xy}^2 = \frac{\Sigma^n_i(X_i - \mu_x)(Y_i-\mu_y)}{N},\] with \(\mu\) the population mean. And the correlation within the population is:

\[\rho_{xy} = \frac{\sigma_{xy}^2}{\sigma_x \sigma_y}\]

Data

For the examples in this tutorial I will use the dateset [The NEtherlands Longitudinal Lifecourse Study (NELLS)] (https://doi.org/10.17026/dans-25n-2xjv) (Tolsma et al. (2014)). Please follow the link to download the codebook and dataset.

References

Tolsma, Jochem, G. L. M. Kraaykamp, Paul M. de Graaf, Matthijs Kalmijn, and Christiaan WS Monden. 2014. “The NEtherlands Longitudinal Lifecourse Study (NELLS, Panel): Codebook.” https://doi.org/https://doi.org/10.17026/dans-25n-2xjv.

With hypothesis (plural hypotheses) I mean a proposed explanation for a (social) phenomenon. The hypothesis becomes scientific when the explanation is a prediction derived from a theory and it is testable with empirical data and a suitable scientific method.↩︎

Last updated on Jan 11, 2022