RSiena II (consequences)

1 Introduction - RSiena behaviour

This page will teach you how to use the R package RSiena for modeling behavioral change. Thus on this page we will focus on node evolution and influence processes.
Please visit the RSiena homepage here. On the website you may find the exceptionally well written manual. But the RSiena website also contains all kind of useful RSiena scripts and ‘lab exercises’. When we model both tie-evolution (which is what we almost without exception do with RSiena) and node-evolution, we also call this a co-evolution model. Before we start with our own lab exercise, I would like you to:

  1. walk through script number 6 on the RSiena website (which is called Rscript04SienaBehaviour.R).

My own lab exercise can be found below.

2 RSiena - some additional explanations

Having taught the course social networks for a couple of years now to the students of the Research Master Social and Cultural Sciences, I think there are a couple of aspects that deserve a little bit more attention than they get in the manual and on the RSiena website. Namely,…

  1. How do we model node evolution and are results similar to ‘normal’ regression models?
  2. What is the difference between an average alter effect and a total alter effect?
  3. Can senders only influence receivers or also vice versa? Do we need both in our models?

2.1 modelling node evolution

2.1.1 RSiena-linear and quad effects

2.1.2 Multi-level hybrid model

2.2 avAlt and totAlt effects

I think this question deserves a knowledge clip. Please watch the video below.

After having watched the video you have learned:

  1. You may be influenced by (the absence of) ties in the network, regardless of the node attributes.
  2. You may be influenced by others to whom you are connected via an undirected tie, an outgoing tie, and an ingoing tie.
  3. The difference between average alter and total alter effects.
  4. That (positive) influence may also mean that you become similar to an alter to which you are connected with a specific node attribute but thereby become less similar to more other alters to which you are also connected.
Average Alter Effect

Figure 2.1: Average Alter Effect

Total Alter Effect

Figure 2.2: Total Alter Effect

2.3 …Alt versus …InAlt effects

Have a look at the picture below.

Who is influencing whom?

Figure 2.3: Who is influencing whom?

  • Undirectied tie: The direction of influence does not depend on the direction tie. In RSiena who is influenced depends on who has the opportunity for change (thus by the behavioral rate function)
  • Directed tie i –> j. Actor j may assimilate to i because of an incoming tie. In RSiena these effects are …inAlt effects (e.g. effects si20beh and si21beh). Example 1a: I teach you how to apply a social network perspective in your research, you come to value a social network perspective and your research behavior becomes more similar to mine.
  • Directed tie i –> j. Actor i may assimilate to j because of an outgoing tie. In RSiena these effects are …Alt effects (e.g. effects si18beh and si19beh). Example 2a: I name you as my friend and therefore I will mimic your behavior.

Naturally, the influence may also be negative: Example 1b: I teach you how to apply a social network perspective in your research, you come to dislike a social network perspective and your research behavior becomes less similar to mine.
Example 2b: I name you as my foe and therefore I will behave as different as possible from you.

The question is, can we tease these processes apart and can we tell which one is more important? To be honest, I am not sure. But we will try below.

3 lab exercise

We are going to use data collected among primary and secondary school pupils in the Netherlands during the MyMovez project.

During the workgroup I will explain all code. For those of you who don’t attend the workgroups, google knows way more than I do.

In the upper left and right corner of the code blocks you will find copy-to-clipboard buttons. Use these buttons to copy the code to your own editor.

3.1 Before you start

Before you start, check whether you run the latest RStudio version (from the Help menu, pick ‘check for updates’ and whether you need to update R.

install.packages("installr")  #you  first install packages
require(installr)  #then you will need to activate packages. 
updateR()  #run the function to start the update process

Give your script a nice name. Include the author, and data when you last modified the script. Include a lot of comments in your script! Don’t forget, always start with cleaning up your workspace.

### Author: JOCHEM TOLSMA### Lastmod: 31-08-2020###

# cleanup workspace
rm(list = ls())

And set your working directory.

# set working directory
setwd("C:\\YOURDIR\\YOURSUBDIR\\YOURSUBSUBDIR\\")  #change to your own workdirectory

Install the packages you will need.

library(RSiena)
library(sna)
library(ape)
library(network)

Define your custom build functions.

fMoran.I:

fMoran.I <- function(x, weight, scaled = FALSE, na.rm = FALSE, alternative = "two.sided", rowstandardize = TRUE) {
    if (rowstandardize) {
        if (dim(weight)[1] != dim(weight)[2]) 
            stop("'weight' must be a square matrix")
        n <- length(x)
        if (dim(weight)[1] != n) 
            stop("'weight' must have as many rows as observations in 'x'")
        ei <- -1/(n - 1)
        nas <- is.na(x)
        if (any(nas)) {
            if (na.rm) {
                x <- x[!nas]
                n <- length(x)
                weight <- weight[!nas, !nas]
            } else {
                warning("'x' has missing values: maybe you wanted to set na.rm = TRUE?")
                return(list(observed = NA, expected = ei, sd = NA, p.value = NA))
            }
        }
        ROWSUM <- rowSums(weight)
        ROWSUM[ROWSUM == 0] <- 1
        weight <- weight/ROWSUM
        s <- sum(weight)
        m <- mean(x)
        y <- x - m
        cv <- sum(weight * y %o% y)
        v <- sum(y^2)
        obs <- (n/s) * (cv/v)
        if (scaled) {
            i.max <- (n/s) * (sd(rowSums(weight) * y)/sqrt(v/(n - 1)))
            obs <- obs/i.max
        }
        S1 <- 0.5 * sum((weight + t(weight))^2)
        S2 <- sum((apply(weight, 1, sum) + apply(weight, 2, sum))^2)
        s.sq <- s^2
        k <- (sum(y^4)/n)/(v/n)^2
        sdi <- sqrt((n * ((n^2 - 3 * n + 3) * S1 - n * S2 + 3 * s.sq) - k * (n * (n - 1) * S1 - 2 * n * 
            S2 + 6 * s.sq))/((n - 1) * (n - 2) * (n - 3) * s.sq) - 1/((n - 1)^2))
        alternative <- match.arg(alternative, c("two.sided", "less", "greater"))
        pv <- pnorm(obs, mean = ei, sd = sdi)
        if (alternative == "two.sided") 
            pv <- if (obs <= ei) 
                2 * pv else 2 * (1 - pv)
        if (alternative == "greater") 
            pv <- 1 - pv
        list(observed = obs, expected = ei, sd = sdi, p.value = pv)
    } else {
        if (dim(weight)[1] != dim(weight)[2]) 
            stop("'weight' must be a square matrix")
        n <- length(x)
        if (dim(weight)[1] != n) 
            stop("'weight' must have as many rows as observations in 'x'")
        ei <- -1/(n - 1)
        nas <- is.na(x)
        if (any(nas)) {
            if (na.rm) {
                x <- x[!nas]
                n <- length(x)
                weight <- weight[!nas, !nas]
            } else {
                warning("'x' has missing values: maybe you wanted to set na.rm = TRUE?")
                return(list(observed = NA, expected = ei, sd = NA, p.value = NA))
            }
        }
        # ROWSUM <- rowSums(weight) ROWSUM[ROWSUM == 0] <- 1 weight <- weight/ROWSUM
        s <- sum(weight)
        m <- mean(x)
        y <- x - m
        cv <- sum(weight * y %o% y)
        v <- sum(y^2)
        obs <- (n/s) * (cv/v)
        if (scaled) {
            i.max <- (n/s) * (sd(rowSums(weight) * y)/sqrt(v/(n - 1)))
            obs <- obs/i.max
        }
        S1 <- 0.5 * sum((weight + t(weight))^2)
        S2 <- sum((apply(weight, 1, sum) + apply(weight, 2, sum))^2)
        s.sq <- s^2
        k <- (sum(y^4)/n)/(v/n)^2
        sdi <- sqrt((n * ((n^2 - 3 * n + 3) * S1 - n * S2 + 3 * s.sq) - k * (n * (n - 1) * S1 - 2 * n * 
            S2 + 6 * s.sq))/((n - 1) * (n - 2) * (n - 3) * s.sq) - 1/((n - 1)^2))
        alternative <- match.arg(alternative, c("two.sided", "less", "greater"))
        pv <- pnorm(obs, mean = ei, sd = sdi)
        if (alternative == "two.sided") 
            pv <- if (obs <= ei) 
                2 * pv else 2 * (1 - pv)
        if (alternative == "greater") 
            pv <- 1 - pv
        list(observed = obs, expected = ei, sd = sdi, p.value = pv)
    }
    
    
}

fanscsv:

fanscsv <- function(ans = ans, filename = "ans.csv", write = TRUE) {
    ans1_mat <- matrix(NA, nrow = length(ans$effects[, 2]), ncol = 3)
    row.names(ans1_mat) <- ans$effects[, 2]
    ans1_mat[, 1] <- (ans$theta)
    ans1_mat[, 2] <- (ans$se)
    ans1_mat[, 3] <- (ans$tconv)
    ans1_mat <- rbind(ans1_mat, c(ans$tconv.max))
    row.names(ans1_mat)[length(row.names(ans1_mat))] <- "Overall maximum convergence ratio:"
    colnames(ans1_mat) <- c("Estimate", "Standard Error", "Convergence t-ratio")
    print(ans1_mat)
    if (write) {
        write.csv(ans1_mat, filename)
    }
}

3.2 Data

We are going to play with a subsample of the MyMovez data. The original data are stored at the Data Archiving and Networked Services.

Download beh_data.Rdata


Load the Robject and have a look at it.

load("beh_data.RData")  #change to your working directory
str(ExData, 1)
str(ExData$depvars, 1)
## List of 8
##  $ nodeSets         :List of 1
##  $ observations     : int 3
##  $ depvars          :List of 3
##  $ cCovars          :List of 4
##  $ vCovars          :List of 1
##  $ dycCovars        : list()
##  $ dyvCovars        : list()
##  $ compositionChange: list()
##  - attr(*, "higher")= Named logi [1:9] FALSE FALSE FALSE FALSE FALSE FALSE ...
##   ..- attr(*, "names")= chr [1:9] "friendship,friendship" "advice,friendship" "mvpa_y,friendship" "friendship,advice" ...
##  - attr(*, "disjoint")= Named logi [1:9] FALSE FALSE FALSE FALSE FALSE FALSE ...
##   ..- attr(*, "names")= chr [1:9] "friendship,friendship" "advice,friendship" "mvpa_y,friendship" "friendship,advice" ...
##  - attr(*, "atLeastOne")= Named logi [1:9] FALSE FALSE FALSE FALSE FALSE FALSE ...
##   ..- attr(*, "names")= chr [1:9] "friendship,friendship" "advice,friendship" "mvpa_y,friendship" "friendship,advice" ...
##  - attr(*, "class")= chr "siena"
## List of 3
##  $ friendship: 'sienaDependent' num [1:149, 1:149, 1:3] 0 1 0 1 1 1 0 0 0 0 ...
##   ..- attr(*, "type")= chr "oneMode"
##   ..- attr(*, "sparse")= logi FALSE
##   ..- attr(*, "nodeSet")= chr "Actors"
##   ..- attr(*, "netdims")= int [1:3] 149 149 3
##   ..- attr(*, "allowOnly")= logi TRUE
##   ..- attr(*, "uponly")= logi [1:2] FALSE FALSE
##   ..- attr(*, "downonly")= logi [1:2] FALSE FALSE
##   ..- attr(*, "distance")= int [1:2] 527 527
##   ..- attr(*, "vals")=List of 3
##   ..- attr(*, "nval")= int [1:3] 21580 21609 21579
##   ..- attr(*, "noMissing")= num [1:3] 472 443 473
##   ..- attr(*, "noMissingEither")= num [1:2] 489 473
##   ..- attr(*, "nonMissingEither")= num [1:2] 21563 21579
##   ..- attr(*, "balmean")= num 0.0908
##   ..- attr(*, "structmean")= num 0.0902
##   ..- attr(*, "simMean")= logi NA
##   ..- attr(*, "symmetric")= logi FALSE
##   ..- attr(*, "missing")= logi TRUE
##   ..- attr(*, "structural")= logi TRUE
##   ..- attr(*, "range2")= num [1:2] 0 1
##   ..- attr(*, "ones")= Named int [1:3] 960 1102 967
##   .. ..- attr(*, "names")= chr [1:3] "1" "1" "1"
##   ..- attr(*, "density")= Named num [1:3] 0.0445 0.051 0.0448
##   .. ..- attr(*, "names")= chr [1:3] "1" "1" "1"
##   ..- attr(*, "degree")= Named num [1:3] 6.58 7.55 6.63
##   .. ..- attr(*, "names")= chr [1:3] "1" "1" "1"
##   ..- attr(*, "averageOutDegree")= num 6.92
##   ..- attr(*, "averageInDegree")= num 6.92
##   ..- attr(*, "maxObsOutDegree")= num [1:3] 27 27 27
##   ..- attr(*, "missings")= num [1:3] 0.0214 0.0201 0.0214
##   ..- attr(*, "name")= chr "friendship"
##  $ advice    : 'sienaDependent' num [1:149, 1:149, 1:3] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "type")= chr "oneMode"
##   ..- attr(*, "sparse")= logi FALSE
##   ..- attr(*, "nodeSet")= chr "Actors"
##   ..- attr(*, "netdims")= int [1:3] 149 149 3
##   ..- attr(*, "allowOnly")= logi TRUE
##   ..- attr(*, "uponly")= logi [1:2] FALSE FALSE
##   ..- attr(*, "downonly")= logi [1:2] FALSE FALSE
##   ..- attr(*, "distance")= int [1:2] 390 295
##   ..- attr(*, "vals")=List of 3
##   ..- attr(*, "nval")= int [1:3] 21576 21609 21577
##   ..- attr(*, "noMissing")= num [1:3] 476 443 475
##   ..- attr(*, "noMissingEither")= num [1:2] 493 475
##   ..- attr(*, "nonMissingEither")= num [1:2] 21559 21577
##   ..- attr(*, "balmean")= num 0.03
##   ..- attr(*, "structmean")= num 0.0297
##   ..- attr(*, "simMean")= logi NA
##   ..- attr(*, "symmetric")= logi FALSE
##   ..- attr(*, "missing")= logi TRUE
##   ..- attr(*, "structural")= logi TRUE
##   ..- attr(*, "range2")= num [1:2] 0 1
##   ..- attr(*, "ones")= Named int [1:3] 362 294 201
##   .. ..- attr(*, "names")= chr [1:3] "1" "1" "1"
##   ..- attr(*, "density")= Named num [1:3] 0.01678 0.01361 0.00932
##   .. ..- attr(*, "names")= chr [1:3] "1" "1" "1"
##   ..- attr(*, "degree")= Named num [1:3] 2.48 2.01 1.38
##   .. ..- attr(*, "names")= chr [1:3] "1" "1" "1"
##   ..- attr(*, "averageOutDegree")= num 1.96
##   ..- attr(*, "averageInDegree")= num 1.96
##   ..- attr(*, "maxObsOutDegree")= num [1:3] 17 27 17
##   ..- attr(*, "missings")= num [1:3] 0.0216 0.0201 0.0215
##   ..- attr(*, "name")= chr "advice"
##  $ mvpa_y    : 'sienaDependent' num [1:149, 1, 1:3] 1 NA 4 NA 2 1 NA 1 2 3 ...
##   ..- attr(*, "type")= chr "behavior"
##   ..- attr(*, "sparse")= logi FALSE
##   ..- attr(*, "nodeSet")= chr "Actors"
##   ..- attr(*, "netdims")= int [1:3] 149 1 3
##   ..- attr(*, "allowOnly")= logi TRUE
##   ..- attr(*, "uponly")= logi [1:2] FALSE FALSE
##   ..- attr(*, "downonly")= logi [1:2] FALSE FALSE
##   ..- attr(*, "distance")= num [1:2] 105 109
##   ..- attr(*, "vals")=List of 3
##   ..- attr(*, "nval")= int [1:3] 102 113 90
##   ..- attr(*, "noMissing")= int [1:3] 47 36 59
##   ..- attr(*, "noMissingEither")= num [1:2] 59 69
##   ..- attr(*, "nonMissingEither")= num [1:2] 90 80
##   ..- attr(*, "symmetric")= logi NA
##   ..- attr(*, "poszvar")= logi TRUE
##   ..- attr(*, "range")= num 4
##   ..- attr(*, "range2")= num [1:2] 1 5
##   ..- attr(*, "moreThan2")= logi TRUE
##   ..- attr(*, "modes")= num [1:3] 2 4 1
##   ..- attr(*, "missing")= logi TRUE
##   ..- attr(*, "simMean")= num 0.622
##   ..- attr(*, "structural")= logi FALSE
##   ..- attr(*, "balmean")= logi NA
##   ..- attr(*, "structmean")= logi NA
##   ..- attr(*, "name")= chr "mvpa_y"
##   ..- attr(*, "simMeans")= Named num [1:2] 0.849 0.773
##   .. ..- attr(*, "names")= chr [1:2] "friendship" "advice"

What do we have?

  • Data on 7 classes. See below how to play with just one class.
  • friendship: dependent tie-variable: who are your friends
  • advice: dependent tie-variable: I am not telling you if it is ‘to whom do you give advice?’ or ‘from whom do you receive advice?’ that would make for a nice question on the exam.
  • mvpa_y: behavioral dependent variable ‘moderate to vigorous physical activity’ measured via FitBit. I made five categories (from low to high)
  • ethnicNLNA: a covariate (including missing values) on ethnicity of pupils: 0 is native Dutch; 1 is other ethnic background.
  • sex: sex of pupils: 0 is boys; 1 is girls.
  • lft: age of pupils (measured in years)
  • primary: primary school (grade 8) or secondary school (grade 9)
  • mvpa_x: please ignore this variable for now.

3.3 Descriptive statistics of similarity in behavior

On the previous page we focused on tie evolution and selection processes. We started the lab exercise with descriptive statistics on homophily and segregation. Are similar people more likely to be connected. We now take a slightly different angle. We want to know if nodes who are closer to one another in the network are more a like.
“Hey, that sounds like some sort of correlation!” Yup, we need some spatial autocorrelation measure. Let us use Moran’s I for this.
We will start with a calculation of the correlation between the score of actor i and the (total/mean) score of the alters of i to whom i is connected directly.

Spatial autocorrelation measures are actually quite complex. A lot of build in functions in different packages of R are not very clear on all the defaults. With respect to Moran’s I, its values are actually quite difficult to compare across different spatial/network settings. Results may depend heavily on whether or not you demean your variables of interest, the chosen neighborhood/weight matrix (and hence on distance decay functions and type of standardization of the weight matrix). Anyways, lets get started.

3.3.1 formula

Moran’s I is given by:


I = γΣiΣjwijzizj,
where wij is the weight matrix zi and zj are the scores in deviations from the mean. And,


$$ \gamma= S_0 * m_2 = \Sigma_i\Sigma_jw_{ij} * \frac{\Sigma_iz_i^2}{n}, $$
Thus S0 is the sum of the weight matrix and m2 is an estimate of the variance. For more information see Anselin 1995.

3.3.2 Moran’s autocorrelation for outgoing ties: RSiena build-in dataset

We need two packages, if we not want to define all functions ourselves: sna and ape.1

Let us first demonstrate the concept on the build-in dataset of RSiena and then apply it to our own data.

First use sna. And give alters a weight of 1 if they are part of the 1.0 degree egocentric network.

library(network)
friend.data.w1 <- s501
friend.data.w2 <- s502
friend.data.w3 <- s503
drink <- s50a
smoke <- s50s

net1 <- network::as.network(friend.data.w1)
net2 <- network::as.network(friend.data.w2)
net3 <- network::as.network(friend.data.w3)

# nacf does not row standardize!
snam1 <- nacf(net1, drink[, 1], type = "moran", neighborhood.type = "out", demean = TRUE)
snam1[2]  #the first order matrix is stored in second list-element
##         1 
## 0.4331431

Lets calculate the same thing with ape.

geodistances <- geodist(net1, count.paths = TRUE)
geodistances <- geodistances$gdist

# first define a nb based on distance 1.
weights1 <- geodistances == 1

# this function rowstandardizes by default
ape::Moran.I(drink[, 1], scaled = FALSE, weight = weights1, na.rm = TRUE)
## $observed
## [1] 0.486134
## 
## $expected
## [1] -0.02040816
## 
## $sd
## [1] 0.132727
## 
## $p.value
## [1] 0.000135401

Close but not the same value!

Lets use ‘my own’ function, don’t rowstandardize

fMoran.I(drink[, 1], scaled = FALSE, weight = weights1, na.rm = TRUE, rowstandardize = FALSE)
## $observed
## [1] 0.4331431
## 
## $expected
## [1] -0.02040816
## 
## $sd
## [1] 0.1181079
## 
## $p.value
## [1] 0.000122962

Same result as nacf function!

What does rowstandardization mean??

  • rowstandardize: We assume that each node i is influenced equally by its neighbourhood regardless on how large it. You could compare this to the average alter effect in RSiena)
  • not rowstandardize: We assume that each alter j has the same influence on i (if at the same distance). You could compare this to the total alter effect in RSiena.

To not standardize is default in sna::nacf, to standardize is default in apa::Moran.I. This bugs me. I thus made a small adaption to apa::Moran.I and now in the function fMoran.I you can choose if you want to rowstandardize or not.

But…

“What you really, really want
I wanna, (ha) I wanna, (ha)
I wanna, (ha) I wanna, (ha)
I wanna really, really, really
Wanna zigazig ah”
— Spice Girls - Wannabe

What I really would like to see is a correlation between actor i and all the alters to whom it is connected (direct or indirectly) and where alters at a larger distances (longer shortest path lengths) are weighted less.

step 1: for each acter i determine the distances (shortest path lengths) to all other nodes in the network. step 2: based on these distances decide on how we want to weigh. That is, determine a distance decay function.
step 3: decide whether or not we want to row-standardize.

# step 1: calculate distances
geodistances <- geodist(net1, count.paths = TRUE)
geodistances <- geodistances$gdist
# set the distance to yourself as Inf
diag(geodistances) <- Inf


# step 2: define a distance decay function. This one is pretty standard in the spatial
# autocorrelation literature but actually pretty arbitrary.
weights2 <- exp(-geodistances)

# step 3: I dont want to rowstandardize.
fMoran.I(drink[, 1], scaled = FALSE, weight = weights2, na.rm = TRUE, rowstandardize = FALSE)
## $observed
## [1] 0.2817188
## 
## $expected
## [1] -0.02040816
## 
## $sd
## [1] 0.07532918
## 
## $p.value
## [1] 6.052458e-05

Conclusion: Yes pupils closer to one another are more a like! And ‘closer’ here means a shorter shortest path length. You also observe that the correlation is lower than we calculated previously. Apparently, we are a like to alters very close by (path length one) and less so (or even dissimilar) to alters furter off.vvv

3.3.3 Moran’s autocorrelation for outgoing ties: MyMovez dataset

Let’s repeat the exercise but now on our own data. The tie variable we will use are the friendship relations.

# step 1: calculate distances
fnet <- ExData$depvars$friendship[, , 1]
fnet[fnet == 10] <- 0

geodistances <- geodist(fnet, count.paths = TRUE)
geodistances <- geodistances$gdist
# set the distance to yourself as Inf
diag(geodistances) <- Inf
# head(geodistances) #have a look for yourself.

# step 2: define a distance decay function. This one is pretty standard in the spatial
# autocorrelation literature but actually pretty arbitrary.
weights2 <- exp(-geodistances)

# step 3: In this case I do want to rowstandardize because I think the influence is by the total
# class but class sizes vary.
fMoran.I(ExData$depvars$mvpa_y[, , 1], scaled = FALSE, weight = weights2, na.rm = TRUE, rowstandardize = TRUE)
## $observed
## [1] 0.07961828
## 
## $expected
## [1] -0.006756757
## 
## $sd
## [1] 0.04212021
## 
## $p.value
## [1] 0.04029821

Thus Yes, If we look at all classes together we see that pupils who are closer to one another are more a like with respect to physical activity. But the correlations is quite small!

Do we see the same within each class?

# some background info:
nclass <- c(28, 18, 18, 18, 18, 25, 24)
classid <- rep(1:7, times = nclass)

print("Moran's I: class 1")
## [1] "Moran's I: class 1"
fMoran.I(ExData$depvars$mvpa_y[, , 1][1:28], scaled = FALSE, weight = weights2[1:28, 1:28], na.rm = TRUE, 
    rowstandardize = TRUE)
## $observed
## [1] -0.03002763
## 
## $expected
## [1] -0.03703704
## 
## $sd
## [1] 0.02435121
## 
## $p.value
## [1] 0.7734643
print("Moran's I: class 2")
## [1] "Moran's I: class 2"
fMoran.I(ExData$depvars$mvpa_y[, , 1][29:46], scaled = FALSE, weight = weights2[29:46, 29:46], na.rm = TRUE, 
    rowstandardize = TRUE)
## $observed
## [1] 0.04152623
## 
## $expected
## [1] -0.05882353
## 
## $sd
## [1] 0.053813
## 
## $p.value
## [1] 0.06221135

Correlations within classes are somewhat lower and/or not significant. Probably there is thus also similarity beween pupils because they are in the same class (might be due to selection and influence processes, or class/context effects of course).

3.4 A quick co-evolution RSiena model to check.

Please be aware that in co-evolution models (as in multiplex models) the variables defined as dependent variables in your RSiena data object can be both a ‘cause’ (or independent variable) and a ‘consequence’ (or dependent variable). In this influence effect: includeEffects(myEff, name = "drinkingbeh", outdeg, interaction1= "friendship" ) the dependent variable “drinkingbeh” is define by name, the independent networkvariable “friendship” by interaction.

# Step 1: DATA
friend.data.w1 <- s501
friend.data.w2 <- s502
friend.data.w3 <- s503
drink <- s50a
smoke <- s50s

friendship <- sienaDependent(array(c(friend.data.w1, friend.data.w2, friend.data.w3), dim = c(50, 50, 
    3)))  # create tie-dependent variable

drinkingbeh <- sienaDependent(drink, type = "behavior")  # create behavior-dependent variable

smoke1 <- coCovar(smoke[, 1])  #covariate

# Define the data set and obtain the basic effects object
myCoEvolutionData <- sienaDataCreate(friendship, smoke1, drinkingbeh)

# STEP 2: have a look at data.
print01Report(myCoEvolutionData, modelname = "s50_3_CoEvinit")
# look at the created file!!

# STEP 3: Define effects
myCoEvolutionEff <- getEffects(myCoEvolutionData)
# effectsDocumentation(myCoEvolutionEff)

# structural effects
myCoEvolutionEff <- includeEffects(myCoEvolutionEff, transTrip, cycle3)

# homophily effect for the constant covariate smoking
myCoEvolutionEff <- includeEffects(myCoEvolutionEff, simX, interaction1 = "smoke1")

# selection effect related to drinking behavior?
myCoEvolutionEff <- includeEffects(myCoEvolutionEff, egoX, altX, simX, interaction1 = "drinkingbeh")

# INFLUENCE PART!! inline with the above I use totAlt
myCoEvolutionEff <- includeEffects(myCoEvolutionEff, name = "drinkingbeh", totAlt, interaction1 = "friendship")

# Check what effects you have decided to include:

myCoEvolutionEff

# STEP 4: define algorithm
myCoEvAlgorithm <- sienaAlgorithmCreate(projname = "s50CoEv_3")

# STEP 5: estimate the model
(ans <- siena07(myCoEvAlgorithm, data = myCoEvolutionData, effects = myCoEvolutionEff))

# use this function if you want to save as excel fanscsv(ans, write=FALSE) #uncomment if you want.
##   effectName          include fix   test  initialValue parm
## 1 transitive triplets TRUE    FALSE FALSE          0   0   
## 2 3-cycles            TRUE    FALSE FALSE          0   0   
##   effectName        include fix   test  initialValue parm
## 1 smoke1 similarity TRUE    FALSE FALSE          0   0   
##   effectName             include fix   test  initialValue parm
## 1 drinkingbeh alter      TRUE    FALSE FALSE          0   0   
## 2 drinkingbeh ego        TRUE    FALSE FALSE          0   0   
## 3 drinkingbeh similarity TRUE    FALSE FALSE          0   0   
##   effectName              include fix   test  initialValue parm
## 1 drinkingbeh total alter TRUE    FALSE FALSE          0   0   
##    name        effectName                          include fix   test  initialValue parm
## 1  friendship  constant friendship rate (period 1) TRUE    FALSE FALSE    4.69604   0   
## 2  friendship  constant friendship rate (period 2) TRUE    FALSE FALSE    4.32885   0   
## 3  friendship  outdegree (density)                 TRUE    FALSE FALSE   -1.46770   0   
## 4  friendship  reciprocity                         TRUE    FALSE FALSE    0.00000   0   
## 5  friendship  transitive triplets                 TRUE    FALSE FALSE    0.00000   0   
## 6  friendship  3-cycles                            TRUE    FALSE FALSE    0.00000   0   
## 7  friendship  smoke1 similarity                   TRUE    FALSE FALSE    0.00000   0   
## 8  friendship  drinkingbeh alter                   TRUE    FALSE FALSE    0.00000   0   
## 9  friendship  drinkingbeh ego                     TRUE    FALSE FALSE    0.00000   0   
## 10 friendship  drinkingbeh similarity              TRUE    FALSE FALSE    0.00000   0   
## 11 drinkingbeh rate drinkingbeh (period 1)         TRUE    FALSE FALSE    0.70571   0   
## 12 drinkingbeh rate drinkingbeh (period 2)         TRUE    FALSE FALSE    0.84939   0   
## 13 drinkingbeh drinkingbeh linear shape            TRUE    FALSE FALSE    0.32237   0   
## 14 drinkingbeh drinkingbeh quadratic shape         TRUE    FALSE FALSE    0.00000   0   
## 15 drinkingbeh drinkingbeh total alter             TRUE    FALSE FALSE    0.00000   0   
## siena07 will create an output file s50CoEv_3.txt .
## Estimates, standard errors and convergence t-ratios
## 
##                                                Estimate   Standard   Convergence 
##                                                             Error      t-ratio   
## Network Dynamics 
##    1. rate constant friendship rate (period 1)  6.4714  ( 1.2823   )   -0.0436   
##    2. rate constant friendship rate (period 2)  5.1689  ( 0.8075   )    0.0244   
##    3. eval outdegree (density)                 -2.7653  ( 0.1628   )    0.0442   
##    4. eval reciprocity                          2.3927  ( 0.2379   )    0.0496   
##    5. eval transitive triplets                  0.6507  ( 0.1471   )    0.0403   
##    6. eval 3-cycles                            -0.0799  ( 0.2978   )    0.0439   
##    7. eval smoke1 similarity                    0.1762  ( 0.2153   )    0.0224   
##    8. eval drinkingbeh alter                   -0.0399  ( 0.1113   )   -0.0370   
##    9. eval drinkingbeh ego                      0.0654  ( 0.1129   )   -0.0542   
##   10. eval drinkingbeh similarity               1.3385  ( 0.6253   )    0.0334   
## 
## Behavior Dynamics
##   11. rate rate drinkingbeh (period 1)          1.3061  ( 0.3478   )    0.0225   
##   12. rate rate drinkingbeh (period 2)          1.7778  ( 0.4695   )   -0.0510   
##   13. eval drinkingbeh linear shape             0.4101  ( 0.2213   )   -0.0379   
##   14. eval drinkingbeh quadratic shape         -0.5303  ( 0.2699   )   -0.0671   
##   15. eval drinkingbeh total alter              0.4202  ( 0.2333   )   -0.0270   
## 
## Overall maximum convergence ratio:    0.1491 
## 
## 
## Total of 3197 iteration steps.

What would you conclude?

  • Yes, we observe selection based on drinking.
  • Yes, we observe influence via the drinking behavior of befriended alters.

4 Assignment

Assignment: Formulate at least three research questions with respect to selection and influence effects among pupils with respect to physical activity. Test these effects on the MyMovez dataset.

  • start with descriptive statistics
  • estimate subsequent models (at least three) and please motivate the order by which you include additional effects/variables
  • include at least two multiplex effects
  • include/try at least four behavioral-evolution effects

4.1 testing and finding optimal model

You probably need to estimate quite some models. To speed things up, you may want to tweak your algorithm by sienaAlgorithm or you may want to run you models on a subsample first.

4.1.1 sienaAlgorithm

Have a look at the function: ?sienaAlgorithm. You could change n3 to 500 and nsub to 2.

4.1.2 select only one class

You may want to test your models first on one class only. When you are satisfied, you could run your models on the total class pool. In the code snippet below, you see how to select one class and make a new RSiena dataobject.

# these are the respective class sizes.
nclass <- c(28, 18, 18, 18, 18, 25, 24)
classid <- rep(1:7, times = nclass)

test <- ExData[classid == 1]  #change classid to select a different class. 
# because everything needs to be mean centered again also make sure to run the next command
class1 <- sienaDataCreate(test$depvars$friendship, test$depvars$advice, test$depvars$mvpa_y, test$cCovars$ethnicNLNA, 
    test$cCovars$sex, test$cCovars$lft, test$cCovars$primary, test$vCovars$mvpa_x)

4.2 An example on the MyMovez dataset

require(RSiena)
require(xtable)  # for some html output

# Step 1: DATA
load("beh_data.RData")
mydata <- ExData

# Stept 2: some first summary
print01Report(mydata, modelname = "segtest1")
# look at the printed doc!!

# Step 3: set algorithm
myalgorithm <- sienaAlgorithmCreate(projname = "segtest1")
## siena07 will create an output file segtest1.txt .
# Step 4: set effects
NBeff <- getEffects(mydata)
# have a look at all possible effects effectsDocumentation(NBeff) #uncomment if you want to have a
# look

# possible order?  a: uncontrolled for network structure effects b: controlled for network structure
# effects M1a/b: selection: homophily tendencies demographics: simsex, simethnic, M2a/b: selection:
# homophily tendencies health: MVPA_y M3a/b: influence: on health M4: total

# I am just estimating the total model in this example.

# Structural effects only focus on friendship network in this example, thus specifying 'name'
# argument is not necessary.
NBeff <- includeEffects(NBeff, inPop, transTrip, transRecTrip)
##   effectName                              include fix   test  initialValue parm
## 1 friendship: transitive triplets         TRUE    FALSE FALSE          0   0   
## 2 friendship: transitive recipr. triplets TRUE    FALSE FALSE          0   0   
## 3 friendship: indegree - popularity       TRUE    FALSE FALSE          0   0
# selection effects
NBeff <- includeEffects(NBeff, egoX, altX, egoXaltX, interaction1 = "ethnicNLNA")
##   effectName                                    include fix   test  initialValue parm
## 1 friendship: ethnicNLNA alter                  TRUE    FALSE FALSE          0   0   
## 2 friendship: ethnicNLNA ego                    TRUE    FALSE FALSE          0   0   
## 3 friendship: ethnicNLNA ego x ethnicNLNA alter TRUE    FALSE FALSE          0   0
NBeff <- includeEffects(NBeff, egoX, altX, sameX, interaction1 = "sex")
##   effectName            include fix   test  initialValue parm
## 1 friendship: sex alter TRUE    FALSE FALSE          0   0   
## 2 friendship: sex ego   TRUE    FALSE FALSE          0   0   
## 3 friendship: same sex  TRUE    FALSE FALSE          0   0
NBeff <- includeEffects(NBeff, egoX, altX, absDiffX, interaction1 = "mvpa_y")
##   effectName                         include fix   test  initialValue parm
## 1 friendship: mvpa_y alter           TRUE    FALSE FALSE          0   0   
## 2 friendship: mvpa_y ego             TRUE    FALSE FALSE          0   0   
## 3 friendship: mvpa_y abs. difference TRUE    FALSE FALSE          0   0
# behavioral model: node effects
NBeff <- includeEffects(NBeff, effFrom, name = "mvpa_y", interaction1 = "sex")
##   effectName              include fix   test  initialValue parm
## 1 mvpa_y: effect from sex TRUE    FALSE FALSE          0   0
NBeff <- includeEffects(NBeff, effFrom, name = "mvpa_y", interaction1 = "lft")
##   effectName              include fix   test  initialValue parm
## 1 mvpa_y: effect from lft TRUE    FALSE FALSE          0   0
NBeff <- includeEffects(NBeff, effFrom, name = "mvpa_y", interaction1 = "ethnicNLNA")
##   effectName                     include fix   test  initialValue parm
## 1 mvpa_y: effect from ethnicNLNA TRUE    FALSE FALSE          0   0
# influence effects
NBeff <- includeEffects(NBeff, totSimRecip, name = "mvpa_y", interaction1 = "friendship")
##   effectName                                  include fix   test  initialValue parm
## 1 mvpa_y tot. sim. x reciprocity (friendship) TRUE    FALSE FALSE          0   0
# look at all effects
NBeff
##    name       effectName                                    include fix   test  initialValue parm
## 1  friendship constant friendship rate (period 1)           TRUE    FALSE FALSE    7.28417   0   
## 2  friendship constant friendship rate (period 2)           TRUE    FALSE FALSE    7.27877   0   
## 3  friendship friendship: outdegree (density)               TRUE    FALSE FALSE   -1.25979   0   
## 4  friendship friendship: reciprocity                       TRUE    FALSE FALSE    0.00000   0   
## 5  friendship friendship: transitive triplets               TRUE    FALSE FALSE    0.00000   0   
## 6  friendship friendship: transitive recipr. triplets       TRUE    FALSE FALSE    0.00000   0   
## 7  friendship friendship: indegree - popularity             TRUE    FALSE FALSE    0.00000   0   
## 8  friendship friendship: ethnicNLNA alter                  TRUE    FALSE FALSE    0.00000   0   
## 9  friendship friendship: ethnicNLNA ego                    TRUE    FALSE FALSE    0.00000   0   
## 10 friendship friendship: ethnicNLNA ego x ethnicNLNA alter TRUE    FALSE FALSE    0.00000   0   
## 11 friendship friendship: sex alter                         TRUE    FALSE FALSE    0.00000   0   
## 12 friendship friendship: sex ego                           TRUE    FALSE FALSE    0.00000   0   
## 13 friendship friendship: same sex                          TRUE    FALSE FALSE    0.00000   0   
## 14 friendship friendship: mvpa_y alter                      TRUE    FALSE FALSE    0.00000   0   
## 15 friendship friendship: mvpa_y ego                        TRUE    FALSE FALSE    0.00000   0   
## 16 friendship friendship: mvpa_y abs. difference            TRUE    FALSE FALSE    0.00000   0   
## 17 advice     constant advice rate (period 1)               TRUE    FALSE FALSE    5.39192   0   
## 18 advice     constant advice rate (period 2)               TRUE    FALSE FALSE    4.07544   0   
## 19 advice     advice: outdegree (density)                   TRUE    FALSE FALSE   -1.73553   0   
## 20 advice     advice: reciprocity                           TRUE    FALSE FALSE    0.00000   0   
## 21 mvpa_y     rate mvpa_y (period 1)                        TRUE    FALSE FALSE    2.17740   0   
## 22 mvpa_y     rate mvpa_y (period 2)                        TRUE    FALSE FALSE    2.87832   0   
## 23 mvpa_y     mvpa_y linear shape                           TRUE    FALSE FALSE    0.04291   0   
## 24 mvpa_y     mvpa_y quadratic shape                        TRUE    FALSE FALSE    0.00000   0   
## 25 mvpa_y     mvpa_y tot. sim. x reciprocity (friendship)   TRUE    FALSE FALSE    0.00000   0   
## 26 mvpa_y     mvpa_y: effect from ethnicNLNA                TRUE    FALSE FALSE    0.00000   0   
## 27 mvpa_y     mvpa_y: effect from sex                       TRUE    FALSE FALSE    0.00000   0   
## 28 mvpa_y     mvpa_y: effect from lft                       TRUE    FALSE FALSE    0.00000   0
# Please uncomment this section. I just don't want to reestimate the model. It does take a while.
# (ans <- siena07( myalgorithm, data = ExData, effects = NBeff)) save(ans, file='ans.RData')
# siena.table(ans, type='html', tstat=T, d=2, sig=T)
load("ans.RData")
ans
## Estimates, standard errors and convergence t-ratios
## 
##                                                          Estimate   Standard   Convergence 
##                                                                       Error      t-ratio   
## Network Dynamics 
##    1. rate constant friendship rate (period 1)            5.2601  ( 0.3035   )   -0.0635   
##    2. rate constant friendship rate (period 2)            5.8955  ( 0.4876   )   -0.0676   
##    3. eval friendship: outdegree (density)                2.7341  ( 2.2175   )   -0.0638   
##    4. eval friendship: reciprocity                        2.9414  ( 1.8683   )   -0.1328   
##    5. eval friendship: transitive triplets                0.9225  ( 0.4882   )   -0.0978   
##    6. eval friendship: transitive recipr. triplets       -0.8119  ( 0.5046   )   -0.1375   
##    7. eval friendship: indegree - popularity             -0.2624  ( 0.1349   )   -0.0828   
##    8. eval friendship: ethnicNLNA alter                  -0.1208  ( 0.3393   )    0.0060   
##    9. eval friendship: ethnicNLNA ego                     0.0291  ( 0.3723   )   -0.0137   
##   10. eval friendship: ethnicNLNA ego x ethnicNLNA alter  0.0181  ( 0.9868   )   -0.0493   
##   11. eval friendship: mvpa_y alter                       0.0624  ( 0.5220   )    0.0570   
##   12. eval friendship: mvpa_y ego                        -0.3947  ( 0.3123   )   -0.0445   
##   13. eval friendship: mvpa_y abs. difference            -3.6551  ( 2.5637   )   -0.0319   
##   14. rate constant advice rate (period 1)                7.3773  ( 0.8998   )    0.0094   
##   15. rate constant advice rate (period 2)                5.4382  ( 0.5426   )   -0.0753   
##   16. eval advice: outdegree (density)                   -1.6276  ( 0.0679   )   -0.0297   
##   17. eval advice: reciprocity                            0.9398  ( 0.1646   )   -0.0359   
## 
## Behavior Dynamics
##   18. rate rate mvpa_y (period 1)                         8.6516  ( 3.4273   )   -0.0135   
##   19. rate rate mvpa_y (period 2)                        25.8909  ( 8.7388   )   -0.0022   
##   20. eval mvpa_y linear shape                           -0.0155  ( 0.0547   )    0.0096   
##   21. eval mvpa_y quadratic shape                         0.0934  ( 0.0701   )   -0.0931   
##   22. eval mvpa_y tot. sim. x reciprocity (friendship)    0.1675  ( 0.1606   )    0.0369   
##   23. eval mvpa_y: effect from ethnicNLNA                 0.2842  ( 0.0969   )    0.0263   
##   24. eval mvpa_y: effect from sex                       -0.1610  ( 0.0993   )   -0.0041   
##   25. eval mvpa_y: effect from lft                       -0.0595  ( 0.0309   )   -0.0271   
## 
## Overall maximum convergence ratio:    0.2810 
## 
## 
## Total of 3944 iteration steps.

What would you conclude?

  • The selection variable of interest is in the predicted direction but does not reach significance.
    • Should we model friendship selection based on mvpa_y with a different statistic/effect?
    • Would we observe the same with respect to advice relations?
  • The influenc variable of interest is in the predicted direction but does not reach significance.
    • Should we model network influence on mvpa_y with a different statistic/effect?
    • Would we observe the same with respect to advice relations?

Hopefully, this example gives you enough pointers to make the assignment.


  1. I quite frequently need to calculate Moran’s I and related statistics in my work/hobby. I commonly use the functions in the R package spdep.↩︎

Previous
Next