RSiena I (causes)

1 Introduction - RSiena

This page introduces the R package RSiena. This week/page we will estimate our first RSiena models to explain the structure of the network and how the network structure evolves. With respect to the latter we only focus this week on ‘tie evolution’ and ‘selection processes’. Thus, we are focusing on the causes. On the next page) we will focus on node evolution and influence processes.
Please visit the RSiena homepage here. On the website you may find the exceptionally well written manual. But the RSiena website also contains all kind of useful RSiena scripts and ‘lab exercises’.
Before we start with our own lab exercise, I would like you to:

  1. walk through script number 4, and 5.
  2. walk through script number 1 and make the first assignment.
  3. make the first lab exercise on the RSiena website ‘sex segregation of friendship networks’.

My own lab exercise can be found below.

Having taught the course social networks for a couple of years now to the students of the Research Master Social and Cultural Sciences, I think there are a couple of aspects that deserve a little bit more attention than they get in the manual and on the RSiena website. Namely,…

  1. model assumptions that refer to the ‘Theory of Action’
  2. the interpretation of the (estimated) rate parameter
  3. the interpretation of the (estimated) coefficients within the evaluation function
  4. the difference between a $ s^{net} $ effect and an $ s^{el} $ effect.
  5. how to build subsequent models in RSiena

2 Assumptions

  • ministep:
    1. one actor per time step (no coordination)
    2. one tie change per time step
    3. gradual behavioral change (one unit increase per time step)
  • no memory: the same choice set is evaluated the same each time.
  • no strategic action: actors do not anticipate that one (negatively evaluated) tie change at T1 may open up the opportunity for a (very positively) tie change at T2.
  • local networks: actors base their decisions only on the evaluation of their own local network (no altruism).

3 Rate function

The rate function of actor i is denoted:


λi(x)

Suppose we have three actors: i, j and k. And suppose that the rate function is a constant, thus the rate function does not depend on the network structure or attributes of the actors. Thus suppose for example:

  • λi = 5
  • λj = 10
  • λk = 15

The waiting times of actors i, j and k are exponentially distributed with rate parameter λ. What do these exponential distribution look like?

par(mfrow = c(1, 3))

dist_5 <- rexp(10000, rate = 5)
hist(dist_5, main = "rate = lambda_i = 5", freq = FALSE, xlab = "waiting times", xlim = c(0, 2), ylim = c(0, 
    9))
abline(v = 1/5, col = "red")

dist_10 <- rexp(10000, rate = 10)
hist(dist_10, main = "rate= lambda_j = 10", freq = FALSE, xlab = "waiting times", xlim = c(0, 2), ylim = c(0, 
    9))
abline(v = 1/10, col = "red")

dist_15 <- rexp(10000, rate = 15)
hist(dist_10, main = "rate = lambda_k = 15", freq = FALSE, xlab = "waiting times", xlim = c(0, 2), ylim = c(0, 
    9))
abline(v = 1/15, col = "red")

We now want to determine who will be allowed to take the next ministep. We thus need to sample a waiting time for each actor. Thus each actor gets a waiting time sampled from the exponential distribution with the specified rate parameter:

set.seed(34641)
waitingtimes <- NA
waitingtimes[1] <- rexp(1, rate = 5)
waitingtimes[2] <- rexp(1, rate = 10)
waitingtimes[3] <- rexp(1, rate = 15)
print(paste("waitingtime_", c("i: ", "j: ", "k: "), round(waitingtimes, 3), sep = ""))
## [1] "waitingtime_i: 0.264" "waitingtime_j: 0.414" "waitingtime_k: 0.028"

Actor k has the shortest waiting time and is allowed to take a ministep. In the example above we only took one draw out of each exponential distribution. If we would take many draws the expected value of the waiting time is $\frac{1}{\lambda}$ and these values are added as vertical lines in the figure above. Thus with a higher rate parameter λ the shorter the expected waiting time.

Now let us repeat this process of determining who is allowed to take a ministep a couple of times and keep track of who will make the ministep and the time that has passed:

set.seed(245651)
sam_waitingtimes <- NA
time <- 0
for (ministeps in 1:50) {
    waitingtimes <- NA
    waitingtimes[1] <- rexp(1, rate = 5)
    waitingtimes[2] <- rexp(1, rate = 10)
    waitingtimes[3] <- rexp(1, rate = 15)
    actor <- which(waitingtimes == min(waitingtimes))
    time <- time + waitingtimes[actor]
    sam_waitingtimes[ministeps] <- waitingtimes[actor]
    print(paste("ministep nr.: ", ministeps, sep = ""))
    print(paste("waitingtime_", c("i: ", "j: ", "k: ")[actor], round(waitingtimes, 3)[actor], sep = ""))
    print(paste("time past: ", round(time, 3), sep = ""))
}
## [1] "ministep nr.: 1"
## [1] "waitingtime_i: 0.012"
## [1] "time past: 0.012"
## [1] "ministep nr.: 2"
## [1] "waitingtime_k: 0.003"
## [1] "time past: 0.014"
## [1] "ministep nr.: 3"
## [1] "waitingtime_k: 0.013"
## [1] "time past: 0.027"
## [1] "ministep nr.: 4"
## [1] "waitingtime_i: 0"
## [1] "time past: 0.027"
## [1] "ministep nr.: 5"
## [1] "waitingtime_j: 0.052"
## [1] "time past: 0.08"
## [1] "ministep nr.: 6"
## [1] "waitingtime_j: 0.073"
## [1] "time past: 0.153"
## [1] "ministep nr.: 7"
## [1] "waitingtime_k: 0.054"
## [1] "time past: 0.207"
## [1] "ministep nr.: 8"
## [1] "waitingtime_i: 0.016"
## [1] "time past: 0.223"
## [1] "ministep nr.: 9"
## [1] "waitingtime_k: 0.006"
## [1] "time past: 0.229"
## [1] "ministep nr.: 10"
## [1] "waitingtime_k: 0.019"
## [1] "time past: 0.248"
## [1] "ministep nr.: 11"
## [1] "waitingtime_k: 0.044"
## [1] "time past: 0.292"
## [1] "ministep nr.: 12"
## [1] "waitingtime_k: 0.024"
## [1] "time past: 0.316"
## [1] "ministep nr.: 13"
## [1] "waitingtime_i: 0.022"
## [1] "time past: 0.338"
## [1] "ministep nr.: 14"
## [1] "waitingtime_k: 0.038"
## [1] "time past: 0.376"
## [1] "ministep nr.: 15"
## [1] "waitingtime_i: 0.027"
## [1] "time past: 0.403"
## [1] "ministep nr.: 16"
## [1] "waitingtime_k: 0.002"
## [1] "time past: 0.405"
## [1] "ministep nr.: 17"
## [1] "waitingtime_i: 0.045"
## [1] "time past: 0.45"
## [1] "ministep nr.: 18"
## [1] "waitingtime_k: 0.023"
## [1] "time past: 0.473"
## [1] "ministep nr.: 19"
## [1] "waitingtime_i: 0"
## [1] "time past: 0.474"
## [1] "ministep nr.: 20"
## [1] "waitingtime_i: 0.024"
## [1] "time past: 0.498"
## [1] "ministep nr.: 21"
## [1] "waitingtime_k: 0.013"
## [1] "time past: 0.511"
## [1] "ministep nr.: 22"
## [1] "waitingtime_j: 0.018"
## [1] "time past: 0.529"
## [1] "ministep nr.: 23"
## [1] "waitingtime_j: 0.027"
## [1] "time past: 0.556"
## [1] "ministep nr.: 24"
## [1] "waitingtime_j: 0.061"
## [1] "time past: 0.617"
## [1] "ministep nr.: 25"
## [1] "waitingtime_k: 0.018"
## [1] "time past: 0.635"
## [1] "ministep nr.: 26"
## [1] "waitingtime_j: 0.097"
## [1] "time past: 0.732"
## [1] "ministep nr.: 27"
## [1] "waitingtime_j: 0.007"
## [1] "time past: 0.739"
## [1] "ministep nr.: 28"
## [1] "waitingtime_j: 0.009"
## [1] "time past: 0.748"
## [1] "ministep nr.: 29"
## [1] "waitingtime_k: 0.001"
## [1] "time past: 0.75"
## [1] "ministep nr.: 30"
## [1] "waitingtime_j: 0.009"
## [1] "time past: 0.759"
## [1] "ministep nr.: 31"
## [1] "waitingtime_i: 0.012"
## [1] "time past: 0.77"
## [1] "ministep nr.: 32"
## [1] "waitingtime_i: 0.056"
## [1] "time past: 0.826"
## [1] "ministep nr.: 33"
## [1] "waitingtime_k: 0.031"
## [1] "time past: 0.857"
## [1] "ministep nr.: 34"
## [1] "waitingtime_j: 0.051"
## [1] "time past: 0.908"
## [1] "ministep nr.: 35"
## [1] "waitingtime_k: 0.014"
## [1] "time past: 0.923"
## [1] "ministep nr.: 36"
## [1] "waitingtime_j: 0.021"
## [1] "time past: 0.943"
## [1] "ministep nr.: 37"
## [1] "waitingtime_k: 0.021"
## [1] "time past: 0.965"
## [1] "ministep nr.: 38"
## [1] "waitingtime_k: 0.059"
## [1] "time past: 1.024"
## [1] "ministep nr.: 39"
## [1] "waitingtime_k: 0.029"
## [1] "time past: 1.053"
## [1] "ministep nr.: 40"
## [1] "waitingtime_j: 0.013"
## [1] "time past: 1.065"
## [1] "ministep nr.: 41"
## [1] "waitingtime_j: 0.044"
## [1] "time past: 1.11"
## [1] "ministep nr.: 42"
## [1] "waitingtime_k: 0.069"
## [1] "time past: 1.178"
## [1] "ministep nr.: 43"
## [1] "waitingtime_i: 0.066"
## [1] "time past: 1.244"
## [1] "ministep nr.: 44"
## [1] "waitingtime_j: 0.003"
## [1] "time past: 1.248"
## [1] "ministep nr.: 45"
## [1] "waitingtime_j: 0.01"
## [1] "time past: 1.258"
## [1] "ministep nr.: 46"
## [1] "waitingtime_k: 0.086"
## [1] "time past: 1.343"
## [1] "ministep nr.: 47"
## [1] "waitingtime_k: 0.046"
## [1] "time past: 1.389"
## [1] "ministep nr.: 48"
## [1] "waitingtime_k: 0.031"
## [1] "time past: 1.42"
## [1] "ministep nr.: 49"
## [1] "waitingtime_i: 0.007"
## [1] "time past: 1.427"
## [1] "ministep nr.: 50"
## [1] "waitingtime_j: 0.066"
## [1] "time past: 1.493"

I know you are wondering what the distribution of sampled waiting times look like, don’t you? Well let’s sample some more and plot them.

set.seed(245651)
sam_waitingtimes <- NA
for (ministeps in 1:5000) {
    waitingtimes <- NA
    waitingtimes[1] <- rexp(1, rate = 5)
    waitingtimes[2] <- rexp(1, rate = 10)
    waitingtimes[3] <- rexp(1, rate = 15)
    actor <- which(waitingtimes == min(waitingtimes))
    sam_waitingtimes[ministeps] <- waitingtimes[actor]
}

par(mfrow = c(1, 2))
hist(sam_waitingtimes, freq = FALSE, xlab = "waiting times", main = "sampled waiting times")
abline(v = mean(sam_waitingtimes), col = "red")

hist(rexp(5000, rate = 30), freq = FALSE, xlab = "waiting times", main = "rate=30")
abline(v = 1/30, col = "red")

The distribution of the sampled waiting times is plotted in the figure above on the left. As you see this distribution is ‘identical’ to the exponential distribution with a rate parameter λ of 30 (which is plotted on the right). And the expected waiting time, plotted in red, is 1/30. This leads us to page 43 of the RSiena manual:

The time duration until the next opportunity of change is exponentially distributed with parameter:

λ+(x0) = Σiλi(x0)

Remember, the waiting times for each actor i are exponentially distributed with parameter λi. The observed time duration until the next ministeps are exponentially distributed with parameter λ+.

If an actor has a higher rate parameter, the expected sampled waiting time is shorter. And since the actor with the shortest waiting time will make the ministep, actors with the highest rate parameter have the highest probability to have an opportunity for change. Thus the larger the rate parameter the more opportunities for change there are within a given time period.

So how many opportunities for change do we have before the total time exceeds 1? Please have a look above with the example of actors i, j and k with rate parameters of 5, 10 and 15 respectively. We see that after ministep 38 time exceeds 1. Of course this was only one run. We could repeat the simulation a couple of times and keep track of the total ministeps and the ministeps for each actor. Let us plot the number of ministeps for actors i, j and k for 1000 simulations.

set.seed(245651)

results <- list()
for (nsim in 1:1000) {
    time <- 0
    steps_tot <- 0
    steps_i <- 0
    steps_j <- 0
    steps_k <- 0
    actors <- NA
    while (time < 1) {
        steps_tot <- steps_tot + 1
        waitingtimes <- NA
        waitingtimes[1] <- rexp(1, rate = 5)
        waitingtimes[2] <- rexp(1, rate = 10)
        waitingtimes[3] <- rexp(1, rate = 15)
        actor <- which(waitingtimes == min(waitingtimes))
        time <- time + waitingtimes[actor]
        actors[steps_tot] <- actor
    }
    results[[nsim]] <- actors
}

# sum(results[[1]]==1) hist(sapply(results, length))

par(mfrow = c(1, 3))
{
    hist(sapply(results, function(x) {
        sum(x == 1)
    }), xlab = "nsteps", main = "actor i: lambda=5")
    abline(v = mean(sapply(results, function(x) {
        sum(x == 1)
    })), col = "red")
}

{
    hist(sapply(results, function(x) {
        sum(x == 2)
    }), xlab = "nsteps", main = "actor j: lambda=10")
    abline(v = mean(sapply(results, function(x) {
        sum(x == 2)
    })), col = "red")
}

{
    hist(sapply(results, function(x) {
        sum(x == 3)
    }), xlab = "nsteps", main = "actor k: lambda=15")
    abline(v = mean(sapply(results, function(x) {
        sum(x == 3)
    })), col = "red")
}

Thus the larger the rate parameter the more opportunities for change per actor there are within a given time period. And in RSiena the optimal value for the rate parameter λi is estimated. And from the figure above we see that the estimated parameter has a nice interpretation: the estimated rate parameter refers to the expected number of opportunities for change in a time period.

4 Interpretation of parameters

The evaluation function is defined as:


finet(x) = Σkβknetsiknet(x)
Thus fnet is the evaluation function. And it attaches a value/number to the attractiveness of the network, x. The subscript i refers to the agent, thus each agent will get its own value of the evaluation function. βknet refers to our estimated parameters. These are what the results will spit out. And for each network effect k, siknet, we will obtain a separate estimate. Each agent evaluates the attractiveness of its local network environment. This is why si has a subscript i. We as scientists have ideas about which network effects siknet may play a role in the evaluation of the local networks. Based on the mathematical definition of siknet the value of statistic k will be determined for each of the possible networks that may result after a ministep of agent i. Agent i is most likely to take a ministep that will result in the network with the highest attractiveness value. The RSiena software will then estimate the parameters βknet for which it is most likely to obtain the network observed at T2 given the network observed at T1. More precisely, to observe a network at T2 with similar network structures as the actual network observed at T2.

Now, let us suppose that actor i has an opportunity for change at that after a ministep three possible networks could occur. Or stated otherwise, the choice set consists of three networks for actor i. See below.

Figure: Choice set for actor i.

How actor i evaluates these networks depends on the siknet in the evaluation function. Let us suppose only the outdegree effect is part of the evaluation function. Thus:


finet(x) = β1netsi1net(x)
where


si1net(x) = Σjxij
and given the networks above:


si1net(xa) = 0, si1net(xb) = 1, si1net(xc) = 1
and hence:


finet(xa) = 0, finet(xb) = β1net, finet(xc) = β1net
The probability that xb will be chosen is given by:


$$ P(X= x_b) = \frac{exp(f^{net}_i(x_b))}{exp(f^{net}_i(x_a))+exp(f^{net}_i(x_b))+exp(f^{net}_i(x_c))} $$

For the interpretation is much easier to interpret the ratio of two probabilities:


$$ \frac{P(X= x_b)}{P(X= x_a)} = \frac{exp(f^{net}_i(x_b))}{exp(f^{net}_i(x_a))} = exp(f^{net}_i(x_b) - f^{net}_i(x_a) ) = exp(\beta^{net}_1)$$
Let us assume that β1net =  − 2 This would imply that:


P(X = xb) = exp( − 2) * P(X = xa) ≈ 0.14 * P(X = xa)

Thus the probability to observe a tie between i and j (network xb) is 14% the probability to observe no tie between i and j (network xa).

Is it possible to deduce the density of the network from this formula? Well let suppose actor i would only have options xa and xb then the probabilities would need to sum to 1. And this would imply a density of approximately .12 (0.12=0.14*.88 and .12 + .88 = 1).

The interpretation of the parameters thus resembles the interpretation of a logistic regression: if one covariate xk increases only with one step and the parameter estimate of this covariate is βk, the odds $\frac{p_{x_k=1}}{1-p_{x_k=1}}$ is exp(βk)*$\frac{p_{x_k=0}}{1-p_{x_k=0}}$1

5 lab exercise

We are going to use the same dataset we used in the lab exercise on plotting social networks.

During the workgroup I will explain all code. For those of you who don’t attend the workgroups, google knows way more than I do.

In the upper left and right corner of the code blocks you will find copy-to-clipboard buttons. Use these buttons to copy the code to your own editor.

5.1 Before you start

Before you start, check whether you run the latest RStudio version (from the Help menu, pick ‘check for updates’ and whether you need to update R.

install.packages("installr")  #you  first install packages
require(installr)  #then you will need to activate packages. 
updateR()  #run the function to start the update process

Give your script a nice name. Include the author, and data when you last modified the script. Include a lot of comments in your script! Don’t forget, always start with cleaning up your workspace.

### Author: JOCHEM TOLSMA### Lastmod: 31-08-2020###

# cleanup workspace
rm(list = ls())

And set your working directory.

# set working directory
setwd("C:\\YOURDIR\\YOURSUBDIR\\YOURSUBSUBDIR\\")  #change to your own workdirectory

Install the packages you will need.

# install packages
library(RSiena)

Define your custom build functions.

# density: observed relations divided by possible relations
fdensity <- function(x) {
    # x is your nomination network make sure diagonal cells are NA
    diag(x) <- NA
    # take care of RSiena structural zeros, set as missing.
    x[x == 10] <- NA
    sum(x == 1, na.rm = T)/(sum(x == 1 | x == 0, na.rm = T))
}

# calculate intragroup density
fdensityintra <- function(x, A) {
    # A is matrix indicating whether nodes in dyad have same node attributes
    diag(x) <- NA
    x[x == 10] <- NA
    diag(A) <- NA
    sum(x == 1 & A == 1, na.rm = T)/(sum((x == 1 | x == 0) & A == 1, na.rm = T))
}

# calculate intragroup density
fdensityinter <- function(x, A) {
    # A is matrix indicating whether nodes in dyad have same node attributes
    diag(x) <- NA
    x[x == 10] <- NA
    diag(A) <- NA
    sum(x == 1 & A != 1, na.rm = T)/(sum((x == 1 | x == 0) & A != 1, na.rm = T))
}

# construct dyadcharacteristic whether nodes are similar/homogenous
fhomomat <- function(x) {
    # x is a vector of node-covariate
    xmat <- matrix(x, nrow = length(x), ncol = length(x))
    xmatt <- t(xmat)
    xhomo <- xmat == xmatt
    return(xhomo)
}

# a function to calculate all valid dyads.
fndyads <- function(x) {
    diag(x) <- NA
    x[x == 10] <- NA
    (sum((x == 1 | x == 0), na.rm = T))
}

# a function to calculate all valid intragroupdyads.
fndyads2 <- function(x, A) {
    diag(x) <- NA
    x[x == 10] <- NA
    diag(A) <- NA
    (sum((x == 1 | x == 0) & A == 1, na.rm = T))
}


fscolnet <- function(network, ccovar) {
    # Calculate coleman on network level:
    # https://reader.elsevier.com/reader/sd/pii/S0378873314000239?token=A42F99FF6E2B750436DD2CB0DB7B1F41BDEC16052A45683C02644DAF88215A3379636B2AA197B65941D6373E9E2EE413
    
    fhomomat <- function(x) {
        xmat <- matrix(x, nrow = length(x), ncol = length(x))
        xmatt <- t(xmat)
        xhomo <- xmat == xmatt
        return(xhomo)
    }
    
    fsumintra <- function(x, A) {
        # A is matrix indicating whether nodes constituting dyad have same characteristics
        diag(x) <- NA
        x[x == 10] <- NA
        diag(A) <- NA
        sum(x == 1 & A == 1, na.rm = T)
    }
    
    # expecation w*=sum_g sum_i (ni((ng-1)/(N-1)))
    network[network == 10] <- NA
    ni <- rowSums(network, na.rm = T)
    ng <- NA
    for (i in 1:length(ccovar)) {
        ng[i] <- table(ccovar)[rownames(table(ccovar)) == ccovar[i]]
    }
    N <- length(ccovar)
    wexp <- sum(ni * ((ng - 1)/(N - 1)), na.rm = T)
    
    # wgg1 how many intragroup ties
    w <- fsumintra(network, fhomomat(ccovar))
    
    Scol_net <- ifelse(w >= wexp, (w - wexp)/(sum(ni, na.rm = T) - wexp), (w - wexp)/wexp)
    return(Scol_net)
}

5.2 Data

We are going to play with Twitter Networks among Dutch MPs.

Download data

Download twitter_20190919.Rdata

Load the Robject and have a look at it. Save the list elements in separate objects.

getwd()
load("twitter_20190919.RData")  #change to your working directory
str(twitter_20190919, 1)
keyf <- twitter_20190919[[1]]
mydata <- twitter_20190919[[2]]
seats <- twitter_20190919[[3]]
## [1] "C:/Users/Administrator/Documents/GitHub/homepage/content/courses/Complete-Networks/socio6"
## List of 3
##  $ keyf  :'data.frame':  147 obs. of  41 variables:
##  $ mydata:List of 8
##   ..- attr(*, "higher")= Named logi [1:9] FALSE FALSE FALSE FALSE FALSE FALSE ...
##   .. ..- attr(*, "names")= chr [1:9] "fnet,fnet" "atmnet,fnet" "rtnet,fnet" "fnet,atmnet" ...
##   ..- attr(*, "disjoint")= Named logi [1:9] FALSE FALSE FALSE FALSE FALSE FALSE ...
##   .. ..- attr(*, "names")= chr [1:9] "fnet,fnet" "atmnet,fnet" "rtnet,fnet" "fnet,atmnet" ...
##   ..- attr(*, "atLeastOne")= Named logi [1:9] FALSE FALSE FALSE FALSE FALSE FALSE ...
##   .. ..- attr(*, "names")= chr [1:9] "fnet,fnet" "atmnet,fnet" "rtnet,fnet" "fnet,atmnet" ...
##   ..- attr(*, "class")= chr "siena"
##  $ seats :'data.frame':  150 obs. of  5 variables:

So, what do we have?

  • keyf: a data.frame on 147 Dutch MPs.
  • mydata: This an object which is ready to analyze in RSiena. It is actually a quite complicated object. For now three things are important:
    1. The nodes in mydata are the same as in keyf and in seats.
    2. It contains the twitter data at three timepoints (in mydata$depvars). We have three layers:
    • fnet: who follows whom
    • atmnet: who atmentions whom
    • rtnet: who retweats whom
    1. It also contains timeinvariant information on the nodes (in mydata$cCovars)
  • seats: a dataset which contains the coordinates of the seats in the House of Parliament in the Netherlands.

We are going to focus on the retweet relations of politicians. This is most closely related to a positive or helping relation. Thus who is retweeting (‘helping’) whom on Twitter?

5.3 Research questions

  1. To what extent do we observer segregation along party affiliation in the retweet network among Dutch MPs.
  2. To what extent is the presumed segregation along party affiliation in the retweet network among Dutch MPs the byproduct of segregation along other social dimensions such as sex, age?
  3. To what extent is the presumed segregation along party affiliation in the retweet network among Dutch MPs the result of propinquity?
  4. To what extent is the presumed segregation along party affiliation in the retweet network among Dutch MPs the result of structural (network) effects?

5.4 Descriptive statistics

5.4.1 densities

Let us first describe the total density and intra- and intergroup densities. Don’t be scared of the long script. I just do it for all kind of dimensions and for all three layers in the twitter network.

# retrieve nominationdata from rsiena object
fnet <- mydata$depvars$fnet
atmnet <- mydata$depvars$atmnet
rtnet <- mydata$depvars$rtnet

# retrieve node-attributes from rsiena object
vrouw <- mydata$cCovars$vrouw
partij <- mydata$cCovars$partij
ethminz <- mydata$cCovars$ethminz
lft <- mydata$cCovars$lft

# de-mean-center node attributes
ethminz <- ethminz + attributes(ethminz)$mean
partij <- partij + attributes(partij)$mean
vrouw <- vrouw + attributes(vrouw)$mean
lft <- lft + attributes(lft)$mean

# construct matrices for similarity for each dimension (dyad characteristics)
vrouwm <- fhomomat(vrouw)
partijm <- fhomomat(partij)
ethminzm <- fhomomat(ethminz)

# just for fun, make dyad characteristic indicating whether both nodes are ethnic minorities
xmat <- matrix(ethminz, nrow = length(ethminz), ncol = length(ethminz))
xmatt <- t(xmat)
minoritym <- xmat == 1 & xmatt == 1

# for age max 5 year difference / for descriptives
xmat <- matrix(lft, nrow = length(lft), ncol = length(lft))
xmatt <- t(xmat)
lftm <- (abs(xmat - xmatt) < 6)

# calculate all possible similar dyads, not the focus of this exercise.  fndyads2(fnet[,,1], vrouwm)
# fndyads2(fnet[,,3], vrouwm) fndyads2(fnet[,,1], partijm) fndyads2(fnet[,,3], partijm)
# fndyads2(fnet[,,1], ethminzm) fndyads2(fnet[,,3], ethminzm)

# make a big object to store all results
desmat <- matrix(NA, nrow = 10, ncol = 9)

# lets start using our functions
desmat[1, 1] <- fdensity(fnet[, , 1])
desmat[1, 2] <- fdensity(fnet[, , 2])
desmat[1, 3] <- fdensity(fnet[, , 3])
desmat[2, 1] <- fdensityintra(fnet[, , 1], vrouwm)
desmat[2, 2] <- fdensityintra(fnet[, , 2], vrouwm)
desmat[2, 3] <- fdensityintra(fnet[, , 3], vrouwm)
desmat[3, 1] <- fdensityinter(fnet[, , 1], vrouwm)
desmat[3, 2] <- fdensityinter(fnet[, , 2], vrouwm)
desmat[3, 3] <- fdensityinter(fnet[, , 3], vrouwm)
desmat[4, 1] <- fdensityintra(fnet[, , 1], partijm)
desmat[4, 2] <- fdensityintra(fnet[, , 2], partijm)
desmat[4, 3] <- fdensityintra(fnet[, , 3], partijm)
desmat[5, 1] <- fdensityinter(fnet[, , 1], partijm)
desmat[5, 2] <- fdensityinter(fnet[, , 2], partijm)
desmat[5, 3] <- fdensityinter(fnet[, , 3], partijm)
desmat[6, 1] <- fdensityintra(fnet[, , 1], ethminzm)
desmat[6, 2] <- fdensityintra(fnet[, , 2], ethminzm)
desmat[6, 3] <- fdensityintra(fnet[, , 3], ethminzm)
desmat[7, 1] <- fdensityinter(fnet[, , 1], ethminzm)
desmat[7, 2] <- fdensityinter(fnet[, , 2], ethminzm)
desmat[7, 3] <- fdensityinter(fnet[, , 3], ethminzm)
desmat[8, 1] <- fdensityinter(fnet[, , 1], minoritym)
desmat[8, 2] <- fdensityinter(fnet[, , 2], minoritym)
desmat[8, 3] <- fdensityinter(fnet[, , 3], minoritym)
desmat[9, 1] <- fdensityintra(fnet[, , 1], lftm)
desmat[9, 2] <- fdensityintra(fnet[, , 2], lftm)
desmat[9, 3] <- fdensityintra(fnet[, , 3], lftm)
desmat[10, 1] <- fdensityinter(fnet[, , 1], lftm)
desmat[10, 2] <- fdensityinter(fnet[, , 2], lftm)
desmat[10, 3] <- fdensityinter(fnet[, , 3], lftm)

desmat[1, 1 + 3] <- fdensity(atmnet[, , 1])
desmat[1, 2 + 3] <- fdensity(atmnet[, , 2])
desmat[1, 3 + 3] <- fdensity(atmnet[, , 3])
desmat[2, 1 + 3] <- fdensityintra(atmnet[, , 1], vrouwm)
desmat[2, 2 + 3] <- fdensityintra(atmnet[, , 2], vrouwm)
desmat[2, 3 + 3] <- fdensityintra(atmnet[, , 3], vrouwm)
desmat[3, 1 + 3] <- fdensityinter(atmnet[, , 1], vrouwm)
desmat[3, 2 + 3] <- fdensityinter(atmnet[, , 2], vrouwm)
desmat[3, 3 + 3] <- fdensityinter(atmnet[, , 3], vrouwm)
desmat[4, 1 + 3] <- fdensityintra(atmnet[, , 1], partijm)
desmat[4, 2 + 3] <- fdensityintra(atmnet[, , 2], partijm)
desmat[4, 3 + 3] <- fdensityintra(atmnet[, , 3], partijm)
desmat[5, 1 + 3] <- fdensityinter(atmnet[, , 1], partijm)
desmat[5, 2 + 3] <- fdensityinter(atmnet[, , 2], partijm)
desmat[5, 3 + 3] <- fdensityinter(atmnet[, , 3], partijm)
desmat[6, 1 + 3] <- fdensityintra(atmnet[, , 1], ethminzm)
desmat[6, 2 + 3] <- fdensityintra(atmnet[, , 2], ethminzm)
desmat[6, 3 + 3] <- fdensityintra(atmnet[, , 3], ethminzm)
desmat[7, 1 + 3] <- fdensityinter(atmnet[, , 1], ethminzm)
desmat[7, 2 + 3] <- fdensityinter(atmnet[, , 2], ethminzm)
desmat[7, 3 + 3] <- fdensityinter(atmnet[, , 3], ethminzm)
desmat[8, 1 + 3] <- fdensityinter(atmnet[, , 1], minoritym)
desmat[8, 2 + 3] <- fdensityinter(atmnet[, , 2], minoritym)
desmat[8, 3 + 3] <- fdensityinter(atmnet[, , 3], minoritym)
desmat[9, 1 + 3] <- fdensityintra(atmnet[, , 1], lftm)
desmat[9, 2 + 3] <- fdensityintra(atmnet[, , 2], lftm)
desmat[9, 3 + 3] <- fdensityintra(atmnet[, , 3], lftm)
desmat[10, 1 + 3] <- fdensityinter(atmnet[, , 1], lftm)
desmat[10, 2 + 3] <- fdensityinter(atmnet[, , 2], lftm)
desmat[10, 3 + 3] <- fdensityinter(atmnet[, , 3], lftm)

desmat[1, 1 + 6] <- fdensity(rtnet[, , 1])
desmat[1, 2 + 6] <- fdensity(rtnet[, , 2])
desmat[1, 3 + 6] <- fdensity(rtnet[, , 3])
desmat[2, 1 + 6] <- fdensityintra(rtnet[, , 1], vrouwm)
desmat[2, 2 + 6] <- fdensityintra(rtnet[, , 2], vrouwm)
desmat[2, 3 + 6] <- fdensityintra(rtnet[, , 3], vrouwm)
desmat[3, 1 + 6] <- fdensityinter(rtnet[, , 1], vrouwm)
desmat[3, 2 + 6] <- fdensityinter(rtnet[, , 2], vrouwm)
desmat[3, 3 + 6] <- fdensityinter(rtnet[, , 3], vrouwm)
desmat[4, 1 + 6] <- fdensityintra(rtnet[, , 1], partijm)
desmat[4, 2 + 6] <- fdensityintra(rtnet[, , 2], partijm)
desmat[4, 3 + 6] <- fdensityintra(rtnet[, , 3], partijm)
desmat[5, 1 + 6] <- fdensityinter(rtnet[, , 1], partijm)
desmat[5, 2 + 6] <- fdensityinter(rtnet[, , 2], partijm)
desmat[5, 3 + 6] <- fdensityinter(rtnet[, , 3], partijm)
desmat[6, 1 + 6] <- fdensityintra(rtnet[, , 1], ethminzm)
desmat[6, 2 + 6] <- fdensityintra(rtnet[, , 2], ethminzm)
desmat[6, 3 + 6] <- fdensityintra(rtnet[, , 3], ethminzm)
desmat[7, 1 + 6] <- fdensityinter(rtnet[, , 1], ethminzm)
desmat[7, 2 + 6] <- fdensityinter(rtnet[, , 2], ethminzm)
desmat[7, 3 + 6] <- fdensityinter(rtnet[, , 3], ethminzm)
desmat[8, 1 + 6] <- fdensityinter(rtnet[, , 1], minoritym)
desmat[8, 2 + 6] <- fdensityinter(rtnet[, , 2], minoritym)
desmat[8, 3 + 6] <- fdensityinter(rtnet[, , 3], minoritym)
desmat[9, 1 + 6] <- fdensityintra(rtnet[, , 1], lftm)
desmat[9, 2 + 6] <- fdensityintra(rtnet[, , 2], lftm)
desmat[9, 3 + 6] <- fdensityintra(rtnet[, , 3], lftm)
desmat[10, 1 + 6] <- fdensityinter(rtnet[, , 1], lftm)
desmat[10, 2 + 6] <- fdensityinter(rtnet[, , 2], lftm)
desmat[10, 3 + 6] <- fdensityinter(rtnet[, , 3], lftm)

colnames(desmat) <- c("friends w1", "friends w2", "friends w3", "atmentions w1", "atmentions w2", "atmentions w3", 
    "retweets w1", "retweets w2", "retweets w3")
rownames(desmat) <- c("total", "same sex", "different sex", "same party", "different party", "same ethnicity", 
    "different ethnicity", "both minority", "same age (<6)", "different age (>5)")
desmat
##                     friends w1 friends w2 friends w3 atmentions w1 atmentions w2 atmentions w3
## total                0.2545583  0.2794521  0.2797969    0.04912612    0.03500236   0.013465660
## same sex             0.2630408  0.2887662  0.2883167    0.05278618    0.03569521   0.013750219
## different sex        0.2449678  0.2689211  0.2701115    0.04498792    0.03421900   0.013142174
## same party           0.7091278  0.7334686  0.7415459    0.19918864    0.14239351   0.063607085
## different party      0.1946538  0.2196204  0.2193593    0.02935044    0.02085004   0.006902729
## same ethnicity       0.2655497  0.2885362  0.2885929    0.04926514    0.03574368   0.013491604
## different ethnicity  0.2096154  0.2423077  0.2435592    0.04855769    0.03197115   0.013358779
## both minority        0.2537506  0.2786431  0.2790029    0.04859054    0.03497372   0.013570823
## same age (<6)        0.2933009  0.3137704  0.3131586    0.05868881    0.03693100   0.014668186
## different age (>5)   0.2354766  0.2625494  0.2635734    0.04441624    0.03405245   0.012880886
##                     retweets w1 retweets w2 retweets w3
## total               0.046008503  0.03401361  0.03373404
## same sex            0.045753961  0.03363111  0.03284288
## different sex       0.046296296  0.03444843  0.03474711
## same party          0.335496957  0.25040258  0.24798712
## different party     0.007858861  0.00569080  0.00569080
## same ethnicity      0.047971781  0.03491604  0.03381587
## different ethnicity 0.037980769  0.03029580  0.03339695
## both minority       0.045723841  0.03402130  0.03355009
## same age (<6)       0.052247352  0.03716890  0.03702649
## different age (>5)  0.042935702  0.03247922  0.03213296

In general we observe quite some homophily in our dyads. Most importantly, we observe that the density is higher within political parties than between political parties. Homophily across social dimensions is, at first sight, not very pronounced.

5.5 Coleman Homophily index.

Because the sizes of the different subgroups vary (e.g. the political parties have different seats) and the number of out-degrees differs between MPs, within party densities may also be higher when MPs randomly select a partner/alter. That is, segregation will be in part structurally induced by differences in relative group sizes and activity on twitter (i.e. outdegrees). A measure of segregation that takes relative group sizes and differences in outdegrees into account is the Coleman’s Homophily Index. In this measure a value of 0 would indicate that the observed number of within-group ties is the same as would be expected under random choice. A value of 1 would indicate maximum segregation and a value of -1 indicates the unlikely case that MPs maximally avoid within group relations.

colmat <- matrix(NA, nrow = 3, ncol = 9)

colmat[1, 1] <- fscolnet(fnet[, , 1], partij)
colmat[1, 2] <- fscolnet(fnet[, , 2], partij)
colmat[1, 3] <- fscolnet(fnet[, , 3], partij)
colmat[1, 4] <- fscolnet(atmnet[, , 1], partij)
colmat[1, 5] <- fscolnet(atmnet[, , 2], partij)
colmat[1, 6] <- fscolnet(atmnet[, , 3], partij)
colmat[1, 7] <- fscolnet(rtnet[, , 1], partij)
colmat[1, 8] <- fscolnet(rtnet[, , 2], partij)
colmat[1, 9] <- fscolnet(rtnet[, , 3], partij)

colmat[2, 1] <- fscolnet(fnet[, , 1], vrouw)
colmat[2, 2] <- fscolnet(fnet[, , 2], vrouw)
colmat[2, 3] <- fscolnet(fnet[, , 3], vrouw)
colmat[2, 4] <- fscolnet(atmnet[, , 1], vrouw)
colmat[2, 5] <- fscolnet(atmnet[, , 2], vrouw)
colmat[2, 6] <- fscolnet(atmnet[, , 3], vrouw)
colmat[2, 7] <- fscolnet(rtnet[, , 1], vrouw)
colmat[2, 8] <- fscolnet(rtnet[, , 2], vrouw)
colmat[2, 9] <- fscolnet(rtnet[, , 3], vrouw)

colmat[3, 1] <- fscolnet(fnet[, , 1], ethminz)
colmat[3, 2] <- fscolnet(fnet[, , 2], ethminz)
colmat[3, 3] <- fscolnet(fnet[, , 3], ethminz)
colmat[3, 4] <- fscolnet(atmnet[, , 1], ethminz)
colmat[3, 5] <- fscolnet(atmnet[, , 2], ethminz)
colmat[3, 6] <- fscolnet(atmnet[, , 3], ethminz)
colmat[3, 7] <- fscolnet(rtnet[, , 1], ethminz)
colmat[3, 8] <- fscolnet(rtnet[, , 2], ethminz)
colmat[3, 9] <- fscolnet(rtnet[, , 3], ethminz)

colnames(colmat) <- c("friends w1", "friends w2", "friends w3", "atmentions w1", "atmentions w2", "atmentions w3", 
    "retweets w1", "retweets w2", "retweets w3")
rownames(colmat) <- c("party", "sex", "ethnicity")
colmat
##           friends w1 friends w2 friends w3 atmentions w1 atmentions w2 atmentions w3  retweets w1
## party     0.23290292 0.21197422 0.21310665    0.39147672   0.399713437    0.48111606  0.804211725
## sex       0.04624336 0.04080333 0.04272129    0.07421140   0.032149289    0.04699739 -0.005381047
## ethnicity 0.12122258 0.09821319 0.09889074   -0.03010028  -0.004564059   -0.02078296  0.051678853
##            retweets w2  retweets w3
## party      0.803175327  0.802304023
## sex       -0.029967472 -0.011929039
## ethnicity -0.007183535 -0.006536557

Well? YES, we observe segregation. Especially within the retweet-network and especially along the party-dimension.

5.6 RSiena models

5.6.1 step 1: Data

Done! Our RSiena object is mydata. We are focussing on the retweetnetwork and do not yet take into account multiplexity.

5.6.2 step 2: define myeff object.

library(RSiena)
myeff <- getEffects(mydata)
myeff
##    name   effectName                      include fix   test  initialValue parm
## 1  fnet   constant fnet rate (period 1)   TRUE    FALSE FALSE    7.22021   0   
## 2  fnet   constant fnet rate (period 2)   TRUE    FALSE FALSE    3.79571   0   
## 3  fnet   fnet: outdegree (density)       TRUE    FALSE FALSE    0.00000   0   
## 4  fnet   fnet: reciprocity               TRUE    FALSE FALSE    0.00000   0   
## 5  atmnet constant atmnet rate (period 1) TRUE    FALSE FALSE   16.26089   0   
## 6  atmnet constant atmnet rate (period 2) TRUE    FALSE FALSE    9.17902   0   
## 7  atmnet atmnet: outdegree (density)     TRUE    FALSE FALSE   -1.76137   0   
## 8  atmnet atmnet: reciprocity             TRUE    FALSE FALSE    0.00000   0   
## 9  rtnet  constant rtnet rate (period 1)  TRUE    FALSE FALSE   10.98716   0   
## 10 rtnet  constant rtnet rate (period 2)  TRUE    FALSE FALSE    9.39819   0   
## 11 rtnet  rtnet: outdegree (density)      TRUE    FALSE FALSE   -1.61392   0   
## 12 rtnet  rtnet: reciprocity              TRUE    FALSE FALSE    0.00000   0
myeff_m1 <- myeff
myeff_m1 <- includeEffects(myeff_m1, sameX, interaction1 = "partij", name = "rtnet")
##   effectName         include fix   test  initialValue parm
## 1 rtnet: same partij TRUE    FALSE FALSE          0   0

5.6.3 step 3: create alogrithm

# I used a seed so you will probably see the same results
myalgorithm <- sienaAlgorithmCreate(projname = "test", seed = 345654)
## siena07 will create an output file test.txt .

Have a look at the created file!!

5.6.4 step 4: estimate model

I generally estimate the model three times to start with. Gives me the opportunity to grab a cup of coffee. Depending on the results I will decide whether or not I need to rerun the model and with which results I want to continue later.

I ❤️ !! 😄

# to speed things up a bit, I am using more cores.
ansM1 <- siena07(myalgorithm, data = mydata, effects = myeff_m1, useCluster = TRUE, nbrNodes = 4, initC = TRUE, 
    batch = TRUE)
ansM1b <- siena07(myalgorithm, data = mydata, prevAns = ansM1, effects = myeff_m1, useCluster = TRUE, 
    nbrNodes = 4, initC = TRUE, batch = TRUE)
ansM1c <- siena07(myalgorithm, data = mydata, prevAns = ansM1b, effects = myeff_m1, useCluster = TRUE, 
    nbrNodes = 4, initC = TRUE, batch = TRUE)

save(ansM1, file = "ansM1a.RData")
save(ansM1b, file = "ansM1b.RData")
save(ansM1c, file = "ansM1c.RData")

Let us have a look at the data.

load("ansM1a.RData")
load("ansM1b.RData")
load("ansM1c.RData")
ansM1
## Estimates, standard errors and convergence t-ratios
## 
##                                            Estimate   Standard   Convergence 
##                                                         Error      t-ratio   
##    1. rate constant fnet rate (period 1)    3.7091  ( 0.1678   )    0.0802   
##    2. rate constant fnet rate (period 2)    1.9690  ( 0.1206   )   -0.0329   
##    3. eval fnet: outdegree (density)       -0.6588  ( 0.0876   )    0.0250   
##    4. eval fnet: reciprocity                0.8762  ( 0.0907   )   -0.0133   
##    5. rate constant atmnet rate (period 1) 25.7496  ( 1.8713   )    0.0220   
##    6. rate constant atmnet rate (period 2)  9.6467  ( 0.6387   )    0.0098   
##    7. eval atmnet: outdegree (density)     -2.3452  ( 0.0321   )   -0.0936   
##    8. eval atmnet: reciprocity              1.7004  ( 0.0727   )   -0.0577   
##    9. rate constant rtnet rate (period 1)  13.4141  ( 0.8603   )    0.0133   
##   10. rate constant rtnet rate (period 2)  12.1080  ( 0.9050   )    0.0024   
##   11. eval rtnet: outdegree (density)      -2.8349  ( 0.0465   )   -0.0047   
##   12. eval rtnet: reciprocity               0.8781  ( 0.0643   )    0.0307   
##   13. eval rtnet: same partij               1.8582  ( 0.0583   )    0.0427   
## 
## Overall maximum convergence ratio:    0.1963 
## 
## 
## Total of 2158 iteration steps.
ansM1b
## Estimates, standard errors and convergence t-ratios
## 
##                                            Estimate   Standard   Convergence 
##                                                         Error      t-ratio   
##    1. rate constant fnet rate (period 1)    3.7037  ( 0.1701   )    0.0505   
##    2. rate constant fnet rate (period 2)    1.9687  ( 0.1205   )   -0.0350   
##    3. eval fnet: outdegree (density)       -0.6541  ( 0.0844   )   -0.0026   
##    4. eval fnet: reciprocity                0.8756  ( 0.0841   )    0.0061   
##    5. rate constant atmnet rate (period 1) 25.7211  ( 1.8487   )    0.0565   
##    6. rate constant atmnet rate (period 2)  9.6449  ( 0.5197   )   -0.0190   
##    7. eval atmnet: outdegree (density)     -2.3452  ( 0.0297   )   -0.0472   
##    8. eval atmnet: reciprocity              1.7000  ( 0.0703   )   -0.0012   
##    9. rate constant rtnet rate (period 1)  13.3847  ( 0.8180   )    0.0404   
##   10. rate constant rtnet rate (period 2)  12.0998  ( 0.7632   )    0.0072   
##   11. eval rtnet: outdegree (density)      -2.8343  ( 0.0503   )   -0.0190   
##   12. eval rtnet: reciprocity               0.8758  ( 0.0614   )   -0.0285   
##   13. eval rtnet: same partij               1.8582  ( 0.0601   )    0.0178   
## 
## Overall maximum convergence ratio:    0.1857 
## 
## 
## Total of 2104 iteration steps.
ansM1c
## Estimates, standard errors and convergence t-ratios
## 
##                                            Estimate   Standard   Convergence 
##                                                         Error      t-ratio   
##    1. rate constant fnet rate (period 1)    3.7053  ( 0.1674   )   -0.0270   
##    2. rate constant fnet rate (period 2)    1.9683  ( 0.1254   )    0.0579   
##    3. eval fnet: outdegree (density)       -0.6570  ( 0.0852   )   -0.0352   
##    4. eval fnet: reciprocity                0.8753  ( 0.0933   )   -0.0508   
##    5. rate constant atmnet rate (period 1) 25.7437  ( 1.4522   )   -0.0013   
##    6. rate constant atmnet rate (period 2)  9.6502  ( 0.5618   )    0.0395   
##    7. eval atmnet: outdegree (density)     -2.3453  ( 0.0304   )   -0.0235   
##    8. eval atmnet: reciprocity              1.7011  ( 0.0645   )    0.0050   
##    9. rate constant rtnet rate (period 1)  13.3964  ( 0.8279   )    0.0523   
##   10. rate constant rtnet rate (period 2)  12.0945  ( 0.8603   )   -0.0005   
##   11. eval rtnet: outdegree (density)      -2.8322  ( 0.0500   )   -0.0197   
##   12. eval rtnet: reciprocity               0.8779  ( 0.0625   )    0.0133   
##   13. eval rtnet: same partij               1.8551  ( 0.0569   )    0.0097   
## 
## Overall maximum convergence ratio:    0.1471 
## 
## 
## Total of 2078 iteration steps.

So, do we have an answer to our first research question? It is up to you answer the other research questions.



  1. $$p = P(Y=1) = \frac{exp(\beta_kx_k)}{1+exp(\beta_kx_k)}$$
    ↩︎

Previous
Next