The data

Description

The network represents communication between organizations involved in search and rescue operations in three Texas counties (Kerr, Kendal and Bandera) after 1978 flood.

Drabek et al. (1981) provided seven case studies of emergent multi-organizational networks in the context of search and rescue (SAR) activities. Networks of interaction frequency are reported, along with several organizational attributes.

Our network was receded so that higher value means more frequent communication -> 1->4, 2->3, 3->2, 4->1 The tie values now mean:

  • 0 - “no communication”
  • 1 - “about once a day or less”
  • 2 - “every few hours”
  • 3 - “about once an hour”
  • 4 - “continuously”

For more see https://rdrr.io/cran/network/man/emon.html.

Loading the data

First we load the blockmodeling and sna packages and the data (available on workshop web page).

library(blockmodeling)
library(sna)

## Loading data
tex<-loadmatrix("texas.net") #File is in Pajek matrix format.
texCounty<-loadvector("texasCounty.clu")
texCounty<-factor(texCounty,labels=c("Other","Kendal","Kerr","Bandera"))

Ploting a partitioned matrix

Then we plot the network and partition by counties in matrix and graph form. We use the plotMat function with paramters M (matrix) and clu (partition).

par(mfrow=c(2,1))
plotMat(tex, clu=texCounty)
gplot1( tex/2,vertex.col=texCounty)

Ploting an image matrix

Based on network and a partition, we can get an image matrix (means by blocks) with funByBlocks.

par(mfrow=c(1,2))
plotMat(tex, clu=texCounty,mar=c(1,4,7,1),x.axis.val.pos = 1.02, y.axis.val.pos = -0.02, main="Matrix by counties",title.line=6,cex.val = 1)

tmpDensIM<-funByBlocks(tex,clu=texCounty,na.rm=TRUE)
plotMat(tmpDensIM,title.line=2,main="Density (image) matrix\ncounties",mar=c(1,4,7,1),x.axis.val.pos = 1.03, y.axis.val.pos = -0.03)

par(mfrow=c(1,1))

Generalized blockmodeling

For finding a partition using generalized blockmodeling, we use optRandomParC function (also on slides 33-26). The function searches for the best partition of a network (matrix) into a specified numner of clusters according to a selected equivalence. It’s main arguments are:

blocks parameter specifies the equivalences as:

Block types (possible values in vector or array supplied to blocks) are:

Homogeneity blockmodeling

We will start with homogeneity blockmodeling, as the least preparations are needed to use it.

For homogeneity blockmodeling we set approaches="hom". Sum of squares blockmodeling is used by default, but we can also explicitly demand it with homFun="ss". For absolute deviations blockmodeling, we would have to specify homFun="ad".

Structural equivalence

The equivalence is specified via blocks argument. For structural equivalence, null ("nul") and complete ("com") block types are allowed, but as for homogeneity blockmodeling, the complete block is a special case of the null block, only complete block can be specified (blocks=c("com")). In the example below, we specify:

  • M=tex - the data is in matrix tex.
  • k=3 - we want partitions into 3 clusters.
  • rep=200 - we want 200 repetitions, that is start with 300 random starting partitions.
  • blocks=c("com") - structural equivalence (se above)
  • approaches="hom",homFun="ss"- homogeneity sum of squares blockmodeling
  • nCores=0 - the number of cores is set to “number of available cores” - 1.

Then we use method plot (which calles plotMat) to plot the results.

set.seed(2021)
texStrSSk3<-optRandomParC(M=tex,k=3,rep=200,blocks=c("com"),approaches="hom",homFun="ss",nCores=0)
## Loading required namespace: doParallel
## Loading required namespace: doRNG
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1072.744 found.

The function has found 1 partition with the minimal value of the criterion function = 1072.744.

Then we print the resoult and plot the matrix.

texStrSSk3
## Network size: 25 
## 
## Approachs (paramter): hom-ss
## Blocks (paramter)
## com 
## 
## Sizes of clusters:
## 1 2 3 
## 8 8 9 
## 
## Error: 1072.744
plot(texStrSSk3)

We see we got 3 clusters of approximately equal size.

Now we can plot na image matrix.

plotMat(funByBlocks(texStrSSk3))

We can make things nicer by giving the names to clusters. Here we also use the argument orderClu of function funByBlocks to order the clusters by decreasing average in-degrees, so that the order is not random.

## Making a nice image matrix
strSSK3im<-funByBlocks(texStrSSk3,orderClu = TRUE)
rownames(strSSK3im)<-colnames(strSSK3im)<-c("Core","Kerr","Pheriphery")
plotMat(strSSK3im,main="Image matrix", mar=c(1,4,7,1), x.axis.val.pos = 1.03, y.axis.val.pos = -0.03, title.line=6)

Determening the number of clusters

We will not go deep into that, but one way to decide on the number of clusters is to check (and plot) the errors for different number of clustrs. We use a for loop for that bellow.

## Selecting number of clusters - using a for loop
texStrSSList<-list()
for(k in 1:5){
  texStrSSList[[as.character(k)]]<-optRandomParC(M=tex,k=k,rep=ifelse(k==1,1,200),blocks=c("com"),approach="hom",homFun="ss",nCores=ifelse(k==1,1,0))
}
## 
## 
## Starting optimization of the partiton 1 of 1 partitions.
## Starting partition: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
## Final error: 1458.5 
## Final partition:    1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1458.5 found. 
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1264.66 found. 
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1072.744 found. 
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 972.1583 found. 
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 878.1958 found.
plot(sapply(texStrSSList,err)~I(1:5),type="o", xlab="k",ylab="CF")

We see an elbow at 3 clusters, indicating that this might be an appropriate number.

We can also use Relative fit - https://link.springer.com/article/10.1007/s10260-021-00595-1.

if(file.exists("texStrSSRF.Rdata")){
    load("texStrSSRF.Rdata")  
  }else{
    set.seed(1)
    texStrSSRF <- NULL
    for(k in 1:8){
      tmp <- optRandomParC(M = tex, 
                                  k = k, 
                                  rep = 200, 
                                  blocks = c("com"), 
                                  approach = "hom", 
                                  homFun = "ss", 
                                  nCores = 0)
      
      # m stands for the number of randomized networks and
      # should be increased to, e.g., 30
      texStrSSRF[k] <- RF(tmp, m = 5)$RF
    }
    save(texStrSSRF,file = "texStrSSRF.Rdata")
}
plot(y = texStrSSRF, x = 1:length(texStrSSRF), type="o", xlab = "k", ylab = "RF")

Binary blockmodeling

For binary blockmodeling, most things stay the same, but we do need a binary network. We will just use tex>0, which essentially converts all values above 0 to 1s. We could also use M=tex with usePreSpecM =1, but that is probably less clear. We of course also have to specify approaches="bin".

Structural equivalence

Again we specify structural equivalence by allowing null and complete blocks. As here null blocks are not a special case of complete blocks, both must be speficied with blocks=c("nul","com"). Below we aslo print and plot the result.

texStrBink3<-optRandomParC(M=tex>0,k=3,rep=200,blocks=c("nul","com"),approaches="bin",nCores=0)
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 132 found.
texStrBink3
## Network size: 25 
## 
## Approachs (paramter): bin
## Blocks (paramter)
## nul com 
## 
## Sizes of clusters:
##  1  2  3 
## 12  7  6 
## 
## IM
##     1   2   3
## 1 nul nul nul
## 2 nul com nul
## 3 com nul com
## 
## Error: 132
plot(texStrBink3)

The argument orderClu can be also used in method plot to order the clusters by decreasing average in-degrees, so that the order is not random.

plot(texStrBink3,orderClu=TRUE)

Then we can also plot the image matrix (ordered).

# image matrix
plotMat(funByBlocks(texStrBink3,orderClu = TRUE))

And the same in graph form.

gplot1(funByBlocks(texStrBink3,orderClu = TRUE))

We can compare the results of homogeneity and binary blockmodeling with a contingency table and Adjusted Rand Index.

## comparison with homogeneity partition
table(clu(texStrSSk3),clu(texStrBink3))
##    
##     1 2 3
##   1 1 7 0
##   2 8 0 0
##   3 3 0 6
crand(clu(texStrSSk3),clu(texStrBink3))
## [1] 0.543518

Regular equivalence

For regular equivalence, we only have to add a regular ("reg") block type to allowed block types (blocks=c("nul","com", "reg")). Unfortunatelly, both partitions, that is in 2 and 3 clusters, are not very usefull, only identifying Salvation Army as a zero out-degree unit.

texRegBink3<-optRandomParC(M=tex>0,k=3,rep=200,blocks=c("nul","com", "reg"),approaches="bin",nCores=0)
## 
## 
## Optimization of all partitions completed
## 2 solution(s) with minimal error = 2 found.
texRegBink3
## Network size: 25 
## 
## Approachs (paramter): bin
## Blocks (paramter)
## nul com reg 
## 
## Sizes of clusters:
##  1  2  3 
##  1  6 18 
## 
## IM
##     1   2   3
## 1 nul nul nul
## 2 com reg reg
## 3 nul reg reg
## 
## Error: 2 
## 2 solutions with minimal error exits. Only results for the first one are shown above!
plot(texRegBink3)

# 2 clusters
texRegBink2<-optRandomParC(M=tex>0,k=2,rep=200,blocks=c("nul","com", "reg"),approach="bin",nCores=0)
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 4 found.
texRegBink2
## Network size: 25 
## 
## Approachs (paramter): bin
## Blocks (paramter)
## nul com reg 
## 
## Sizes of clusters:
##  1  2 
## 24  1 
## 
## IM
##     1   2
## 1 reg nul
## 2 nul nul
## 
## Error: 4
plot(texRegBink2)

Valued blockmodeling

For valued blockmodeling, we have to use approaches="val". In addition, the paramter \(m\) has to be specified via preSpecM. To determine it, an inspection of tie values is useful, but even more is the subject matter knowledge. Here we set it to 4, which represents continuous communication.

table(tex)
## tex
##   0   1   2   3   4 
## 439  27  40  13 106
summary(tex[tex>0])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   2.000   4.000   3.065   4.000   4.000

Structural eqeuivalence

First we find a plot a partition and the corresponding image matrix.

set.seed(2021)
texStrValm4k3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=c("nul","com"),approaches="val",nCores=0)
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 438 found.
texStrValm4k3
## Network size: 25 
## 
## Approachs (paramter): val
## Blocks (paramter)
## nul com 
## 
## Sizes of clusters:
##  1  2  3 
##  7  5 13 
## 
## IM
##     1   2   3
## 1 com nul nul
## 2 nul com nul
## 3 nul nul nul
## 
## Error: 438
plot(texStrValm4k3)

plotMat(funByBlocks(texStrValm4k3))

And now we make it a bit nicer.

texStrValm4k3im<-funByBlocks(texStrValm4k3, orderClu = TRUE,na.rm=TRUE)
rownames(texStrValm4k3im)<-colnames(texStrValm4k3im)<-c("Core","Kerr","Pheriphery")
plotMat(texStrValm4k3im)

Now we check what changes if we set \(m\) to 2, that is by preSpecM=2, that is, we treat communication “every few hours” as already “strong enough” not to represent an error in complete blocks.

table(tex)
## tex
##   0   1   2   3   4 
## 439  27  40  13 106
summary(tex[tex>0])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   2.000   4.000   3.065   4.000   4.000
set.seed(2021)
texStrValm2k3<-optRandomParC(M=tex,preSpecM=2,k=3,rep=200,blocks=c("nul","com"),approach="val",nCores=0)
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 325 found.
texStrValm2k3
## Network size: 25 
## 
## Approachs (paramter): val
## Blocks (paramter)
## nul com 
## 
## Sizes of clusters:
## 1 2 3 
## 8 9 8 
## 
## IM
##     1   2   3
## 1 nul nul nul
## 2 com com nul
## 3 nul nul com
## 
## Error: 325
plot(texStrValm2k3)

funByBlocks(texStrValm2k3)
##           1         2         3
## 1 0.1428571 0.5972222 0.2812500
## 2 1.6944444 2.2500000 0.6111111
## 3 0.1562500 0.5416667 2.2142857

Regular equivalence

We have also tried regular equivalence with \(m=4\), but with little success.

## Regular equivalence, m = 4 ("does not work")
set.seed(2021)
texRegValm4k3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=c("nul","com","reg"),approach="val",nCores=0)
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 81 found.
plot(texRegValm4k3)

Pre-specified blockmodeling

Pre-specified blockmodeling can be used with any generalized blockmodeling approach, but we demonstrate its use with valued blockmodeling with \(m=4\).

Cohesive groups

First we will pre-specify a model called “cohesive groups”. The model assumes a complete blocks on the diagonal and null blocks elsewhere. In case of one-relational network with only one allowed block type per block, the specification can be a matrix (in general, it should be 4 dimensional array). This should be then supplied as a blocks paramter to optRandomParC function.

Bellow we specify models models for 2-4 clusters and print the one for 4 groups. We also compute the error with 1 cluster (the whole network is one complte block) and with each unit in its own cluster (non-diagonal elements should then be empty).

IMcoh2<-ifelse(diag(2)==1, "com","nul")
IMcoh3<-ifelse(diag(3)==1, "com","nul")
IMcoh4<-ifelse(diag(4)==1, "com","nul")
IMcoh4
##      [,1]  [,2]  [,3]  [,4] 
## [1,] "com" "nul" "nul" "nul"
## [2,] "nul" "com" "nul" "nul"
## [3,] "nul" "nul" "com" "nul"
## [4,] "nul" "nul" "nul" "com"
c(sum(4-tex)-4*nrow(tex),sum(tex))
## [1] 1830  570

First we fit the model with 2 clusters.

set.seed(2021)
texStrValm4coh2<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcoh2,approach="val",nCores=0)
## 
## 
## Optimization of all partitions completed
## 4 solution(s) with minimal error = 894 found.
table(clu(texStrValm4coh2), clu(texStrValm4coh2,which = 2))
##    
##      1  2
##   1 12  1
##   2  2 10

We have found 4 equally well fitting partitions, but closer inspections shows that the first two differ only in the order of groups. With pre-specified blockmodeling, the optRandomParC function things the order (or names) of clusters are important. By setting switch.names = TRUE we can tell the function that this is not the case. Then we still get two equally well fitting partitions, but they are in deed different.

set.seed(2021)
texStrValm4coh2<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcoh2,approach="val",nCores=0, switch.names = TRUE)
## 
## 
## Optimization of all partitions completed
## 2 solution(s) with minimal error = 894 found.
plot(texStrValm4coh2)

plot(texStrValm4coh2,which = 2)

table(clu(texStrValm4coh2), clu(texStrValm4coh2,which = 2))
##    
##      1  2
##   1 12  1
##   2  2 10

Then we also try models for 3 and 4 clusters.

texStrValm4coh3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=IMcoh3,approach="val",nCores=0, switch.names = TRUE)
## 
## 
## Optimization of all partitions completed
## 3 solution(s) with minimal error = 626 found.
plot(texStrValm4coh3)

texStrValm4coh4<-optRandomParC(M=tex,preSpecM=4,k=4,rep=200,blocks=IMcoh4,approach="val",nCores=0, switch.names = TRUE)
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 520 found.
plot(texStrValm4coh4)

We can notice that the error is decreasing with number of clusters. We can check how long.

if(file.exists("texStrValm4cohList.Rdata")){
  load("texStrValm4cohList.Rdata")  
}else{
  texStrValm4cohList<-list()
  for(k in 2:15){
    tmpIM<-ifelse(diag(k)==1, "com","nul")
    texStrValm4cohList[[as.character(k)]]<-optRandomParC(M=tex,preSpecM=4,k=k,rep=100,blocks=tmpIM,approach="val",nCores=0)
  }
  save(texStrValm4cohList,file = "texStrValm4cohList.Rdata")
}

errCoh<-c("1"=sum(4-tex), sapply(texStrValm4cohList,err))
plot(errCoh,type="o")
abline(h=sum(tex))

plot(errCoh,type="o",ylim=c(min(errCoh),650))
abline(h=sum(tex))

Core-perhiphery

Now we can try a bit more complex model, where in some blocks, several block types will be allowed. We will explore a core-perhiphey model. In this model, two clusters are assumed, where the first one is the core, which means that units within should be strongly connected to eachother. For this one, we assume a complete block. The second is a periphery, which units should not be conntected to each other at all. For this block, we should assume a null block. As different versions of the model allow different conections between the core and the perhiphery, we will allow different block types there.

Structural equivanece

Under structural equivalence, we want to allow both null and complete blocks in the off-diagonal blocks. We can spcify this as follows:

## Core-perhiphery - structural equivalence
# model:
#      1        2
# 1   com    nul,com
# 2 nul,com    nul
IMcpStr<-array(NA,dim=c(2,2,2))
IMcpStr[1,,]<-"nul"
IMcpStr[1,1,1]<-"com"
IMcpStr[2,2,1]<-"com"
IMcpStr[2,1,2]<-"com"
IMcpStr
## , , 1
## 
##      [,1]  [,2] 
## [1,] "com" "nul"
## [2,] NA    "com"
## 
## , , 2
## 
##      [,1]  [,2] 
## [1,] "nul" "nul"
## [2,] "com" NA

As 3 and 4 dimensional arrays are very hard to read, we have created a function to print this a bit nicely.

printBlocks(IMcpStr)
##         1       2
## 1     com nul,com
## 2 nul,com     nul

Then we fit the model using again valued blockmodeling and \(m=4\).

texStrValm4CPstr<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcpStr,approach="val",nCores=0)
## 
## 
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 496 found.
plot(texStrValm4CPstr)

texStrValm4CPstr
## Network size: 25 
## 
## Approachs (paramter): val
## Blocks (paramter)
##         1       2
## 1     com nul,com
## 2 nul,com     nul
## 
## Sizes of clusters:
##  1  2 
##  7 18 
## 
## IM
##     1   2
## 1 com nul
## 2 nul nul
## 
## Error: 496

Reglar equivalence

Another option is to also allow regular block type in the off-diagonal blocks.

## Core-perhiphery - regular equivalence
# model:
#        1              2
# 1     com        nul,com,reg
# 2 nul,com,reg        nul
IMcpReg<-array(NA,dim=c(3,2,2))
IMcpReg[1,,]<-"nul"
IMcpReg[1,1,1]<-"com"
IMcpReg[2,2,1]<-"com"
IMcpReg[2,1,2]<-"com"
IMcpReg[3,2,1]<-"reg"
IMcpReg[3,1,2]<-"reg"

# for better idea what we have done
printBlocks(IMcpReg)
##             1           2
## 1         com nul,com,reg
## 2 nul,com,reg         nul
texStrValm4CPreg<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcpReg,approach="val",nCores=0)
## 
## 
## Optimization of all partitions completed
## All 200 solutions have err 317
plot(texStrValm4CPreg)

err(texStrValm4CPreg)
## [1] 317
plot(texStrValm4CPreg, use.IM=FALSE)

Relative Fit can also be used to compare different blockmodels.

# The code below can be time consuming
if(file.exists("RFvec.Rdata")){
    load("RFvec.Rdata")
}else {
  set.seed(1)
    RFvec<-c(
        texStrValm4coh3 = RF(texStrValm4coh3)$RF,
        texStrValm4coh4 = RF(texStrValm4coh4)$RF,
        texStrValm4CPreg = RF(texStrValm4CPreg)$RF
    )
    save(RFvec, file = "RFvec.Rdata")
}
barplot(RFvec)

More complex model

We can of course also think of other models that suit particular network.

For example, perhaps her the following model might make sense:

  • groups should be cohesive (complete block), except one, which does not need (can be also null) to be (but can be)
  • ties between groups can be anything (nul, com, reg)

If the number of groups is not known in advance, it is usually beneficial to make a function that will create a model with certain number of groups.

makeMyIM<-function(k){
  myIM<-array(NA,dim=c(3,k,k))
  myIM[1,,]<-"nul"
  myIM[cbind(1,1:(k-1),1:(k-1))]<-"com"
  myIM[2,,]<-"com"
  myIM[cbind(2,1:(k-1),1:(k-1))]<-NA
  myIM[3,,]<-"reg"
  myIM[cbind(3,1:k,1:k)]<-NA
  return(myIM)
}
printBlocks(makeMyIM(4))
##             1           2           3           4
## 1         com nul,com,reg nul,com,reg nul,com,reg
## 2 nul,com,reg         com nul,com,reg nul,com,reg
## 3 nul,com,reg nul,com,reg         com nul,com,reg
## 4 nul,com,reg nul,com,reg nul,com,reg     nul,com
k<-3; texStrValm4myIMk3<-optRandomParC(M=tex,preSpecM=4,k=k,rep=200,blocks=makeMyIM(3),approach="val",nCores=0, switch.names = TRUE)
## 
## 
## Optimization of all partitions completed
## 2 solution(s) with minimal error = 274 found.
plot(texStrValm4myIMk3)

printBlocks(IM(texStrValm4myIMk3))
##     1   2   3
## 1 com nul reg
## 2 nul com nul
## 3 reg nul nul
plot(texStrValm4myIMk3,which = 2)

printBlocks(IM(texStrValm4myIMk3,which = 2))
##     1   2   3
## 1 com reg reg
## 2 reg com reg
## 3 nul nul nul
table(clu(texStrValm4myIMk3),clu(texStrValm4myIMk3,which = 2))
##    
##      1  2  3
##   1  4  3  0
##   2  1  2  2
##   3  1  2 10