The network represents communication between organizations involved in search and rescue operations in three Texas counties (Kerr, Kendal and Bandera) after 1978 flood.
Drabek et al. (1981) provided seven case studies of emergent multi-organizational networks in the context of search and rescue (SAR) activities. Networks of interaction frequency are reported, along with several organizational attributes.
Our network was receded so that higher value means more frequent communication -> 1->4, 2->3, 3->2, 4->1 The tie values now mean:
For more see https://rdrr.io/cran/network/man/emon.html.
First we load the blockmodeling and sna packages and the data (available on workshop web page).
library(blockmodeling)
library(sna)
## Loading data
tex<-loadmatrix("texas.net") #File is in Pajek matrix format.
texCounty<-loadvector("texasCounty.clu")
texCounty<-factor(texCounty,labels=c("Other","Kendal","Kerr","Bandera"))
Then we plot the network and partition by counties in matrix and
graph form. We use the plotMat
function with paramters
M
(matrix) and clu
(partition).
par(mfrow=c(2,1))
plotMat(tex, clu=texCounty)
gplot1( tex/2,vertex.col=texCounty)
Based on network and a partition, we can get an image matrix (means
by blocks) with funByBlocks
.
par(mfrow=c(1,2))
plotMat(tex, clu=texCounty,mar=c(1,4,7,1),x.axis.val.pos = 1.02, y.axis.val.pos = -0.02, main="Matrix by counties",title.line=6,cex.val = 1)
tmpDensIM<-funByBlocks(tex,clu=texCounty,na.rm=TRUE)
plotMat(tmpDensIM,title.line=2,main="Density (image) matrix\ncounties",mar=c(1,4,7,1),x.axis.val.pos = 1.03, y.axis.val.pos = -0.03)
par(mfrow=c(1,1))
For finding a partition using generalized blockmodeling, we use
optRandomParC
function (also on slides 33-26). The function
searches for the best partition of a network (matrix) into a specified
numner of clusters according to a selected equivalence. It’s main
arguments are:
M
– The network in (dense) matrix format. An array for
multirelational networkk
– the number of clusters. If data contain more sets,
this is a vector (advanced).approaches
– one of "bin"
(binary),
"val"
(valued) or "hom"
(homogeneity)
blockmodeling.blocks
– allowed block typesrep
– the number of random starting partitionsnCores
– For multicore computers. Defaults to 1. 0
means automatic (number of available cores – 1).blocks
parameter specifies the equivalences as:
Block types (possible values in vector or array supplied to
blocks
) are:
"nul"
- null or empty block"com"
- complete block"rdo"
, "cdo"
- row and column-dominant
blocks (binary and valued approach only)"reg"
- (f-)regular block"rre"
, "cre"
- row and column-(f-)regular
blocks"rfn"
, "cfn"
- row and column-dominant
blocks (binary and valued approach only)"den"
- density block (binary approach only)"avg"
- average block (valued approach only)"dnc"
- do not care block - the error is always zero
The ordering is important, since if several block types have identical
error, the first on the list is selected.We will start with homogeneity blockmodeling, as the least preparations are needed to use it.
For homogeneity blockmodeling we set approaches="hom"
.
Sum of squares blockmodeling is used by default, but we can also
explicitly demand it with homFun="ss"
. For absolute
deviations blockmodeling, we would have to specify
homFun="ad"
.
The equivalence is specified via blocks
argument. For
structural equivalence, null ("nul"
) and complete
("com"
) block types are allowed, but as for homogeneity
blockmodeling, the complete block is a special case of the null block,
only complete block can be specified (blocks=c("com")
). In
the example below, we specify:
M=tex
- the data is in matrix tex
.k=3
- we want partitions into 3 clusters.rep=200
- we want 200 repetitions, that is start with
300 random starting partitions.blocks=c("com")
- structural equivalence (se
above)approaches="hom",homFun="ss"
- homogeneity sum of
squares blockmodelingnCores=0
- the number of cores is set to “number of
available cores” - 1.Then we use method plot
(which calles
plotMat
) to plot the results.
set.seed(2021)
texStrSSk3<-optRandomParC(M=tex,k=3,rep=200,blocks=c("com"),approaches="hom",homFun="ss",nCores=0)
## Loading required namespace: doParallel
## Loading required namespace: doRNG
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1072.744 found.
The function has found 1 partition with the minimal value of the criterion function = 1072.744.
Then we print the resoult and plot the matrix.
texStrSSk3
## Network size: 25
##
## Approachs (paramter): hom-ss
## Blocks (paramter)
## com
##
## Sizes of clusters:
## 1 2 3
## 8 8 9
##
## Error: 1072.744
plot(texStrSSk3)
We see we got 3 clusters of approximately equal size.
Now we can plot na image matrix.
plotMat(funByBlocks(texStrSSk3))
We can make things nicer by giving the names to clusters.
Here we also use the argument orderClu
of function
funByBlocks
to order the clusters by decreasing average
in-degrees, so that the order is not random.
## Making a nice image matrix
strSSK3im<-funByBlocks(texStrSSk3,orderClu = TRUE)
rownames(strSSK3im)<-colnames(strSSK3im)<-c("Core","Kerr","Pheriphery")
plotMat(strSSK3im,main="Image matrix", mar=c(1,4,7,1), x.axis.val.pos = 1.03, y.axis.val.pos = -0.03, title.line=6)
We will not go deep into that, but one way to decide on the number of clusters is to check (and plot) the errors for different number of clustrs. We use a for loop for that bellow.
## Selecting number of clusters - using a for loop
texStrSSList<-list()
for(k in 1:5){
texStrSSList[[as.character(k)]]<-optRandomParC(M=tex,k=k,rep=ifelse(k==1,1,200),blocks=c("com"),approach="hom",homFun="ss",nCores=ifelse(k==1,1,0))
}
##
##
## Starting optimization of the partiton 1 of 1 partitions.
## Starting partition: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## Final error: 1458.5
## Final partition: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1458.5 found.
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1264.66 found.
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 1072.744 found.
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 972.1583 found.
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 878.1958 found.
plot(sapply(texStrSSList,err)~I(1:5),type="o", xlab="k",ylab="CF")
We see an elbow at 3 clusters, indicating that this might be an appropriate number.
We can also use Relative fit - https://link.springer.com/article/10.1007/s10260-021-00595-1.
if(file.exists("texStrSSRF.Rdata")){
load("texStrSSRF.Rdata")
}else{
set.seed(1)
texStrSSRF <- NULL
for(k in 1:8){
tmp <- optRandomParC(M = tex,
k = k,
rep = 200,
blocks = c("com"),
approach = "hom",
homFun = "ss",
nCores = 0)
# m stands for the number of randomized networks and
# should be increased to, e.g., 30
texStrSSRF[k] <- RF(tmp, m = 5)$RF
}
save(texStrSSRF,file = "texStrSSRF.Rdata")
}
plot(y = texStrSSRF, x = 1:length(texStrSSRF), type="o", xlab = "k", ylab = "RF")
For binary blockmodeling, most things stay the same, but we do need a
binary network. We will just use tex>0
, which
essentially converts all values above 0 to 1s. We could also use
M=tex
with usePreSpecM =1
, but that is
probably less clear. We of course also have to specify
approaches="bin"
.
Again we specify structural equivalence by allowing null and complete
blocks. As here null blocks are not a special case of complete blocks,
both must be speficied with blocks=c("nul","com")
. Below we
aslo print and plot the result.
texStrBink3<-optRandomParC(M=tex>0,k=3,rep=200,blocks=c("nul","com"),approaches="bin",nCores=0)
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 132 found.
texStrBink3
## Network size: 25
##
## Approachs (paramter): bin
## Blocks (paramter)
## nul com
##
## Sizes of clusters:
## 1 2 3
## 12 7 6
##
## IM
## 1 2 3
## 1 nul nul nul
## 2 nul com nul
## 3 com nul com
##
## Error: 132
plot(texStrBink3)
The argument orderClu
can be also used in method
plot
to order the clusters by decreasing average
in-degrees, so that the order is not random.
plot(texStrBink3,orderClu=TRUE)
Then we can also plot the image matrix (ordered).
# image matrix
plotMat(funByBlocks(texStrBink3,orderClu = TRUE))
And the same in graph form.
gplot1(funByBlocks(texStrBink3,orderClu = TRUE))
We can compare the results of homogeneity and binary blockmodeling with a contingency table and Adjusted Rand Index.
## comparison with homogeneity partition
table(clu(texStrSSk3),clu(texStrBink3))
##
## 1 2 3
## 1 1 7 0
## 2 8 0 0
## 3 3 0 6
crand(clu(texStrSSk3),clu(texStrBink3))
## [1] 0.543518
For regular equivalence, we only have to add a regular
("reg"
) block type to allowed block types
(blocks=c("nul","com", "reg")
). Unfortunatelly, both
partitions, that is in 2 and 3 clusters, are not very usefull, only
identifying Salvation Army as a zero out-degree unit.
texRegBink3<-optRandomParC(M=tex>0,k=3,rep=200,blocks=c("nul","com", "reg"),approaches="bin",nCores=0)
##
##
## Optimization of all partitions completed
## 2 solution(s) with minimal error = 2 found.
texRegBink3
## Network size: 25
##
## Approachs (paramter): bin
## Blocks (paramter)
## nul com reg
##
## Sizes of clusters:
## 1 2 3
## 1 6 18
##
## IM
## 1 2 3
## 1 nul nul nul
## 2 com reg reg
## 3 nul reg reg
##
## Error: 2
## 2 solutions with minimal error exits. Only results for the first one are shown above!
plot(texRegBink3)
# 2 clusters
texRegBink2<-optRandomParC(M=tex>0,k=2,rep=200,blocks=c("nul","com", "reg"),approach="bin",nCores=0)
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 4 found.
texRegBink2
## Network size: 25
##
## Approachs (paramter): bin
## Blocks (paramter)
## nul com reg
##
## Sizes of clusters:
## 1 2
## 24 1
##
## IM
## 1 2
## 1 reg nul
## 2 nul nul
##
## Error: 4
plot(texRegBink2)
For valued blockmodeling, we have to use
approaches="val"
. In addition, the paramter \(m\) has to be specified via
preSpecM
. To determine it, an inspection of tie values is
useful, but even more is the subject matter knowledge. Here we set it to
4, which represents continuous communication.
table(tex)
## tex
## 0 1 2 3 4
## 439 27 40 13 106
summary(tex[tex>0])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 4.000 3.065 4.000 4.000
First we find a plot a partition and the corresponding image matrix.
set.seed(2021)
texStrValm4k3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=c("nul","com"),approaches="val",nCores=0)
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 438 found.
texStrValm4k3
## Network size: 25
##
## Approachs (paramter): val
## Blocks (paramter)
## nul com
##
## Sizes of clusters:
## 1 2 3
## 7 5 13
##
## IM
## 1 2 3
## 1 com nul nul
## 2 nul com nul
## 3 nul nul nul
##
## Error: 438
plot(texStrValm4k3)
plotMat(funByBlocks(texStrValm4k3))
And now we make it a bit nicer.
texStrValm4k3im<-funByBlocks(texStrValm4k3, orderClu = TRUE,na.rm=TRUE)
rownames(texStrValm4k3im)<-colnames(texStrValm4k3im)<-c("Core","Kerr","Pheriphery")
plotMat(texStrValm4k3im)
Now we check what changes if we set \(m\) to 2, that is by
preSpecM=2
, that is, we treat communication “every few
hours” as already “strong enough” not to represent an error in complete
blocks.
table(tex)
## tex
## 0 1 2 3 4
## 439 27 40 13 106
summary(tex[tex>0])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 4.000 3.065 4.000 4.000
set.seed(2021)
texStrValm2k3<-optRandomParC(M=tex,preSpecM=2,k=3,rep=200,blocks=c("nul","com"),approach="val",nCores=0)
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 325 found.
texStrValm2k3
## Network size: 25
##
## Approachs (paramter): val
## Blocks (paramter)
## nul com
##
## Sizes of clusters:
## 1 2 3
## 8 9 8
##
## IM
## 1 2 3
## 1 nul nul nul
## 2 com com nul
## 3 nul nul com
##
## Error: 325
plot(texStrValm2k3)
funByBlocks(texStrValm2k3)
## 1 2 3
## 1 0.1428571 0.5972222 0.2812500
## 2 1.6944444 2.2500000 0.6111111
## 3 0.1562500 0.5416667 2.2142857
We have also tried regular equivalence with \(m=4\), but with little success.
## Regular equivalence, m = 4 ("does not work")
set.seed(2021)
texRegValm4k3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=c("nul","com","reg"),approach="val",nCores=0)
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 81 found.
plot(texRegValm4k3)
Pre-specified blockmodeling can be used with any generalized blockmodeling approach, but we demonstrate its use with valued blockmodeling with \(m=4\).
First we will pre-specify a model called “cohesive groups”. The model
assumes a complete blocks on the diagonal and null blocks elsewhere. In
case of one-relational network with only one allowed block type per
block, the specification can be a matrix (in general, it should be 4
dimensional array). This should be then supplied as a
blocks
paramter to optRandomParC
function.
Bellow we specify models models for 2-4 clusters and print the one for 4 groups. We also compute the error with 1 cluster (the whole network is one complte block) and with each unit in its own cluster (non-diagonal elements should then be empty).
IMcoh2<-ifelse(diag(2)==1, "com","nul")
IMcoh3<-ifelse(diag(3)==1, "com","nul")
IMcoh4<-ifelse(diag(4)==1, "com","nul")
IMcoh4
## [,1] [,2] [,3] [,4]
## [1,] "com" "nul" "nul" "nul"
## [2,] "nul" "com" "nul" "nul"
## [3,] "nul" "nul" "com" "nul"
## [4,] "nul" "nul" "nul" "com"
c(sum(4-tex)-4*nrow(tex),sum(tex))
## [1] 1830 570
First we fit the model with 2 clusters.
set.seed(2021)
texStrValm4coh2<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcoh2,approach="val",nCores=0)
##
##
## Optimization of all partitions completed
## 4 solution(s) with minimal error = 894 found.
table(clu(texStrValm4coh2), clu(texStrValm4coh2,which = 2))
##
## 1 2
## 1 12 1
## 2 2 10
We have found 4 equally well fitting partitions, but closer
inspections shows that the first two differ only in the order of groups.
With pre-specified blockmodeling, the optRandomParC
function things the order (or names) of clusters are important. By
setting switch.names = TRUE
we can tell the function that
this is not the case. Then we still get two equally well fitting
partitions, but they are in deed different.
set.seed(2021)
texStrValm4coh2<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcoh2,approach="val",nCores=0, switch.names = TRUE)
##
##
## Optimization of all partitions completed
## 2 solution(s) with minimal error = 894 found.
plot(texStrValm4coh2)
plot(texStrValm4coh2,which = 2)
table(clu(texStrValm4coh2), clu(texStrValm4coh2,which = 2))
##
## 1 2
## 1 12 1
## 2 2 10
Then we also try models for 3 and 4 clusters.
texStrValm4coh3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=IMcoh3,approach="val",nCores=0, switch.names = TRUE)
##
##
## Optimization of all partitions completed
## 3 solution(s) with minimal error = 626 found.
plot(texStrValm4coh3)
texStrValm4coh4<-optRandomParC(M=tex,preSpecM=4,k=4,rep=200,blocks=IMcoh4,approach="val",nCores=0, switch.names = TRUE)
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 520 found.
plot(texStrValm4coh4)
We can notice that the error is decreasing with number of clusters. We can check how long.
if(file.exists("texStrValm4cohList.Rdata")){
load("texStrValm4cohList.Rdata")
}else{
texStrValm4cohList<-list()
for(k in 2:15){
tmpIM<-ifelse(diag(k)==1, "com","nul")
texStrValm4cohList[[as.character(k)]]<-optRandomParC(M=tex,preSpecM=4,k=k,rep=100,blocks=tmpIM,approach="val",nCores=0)
}
save(texStrValm4cohList,file = "texStrValm4cohList.Rdata")
}
errCoh<-c("1"=sum(4-tex), sapply(texStrValm4cohList,err))
plot(errCoh,type="o")
abline(h=sum(tex))
plot(errCoh,type="o",ylim=c(min(errCoh),650))
abline(h=sum(tex))
Now we can try a bit more complex model, where in some blocks, several block types will be allowed. We will explore a core-perhiphey model. In this model, two clusters are assumed, where the first one is the core, which means that units within should be strongly connected to eachother. For this one, we assume a complete block. The second is a periphery, which units should not be conntected to each other at all. For this block, we should assume a null block. As different versions of the model allow different conections between the core and the perhiphery, we will allow different block types there.
Under structural equivalence, we want to allow both null and complete blocks in the off-diagonal blocks. We can spcify this as follows:
## Core-perhiphery - structural equivalence
# model:
# 1 2
# 1 com nul,com
# 2 nul,com nul
IMcpStr<-array(NA,dim=c(2,2,2))
IMcpStr[1,,]<-"nul"
IMcpStr[1,1,1]<-"com"
IMcpStr[2,2,1]<-"com"
IMcpStr[2,1,2]<-"com"
IMcpStr
## , , 1
##
## [,1] [,2]
## [1,] "com" "nul"
## [2,] NA "com"
##
## , , 2
##
## [,1] [,2]
## [1,] "nul" "nul"
## [2,] "com" NA
As 3 and 4 dimensional arrays are very hard to read, we have created a function to print this a bit nicely.
printBlocks(IMcpStr)
## 1 2
## 1 com nul,com
## 2 nul,com nul
Then we fit the model using again valued blockmodeling and \(m=4\).
texStrValm4CPstr<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcpStr,approach="val",nCores=0)
##
##
## Optimization of all partitions completed
## 1 solution(s) with minimal error = 496 found.
plot(texStrValm4CPstr)
texStrValm4CPstr
## Network size: 25
##
## Approachs (paramter): val
## Blocks (paramter)
## 1 2
## 1 com nul,com
## 2 nul,com nul
##
## Sizes of clusters:
## 1 2
## 7 18
##
## IM
## 1 2
## 1 com nul
## 2 nul nul
##
## Error: 496
Another option is to also allow regular block type in the off-diagonal blocks.
## Core-perhiphery - regular equivalence
# model:
# 1 2
# 1 com nul,com,reg
# 2 nul,com,reg nul
IMcpReg<-array(NA,dim=c(3,2,2))
IMcpReg[1,,]<-"nul"
IMcpReg[1,1,1]<-"com"
IMcpReg[2,2,1]<-"com"
IMcpReg[2,1,2]<-"com"
IMcpReg[3,2,1]<-"reg"
IMcpReg[3,1,2]<-"reg"
# for better idea what we have done
printBlocks(IMcpReg)
## 1 2
## 1 com nul,com,reg
## 2 nul,com,reg nul
texStrValm4CPreg<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcpReg,approach="val",nCores=0)
##
##
## Optimization of all partitions completed
## All 200 solutions have err 317
plot(texStrValm4CPreg)
err(texStrValm4CPreg)
## [1] 317
plot(texStrValm4CPreg, use.IM=FALSE)
Relative Fit can also be used to compare different blockmodels.
# The code below can be time consuming
if(file.exists("RFvec.Rdata")){
load("RFvec.Rdata")
}else {
set.seed(1)
RFvec<-c(
texStrValm4coh3 = RF(texStrValm4coh3)$RF,
texStrValm4coh4 = RF(texStrValm4coh4)$RF,
texStrValm4CPreg = RF(texStrValm4CPreg)$RF
)
save(RFvec, file = "RFvec.Rdata")
}
barplot(RFvec)
We can of course also think of other models that suit particular network.
For example, perhaps her the following model might make sense:
If the number of groups is not known in advance, it is usually beneficial to make a function that will create a model with certain number of groups.
makeMyIM<-function(k){
myIM<-array(NA,dim=c(3,k,k))
myIM[1,,]<-"nul"
myIM[cbind(1,1:(k-1),1:(k-1))]<-"com"
myIM[2,,]<-"com"
myIM[cbind(2,1:(k-1),1:(k-1))]<-NA
myIM[3,,]<-"reg"
myIM[cbind(3,1:k,1:k)]<-NA
return(myIM)
}
printBlocks(makeMyIM(4))
## 1 2 3 4
## 1 com nul,com,reg nul,com,reg nul,com,reg
## 2 nul,com,reg com nul,com,reg nul,com,reg
## 3 nul,com,reg nul,com,reg com nul,com,reg
## 4 nul,com,reg nul,com,reg nul,com,reg nul,com
k<-3; texStrValm4myIMk3<-optRandomParC(M=tex,preSpecM=4,k=k,rep=200,blocks=makeMyIM(3),approach="val",nCores=0, switch.names = TRUE)
##
##
## Optimization of all partitions completed
## 2 solution(s) with minimal error = 274 found.
plot(texStrValm4myIMk3)
printBlocks(IM(texStrValm4myIMk3))
## 1 2 3
## 1 com nul reg
## 2 nul com nul
## 3 reg nul nul
plot(texStrValm4myIMk3,which = 2)
printBlocks(IM(texStrValm4myIMk3,which = 2))
## 1 2 3
## 1 com reg reg
## 2 reg com reg
## 3 nul nul nul
table(clu(texStrValm4myIMk3),clu(texStrValm4myIMk3,which = 2))
##
## 1 2 3
## 1 4 3 0
## 2 1 2 2
## 3 1 2 10