Networks With R
Networks With R
Padgett Florentine's wedding dataset is quite fascinating — and it can be used to help you understand how to create networks with R.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
> library ( network ) > data(flo) > nflo plot(nflo, displaylabels = TRUE, + boxed.labels = + FALSE)
The next step was to move from the network package to igraph. Since we have the adjacency matrix, we can use it:
> iflo=graph_from_adjacency_matrix(flo, + mode = "undirected") > plot(iflo)
The good thing is that a lot of functions are available. For instance, we can get shortest paths between two specific nodes. And we can give appropriate colors to the nodes that we’ll cross:
> AP=all_shortest_paths(iflo, + from="Peruzzi", + to="Ginori") > L=AP$res[] > V(iflo)$color="yellow" > V(iflo)$color[L[2:4]]="light blue" > V(iflo)$color[L[c(1,5)]]="blue" > plot(iflo)
We can also visualize edges, but I found it slightly more complicated (to extract edges from the output)
> liens=c(paste(as.character(L)[1:4], + "--", + as.character(L)[2:5],sep=""), + paste(as.character(L)[2:5], + "--", + as.character(L)[1:4],sep="")) > df=as.data.frame(ends(iflo,E(iflo))) > names(df)=c("src","target") > lstn=sort(unique(c(as.character(df[,1]),as.character(df[,2]),"Pucci"))) > Eliens=paste(as.numeric(factor(df[,1],levels=lstn)),"--", + as.numeric(factor(df[,2],levels=lstn)),sep="") > EU=unlist(lapply(Eliens,function(x) x%in%liens)) > E(iflo)$color=c("grey","black")[1+EU] > plot(iflo)
But it works. It is also possible to use some D3js visualization:
> library( networkD3 ) > simpleNetwork (df)
Then, the next question was to add a vertice to the network. The most simple way to do it is probability through the adjacency matrix:
> flo2=flo > flo2["Pucci","Bischeri"]=1 > flo2["Bischeri","Pucci"]=1 > nflo2 plot(nflo2, displaylabels = TRUE, + boxed.labels = + FALSE)
Then, we’ve been playing with centrality measures:
The goal was to see how related they were. Here, for all of them, “Medici” is the central node. But what about the others?
> B=betweenness(iflo) > C=closeness(iflo) > D=degree(iflo) > E=eigen_centrality(iflo)$vector > base=data.frame(betw=B,close=C,deg=D,eig=E) > cor(base) betw close deg eig betw 1.0000000 0.5763487 0.8333763 0.6737162 close 0.5763487 1.0000000 0.7572778 0.7989789 deg 0.8333763 0.7572778 1.0000000 0.9404647 eig 0.6737162 0.7989789 0.9404647 1.0000000
Those measures are quite correlated. It is also possible to use a hierarchical graph to visualize how close those centrality measures can be:
> H=hclust(dist(t(base)), + method="ward") > plot(H)
Instead of looking at values of centrality measures, it is possible to looks are ranks
> rbase=base > for(i in 1:4) rbase[,i]=rank(base[,i]) > H=hclust(dist(t(rbase)), + method="ward") > plot(H)
Here the eigenvector measure is very close to the degree of vertices.
Finally, it is possible to seek clusters (in the context of coalition here, in case a war should start between those families):
> kc <- fastgreedy.community ( iflo ).
Here, we have three classes (+1 for the node that is disconnected from the other families):
> V(iflo)$color=c("yellow","orange", + "light blue")[membership ( kc )] > plot(iflo)
Published at DZone with permission of Arthur Charpentier , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.