Over a million developers have joined DZone.

Networks With R

DZone's Guide to

Networks With R

Padgett Florentine's wedding dataset is quite fascinating — and it can be used to help you understand how to create networks with R.

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

In order to practice with network data with R, we have been playing with the Padgett (1994) Florentine’s wedding dataset (discussed in the lecture). The dataset is available here:

> library ( network )
> data(flo)
> nflo plot(nflo, displaylabels = TRUE,
+ boxed.labels =

The next step was to move from the network package to igraph. Since we have the adjacency matrix, we can use it:

> iflo=graph_from_adjacency_matrix(flo,
+ mode = "undirected")
> plot(iflo)

The good thing is that a lot of functions are available. For instance, we can get shortest paths between two specific nodes. And we can give appropriate colors to the nodes that we’ll cross:

> AP=all_shortest_paths(iflo,
+ from="Peruzzi",
+ to="Ginori")
> L=AP$res[[1]]
> V(iflo)$color="yellow"
> V(iflo)$color[L[2:4]]="light blue"
> V(iflo)$color[L[c(1,5)]]="blue"
> plot(iflo)

We can also visualize edges, but I found it slightly more complicated (to extract edges from the output)

> liens=c(paste(as.character(L)[1:4],
+ "--",
+ as.character(L)[2:5],sep=""),
+ paste(as.character(L)[2:5],
+ "--",
+ as.character(L)[1:4],sep=""))
> df=as.data.frame(ends(iflo,E(iflo)))
> names(df)=c("src","target")
> lstn=sort(unique(c(as.character(df[,1]),as.character(df[,2]),"Pucci")))
> Eliens=paste(as.numeric(factor(df[,1],levels=lstn)),"--",
+ as.numeric(factor(df[,2],levels=lstn)),sep="")
> EU=unlist(lapply(Eliens,function(x) x%in%liens))
> E(iflo)$color=c("grey","black")[1+EU]
> plot(iflo)

But it works. It is also possible to use some D3js visualization:

> library( networkD3 )
> simpleNetwork (df)

Then, the next question was to add a vertice to the network. The most simple way to do it is probability through the adjacency matrix:

> flo2=flo
> flo2["Pucci","Bischeri"]=1
> flo2["Bischeri","Pucci"]=1
> nflo2 plot(nflo2, displaylabels = TRUE,
+ boxed.labels =

Then, we’ve been playing with centrality measures:

> plot(iflo,vertex.size=betweenness(iflo))

The goal was to see how related they were. Here, for all of them, “Medici” is the central node. But what about the others?

> B=betweenness(iflo)
> C=closeness(iflo)
> D=degree(iflo)
> E=eigen_centrality(iflo)$vector
> base=data.frame(betw=B,close=C,deg=D,eig=E)
> cor(base)
betw close deg eig
betw 1.0000000 0.5763487 0.8333763 0.6737162
close 0.5763487 1.0000000 0.7572778 0.7989789
deg 0.8333763 0.7572778 1.0000000 0.9404647
eig 0.6737162 0.7989789 0.9404647 1.0000000

Those measures are quite correlated. It is also possible to use a hierarchical graph to visualize how close those centrality measures can be:

> H=hclust(dist(t(base)),
+ method="ward")
> plot(H)

Instead of looking at values of centrality measures, it is possible to looks are ranks

> rbase=base
> for(i in 1:4) rbase[,i]=rank(base[,i])
> H=hclust(dist(t(rbase)),
+ method="ward")
> plot(H)

Here the eigenvector measure is very close to the degree of vertices.

Finally, it is possible to seek clusters (in the context of coalition here, in case a war should start between those families): > kc <- fastgreedy.community ( iflo ).

Here, we have three classes (+1 for the node that is disconnected from the other families):

> V(iflo)$color=c("yellow","orange",
+ "light blue")[membership ( kc )]
> plot(iflo)

> plot(kc,iflo):

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

big data ,r ,data visualization ,tutorial ,igraph

Published at DZone with permission of Arthur Charpentier, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}