Game of Friendship Paradox
Game of Friendship Paradox
The paradox is that your friends probably have more friends than you. We take a closer look at this head-scratcher and create some data visualization using code.
Join the DZone community and get the full member experience.Join For Free
Cloudera Data Flow, the answer to all your real-time streaming data problems. Manage your data from edge to enterprise with a no-code approach to developing sophisticated streaming applications easily. Learn more today.
In the introduction of my course next week, I will (briefly) mention networks, and I wanted to provide some illustration of the Friendship Paradox. On network of thrones (discussed in Beveridge and Shan (2016)), there is a dataset with the network of characters in Game of Thrones. The word “friend” might be abusive here, but let’s continue to call connected nodes “friends.” The friendship paradox states that:
People, on average, have fewer friends than their friends.
download.file("https://www.macalester.edu/~abeverid/data/stormofswords.csv","got.csv") GoT=read.csv("got.csv") library(networkD3) simpleNetwork(GoT[,1:2])
Because it is difficult for me to incorporate some d3.js scripts in the post, I will illustrate this with a more basic graph:
Consider a vertex in the undirected graph (with classical graph notations), and let denote the number of edges touching it (i.e., has friends). The average number of friends of a random person in the graph is:
The average number of friends that a typical friend has is:
Note that this can be related to the variance decomposition:
(Jensen inequality). But let us get back to our network. The list of nodes is:
And we each of them, we can get the list of friends, and the number of friends:
friends = function(x) as.character(M[which(M[,1]==x),2]) nb_friends = Vectorize(function(x) length(friends(x)))
As well as the number of friends our friends have, and the average number of friends.
friends_of_friends = function(y) (Vectorize(function(x) length(friends(x)))(friends(y))) nb_friends_of_friends = Vectorize(function(x) mean(friends_of_friends(x)))
We can look at the density of the number of friends, for a random node.
Nb = nb_friends(nodes) Nb2 = nb_friends_of_friends(nodes) hist(Nb,breaks=0:40,col=rgb(1,0,0,.2),border="white",probability = TRUE) hist(Nb2,breaks=0:40,col=rgb(0,0,1,.2),border="white",probability = TRUE,add=TRUE) lines(density(Nb),col="red",lwd=2) lines(density(Nb2),col="blue",lwd=2)
And we can also compute the averages, just to check:
mean(Nb)  6.579439 mean(Nb2)  13.94243
So, indeed, people on average have fewer friends than their friends.
Published at DZone with permission of Arthur Charpentier , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.