# Non-Uniform Population Density in European Countries

### Statistical Analysis using R of population density in cities of France and how it compares to that or its surrounding nations.

Join the DZone community and get the full member experience.

Join For FreeA few months ago, I mentioned that France was a country with strong inequalities, especially when you look at higher education and research teams. Paris has almost 50% of CNRS researchers, while only 3% of the population lives there.

CNRS, "répartition des chercheurs en SHS" http://t.co/39dcJJBwrF, Paris 47.52% IdF 66.85% (pop 3.39% et 18.18% resp) pic.twitter.com/OsEXiFywPf

— Arthur Charpentier (@freakonometrics) 28 septembre 2015

It looks like Paris is the only city in France. And I wanted to check that, indeed, France is a country with strong inequalities when we look at population density.

Using data from sedac.ciesin.columbia.edu, it is possible to get population density on a small granular level:

```
> rm(list=ls())
> base=read.table(
+ "/home/charpentier/glp00ag.asc",
+ skip=6)
> X=t(as.matrix(base,ncol=8640))
> X=X[,ncol(X):1]
```

The scales for latitudes and longitudes can be obtained from the text file:

```
> #ncols 8640
> #nrows 3432
> #xllcorner -180
> #yllcorner -58
> #cellsize 0.0416666666667
```

Hence, we have:

```
> library(maps)
> world=map(database="world")
> vx=seq(-180,180,length=nrow(X)+1)
> vx=(vx[2:length(vx)]+vx[1:(length(vx)-1)])/2
> vy=seq(-58,85,length=ncol(X)+1)
> vy=(vy[2:length(vy)]+vy[1:(length(vy)-1)])/2
```

If we plot our density, as in a previous post, on Where People Live:

```
> I=seq(1,nrow(X),by=10)
> J=seq(1,ncol(X),by=10)
> image(vx[I],vy[J],log(1+X[I,J]),
+ col=rev(heat.colors(101)))
> lines(world[[1]],world[[2]])
```

We can see that we have a match between the big population matrix, and polygons of countries.

Consider France, for instance. We can download the contour polygon with higher precision,

```
> library(rgdal)
> fra=download.file( "http://biogeo.ucdavis.edu/data/gadm2.8/rds/FRA_adm0.rds", + "fr.rds")
> Fra=readRDS("fr.rds")
> n=length(Fra@polygons[[1]]@Polygons)
> L=rep(NA,n) > for(i in 1:n) L[i]=nrow(Fra@polygons[[1]]@Polygons[[i]]@coords)
> idx=which.max(L)
> polygon_Fr= + Fra@polygons[[1]]@Polygons[[idx]]@coords
> min_poly=apply(polygon_Fr,2,min)
> max_poly=apply(polygon_Fr,2,max)
> idx_i=which((vx>min_poly[1])&(vx<max_poly[1]))
> idx_j=which((vy>min_poly[2])&(vy<max_poly[2]))
> sub_X=X[idx_i,idx_j]
> image(vx[idx_i],vy[idx_j],
+ log(sub_X+1),col=rev(heat.colors(101)),
+ xlab="",ylab="")
> lines(polygon_Fr)
```

We are now able to extract information about the population for France only (actually, it is only mainland France, islands are not considered here to avoid complicated computations):

```
> library(sp)
> xy=expand.grid(x = vx[idx_i], y = vy[idx_j])
> dim(xy) [1] 65730 2
```

Here, we have 65,730 small squares, in France.

```
> pip=point.in.polygon(xy[,1],xy[,2],
+ polygon_Fr[,1],polygon_Fr[,2])>0
> dim(pip)=dim(sub_X)
> Fr=sub_X[pip]
> sum(Fr) [1] 58105272
```

Observe that the total population within the French polygon is close to 60 million people, which is consistent with actual figures. Now, if we look more carefully at repartition over the French territory

```
> library(ineq)
> Gini(Fr) [1] 0.7296936
```

The Gini coefficient is rather high (over 70%), but it is also possible to visualize Lorenz curve,

`> plot(Lc(Fr))`

Observe that in 5% of the territory, we can find almost 54% of the population:

`> 1-min(LcF$L[LcF$p>.95]) [1] 0.5462632`

In order to compare with other countries, we consider the following:

```
> LC=function(rds="fr.rds"){
+ Fra=readRDS(rds)
+ n=length(Fra@polygons[[1]]@Polygons)
+ L=rep(NA,n)
+ for(i in 1:n) L[i]=nrow(Fra@polygons[[1]]@Polygons[[i]]@coords)
+ idx=which.max(L)
+ polygon_Fr=
+ Fra@polygons[[1]]@Polygons[[idx]]@coords
+ min_poly=apply(polygon_Fr,2,min)
+ max_poly=apply(polygon_Fr,2,max)
+ idx_i=which((vx>min_poly[1])&(vx<max_poly[1]))
+ idx_j=which((vy>min_poly[2])&(vy<max_poly[2]))
+ sub_X=X[idx_i,idx_j]
+ xy=expand.grid(x = vx[idx_i], y = vy[idx_j])
+ dim(xy)
+ pip=point.in.polygon(xy[,1],xy[,2],
+ polygon_Fr[,1],polygon_Fr[,2])>0
+ dim(pip)=dim(sub_X)
+ Fr=sub_X[pip]
+ return(list(gini=Gini(Fr),LC=Lc(Fr))
+ }
> FRA=LC()
```

For instance, consider Germany or Italy:

```
> deu=download.file( "http://biogeo.ucdavis.edu/data/gadm2.8/rds/DEU_adm0.rds","deu.rds")
> DEU=LC("deu.rds")
> ita=download.file( "http://biogeo.ucdavis.edu/data/gadm2.8/rds/ITA_adm0.rds","ita.rds")
> ITA=LC("ita.rds")
```

It is possible to plot Lorenz curves together:

```
> plot(FRA$LC,col="blue")
> lines(DEU$LC,col="black")
> lines(ITA$LC,col="red")
```

Observe that France is clearly below the other ones. Compared with Germany, there is a significant difference:

```
> FRA$gini
[1] 0.7296936
> DEU$gini
[1] 0.5088853
```

More precisely, 54% of French people, 40% of Italians, and 32% of Germans live in 5% of their territories.

```
> 1-min(FRA$LC$L[FRA$LC$p>.95])
[1] 0.5462632
> 1-min(ITA$LC$L[ITA$LC$p>.95])
[1] 0.3933227
> 1-min(DEU$LC$L[DEU$LC$p>.95])
[1] 0.3261124
```

Published at DZone with permission of Arthur Charpentier, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Comments