Over a million developers have joined DZone.

Data Visualization with R: RGoogleMaps and Toronto Open Data

DZone's Guide to

Data Visualization with R: RGoogleMaps and Toronto Open Data

· Big Data Zone ·
Free Resource

The Architect’s Guide to Big Data Application Performance. Get the Guide.

Before my daughter was born, I thought that my wife and I would have to send her to a licensed child care centre somewhere in Toronto. I had heard over and over how long of a waiting list I should expect the centre to have, and so we’d better get her registered nice and early!  Well, it turns out that we found an excellent unlicensed home day care which she’s been at for two years now.  So when I recently went on Toronto Open Data’s website and found a dataset of licensed child care centres throughout Toronto, I thought I might have a fun time analyzing a topic that I thankfully have not had to deal with thus far!

If you look in the dataset (or in the documentation for the dataset) you’ll see that it contains names, addresses, phone info, building type, number of spaces in the daycare (broken down into age categories, and then totalled up) and unprojected latitude/longitude coordinates.  This dataset literally begged me to map it, but it also begged me to use one of the number of spaces variables in a map as well!

The process which I used to create the maps is very similar to the maps I made when I was analyzing the Toronto Casino Feedback Form (Survey), except in the maps I’ve put in this post, the dots are bigger or smaller depending on the percentiles of a quantitative variable (in this case the total number of spaces in the child care centres of a particular building type).  You can find the R code I used to generate these maps and stats at the bottom of this post.

This is meant strictly as an exploratory exercise.  To provide further informative clarity for this exercise, I’ve created multiple maps where each map shows locations of child care centres from one building type (e.g. Places of Worship, Public Elementary Schools, High Rise Buildings, etc.).  As I describe each map, I’ll also refer to descriptive stats that I’ve calculated and displayed at the bottom of this post (after all of the R code).  If you look at the table at the bottom of this post, you’ll notice there were more building types than I’ve mapped here.  That’s because I didn’t feel like mapping everything, only some of the most popular ones :)

Public Elementary Schools are by far the most popular type of building in which the licensed child care centres in this dataset are found (279 centres, according to the data).  Looking at the map below, you can see that there is a very dense cluster of public elementary school child care centres in the core of the GTA (down-town Toronto, and North York).  As you go west towards Etobicoke and Rexdale, you definitely see fewer centres, and then in Scarborough you also see few centres, but they appear to be more dispersed and less clustered than in the other areas.  There’s a lot of variability in the number of spaces in these child care centres, ranging from a minimum of 15 to a maximum of 217, with the average number of spaces per public elementary school child care centres being around 74 spaces per centre.

Public Elementary Schools

Places of Worship, as you can see below, are far less numerous than their public elementary school counterparts, with only 116 registered in this dataset.  The first thing that I noticed with this map is that most of the small dots (thus the smaller child care centres in places of worship) seem to fall in the south of the GTA, rather than the north.  I suppose that makes sense to me in an analogical kind of a way.  In the north of the GTA (where I live) a lot of the businesses are big chains that seek to serve as many people as possible, whereas downtown there are a lot of smaller businesses that serve a niche market.  Perhaps it’s a similar story with child care centres in places of worship as well.

On a side note, the vast majority of the places of worship mentioned in this dataset were of Christian or Catholic denominations.  I was perhaps surprised not to find too many synagogues in there, but that might just be my bias speaking!

Child care centres in places of worship ranged from having a minimum of 8 spaces to a maximum of 167 with an average of about 48 spaces per centre.
places of worship

High Rise child care centresseem to show a pretty distinctive geographical pattern, as you can see below!  They seem to either be in the east of Toronto, near or beyond highway 404, or the west of Toronto, most of them beyond Allen Road/Dufferin Street.  I wonder what accounts for what looks like this hole in the middle!?  Also, you’ll notice that many of the smaller high rise child care centres are in the east of Toronto, rather than the West.  High rise child care centres range from having a minimum of 20 spaces to a maximum of 145, with an average of about 69 spaces per centre.  You’ll notice that the minimum number of spaces is higher than other categories, probably accounted for by the fact that they are likely serving many residents in their own high rise building!
High Rises

Purpose Buildings, or buildings that were created with the child care centre in mind, are fairly sparse throughout Toronto, with only 58 in the dataset.  In terms of clusters here, it almost looks like you could delineate 4 clusters of buildings: North, South, East, and West.  Purpose buildings range from having a minimum of 20 spaces to a maximum of 165 with an average of about 72 spaces per centre (however median is 60, suggesting that there are a few really big ones in there, relative to all the rest).  Outliers aside, it seems that purpose buildings are like high rise daycares, in that they are meant for higher capacity than other centres.
Purpose Buildings

Community and Recreation Centres with child careseem to almost show a circle pattern in how they are laid out around the GTA.  The obvious exceptions are in Scarborough, which seems to have very few community and recreation centres with child care compared to what’s going on in the west.  Perhaps mirroring the phenomenon we saw with child care centres in places of worship, a lot of the smaller community and recreation centres with child care are in the south, whereas the north is the domain of bigger centres.  These centres range from having a minimum of 13 spaces to a maximum of 146, with an average of about 64 spaces per centre.
Community and Recreation Centres

Although child care centres in Houses seem to show a very random looking pattern, I can’t help but notice that there are more than a few centres within a close proximity to the Go Train tracks emanating from Union Station.  Perhaps there’s an interesting story there, or maybe I’m just seeing patterns than don’t exactly mean anything (after all, there are just 38 of these places registered in the dataset!).  Child care centres in houses range from having a minimum of 10 spaces to a maximum of 116, with an average of about 50 spaces per centre.  I do have to wonder how exactly these houses fit so many kids.  Seeing as how we are living in a post google street view world, you can just look at whatever house you want in living colour by typing in the address.  You can see how big on of these houses is (it has 87 spaces!) below the following map.

Wow!  It’s not a full picture, but you get the sense that the house really is quite big!

Well, so concludes my foray into the world of licensed child care centres.  If you have any commentary to add regarding these results, or can show me a better way of mapping them (although I do like RGoogleMaps), then by all means leave me a comment!  R code is shared below.


addTrans <- function(color,trans)
  # This function adds transparancy to a color.
  # Define transparancy with an integer between 0 and 255
  # 0 being fully transparant and 255 being fully visable
  # Works with either color and trans a vector of equal length,
  # or one of the two of length 1.
  if (length(color)!=length(trans)&!any(c(length(color),length(trans))==1)) stop("Vector lengths not correct")
  if (length(color)==1 & length(trans)>1) color <- rep(color,length(trans))
  if (length(trans)==1 & length(color)>1) trans <- rep(trans,length(color))
  num2hex <- function(x)
    hex <- unlist(strsplit("0123456789ABCDEF",split=""))
  rgb <- rbind(col2rgb(color),trans)
  res <- paste("#",apply(apply(rgb,2,num2hex),2,paste,collapse=""),sep="")

childcare = read.csv.ffdf(file="child-care.csv", first.rows=500,next.rows=500,colClasses=NA,header=TRUE)
pcodes = read.csv.ffdf(file="zipcodeset.txt", first.rows=50000, next.rows=50000, colClasses=NA, header=FALSE)

childcare$PCODE_R = as.ff(as.factor(sub(" ","", childcare[,"PCODE"])))
names(pcodes) = c("PCODE","Lat","Long","City","Prov")

childcare = merge(childcare, as.ffdf(pcodes[,1:3]), by.x="PCODE_R", by.y="PCODE", all.x=TRUE)

childcare.gc = subset(childcare, !is.na(Lat))
childcare.worship = subset(childcare.gc, bldg_type == "Place of Worship")
childcare.house = subset(childcare.gc, bldg_type == "House")
childcare.community = subset(childcare.gc, bldg_type == "Community/Recreation Centre")
childcare.pschool = subset(childcare.gc, bldg_type == "Public Elementary School")
childcare.highrise = subset(childcare.gc, bldg_type == "High Rise Apartment")
childcare.purpose = subset(childcare.gc, bldg_type == "Purpose Built")

Fn = ecdf(childcare.worship[,"TOTSPACE"])
childcare.worship$TOTSPACE.pct = as.ff(Fn(childcare.worship[,"TOTSPACE"]))
mymap = MapBackground(lat=childcare.worship[,"Lat"], lon=childcare.worship[,"Long"])
PlotOnStaticMap(mymap, childcare.worship[,"Lat"], childcare.worship[,"Long"], cex=childcare.worship[,"TOTSPACE.pct"]*4, pch=21, bg=addTrans("purple",100))

Fn = ecdf(childcare.house[,"TOTSPACE"])
childcare.house$TOTSPACE.pct = as.ff(Fn(childcare.house[,"TOTSPACE"]))
mymap = MapBackground(lat=childcare.house[,"Lat"], lon=childcare.house[,"Long"])
PlotOnStaticMap(mymap, childcare.house[,"Lat"], childcare.house[,"Long"], cex=childcare.house[,"TOTSPACE.pct"]*4, pch=21, bg=addTrans("purple",100))

Fn = ecdf(childcare.community[,"TOTSPACE"])
childcare.community$TOTSPACE.pct = as.ff(Fn(childcare.community[,"TOTSPACE"]))
mymap = MapBackground(lat=childcare.community[,"Lat"], lon=childcare.community[,"Long"])
PlotOnStaticMap(mymap, childcare.community[,"Lat"], childcare.community[,"Long"], cex=childcare.community[,"TOTSPACE.pct"]*4, pch=21, bg=addTrans("purple",100))

Fn = ecdf(childcare.pschool[,"TOTSPACE"])
childcare.pschool$TOTSPACE.pct = as.ff(Fn(childcare.pschool[,"TOTSPACE"]))
mymap = MapBackground(lat=childcare.pschool[,"Lat"], lon=childcare.pschool[,"Long"])
PlotOnStaticMap(mymap, childcare.pschool[,"Lat"], childcare.pschool[,"Long"], cex=childcare.pschool[,"TOTSPACE.pct"]*4, pch=21, bg=addTrans("purple",100))

Fn = ecdf(childcare.highrise[,"TOTSPACE"])
childcare.highrise$TOTSPACE.pct = as.ff(Fn(childcare.highrise[,"TOTSPACE"]))
mymap = MapBackground(lat=childcare.highrise[,"Lat"], lon=childcare.highrise[,"Long"])
PlotOnStaticMap(mymap, childcare.highrise[,"Lat"], childcare.highrise[,"Long"], cex=childcare.highrise[,"TOTSPACE.pct"]*4, pch=21, bg=addTrans("purple",100))

Fn = ecdf(childcare.purpose[,"TOTSPACE"])
childcare.purpose$TOTSPACE.pct = as.ff(Fn(childcare.purpose[,"TOTSPACE"]))
mymap = MapBackground(lat=childcare.purpose[,"Lat"], lon=childcare.purpose[,"Long"])
PlotOnStaticMap(mymap, childcare.purpose[,"Lat"], childcare.purpose[,"Long"], cex=childcare.purpose[,"TOTSPACE.pct"]*4, pch=21, bg=addTrans("purple",100))

space.by.bldg_type = ddply(as.data.frame(childcare.gc), .(bldg_type), function (x) c(min.space = min(x[,"TOTSPACE"], na.rm=TRUE), average.space = mean(x[,"TOTSPACE"], na.rm=TRUE), median.space = median(x[,"TOTSPACE"], na.rm=TRUE), max.space = max(x[,"TOTSPACE"], na.rm=TRUE), tot_daycares = sum(!is.na(x[,"TOTSPACE"]))))
space.by.bldg_type = space.by.bldg_type[order(-space.by.bldg_type$tot_daycares),]

                                bldg_type min.space average.space median.space max.space tot_daycares
18               Public Elementary School        15      74.19355         69.0       217          279
17                       Place of Worship         8      48.46552         44.0       167          116
16                                  Other        14      51.17647         48.5       160          102
1              Catholic Elementary School        16      51.50000         49.5       112           76
9                     High Rise Apartment        20      68.56522         62.0       145           69
22                          Purpose Built        20      72.48276         59.5       165           58
8             Community/Recreation Centre        13      63.73333         60.0       146           45
11                                  House        10      49.84211         44.5       116           38
6                     Commercial Building        16      55.95833         51.5       129           24
15                        Office Building        20      69.69565         64.0       162           23
20                     Public High School        16      42.36842         41.0        60           19
21                 Public School (Closed)        22      70.26667         56.0       180           15
4                                  Church        13      51.90909         46.0       148           11
19      Public Elementary School (French)        36      84.71429         70.0       167            7
23                              Synagogue        24      64.00000         61.0       108            7
7            Community College/University        15      55.16667         59.5        78            6
14                     Low Rise Apartment        15      56.00000         62.0        92            6
2      Catholic Elementary School(French)        39      81.20000         76.0       130            5
5  City owned Community/Recreation Centre        28      65.80000         62.0       103            5
3                    Catholic High School        36      51.50000         54.0        62            4
12                                 HUMSRV        45      52.00000         52.0        59            2
13                    Industrial Building        45     109.00000        109.0       173            2
26              Private Elementary School        20     154.50000        154.5       289            2
10                 Hospital/Health Centre        25      25.00000         25.0        25            1
24                                              109     109.00000        109.0       109            1
25            Coomunity/Recreation Centre       156     156.00000        156.0       156            1
27                   Public Middle School        10      10.00000         10.0        10            1

Learn how taking a DataOps approach will help you speed up processes and increase data quality by providing streamlined analytics pipelines via automation and testing. Learn More.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}