Over a million developers have joined DZone.

Use GDAL from R Console to Split Raster into Tiles

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

When working with raster datasets I often encounter performance issues caused by the large filesizes. I thus wrote up a little R function that invokes gdal_translate which would split the raster into parts which makes subsequent processing more CPU friendly. I didn't use built-in R functions simply because performance is much better when using gdal from the command line...

The screenshot to the left shows a raster in QGIS that was split into four parts with the below script.

## get filesnames (assuming the datasets were downloaded already.
## please see http://thebiobucket.blogspot.co.at/2013/06/use-r-to-bulk-download-digital.html
## on how to download high-resolution DEMs)
files <- dir(pattern = ".hgt")
## function for single file processing mind to replace the PATH to gdalinfo.exe!
## s = division applied to each side of raster, i.e. s = 2 gives 4 tiles, 3 gives 9, etc.
split_raster <- function(file, s = 2) {
    filename <- gsub(".hgt", "", file)
    gdalinfo_str <- paste0("\"C:/OSGeo4W64/bin/gdalinfo.exe\" ", file)
    # pick size of each side
    x <- as.numeric(gsub("[^0-9]", "", unlist(strsplit(system(gdalinfo_str, intern = T)[3], ", "))))[1]
    y <- as.numeric(gsub("[^0-9]", "", unlist(strsplit(system(gdalinfo_str, intern = T)[3], ", "))))[2]
    # t is nr. of iterations per side
    t <- s - 1
    for (i in 0:t) {
        for (j in 0:t) {
            # [-srcwin xoff yoff xsize ysize] src_dataset dst_dataset
            srcwin_str <- paste("-srcwin ", i * x/s, j * y/s, x/s, y/s)
            gdal_str <- paste0("\"C:/OSGeo4W64/bin/gdal_translate.exe\" ", srcwin_str, " ", "\"", file, "\" ", "\"", filename, "_", i, "_", j, ".tif\"")

## process all files and save to same directory
mapply(split_raster, files, 2) 

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.


Published at DZone with permission of Kay Cichini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}