Over a million developers have joined DZone.

Installing NetCDF and R 'ncdf'

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

If you work with large, gridded datasets, you should probably be using NetCDF, the Network Common Data Form from Unidata:

NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.


Lots of high-end analysis software can be made to support NetCDF, and it is indispensable for working with gridded datasets that weigh in at tens of gigabytes or more. This brief post describes the easiest way to install the NetCDF libraries and the R ‘ncdf’ package on our favorite systems: CentOS, Ubuntu and Mac OSX.

CentOS 6.x

CentOS is the operating system of choice if you want a free, robust, open-source server to host your scientific analysis. It is basically an unbranded clone of Red Hat Enterprise Linux. The following instructions worked on CentOS 6.2.

Installing System Libraries

First, go to http://fedoraproject.org/wiki/EPEL and check for the latest version of the Extended Packages for Enterprise Linux (which contains NetCDF, HDF and many other useful packages). The latest version should be specified on this page. To download a local copy of EPEL and install NetCDF from it, just execute the following commands:

sudo wget http://mirror.metrocast.net/fedora/epel/6/i386/epel-release-6-8.noarch.rpm
sudo rpm -Uvh epel-release-6-8.noarch.rpm
sudo yum --assumeyes install netcdf
sudo yum --assumeyes install netcdf-devel
Note that adding EPEL as a package archive in the second line does not automatically install all the packages in EPEL. We had to manually install netcdf and netcdf-dev. A list of other available packages is given at the EPEL wiki given above.

Installing R Package ‘ncdf’

With the libraries in place, we can now install the ncdf package for our favorite statistical package — R.

sudo wget http://cran.r-project.org/src/contrib/ncdf_1.6.6.tar.gz
sudo R CMD INSTALL --configure-args="--with-netcdf-include=/usr/include --with-netcdf-lib=/usr/lib" ncdf_1.6.6.tar.gz

Ubuntu 12.04

Ubuntu is easy to install and has a great user interface for linux systems. Ubuntu 12.0.4 is the most recent Long-Term-Stable release.

Installing System Libraries

The following instructions worked on Ubuntu 12.04 LTS. To install NetCDF libraries that allow reading, writing and manipulation, use apt-get, rather than downloading the source files and installing them yourself.  To install, open a terminal and type:

sudo apt-get install netcdf

Installing R package ‘ncdf’

The base version of R on Ubuntu 12.04 slt is 2.14.1. Unfortunately, clicking the install button in RStudio and typing 'ncdf' will only work at the user level. The package will not be installed for all users or even show up in all of your RStudio projects. To install ncdf tools in the global library you must start up R as root and use the following command:

install.packages(repos=c('http://cran.fhcrc.org/'),pkgs=c('ncdf'),lib="/usr/lib/R/site-library/")

‘http://cran.fhcrc.org/’ should be replaced by whichever CRAN mirror is closest to you.

OSX 10.8.4

Macs run OSX, which is Unix based. The following instructions worked on OSX 10.8.4 — Mountain Lion.

Installing System Libraries

The absolute easiest way to install NetCDF on a mac requires Macports. Macports is a software package designed to make installing and compiling software easy. Macports .pkg and installation instructions are available here.

Once Macports is installed, building and installing NetCDF libraries is a one-step job.

sudo port install netcdf
More details and instructions for installing Fortran and Python APIs are available here.

Installing R Package ‘ncdf’

Having NetCDF command line tools is not required to use the ncdf R package. Simply download the package from CRAN (link), or by clicking on the “install packages” button in RStudio. This package allows reading, writing and manipulation of existing .nc files.

However, the package’s ability to view the content of nc files before loading them into the R workspace is limited. For this reason, installing the NetCDF tools outlined in the first section of this post is extremely important. Command line tools such as “ncdump” are crucial to effectively working with NetCDF files.



Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:

Published at DZone with permission of Jonathan Callahan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}