Unit Testing in R
Unit Testing in R
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
R is a statistical programming language, with a strong focus on mathematical operations. When writing code that is math-heavy, unit testing becomes very appealing- while equations may look correct on paper, one minor error can ruin the output.
R programming is also different to CRUD or enterprise software in that the R in-memory data structures are often used in a similar fashion to a database. When used this way, it is more self-contained, and integration tests that require specific data set-up can be easier to manage than tests that require pre-configured data in a database (a requirement, for instance, when testing some parts of ETL scripts).
R has a unit testing package called RUnit, based on the JUnit 3.x APIs, which even includes code coverage. Like jUnit, test functions start with the word “test”, and there are startup/teardown methods.
Unlike jUnit and nUnit, there is no IDE integration, nor is there a specialized tool – it is simply run through the R REPL, which gives you some control at the expense of convenience. Unfortunately there are no one-click installs with CI servers like Jenkins- if you wish to run tests automatically and track the results over time, you have to figure out some command line integration.
Unit testing is not designed for fuzzy operations that have natural failures; for a machine learning exercise you ideally may want to track accuracy, recall, precision, etc. This package may be a useful starting point for that, but would require custom development on top to be really valuable for these types of problem.
The following code will run a test suite,
source('chords.r') test.suite <- defineTestSuite("example", dirs = file.path("tests"), testFileRegexp = '^.+\\.r
This prints a result like this:
Number of test functions: 168 Number of errors: 2 Number of failures: 166
By default R does not print stack traces when there is an error, so the following may be helpful:
RUnit provides a series of “check” functions, for test result assertions:
checkTrue(x > 0)
The results are available as text or HTML, or you can inspect the R object, for custom output.
printTextProtocol(test.result) str(test.result) List of 1 $ example:List of 8 ..$ nTestFunc : num 168 ..$ nDeactivated : int 0 ..$ nErr : num 2 ..$ nFail : num 166 ..$ dirs : chr "tests" ..$ testFileRegexp : chr "^.+\\.r$" ..$ testFuncRegexp : chr "^test.+" ..$ sourceFileResults:List of 1 .. ..$ tests/wiki-chords.r:List of 168 .. .. ..$ testA :List of 4 .. .. .. ..$ kind : chr "failure" .. .. .. ..$ msg : chr "Error in checkTrue(cmp) : Test not TRUE\n\n" .. .. .. ..$ checkNum : num 1 .. .. .. ..$ traceBack: NULL .. .. ..$ testA7 :List of 4 .. .. .. ..$ kind : chr "failure" .. .. .. ..$ msg : chr "Error in checkTrue(cmp) : Test not TRUE\n\n" .. .. .. ..$ checkNum : num 1 .. .. .. ..$ traceBack: NULL
The full test suite for this example is on Github, as it’s a bit long for a blog post.
Published at DZone with permission of Gary Sieling , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.