Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Building a Statistical Significance Testing Web Service with R

DZone's Guide to

Building a Statistical Significance Testing Web Service with R

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

R is a programming language focused on solving statistical and mathematical calculations. R programs often operate on largein-memory data sets, which feels somewhat similar to database programming. Examples in the R Cookbook bear a resemblence to functional programming in clojure, as others have noted.

I’ve been exploring the language to gain insight into related, but disparate technologies that I use with regularity (e.g. Postgres), but for this to be really useful, I’d like to see R behind a webservice. Looking through the official website, there are many defunct attempts at using R in this manner, often abandoned once the maintainer finishes their masters.

A couple have survived, notably Rook and rApache. Rook is a web server inside of R, and rApache, as you might guess, is an Apache module that calls R. I’ve chosen rApache, as I’d like to have a battle-tested front-end for this – while R seems to have very committed maintainers, there do not seem to be very many of them, and I have yet to find examples of anyone running this as a production application.

Inspired by WolframAlpha’s APIs, I built a small web service to test statistical significance. In the future I intend to do tests on performance and security, as well as available JSON libraries.

Here is the installation procedure:

apt-get upgrade
apt-get update
apt-get install r-base r-base-dev
apt-get install apache2-mpm-prefork apache2-prefork-dev
apt-get install git-core
git clone https://github.com/jeffreyhorner/rapache.git
cd rapache
./configure
make
make test
make install
vi /etc/apache2/httpd.conf

Apache configuration settings:

 
LoadModule R_module /usr/lib/apache2/modules/mod_R.so
 
<Location /RApacheInfo>
SetHandler r-info
</Location>
 
ROutputErrors
 
<Directory /var/www/R>
        SetHandler r-script
        RHandler sys.source
</Directory>
/etc/init.d/apache2 restart

And these are the contents of ws.R:

 
setContentType("application/json")
 
zscore<-function(p, pc, N, Nc){ (p-pc)
     / sqrt(p * (1-p) / N + pc * (1-pc) / Nc) }
significant<-function(p, pc, N, Nc){
     zscore(p, pc, N, Nc) > 1.65 }
 
valid<-function(x){ nchar(x) < 10 }
 
if (!valid(GET$pc)
 || !valid(GET$p)
 || !valid(GET$N)
 || !valid(GET$Nc)) {
  cat('error:arg length')
} else {
cat(significant(as.numeric(GET$p),
                as.numeric(GET$pc),
                as.numeric(GET$N),
                as.numeric(GET$Nc)))
}
 
OK

For instance, the output of http://localhost:8080/R/ws.R?p=.15&pc=.10&N=1000&Nc=1100
is “TRUE”

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}