DZone
Web Dev Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Web Dev Zone > An Introduction to the R Language

An Introduction to the R Language

Giorgio Sironi user avatar by
Giorgio Sironi
·
Feb. 02, 12 · Web Dev Zone · Interview
Like (0)
Save
Tweet
8.98K Views

Join the DZone community and get the full member experience.

Join For Free
R is a language for statistical computing. In a world of big data and scientific approaches to startup ideas, you can have the advantage of a tool in your box for statistical analysis and mathematical computations that is more powerful than a general purpose language.

Yet another language?

R is oriented to mathematics instead of general purpose computation, and has many similarities with Matlab and Octave; for example, it is accessible to people not having a computer science background. Moreover, it is open source.

The support for mathematical computations is reflected in more libraries included out of the box, for managing distributions, estimations, and inference tests. R has also a simplified syntax for mathematical expressions (e. g. the ~ operator to specify regression).

R is still strong on other core concepts, unlike Matlab: there are possibilities for classes and objects (a la Clojure), anonymous functions and closures, and named parameters. It includes a whole environment, more than a language interpreter: some functions implement graphing capabilities (farewell gnuplot) and a shell with completion and history.

The practical side of R is taken care of by bunbled utilities for reading data from files, databases (MySQL, ORacle, JDBC, ODBC), and for saving the results (e.g. the whole current workspace or single variables).

Like Matlab and Octave, R can be used for quick prototyping; after an algorithm is implemented and validated, it can be translated in C or other languages. The reasons for the translation include better performance, and the portability of the code on different machines, operating systems, or programmers.

Installation

The installation process depends on your system, but four Linux distributition are supported via repositories (with RPM and Deb packages).

People from other domains (like statistics) install R everyday, so it's not like compiling the kernel or hunting for missing libraries. Eveyrthing is already compiled and automated, which is a plus with respect to other niche languages which need to download different groups of JARs just because it is assumed a programmer can handle it.

R basic syntax and data structures

Before beginning, you must know that R's assignment operator is <- and not =. = will work in many cases, but <- is more general as it can be used anywhere; it is the real equivalent of the assignment operator you have used in C-like languages.

> if (a = 4) 1
Error: unexpected '=' in "if (a ="
> if (a <- 4) 1
[1] 1

Moreover, as you can already see from the code above, there is no need for semicolons.

Numbers in R are numeric (which means double actually) or integers.

> answer <- 42
> class(42)
[1] "numeric"
> answers <- c(42, 43)
> class(answers)
[1] "numeric"
> answers <- 42:43
> class(answers)
[1] "integer"
> class(as.integer(42))
[1] "integer"

Booleans are represented with the instances TRUE and FALSE:

> if (TRUE) 42
[1] 42

Strings are also a first-class type, with easier handling than with C libraries:

> message <- "hello"
> message
[1] "hello"

Basically, every R variable is a vector, again similarly to the case of Matlab/Octave; even scalars are just vectors of length 1. Vectors are created with the concatenation function c():

> my_vector <- c(1, 2, 3)
> my_vector
[1] 1 2 3

Lists, however can store variables of any type, while vectors must be homegeneous:

> list(42, "a")
[[1]]
[1] 42

[[2]]
[1] "a"

> c(42, "a")
[1] "42" "a"

Moreover, both lists and vectors can act as maps, since their keys can be strings.

Matrices and data frames are more complex structures. Matrices are the evolution of vectors in 2 dimensions, while data frames are similar structures that can contain values of different types. The difference between them is the same as for vectors and lists.

There are many more types, and useful functions bundled with the interpreter. If you come across some unknown calls, type ?entity at the prompt to load the corresponding man page; entity can be a function or a type name.

A quick example: linear regression

R can seamlessly  perform linear regression, a staple problem in statistics and machine learning. The operation consists in finding the parameters of a linear combination that fits several samples of input and output variables. In our case, we want to find the parameters q and m in the model correlated_data = q + m * data.
> data <- c(1, 2, 3, 4)

> data
[1] 1 2 3 4
> mean(data)
[1] 2.5
> var(data)
[1] 1.666667
> sd(data)
[1] 1.290994
> correlated_data <- c(2, 4, 7, 7.5)
> fm<-lm(correlated_data ~ data)
> fm

Call:
lm(formula = correlated_data ~ data)

Coefficients:
(Intercept)         data  
       0.25         1.95  
ttributes(fm)
$names
 [1] "coefficients"  "residuals"     "effects"       "rank"         
 [5] "fitted.values" "assign"        "qr"            "df.residual"  
 [9] "xlevels"       "call"          "terms"         "model"        

$class
[1] "lm"
> fm$coefficients
(Intercept)        data
       0.25        1.95
> fm$coefficients['data']
data
1.95 
R (programming language) Big data

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Biometric Authentication: Best Practices
  • Cross-Functional Team Management
  • The Power of Enum: Make Your Code More Readable and Efficient [Video]
  • JIT Compilation of SQL in NoSQL

Comments

Web Dev Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo