Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

# Thoughts on Software Development R: Apply a Custom Function Across Multiple Lists

· Big Data Zone
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

In my continued playing around with R I wanted to map a custom function over two lists comparing each item with its corresponding items.

If we just want to use a built in function such as subtraction between two lists it’s quite easy to do:

```> c(10,9,8,7,6,5,4,3,2,1) - c(5,4,3,4,3,2,2,1,2,1)
[1] 5 5 5 3 3 3 2 2 0 0```

I wanted to do a slight variation on that where instead of returning the difference I wanted to return a text value representing the difference e.g. ’5 or more’, ’3 to 5′ etc.

I spent a long time trying to figure out how to do that before finding an excellent blog post which describes all the different ‘apply’ functions available in R.

As far as I understand ‘apply’ is the equivalent of ‘map’ in Clojure or other functional languages.

In this case we want the mapply variant which we can use like so:

```> mapply(function(x, y) {
if((x-y) >= 5) {
"5 or more"
} else if((x-y) >= 3) {
"3 to 5"
} else {
"less than 5"
}
}, c(10,9,8,7,6,5,4,3,2,1),c(5,4,3,4,3,2,2,1,2,1))
[1] "5 or more"   "5 or more"   "5 or more"   "3 to 5"      "3 to 5"      "3 to 5"      "less than 5"
[8] "less than 5" "less than 5" "less than 5"```

We could then pull that out into a function if we wanted:

```summarisedDifference <- function(one, two) {
mapply(function(x, y) {
if((x-y) >= 5) {
"5 or more"
} else if((x-y) >= 3) {
"3 to 5"
} else {
"less than 5"
}
}, one, two)
}```

which we could call like so:

```> summarisedDifference(c(10,9,8,7,6,5,4,3,2,1),c(5,4,3,4,3,2,2,1,2,1))
[1] "5 or more"   "5 or more"   "5 or more"   "3 to 5"      "3 to 5"      "3 to 5"      "less than 5"
[8] "less than 5" "less than 5" "less than 5"```

I also wanted to be able to compare a list of items to a single item which was much easier than I expected:

```> summarisedDifference(c(10,9,8,7,6,5,4,3,2,1), 1)
[1] "5 or more"   "5 or more"   "5 or more"   "5 or more"   "5 or more"   "3 to 5"      "3 to 5"
[8] "less than 5" "less than 5" "less than 5"```

If we wanted to get a summary of the differences between the lists we could plug them into ddply like so:

```> library(plyr)
> df = data.frame(x=c(10,9,8,7,6,5,4,3,2,1), y=c(5,4,3,4,3,2,2,1,2,1))
> ddply(df, .(difference=summarisedDifference(x,y)), summarise, count=length(x))
difference count
1      3 to 5     3
2   5 or more     3
3 less than 5     4```

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.