Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

R: Modelling a Conversion Rate with a Binomial Distribution

DZone's Guide to

R: Modelling a Conversion Rate with a Binomial Distribution

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

As part of some work Sid and I were doing last week we wanted to simulate the conversion rate for an A/B testing we were planning.

We started with the following function which returns the simulated conversion rate for a given conversion rate of 12%:

generateConversionRates <- function(sampleSize) {
	sample_a <- rbinom(seq(0, sampleSize), 1, 0.12)
	conversion_a <- length(sample_a[sample_a == 1]) / sampleSize
 
	sample_b <- rbinom(seq(0, sampleSize), 1, 0.12)
	conversion_b <- length(sample_b[sample_b == 1]) / sampleSize
 
	c(conversion_a, conversion_b)
}

If we call it:

> generateConversionRates(10000)
[1] 0.1230 0.1207

We have a 12.3% conversion rate on A and a 12.07% conversion rate on B based on 10,000 sample values.

We then wrote the following function to come up with 1000 versions of those conversion rates:

generateSample <- function(sampleSize) {
	lapply(seq(1, 1000), function(x) generateConversionRates(sampleSize))
}

We can call that like this:

> getSample(10000)
[[998]]
[1] 0.1179 0.1216
 
[[999]]
[1] 0.1246 0.1211
 
[[1000]]
[1] 0.1248 0.1234

We were then using these conversion rates to try and work out how many samples we needed to include in an A/B test to have reasonable confidence that it represented the population.

We actually ended up abandoning that exercise but I thought I’d record the code because I thought it was pretty interesting.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}