DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world
Weighted Random Array Selection
Say you have an array of ['apples', 'oranges', 'bananas', 'kiwis']
Selecting a random item is simple enough
arr = ['apples', 'oranges', 'bananas', 'kiwis'] arr[rand(arr.size)]
But say you want to have apples come up 50% of the time, oranges 25%, bananas 15% and kiwis 10% ...
class Array
#sum (and mean) found on http://snippets.dzone.com/posts/show/2161
def sum
inject( nil ) { |sum,x| sum ? sum+x : x }
end
def probability(spread = 2)
z = 1.0
collect {|x| z = z / spread}
end
def weighted_random_index(probability_array = probability(2) )
size.times do |x|
return x if rand < probability_array[0..x].sum
end
return 0
end
def get_weighted_random_item(probability_array = probability(2))
self[weighted_random_index(probability_array)]
end
def get_weighted_random_indexes(num_items, p = probability(2))
res = []
escape = 1000
while res.size < num_items and escape > 0
escape -= 1
tmp = weighted_random_index(p)
res << tmp unless res.include?(tmp)
end
return res.sort
end
end
arr = ['apples', 'oranges', 'bananas', 'kiwis']
10.times do |i|
puts arr.get_weighted_random_item([0.5, 0.25, 0.15, 0.10])
end
oranges
oranges
apples
apples
bananas
apples
apples
bananas
oranges
apples
I'm sure there's some math term for this but I'm calling it 'spread' until I learn otherwise. I put in a default probability array of p = p / 2 which gives you (for an array of 4)
arr.probability => [0.5, 0.25, 0.125, 0.0625]
On larger arrays the probabilities will get (for all practicle purposes) zero quickly
arr = [1,2,3,4,5,6,7,8,9] => [1, 2, 3, 4, 5, 6, 7, 8, 9] arr.probability => [0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125]
I put in a simple spread concept to spread out the probabilities a bit more... the higher the spread, the more even the probability distribution (this varies a lot with array size - but here's a simple example):
arr = ['apples', 'oranges', 'bananas', 'guava']
iterations = 10000
4.times do |t|
p = arr.probability(t+2)
res = Array.new(arr.size).fill(0)
iterations.times do |x|
res[arr.weighted_random_index(p)] += 1
end
res2 = res.collect {|x| (x/iterations.to_f) * 100}
puts "For probability spread #{t+2}"
puts "Results #{res2.join(", ")} = #{res2.sum}"
puts ""
end
For probability spread 2
Results 49.71, 37.55, 11.35, 1.39 = 100.0
For probability spread 3
Results 43.44, 29.99, 17.79, 8.78 = 100.0
For probability spread 4
Results 48.3, 23.56, 16.73, 11.41 = 100.0
For probability spread 5
Results 54.22, 19.31, 14.65, 11.82 = 100.0
Here is a longer example:
def weighted_random_index_example
arr = ['apples', 'oranges', 'bananas', 'guava']
puts "Sample array = [#{arr.join(",")}]"
p = [0.5,0.25, 0.15, 0.10]
puts "Probability that each will show up [#{p.join(', ')}]"
puts "1000 runs..."
res = Array.new(arr.size).fill(0)
1000.times do |t|
res[arr.weighted_random_index(p)] += 1
end
res2 = res.collect {|x| (x/1000.0) * 100}
puts "Results:"
4.times do |t|
puts " #{arr[t]}: #{res2[t]}%"
end
puts ""
puts "You can also use more spread out probability arrays"
p = arr.probability(3)
puts "Probability that each will show up with a spread of 3 [#{p.join(', ')}]"
puts "1000 runs..."
res = Array.new(arr.size).fill(0)
1000.times do |t|
res[arr.weighted_random_index(p)] += 1
end
res2 = res.collect {|x| (x/1000.0) * 100}
puts "Results:"
4.times do |t|
puts " #{arr[t]}: #{res2[t]}%"
end
puts ""
puts "You can also just get selected indexes"
puts "arr.get_weighted_random_indexes(3,p) = [#{arr.get_weighted_random_indexes(3,p).join(', ')}]"
puts "The probability spread will depend on the number of items in your array - for an array of 4 it looks like this:"
8.times do |t|
puts " probability(#{t+2}): [#{arr.probability(t+2).collect {|x| sprintf('%0.5f',x)}.join(', ')}]"
end
return nil
end
weighted_random_index_example
Outputs...
Sample array = [apples,oranges,bananas,guava]
Probability that each will show up [0.5, 0.25, 0.15, 0.1]
1000 runs...
Results:
apples: 49.1%
oranges: 38.3%
bananas: 11.3%
guava: 1.3%
You can also use more spread out probability arrays
Probability that each will show up with a spread of 3 [0.333333333333333, 0.111111111111111, 0.037037037037037, 0.0123456790123457]
1000 runs...
Results:
apples: 43.6%
oranges: 29.2%
bananas: 18.0%
guava: 9.2%
You can also just get selected indexes
arr.get_weighted_random_indexes(3,p) = [0, 1, 3]
The probability spread will depend on the number of items in your array - for an array of 4 it looks like this:
probability(2): [0.50000, 0.25000, 0.12500, 0.06250]






Comments
Snippets Manager replied on Thu, 2009/11/26 - 5:32am
Snippets Manager replied on Sat, 2009/09/19 - 7:43pm
def badpick(values, probs) values.each_index { |index| sum = probs[0..index].inject { |x,y| x+y } return values[index] if rand < sum } end def goodpick(values, probs) randValue = rand() values.each_index { |index| sum = probs[0..index].inject { |x,y| x+y } return values[index] if randValue < sum } end def printTotal(total, values) values.each { |v| percent = total.count(v).to_f / 10000.0 puts "#{v}: #{percent}" } end values = ["a", "b", "c", "d"] probs = [0.1, 0.2, 0.3, 0.4] total = [] 10000.times { total << badpick(values, probs) } puts "Bad:" printTotal(total, values) total = [] 10000.times { total << goodpick(values, probs) } puts "Good:" printTotal(total, values)Snippets Manager replied on Sat, 2009/09/19 - 7:43pm
def goodpick(values, probs) randValue = rand() values.each_index { |index| sum = probs[0..index].inject { |x,y| x+y } return values[index] if randValue < sum } end def printTotal(total, values) values.each { |v| percent = total.count(v).to_f / 10000.0 puts "#{v}: #{percent}" } end values = ["a", "b", "c", "d"] probs = [0.1, 0.2, 0.3, 0.4] total = [] 10000.times { total << badpick(values, probs) } puts "Bad:" printTotal(total, values) total = [] 10000.times { total << goodpick(values, probs) } puts "Good:" printTotal(total, values)You'll notice that d is is never 0.4 as it should be. The corrected version of the weighted_random_index method should be:def weighted_random_index(probability_array = probability(2) ) randValue = rand size.times do |x| return x if randValue < probability_array[0..x].sum end return 0 endSnippets Manager replied on Wed, 2007/09/26 - 10:32am
class Array def random_weighted(method) total = self.map(&method).sum running_total = 0 index = rand(total) + 1 #puts "INDEX: #{index}" # we assume it is already sorted in descending order, although I suppose order does not matter self.each do |item| running_total += item.send(method) return item if index <= running_total end # it is possible to arrive here if all the elements had weight of zero. Handle this: return self[rand(self.size)] end endSo, for a contrived example, I can now do something like:>> [1,2,4,8].random_weighted(:to_i) => 8 >> [1,2,4,8].random_weighted(:to_i) => 8 >> [1,2,4,8].random_weighted(:to_i) => 8 >> [1,2,4,8].random_weighted(:to_i) => 8 >> [1,2,4,8].random_weighted(:to_i) => 4 >> [1,2,4,8].random_weighted(:to_i) => 4 >> [1,2,4,8].random_weighted(:to_i) => 8 >> [1,2,4,8].random_weighted(:to_i) => 4 >> [1,2,4,8].random_weighted(:to_i) => 4 >> [1,2,4,8].random_weighted(:to_i) => 1Or, I could do:>> %w(a one a_really_long_word ).random_weighted(:size)These are of course contrived examples. I am using it to do a random weighted selection of things that get clicked on the most on a web site.