DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Find All Duplicates In An Array

06.15.2007
| 8718 views |
  • submit to reddit
        Inspired by: <a href="http://www.rubyonrailsblog.com/2006/11/1/ruby-on-rails-helpers-dups-and-dups-identifies-elements-occuring-more-than-once-in-an-array">Ruby on Rails Helpers: .dups and .dups? Identifies Elements Occuring More Than Once In An Array</a>


require 'benchmark' 

module Enumerable
   def map_with_index
      index = -1
      (block_given? && self.class == Range || self.class == Array)  ?  map { |x| index += 1; yield(x, index) }  :  self
   end
end


class Array

   def find_dups
      inject(Hash.new(0)) { |h,e| h[e] += 1; h }.select { |k,v| v > 1 }.collect { |x| x.first }
   end


   # Based on hungryblank's version in the comments
   # see http://www.ruby-forum.com/topic/122008

   def find_dups2
      uniq.select{ |e| (self-[e]).size < self.size - 1 }
   end

   def find_ndups     # also returns the number of items
      uniq.map { |v| diff = (self.size - (self-[v]).size); (diff > 1) ? [v, diff] : nil}.compact
   end


   # cf. http://www.ruby-forum.com/topic/122008
   def dups_indices   
      (0...self.size).to_a - self.uniq.map{ |x| index(x) }
   end

   def dup_indices(obj)
      i = -1
      ret = map { |x| i += 1; x == obj ? i : nil }.compact
      #ret = map_with_index { |x,i| x == obj ? i : nil }.compact
      ret.shift
      ret
   end

   def delete_dups(obj)
      indices = dup_indices(obj)
      return self if indices.empty?
      indices.reverse.each { |i| self.delete_at(i) }
      self
   end

end  


array = [1,3,5,5,6,7,9,10,14,18,22,22,4,4,4,3,6]

dups = nil


Benchmark.bm(14) do |t| 

 t.report('find_dups:') do
    dups = array.find_dups
 end 

end 

p dups   #=> [5, 22, 6, 3, 4]


p %w(a b a c c d).dups_indices
p %w(a b a c c d).dup_indices('c')
p %w(a b a c c d).delete_dups('a')


    

Comments

Snippets Manager replied on Mon, 2010/05/03 - 6:13am

I am using this rather long version for getting the most common value in a hash: def mostused(ahash) original_size = ahash.size sizes = {} ahash.uniq.map { |x| sizes[x] = original_size - ((ahash - [x]).size)} sizes.sort_by {|k,v| v}.pop end I am new to ruby though, any suggestion for improvement?

Snippets Manager replied on Sun, 2006/04/16 - 5:07am

Nice. I was looking for some way to do this. Should be part of Array class already!

Snippets Manager replied on Fri, 2007/06/15 - 1:43pm

My version class Array def find_dups uniq.map {|v| (self - [v]).size < (self.size - 1) ? v : nil}.compact end end