DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Algorithm of the Week: Python vs. Ruby in the Knapsack Problem

Algorithm of the Week: Python vs. Ruby in the Knapsack Problem

Mark Needham user avatar by
Mark Needham
·
Feb. 26, 13 · Interview
Like (0)
Save
Tweet
Share
15.04K Views

Join the DZone community and get the full member experience.

Join For Free

The latest algorithm that we had to code in Algorithms 2 was the Knapsack problem which is as follows:

The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible.

We did a slight variation on this in that you could only pick each item once, which is known as the 0-1 knapsack problem.

In our case we were given an input file from which you could derive the size of the knapsack, the total number of items and the individual weights & values of each one.

The pseudocode of the version of the algorithm which uses a 2D array as part of a dynamic programming solution is as follows:

  • Let A = 2-D array of size n (number of items) * W (size of the knapsack)
  • Initialise A[0,X] = 0 for X=0,1,..,W
  • for i=1,2,3,…n
    • for x=0,1,2,…,w
      • A[i,x] = max { A[i-1, x], A[x-1, x-wi] + Vi }
      • where Vi is the value of the ith element and Wi is the weight of the ith element
  • return A[n, W]

This version runs in O(nW) time and O(nW) space. This is the main body of my Ruby solution for that:

number_of_items,knapsack_size = # calculated from file
 
cache = [].tap { |m| (number_of_items+1).times { m << Array.new(knapsack_size+1) } }
cache[0].each_with_index { |value, weight| cache[0][weight] = 0  }
 
(1..number_of_items).each do |i|
  value, weight = rows[i-1]
  (0..knapsack_size).each do |x|
    if weight > x
      cache[i][x] = cache[i-1][x] 
    else
      cache[i][x] = [cache[i-1][x], cache[i-1][x-weight] + value].max
    end
  end
end
 
p cache[number_of_items][knapsack_size]

This approach works reasonably well when n and W are small but in the second part of the problem n was 500 and W was 2,000,000 which means the 2D array would contain 1 billion entries.

If we’re storing integers of 4 bytes each in that data structure then the amount of memory required is 3.72GB– slightly too much for my machine to handle!

Instead a better data structure would be one where we don’t have to allocate everything up front but can just fill it in as we go. In this case we can still use an array for the number of items but instead of storing another array in each slot we’ll use a dictionary/hash map instead.

If we take a bottom up approach to this problem it seems like we end up solving a lot of sub problems which aren’t relevant to our final solution so I decided to try a top down recursive approach and this is what I ended up with:

@new_cache = [].tap { |m| (@number_of_items+1).times { m << {} } }
 
def knapsack_cached(rows, knapsack_size, index)
  return 0 if knapsack_size == 0 || index == 0
  value, weight = rows[index]
  if weight > knapsack_size
    stored_value = @new_cache[index-1][knapsack_size]
 
    return stored_value unless stored_value.nil?  
    return @new_cache[index-1][knapsack_size] = knapsack_cached(rows, knapsack_size, index-1)
  else
    stored_value = @new_cache[index-1][knapsack_size]
    return stored_value unless stored_value.nil?
 
    option_1  = knapsack_cached(rows, knapsack_size, index-1)
    option_2  = value + knapsack_cached(rows, knapsack_size - weight, index-1)
    return @new_cache[index-1][knapsack_size] = [option_1, option_2].max
  end
end
 
p knapsack_cached(rows, @knapsack_size, @number_of_items-1)

The code is pretty similar to the previous version except we’re starting from the last item and working our way inwards. We end up storing 2,549,110 items in @new_array which we can work out by running this:

p @new_cache.inject(0) { |acc,x| acc + x.length}

If we’d used the 2D array that would mean we’d only populated 0.25% of the data structure, truly wasteful!

I wanted to do a little bit of profiling on how fast this algorithm ran in Ruby compared to JRuby and I also recently came across nailgun – which allows you to start up a persistent JVM and then run your code via thatinstead of starting a new one up each time – so I thought I could play around with that as well!

# Ruby
$ time ruby knapsack/knapsack_rec.rb
real	0m18.889s user	0m18.613s sys	0m0.138s
 
# JRuby
$ time ruby knapsack/knapsack_rec.rb
real	0m6.380s user	0m10.862s sys	0m0.428s
 
# JRuby with nailgun
$ ruby --ng-server & # start up the nailgun server
 
$ time ruby --ng knapsack/knapsack_rec.rb
real	0m6.734s user	0m0.023s sys	0m0.021s
 
$ time ruby --ng knapsack/knapsack_rec.rb
real	0m5.213s user	0m0.022s sys	0m0.021s

The first run is a bit slow as the JVM gets launched but after that we get a marginal improvement. I thought the JVM startup time would be a bigger proportion of the running time but I guess not!

I thought I'd try it out in Python as well because on one of the previous problems Isaiah had been able to write much faster versions in Python so I wanted to see if that'd be the case here too.

This was the python solution:

def knapsack_cached(rows, knapsack_size, index):
    global cache
    if(index is 0 or knapsack_size is 0):
        return 0
    else:
        value, weight = rows[index]
        if(weight > knapsack_size and knapsack_size not in cache[index-1]):
            cache[index-1][knapsack_size] = knapsack_cached(rows, knapsack_size, index-1)                
        else:
            if(knapsack_size not in cache[index-1]):
                option_1  = knapsack_cached(rows, knapsack_size, index-1)
                option_2  = value + knapsack_cached(rows, knapsack_size - weight, index-1)
                cache[index-1][knapsack_size] = max(option_1, option_2)                
 
        return cache[index-1][knapsack_size]
 
knapsack_size, number_of_items, rows = # worked out from the file
 
result = knapsack_cached(rows, knapsack_size, number_of_items-1)    
print(result)

The code is pretty much exactly the same as the Ruby version but interestingly it seems to run more quickly:

$ time python knapsack/knapsack.py 
real	0m4.568s user	0m4.480s sys	0m0.078s

I have no idea why that would be the case but it has been for all the algorithms we've written so far. If anyone has any ideas I'd be intrigued to hear them!

Algorithm Data structure Python (language)

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How Do the Docker Client and Docker Servers Work?
  • How to Create a Dockerfile?
  • 9 Ways You Can Improve Security Posture
  • What Was the Question Again, ChatGPT?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: