Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Python: Combinations of Values On and Off

DZone's Guide to

Python: Combinations of Values On and Off

Sometimes you just need to generate a grid of boolean values. Let's take a look at how to do this in Python to help facilitate a GridSearch.

· Web Dev Zone
Free Resource

Get deep insight into Node.js applications with real-time metrics, CPU profiling, and heap snapshots with N|Solid from NodeSource. Learn more.

In my continued exploration of Kaggle’s Spooky Authors competition, I wanted to run a GridSearch turning on and off different classifiers to work out the best combination.

I, therefore, needed to generate combinations of 1s and 0s enabling different classifiers.

e.g. if we had 3 classifiers we’d generate these combinations

0 0 1
0 1 0
1 0 0
1 1 0
1 0 1
0 1 1
1 1 1

where…

  • ‘0 0 1’ means: classifier1 is disabled, classifier3 is disabled, classifier3 is enabled.
  • ‘0 1 0’ means: classifier1 is disabled, classifier3 is enabled, classifier3 is disabled.
  • ‘1 1 0’ means: classifier1 is enabled, classifier3 is enabled, classifier3 is disabled.
  • ‘1 1 1’ means: classifier1 is enabled, classifier3 is enabled, classifier3 is enabled.

…and so on. In other words, we need to generate the binary representation for all the values from 1 to 2number of classifiers-1.

We can write the following code fragments to calculate a 3-bit representation of different numbers:

>>> "{0:0b}".format(1).zfill(3)
'001'
>>> "{0:0b}".format(5).zfill(3)
'101'
>>> "{0:0b}".format(6).zfill(3)
'110'

We need an array of 0s and 1s rather than a string, so let’s use the list function to create our array and then cast each value to an integer:

>>> [int(x) for x in list("{0:0b}".format(1).zfill(3))]
[0, 0, 1]

Finally, we can wrap that code inside a list comprehension:

def combinations_on_off(num_classifiers):
    return [[int(x) for x in list("{0:0b}".format(i).zfill(num_classifiers))]
            for i in range(1, 2 ** num_classifiers)]

And let’s check it works:

>>> for combination in combinations_on_off(3):
       print(combination)

[0, 0, 1]
[0, 1, 0]
[0, 1, 1]
[1, 0, 0]
[1, 0, 1]
[1, 1, 0]
[1, 1, 1]

What about if we have 4 classifiers?

>>> for combination in combinations_on_off(4):
       print(combination)

[0, 0, 0, 1]
[0, 0, 1, 0]
[0, 0, 1, 1]
[0, 1, 0, 0]
[0, 1, 0, 1]
[0, 1, 1, 0]
[0, 1, 1, 1]
[1, 0, 0, 0]
[1, 0, 0, 1]
[1, 0, 1, 0]
[1, 0, 1, 1]
[1, 1, 0, 0]
[1, 1, 0, 1]
[1, 1, 1, 0]
[1, 1, 1, 1]

Perfect! We can now use this function to help work out which combinations of classifiers are needed.

Node.js application metrics sent directly to any statsd-compliant system. Get N|Solid

Topics:
function ,python ,web dev ,array

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}