Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Code Mistakes: Python's for Loop

DZone's Guide to

Code Mistakes: Python's for Loop

It's really exciting to learn a new language... up until you write something broken that you can't figure out how to fix. Here's something I overlooked recently in trying to make a Python for loop.

· Java Zone
Free Resource

Build vs Buy a Data Quality Solution: Which is Best for You? Gain insights on a hybrid approach. Download white paper now!

Update: I've gotten a lot of great responses to this post that show better, more functional, and more pythonic ways to solve the problem I came across. I'm adding some of these responses to this post, as I know the first solution I came up with was not ideal, but rather a means of fixing an issue in my code I was having trouble understanding. Thanks everyone very much for all the responses.

I'm learning Python right now after years of using Java. It has actually helped a lot that I have been doing some work with Lisp and functional programming, as I think that has helped me bridge the gap between Python and Java. Still, there are some Java practices that I cling onto that cause me some trouble in a new language. Python's for loop was one of those trouble-makers for me recently.

I actually really like the way Python handles for loops, but years of Java conditioning had me a bit mixed up. I've gotten so used to defining an iterator, creating the end condition based on the iterator, and then defining how to iterate. For many applications in Python, there's no need to have all that detail. But it was the simplicity of the for loop that ended up causing me problems.

Essentially, I was trying to take a long list of numbers and iterate over all possible equal-length consecutive segments of that list. In a very basic example, this would look like taking a list:

num_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]

and finding every set of three consecutive numbers in that list: 123, 234, 456, and so on. So I wrote my Python to look somewhat like this:

for x in num_list:
     some_var = some_function(num_list[x:x+3])
     if some_var > some_other_var:
        some_other_var = some_var
        final_list = num_list[x:x+3]

If you've used Python for loops before, you can probably pretty easily tell what I'm doing wrong here. But if you haven't worked much with these loops, the issue might be a little harder to spot.

The issue here is that Python already knows how to iterate over every item in this list, because it knows what each item is. So it doesn't need me to tell it how to go through the list and get each item, it just does it for me.

This means that x in for x in num_list does not refer to the index of each iteration; it refers to the value at each index. But the function I'm performing, and the final_list variable I'm defining, are trying to use these values as indexes. In the particular list that I gave as an example above, that means, during the first iteration of the loop, while I'm trying to work with num_list[0], I'm actually working with num_list[1], since 1 is at index 0 in that list.

In Java, I would have likely defined my iterator and used the iterator to retrieve the values within the loop. But this for loop in Python is already giving me the value.

Since I still needed an iterator in order to define my index ranges, I ended up adding one within the loop logic:

i = 0
for x in num_list:
    some_var = some_function(num_list[i:i+3])
    if some_var > some_other_var:
        some_other_var = some_var
        final_list = num_list[i:i+3]
    i += 1

This ends up getting me the indexes I need to find consecutive series within the list.

This very well may not be is not the best way to handle this, and I'd love to hear other solutions that can help make my Python better. But I wanted to point out this difference in case others have had this kind of trouble when learning Python loops.

[Update]

Here are some better solutions presented by great DZone contributors. See the comments for more great discussion, and feel free to leave a comment yourself.

  • Marcin Cuprjak: There is a syntax in Python for that: enumerate. It gives index and enumerated object...:

    for i, x in enumerate(num_list):
  • Tim Desjardins: A more pythonic way would be to do your for loop as:

    for i in xrange(0, len(num_list)):

    Less code, more concise.

  • John Henson: I too work with Java since your foundation (1996), but python is awesome in much areas that have tools and constructions that facilitate your life. If you need iterate for combinations in elements in a contiguous list you can use that in Python 2.7.1 or major:

    from itertools import combinations
    
    for x in combinations(range(1,100),3):
        print x

    If you have one list L you can do:

    for x in combinations(L,3):
        print x
  • Andre Burgaud: Assuming you'd be inclined to approach the problem in a more functional fashion, the following might be a starting point:

    num_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    
    l = [(x, y, z) for (x, y, z) in zip(num_list, num_list[1:], num_list[2:]) 
         if (x, y, z) == (x, x+1, x+2)]
    
    print(l)
  • Erik Colban:

    final_triple = max((triple for triple in zip(num_list, num_list[1:], num_list[2:])), key=some_function)

    Alternatively:

    final_list = max((num_list[i:i+3] for i in xrange(0, len(num_list) - 2)), key=some_function)

Related Refcard:

Build vs Buy a Data Quality Solution: Which is Best for You? Maintaining high quality data is essential for operational efficiency, meaningful analytics and good long-term customer relationships. But, when dealing with multiple sources of data, data quality becomes complex, so you need to know when you should build a custom data quality tools effort over canned solutions. Download our whitepaper for more insights into a hybrid approach.

Topics:
java ,python ,for loops ,language learning

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}