Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Python: Joining Multiple Generators/Iterators

DZone's Guide to

Python: Joining Multiple Generators/Iterators

· Web Dev Zone ·
Free Resource

Bugsnag monitors application stability, so you can make data-driven decisions on whether you should be building new features, or fixing bugs. Learn more.

In my previous blog post I described how I’d refactored some scraping code I’ve been working on to use iterators and ended up with a function which returned a generator containing all the events for one BBC live text match:


match_id = "32683310"
events = extract_events("data/raw/%s" % (match_id))
 
>>> print type(events)
<type 'generator'>

The next thing I wanted to do is get the events for multiple matches which meant I needed to glue together multiple generators into one big generator.

itertools’ chain function does exactly what we want:

itertools.chain(*iterables)

Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence.

First let’s try it out on a collection of range generators:

import itertools
gens = [(n*2 for n in range(0, 3)), (n*2 for n in range(4,7))]
>>> gens
[<generator object <genexpr> at 0x10ff3b140>, <generator object <genexpr> at 0x10ff7d870>]
 
output = itertools.chain()
for gen in gens:
  output = itertools.chain(output, gen)

Now if we iterate through ‘output’ we’d expect to see the multiples of 2 up to and including 12:

>>> for item in output:
...   print item
...
0
2
4
8
10
12

Exactly as we expected! Our scraping code looks like this once we plug the chaining in:

matches = ["32683310", "32683303", "32384894", "31816155"]
 
raw_events = itertools.chain()
for match_id in matches:
    raw_events = itertools.chain(raw_events, extract_events("data/raw/%s" % (match_id)))

‘raw_events’ now contains a single generator that we can iterate through and process the events for all matches.

Monitor application stability with Bugsnag to decide if your engineering team should be building new features on your roadmap or fixing bugs to stabilize your application.Try it free.

Topics:
python ,bigdata ,tips and tricks ,joining multiple iterators ,joining generators

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}