I’ve been playing around with meetup.com’s API again and this time wanted to consume the chunked HTTP RSVP stream and filter RSVPs for events I’m interested in.
I use Python for most of my hacking these days and if HTTP requests are required the requests library is my first port of call.
I started out with the following script:
import requests import json def stream_meetup_initial(): uri = "http://stream.meetup.com/2/rsvps" response = requests.get(uri, stream = True) for chunk in response.iter_content(chunk_size = None): yield chunk for raw_rsvp in stream_meetup_initial(): print raw_rsvp try: rsvp = json.loads(raw_rsvp) except ValueError as e: print e continue
This mostly worked but I also noticed the following error from time to time:
No JSON object could be decoded
Although less frequent, I also saw errors suggesting I was trying to parse an incomplete JSON object. I tweaked the function to keep a local buffer and only yield that if the chunk ended in a new line character:
def stream_meetup_newline(): uri = "http://stream.meetup.com/2/rsvps" response = requests.get(uri, stream = True) buffer = "" for chunk in response.iter_content(chunk_size = 1): if chunk.endswith("\n"): buffer += chunk yield buffer buffer = "" else: buffer += chunk
This mostly works although I’m sure I’ve seen some occasions where two JSON objects were being yielded and then the call to ‘json.loads’ failed. I haven’t been able to reproduce that though.
A second read through the requests documentation made me realise I hadn’t read it very carefully the first time since we can make our lives much easier by using ‘iter_lines’ rather than ‘iter_content’:
r = requests.get('http://stream.meetup.com/2/rsvps', stream=True) for raw_rsvp in r.iter_lines(): if raw_rsvp: rsvp = json.loads(raw_rsvp) print rsvp
We can then process ‘rsvp’, filtering out the ones we’re interested in.