Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Clojure: Paging Meetup Data Using Lazy Sequences

DZone's Guide to

Clojure: Paging Meetup Data Using Lazy Sequences

· Java Zone
Free Resource

Bitbucket is for the code that takes us to Mars, decodes the human genome, or drives your next car. What will your code do? Get started with Bitbucket today, it's free.

I’ve been playing around with the meetup API to do some analysis on the Neo4j London meetup and one thing I wanted to do was download all the members of the group.

A feature of the meetup API is that each end point will only allow you to return a maximum of 200 records so I needed to make use of offsets and paging to retrieve everybody.

It seemed like a good chance to use some lazy sequences to keep track of the offsets and then stop making calls to the API once I wasn’t retrieving any more results.

I wrote the following functions to take care of that bit:

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))
 
(defn offsets []
  (unchunk (range)))
 
 
(defn get-all [api-fn]
  (flatten
   (take-while seq
               (map #(api-fn {:perpage 200 :offset % :orderby "name"}) (offsets)))))

I previously wrote about the chunking behaviour of lazy collections which meant that I ended up with a minimum of 32 calls to each URI which wasn’t what I had in mind!

To get all the members in the group I wrote the following function which is passed to get-all:

(:require [clj-http.client :as client])
 
(defn members
  [{perpage :perpage offset :offset orderby :orderby}]
  (->> (client/get
        (str "https://api.meetup.com/2/members?page=" perpage
             "&offset=" offset
             "&orderby=" orderby
             "&group_urlname=" MEETUP_NAME
             "&key=" MEETUP_KEY)
        {:as :json})
       :body :results))

So to get all the members we’d do this:

(defn all-members []
  (get-all members))

I’m told that using lazy collections when side effects are involved is a bad idea – presumably because the calls to the API might never end – but since I only run it manually I can just kill the process if anything goes wrong.

I’d be interested in how others would go about solving this problem – core.async was suggested but that seems to result in much more / more complicated code than this version.

The code is on github if you want to take a look.

Bitbucket is the Git solution for professional teams who code with a purpose, not just as a hobby. Get started today, it's free.

Topics:

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}