Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Clojure: Paging Meetup Data Using Lazy Sequences

DZone's Guide to

Clojure: Paging Meetup Data Using Lazy Sequences

· Java Zone ·
Free Resource

FlexNet Code Aware, a free scan tool for developers. Scan Java, NuGet, and NPM packages for open source security and open source license compliance issues.

I’ve been playing around with the meetup API to do some analysis on the Neo4j London meetup and one thing I wanted to do was download all the members of the group.

A feature of the meetup API is that each end point will only allow you to return a maximum of 200 records so I needed to make use of offsets and paging to retrieve everybody.

It seemed like a good chance to use some lazy sequences to keep track of the offsets and then stop making calls to the API once I wasn’t retrieving any more results.

I wrote the following functions to take care of that bit:

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))
 
(defn offsets []
  (unchunk (range)))
 
 
(defn get-all [api-fn]
  (flatten
   (take-while seq
               (map #(api-fn {:perpage 200 :offset % :orderby "name"}) (offsets)))))

I previously wrote about the chunking behaviour of lazy collections which meant that I ended up with a minimum of 32 calls to each URI which wasn’t what I had in mind!

To get all the members in the group I wrote the following function which is passed to get-all:

(:require [clj-http.client :as client])
 
(defn members
  [{perpage :perpage offset :offset orderby :orderby}]
  (->> (client/get
        (str "https://api.meetup.com/2/members?page=" perpage
             "&offset=" offset
             "&orderby=" orderby
             "&group_urlname=" MEETUP_NAME
             "&key=" MEETUP_KEY)
        {:as :json})
       :body :results))

So to get all the members we’d do this:

(defn all-members []
  (get-all members))

I’m told that using lazy collections when side effects are involved is a bad idea – presumably because the calls to the API might never end – but since I only run it manually I can just kill the process if anything goes wrong.

I’d be interested in how others would go about solving this problem – core.async was suggested but that seems to result in much more / more complicated code than this version.

The code is on github if you want to take a look.

 Scan Java, NuGet, and NPM packages for open source security and license compliance issues. 

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}