The Performance Zone is brought to you in partnership with New Relic. New Relic APM provides constant monitoring of your apps so you don't have to.
This post revisits my earlier evaluation of Google App Engine's post-preview pricing changes and how it affected my project, SMSMyBus. As I noted, the app was projected to cost between $6 and $7 dollars per day under the new platform pricing.
This is authored by Greg Tracy, Co-founder of Asthmapolis and Sharendipity Since
that post, I’ve been rolling out small, incremental changes to optimize
the code and combat all of the known issues. I’m thrilled to report
that I have the price down to $0/day. And I’m once again impressed by
the snappy and reliable App Engine platform.
back on the changes, I can say that I was doing some bad things, some
abusive things, and App Engine was making some bad choices as well. But
the end result proves that if its developers optimize and do smart
things, they are rewarded. App Engine remains a great solution for my transit API service.
Here’s a history of the changes over the last two months that got me down to $0/day...
1. Instance allocation
new pricing model charges applications based on their use of instances
(hardware resources where your application is running) rather than CPU
utilization. A key to keeping your instance cost down is to simply
reduce the number of instances that are spinning. Duh. So I grabbed the
instance slider in the application settings and yanked it to the left.
This doesn't prevent scaling, it just limits my billing for normal
2. Delete data
Engine data storage (for your database) costs $0.008/GByte-day. Doesn’t
sound too expensive, but I had been storing every single API call I had
ever gotten. I thought it would be useful for API developers and for
analytics. My drive to $0 outweighed that, however, so I deleted all of
the history data and got under the free quota for storage.
3. Memcached the application's route listings
was surprised to find that I wasn’t doing this already, but there it
was. I have a data structure that maps bus routes and bus stops to
scheduling data on the Metro website and it never changes. In some cases
- like the static calls from the kiosk clients
- I was looking up route listing details in the datastore once every
minute!! Fail. I used memcache to keep the common queries in memory and
avoid the extra datastore reads.
4. Limit access during off hours
thing that never changes is when the Metro service is running. There
are five+ hours a day where the buses aren’t on the street. But some
clients are still asking for data. I stubbed out most of the API during
these off hours before the code ever gets close to making a datastore or
These four changes brought me down to $0.70 per day. Bam!
5. Asynchronous screen grabs
you don’t know, behind the API curtain is an ugly screen scraping task
that extracts the arrival estimates from the Metro website. So when a
client requests arrival data for a stop, the app goes off and requests
multiple web pages, machine-reads the information and aggregates all of
The original implementation of the SMS interface
did this by creating multiple tasks (one for each route traveling
through the respective stop). When a task ran, it stored the results in
the datastore. An aggregator task would read those results out of the
datastore and piece together the response to the caller.
the API was created, I couldn’t use background tasks because I had to
respond with results in the same HTTP context. That’s when I discovered
the great feature, asynchronous url fetch.
This essentially let me grab all of the different Metro web pages at
the same time. But when I implemented this, I continued to use the
datastore as the mechanism for storing and retrieving results. This was
just lazy. Under the old pricing, I wasn’t incented to change it other
then the fact that it was a bit slow.
the new pricing model, this solution was very expensive. The API is
continuously running this aggregation algorithm - constantly writing and
reading to the datastore for model instances that have a lifespan of
under a minute!
I rolled out a change that removed the use of the datastore and instead sorted the aggregated results in memory. This had a dramatic
effect on my API quota for datastore reads and writes. Especially the
write operations, where you get penalized by an order of magnitude for
this type of behavior because index updates work against your API quota
optimizing the API, I realized that the original SMSMyBus apps (SMS,
chat, email and phone interfaces for the Metro) were now the long pole.
Those apps were implemented before the API existed so they weren’t
benefiting from the API optimizations. Solution... re-implement to use
the SMSMyBus API.
should have been done long ago simply as a validation exercise of the
API methods. Credit to the eligence and simplicity of the API - this
port was simple and only took a couple of hours.
These two changes brought me down to $0.10/day. Badda-bing.
7. Run Appstats on all application interfaces
The last stop on the optimization train was Appstats. A truly great tool in the App Engine toolbox. In just a matter of minutes,
you can find the hidden datastore operations that are dragging you
down. In my case, it led me to one area that wasn’t being memcached at
all. And it revealed an area that was simply using the memcache
incorrectly! Love this tool...
This change brought me down $0.00/day. Winning!
App Engine remains a great platform for developers that don’t abuse it and take the time to optimize their applications.
SMSMyBus API now serves over 6,000 transit requests per day. It’s fast,
reliable and flat out fun to use. I’m as proud as ever that I brought
this to Madison.
Next step... find a way to fund my SMS users. :)
The Performance Zone is brought to you in partnership with New Relic. New Relic’s SaaS-based Application Performance Monitoring helps you build, deploy, and maintain great web software.