Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

A Graphical View of API Performance Based on Call Location

DZone's Guide to

A Graphical View of API Performance Based on Call Location

In this post, we use some Python and data viz to prove that the farther you are from an API call location, the longer and slower the API's response will be.

· Integration Zone ·
Free Resource

WSO2 is the only open source vendor to be named a leader in The Forrester Wave™: API Management Solutions, Q4 2018 Report. Download the report now or try out our product for free.

The performance of APIs is dependent on both the processing time from when the API receives a request and delivers a response and the time it takes for the request and response data packets to traverse the Internet distance between the calling system and the system that hosts the API. The timings for calls to APIs are broken down into components by cURL, "a command line tool and library for transferring data with URLs." In an earlier post, I outlined what curl timings mean.

Your customers want to see a response to their action as soon as possible. In this post, I utilize the API Science API, curl, and a few simple scripts to graphically illustrate the effect of global calling location on overall API performance.

Assume that one component of your app is a call to the World Bank Countries API. The data center for this API is located in Washington, DC, USA. To test the effect of calling location when this API is accessed, I created four API monitors that call the World Bank API from these locations:

  • Washington, DC, USA
  • Oregon, USA
  • Ireland
  • Tokyo, Japan

Next, I created four Linux shell scripts that download performance data for the past week from the API Science Performance Report API. Here's DC_weekly_perf.csh, the script that downloads the past week data for the monitor that calls the World Bank Countries API from Washington, DC:

curl 'https://api.apiscience.com/v1/monitors/1572020/performance.json?preset=lastWeek&resolution=hour' -H 'Authorization: Bearer MY_AUTH_CODE' 

The downloaded JSON data for each monitor is stored in a text file (for example, DC_perf.json). The call to the API Science API returns performance data for the past week binned by hour, with each JSON file containing 168 data entries. A Python script (listed below) performs the processing once the JSON files have been retrieved.

Our objective is to create a graphical view of the performance timings for calling the World Bank Countries API from the four different locations. So, from each JSON file, we must extract the averageTotal value for each hour. We want to plot this data for each calling location on a single graph, so we can easily compare the performance of the World Bank API based on calling location.

Here is the Python script:

# gen_loc_report - 15 February 2019

# generate a report based on JSON data showing
# performance with respect to API call location

import sys
import numpy as np
import matplotlib

# force matplotlib not to use an Xwindows backend
matplotlib.use('Agg')

import matplotlib.pyplot as plt
import json

# get the results from each call location;

with open('DC_perf.json') as f:
    DC_perf = json.load(f)

with open('OR_perf.json') as f:
    OR_perf = json.load(f)

with open('IR_perf.json') as f:
    IR_perf = json.load(f)

with open('JP_perf.json') as f:
    JP_perf = json.load(f)

print 'number of results:', \
    DC_perf['meta']['numberOfResults'], \
    OR_perf['meta']['numberOfResults'], \
    IR_perf['meta']['numberOfResults'], \
    JP_perf['meta']['numberOfResults']

# for simplicity, assume number of results is
# identical across all the JSON files
n_results = DC_perf['meta']['numberOfResults']

hourly_perf_total = np.zeros(n_results * 4, dtype=float)
hourly_perf_total.shape = (4, n_results)

# extract the total performance data for each location
for i in range(n_results):
    hourly_perf_total[0][i] = DC_perf['data'][i]['averageTotal']
    hourly_perf_total[1][i] = OR_perf['data'][i]['averageTotal']
    hourly_perf_total[2][i] = IR_perf['data'][i]['averageTotal']
    hourly_perf_total[3][i] = JP_perf['data'][i]['averageTotal']

# plot the total performance data for each location
plt.plot(hourly_perf_total[0], label='Wash DC')
plt.plot(hourly_perf_total[1], label='Oregon')
plt.plot(hourly_perf_total[2], label='Ireland')
plt.plot(hourly_perf_total[3], label='Tokyo')

plt.xticks(np.arange(0, n_results + 1, 24.0))
plt.ylabel('Average Total Milliseconds')
plt.xlabel('Hours Since ' + DC_perf['meta']['endPeriod'])
title = 'World Bank Countries API Past Week Performance'
plt.title(title)
plt.legend(loc='best')

# log y axis
plt.semilogy()
plt.grid(True)

#plt.show()
plt.savefig('/home/kevin/APIScience/custom_reports/World_Bank_past_week.png')

And here is the resultant graph:

The milliseconds scale (Y-axis) is logarithmic. This plot provides a clear view of the effect of "Internet distance" on the performance of calls to the World Bank Countries API. The API is served from Washington, DC, USA. Calls to the API from Washington DC are generally met in under 100 ms. Meanwhile, calls from Ireland always take significantly longer; while calls from Oregon and Tokyo take longer still.

There is more to investigate. For example, what is the primary cause of these fairly consistent timing differences? The API Science Performance API contains additional timing data, which we'll investigate in a future post.

IAM is now more than a security project. It’s an enabler for an integration agile enterprise. If you’re currently evaluating an identity solution or exploring IAM, join this webinar.

Topics:
integration ,api performance ,api

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}