My Newbie Challenges With Matplotlib
Join the DZone community and get the full member experience.Join For Free
In this article, I would like to share the challenges I faced (and the solutions!) as a Python newbie using Matplotlib in anger for the first time…
Recently I was tasked with developing a Python-based workflow as well as an article for my employer. The workflow involved generating some Technical Analysis related charts and the article involved plotting some Volatility Surfaces and various Curves.
As someone quite new to Python and never really used Matplotlib previously, this presented quite a challenge.
Sure I had read a few articles/tutorials on Matplotlib — but quite often these were very ‘atomic’ — each focusing on individual bits of functionality but not how you could combine multiple features into one presentable chart. Or they used different approaches/interfaces to achieve a similar end objective — making it harder to combine the snips of code. And none of these covered all the different features I wanted to apply — so I had to glean hints and guidance from multiple sources — just to make one chart look pretty.
Each chart I had to create presented a slightly different challenge — which meant more googling and more testing out snippets. So, I promised myself to document all the challenges and solutions to help others in a similar situation — and this article is the result of that promise. I hope you find it interesting as well as useful….
As you can imagine, most of my challenges were around the aesthetics of my charts. For my Technical Analysis workflow, I needed to replicate some charts that are available in my employer’s desktop application and they look much better than the ‘black axes on white background’ stuff that Matplotlib generates by default.
The desktop application generates some very contemporary looking charts with black/charcoal backgrounds and contrasting colour line plots:
Now, clearly, I was not going to be able to replicate the hours of work and polish that had gone into those — but I did want to replicate the look to some degree at least. So, the first challenge was to make the default Matplotlib stuff look something like the above:
Where close is my Closing Price data and sma14 and sma200 represent my short and long period Moving Averages data.
A few immediately obvious things that needed fixing:
- Change the background and foreground colours
- Make the chart bigger
- Label my line plots
- Title the chart
I assumed I would be able to call something like the following to change the background colour — but it seems it is not that simple.
Instead, I had to do set the figure and axes colours individually (where ‘0.25’ is a grey shade I quite liked):
I then found that I needed to access the figure and the axes individually using the subplots call (in order to change the colour of tick markers). In doing so I discovered I could roll the figsize and facecolor into the same subplots access call - saving a couple of lines of code:
The plt.subplots() call returns the enclosing figure (container) and the axes so I can interact with each individually — and whilst doing so I was able to set the size and colour of the figure.
In addition, I did the following
- call ax.set_facecolor() to set the axes (plot area) colour
- colour each line plot by adding the ‘color’ parameter where y=yellow, g=green and ‘tab:purple’ (which is from the Tableau palette)
- ax.tick_params() to change the colour of the tick values on both axes from the default black to a more legible white, and make the font a bit bigger
- plt.legend() to show the actual line plot labels in a nice little panel
- plt.title() for the chart title
All that brought me much closer to the prettier chart I was looking for:
Much googling later I worked out that I could use the numpy.arange function to produce some evenly spaced data — which worked well, but my x-axis no longer showed the date. So I added the following custom Formatter to restore the dates to the x-axis.
I then reference MyFormatter as part of my final plot code as follows:
I also added the fig.autofmat_xdate() to rotate the x-axis labels and avoid overlap. The final result is a much smoother plot with evenly spaced date labels:
For the Simple Moving averages chart, it made sense to show the Closing price line on the same plot as the short and long Moving Average lines — to contrast the Moving Averages with the Close price.
However, for my next two charts, this would not be ideal — I really need to show the lines as separate plots — albeit sharing the same x-axis. In other words, I wanted to ‘stack’ the two charts on top of each other.
First, I tried to create two charts and position them on top of each other — but the gap between the charts was too large and the x-axis was repeated unnecessarily.
I then came across the GridSpec class which allows the user to ‘Specify the geometry of the grid where a subplot will be placed ‘— sounded perfect!
Using GridSpec I created a grid of 2 rows and a single column and then added a subplot to each row. I then plotted my Closing Price chart in the first row and my other chart in the 2nd one:
Once I added in the additional stuff for labels, tick params, axis formatting etc I ended up with the following:
Note the dashed ‘RSI threshold’ lines at 70 and 30 — which I added with the following code:
I then re-used all the above techniques to create a Stochastic chart that you can see below:
Volatility Surfaces and Curves
My 2nd piece of work required me to include some Volatility Surfaces and a few Curves in the article I was writing. Somebody else has already created some basic plots of the types I needed — however, I did not want the default Matplotlib look:
I didn’t like the colours, the position of the axes labels nor the Y-axis tick labels. I already knew how to do the colours which just left the tick labels and axes label positioning.
So, why were the Y-axis of ‘time to expiry’ ticks labelled with numeric values? Because, for this plot type, the Matplotlib library requires the date to be expressed in ‘number of days since UTC’ — e.g. 737600 = 25 June 2020.
I initially tried to find a way of passing in the date in two formats, one for the tick labels and one for the plot — when calling the plot function. However, I soon realised I could just use a custom Formatter again:
And the axes label spacing/position could be improved via the labelpad attribute. So, the following code snippet:
produces the following — which I hope you will agree, looks more appealing:
A Bit of a Cheat!
This final one, I have to admit is a bit of a cheat— there is probably a correct way of doing it — but my ‘fix’ worked!
I had to plot a chart with data for three instruments over a given period. Should be simple enough — but for some reason, the chart looked like this:
After a bit of digging, it turned out that the GBP data had an extra data point for 2022 (which the EUR and CHF data did not). However, because I was plotting the EUR data before the GBP data, Matplotlib created the mess you see above.
So, how did I ‘fix’ this? I just plotted the GBP data first — giving me the following neater output:
The takeaway from this is that the sequence of a multi-line plot does seem to matter when dealing with non-identical x-axis values. I suppose that if each of the currencies had different dates, I would have to pad/fill the dates out somehow.
I did have several other issues around creating and formatting these charts — but these were more to do with the data format not being quite how I needed/wanted it to be — rather than Matplotlib issues.
So, thanks for reading and I hope you found it interesting if not entirely useful!
Full Source code:
As promised, here are the links for the full source code behind the snippets I used above. You will need various licences/accounts to run them — however, I am hopeful that you can lift the relevant charting snippets and re-use them with your data in your Python code.
Opinions expressed by DZone contributors are their own.