Nice numbers. I have used those articles as reference points while speaking about the potential market size for our memory leak detection tool. But something in these numbers has bothered me for years - there is no trustworthy and public analysis behind those numbers. Its just conjured up from thin air. So I finally thought I would do something about it and try to figure it out for good.
It proved out to be a challenging task. After all - with more than seven billion people on our planet I couldn't call everyone and ask them. Well, maybe I could, but if every call would take on average 20 seconds I would need at least 4,439 years to complete the survey. If I would not sleep nor eat nor rest. So I had to use other ways for estimation.
After playing around with different sources of information, I decided to dig into four of them for a closer look:
- Labour statistics provided by different governments
- Language popularity sites such as Tiobe and Langpop
- Employment portals using Indeed.com and Monster.com
- Download numbers on popular Java tools and libraries - namely Eclipse and Tomcat.
World population is currently above seven billion. Out of those seven billion we can leave out sub-Saharan Africa (900M) and rural Asia (about 50% of its 2.2B population) as negligible. This leaves us with approximately 5 billion people living in regions where overall economical and cultural background can be considered suitable for software industries to spawn.
Now, out of those 5,000,000,000 how many could be actually developing software? A good answer at StackExchange gives us some pointers as to where we can find information on the percentage of software developers in different countries. Using the US, Japan, Canada, the EU27 and the UK as a baseline we can estimate that 0.82% of the population is employed as a software developer or programmer:
0.86% out of five billion is 43,000,000. Lets remember this number, as it will be used as a baseline in following calculations.
In the popularity contest we will use two channels for the source of data - the TIOBE index and the Langpop one. Other sources such as Dataist figures were hard to interpret, so we’ll stick just to those two.
For the background - the TIOBE ratings are calculated by counting hits of the most popular search engines. The search query that is used is
+"<language> programming", e.g. +“Java programming” in our case.
Langpop uses more sources for input besides search engine queries - in equal weights it traces open job positions, book titles, search engine results, the number of open source projects and other data to calculate its popularity score.
Simplifying TIOBE and Langpop results, we can conclude that according to TIOBE 17% and according to Langpop ~15% of the programmers in the world are using Java. Averaging those numbers we can say that around 16% out of the 43,000,000 developers in the world use Java. This translates to 6,880,000 Java developers out there.
Job portals, especially when considering both available positions and uploaded resumes, are definitely a good source of information. The larger ones also provide nice reports on labour market, which we will dig into next. Note that we used Indeed.com and Monster.com - if you can point us towards more and/or better sources of information, we would be glad to correct our calculations.
But using this analysis from Monster.com and the aggregated statistics from Indeed.com we can say that ~18% of Monster.com applicants can program in Java and ~16% of open engineering / programming positions scanned by Indeed.com are looking for Java talent. Averaging those numbers we arrive at 17%. Which out of 41,000,000 programmers in total would translate to 7,310,000 Java guys and girls in the world.
Every Java developer uses something to build the application. Well, we expect them to use at least a JVM and a compiler. If you happen to know anyone who can get away without those two, please let us know. We would hire him immediately.
But most of us tend to use more than just a compiler and a virtual machine. We use IDEs, application servers, build tools, etc. So we figured that we would look into the publicly available download numbers of these tools and try to estimate the number of developers from the download numbers.
When calculating the total number of developers from estimated number of users, we take into account the market share of the corresponding software. To estimate the market share we use Zeroturnaround’s statistics gathered in the spring of 2012.
Eclipse downloads. Eclipse Juno was released on June 27 and has been downloaded 1,200,000 times during the first 20 days. Looking into the historical data published by eclipse.org we can predict that Juno will be downloaded approximately 8,000,000 times in total. Last four major Eclipse releases have all been released using a yearly release calendar and all the releases took place in June:
- Juno - 8,000,000 (in a year, expecting the trend to continue. Currently has 1,200,000 downloads in first 20 days).
- Indigo - 6,000,000 downloads
- Helios - 4,100,000 downloads
- Galileo - 2,200,000 downloads
Averaging Juno estimates and Indigo results, we can say that Eclipse is downloaded approximately 7,000,000 times a year.
Using the Zeroturnaround’s statistics, we expect 68% of Java developers to use Eclipse as a (primary) IDE.
If we now make a bold claim that each Java developer on Eclipse will download the IDE exactly once a year, expect the number of downloads per year to be 7,000,000 and consider that 32% of Java developers do not use Eclipse at all, we come to a conclusion that there should be 10,300,00 Java developers in total.
Apache Tomcat downloads. Vadim Gritsenko has put together some nice statistics on top of Apache logs. From there we can see that during the last year Tomcat has been downloaded approximately 550,000 times/month. This gives us a yearly total of 6,600,000 Tomcat downloads.
Applying now statistics from the same report used for calculating Eclipse’s market share we can estimate that 59% of Java developers are using Tomcat as one of their development platform.
If we now again make a bold claim that each Java developer on Tomcat will download every major release exactly once and consider that 41% of Java developers do not use Tomcat, we reach to conclusion that there should be 11,186,000 Java developers out there.
Averaging the numbers from Eclipse and Tomcat downloads, we end up with 10,743,000 Java developers.
We used three different sources for estimation - popularity contests, job market analysis and download numbers of popular Java development infrastructure products. The numbers varied quite a bit - from 6,880,000 to 10,743,000. Aggressively averaging the three numbers we can conclude that there are 8,311,000 Java developers out there. Not quite as much as Oracle or Wikipedia think, but still enough to build a business that provides developing tools for the Java community.
Lies. Damn lies. And statistics.