A few articles in the past week prompted some thinking about the industry of software development. In this case, when I say software development, I am talking about developing websites, web applications, commercial software, enterprise software and almost anything else that requires someone who can code. When comparing the industry to other scientific industries, software development is still young, but more importantly software development is one of the few industries that anyone can get into.
First, let’s look at the articles that prompted all of this. On DZone, there was a post asking us to embrace the fact that we are bad programmers. The first paragraph really summarizes the post:
How many developers think they’re good programmers? We’re not. We’re all fairly bad at what we do. We can’t remember all the methods our code needs to call, so we use autocompleting IDEs to remind us. And the languages we’ve spent years, in some cases decades, learning? We can’t even type the syntax properly, so we have compilers that check and correct it for us.
Admittedly, the post does talk about the benefits of unit testing and other good things, but the premise is interesting. Why are we still dependent upon IDE auto-completion? Why do we really need compilers and interpreters to tell us that we typed the syntax incorrectly? Why are we still talking about syntax?
There are really two parts to this problem. The first is the fact that computer aided software engineering (CASE) is mostly a failed segment of the industry. The tools were not that bad, but the problem is that the industry is really not mature enough to properly specify the type of functionality we need in an application. Part of the reason is that software development is hard. It may also be related to the fact that the industry is mostly unregulated. If you look at pharmaceuticals, they are heavily regulated. Architecture and construction are regulated or at least they have rules they must follow. Software engineering has soft guidelines at best. I am not talking about the SDLC processes like Waterfall, XP and Scrum, I am talking about things our code must or must not do. Static analysis tools like PMD or FindBugs may be part of this, but in order for the industry to really mature it feels like something is missing. Another part of this is the certification or licensing required for each of these jobs. Architects typically require an advanced degree to become an architect, and depending upon what you design, you may need certifications or licenses. Scientists at a drug development company require a minimum of an MS degree eventually, and in many cases a PhD. You cannot become an architect or a scientist without having these advanced degrees or certifications. I am not calling for certifications in software engineering, but this is one of the few industries where you can just pick up a book, start coding and get a job. This does not always mean that these developers are good developers, but just that anyone can become a developer.
The second part of this is that the software development industry is that the industry has become heavily focused on the application side of computer science. I believe this was brought on by the industry itself. When I was in college, there was already pressure from industry to ensure that people graduating with a degree in computer science could be immediately productive in a business environment. The standard curriculum will always change due to the rapid pace of change in our industry, but changing the curriculum to fit what the industry wants may not be the answer. You would hear that developers coming out of school did not know how to build an application. I would argue that the point of education is to teach students the basics of a subject and to teach students how to learn more. If we target students to be more marketable to industry when they graduate, we are losing some of the core knowledge we are supposed to give students. Some of the “leading” universities still have programs that teach a breadth of information, but some smaller or more local universities are going the “marketable graduate” route.
The other post that prompted this was from Dave Winer regarding doing more computer science.
I’d like to see young computer science students get up and present to their teachers some new computer science. Some new use of graph theory. Or an old one re-applied to a modern world. We used to do that, when I was a comp sci grad student in the 1970s. I think we got way too caught up in the commercialism.
Why is this a problem? Well, the real issue is that in order to get to the scaling specialist realm of expertise (or any specialization) or even to build one of these fancy social applications , you need to know a lot of information:
- Basic data structures – What is the difference between a linked list and a hashmap?
- Basic algorithms – There are several sorting algorithms and many suck. There are also different algorithms if you are memory constrained, CPU constrained or disk constrained.
- Object Oriented Programming and other programming types – Object oriented programming is not the only type of programming. There is also procedural, functional, logic programming, and other strange paradigms. Knowing some basic information about these can really make you think about a problem differently.
- Database systems theory – Basic knowledge of how databases work is helpful in understanding why your application is slowing down. The NoSQL movement may not be the same as RDBMS technology, but some of the core concepts still come from standard database theory.
- Theory of Computation – There is a lot of theory behind the often mentioned Big-O notation. This helps you understand you why your algorithm is so damned slow.
- Statistics and Probability – Much of computer science theory is based on mathematics. Statistics and probability also helped forge many concepts in machine learning and artificial intelligence.
- Machine Learning and Artificial Intelligence – This is typically kept outside of the standard curriculum, but many applications are based on recommendations, categorizations or prediction. A basic understanding of what is possible would be enough to give you an idea of where to look next.
This is just a basic list of foundational knowledge. As you dig into
the more advanced topics, you can find tons of interesting information.
Many graduates of computer science do not take courses in each of these
areas. I believe this focus on job readiness has forced us to skip the
courses that help us create the next computing paradigms. This is why
Dave Winer complains about our need to do more computer science, because
less computer science is being done in schools. He talked about the
companies at TechCrunch Disrupt recently:
The products were all cut from a small number of molds. Somehow they have fixated on turning the internet into a game.
What does all of this have to do with each other? The lack of tools to formalize the process could be due to our lack of foundational computer science. There is a large breadth of information that needs to go into tools that would give us standard rules of programming. The lack of education in advanced topics also hurts computer science innovation. When computer science innovation lags, there are less opportunities for “world-changing” computing technologies. Even if we do not target world-changing ideas, a greater breadth in education would give you different ideas on what you could accomplish.
It may not be that you are a bad programmer, we all make mistakes. It may not be that you are using a bad process, even though we all know that the Waterfall process has its problems. You may not have a formal education in computer science either. All this being said, it does not excuse us from learning from our mistakes, learning more about the software development process, and learning more about areas of computer science that are unknown to us.