The Benefits of Testable Code
Join the DZone community and get the full member experience.
Join For Freeone of the most beneficial approaches that you can take in software development is to follow a test driven development approach. while we all understand why it's a good thing, it's often left slide in favour of adding functionality to reach deadline, or just to "save" time.
one of the challenges for test driven development is making sure that your code is testable. miško hevery wrote an
article
about this very topic a while back. i thought he would be the ideal person to talk to me about testable code and test driven development.
james sugrue: hi miško, can you introduce yourself to the readers?
miško hevery:
my name is miško hevery, and i build web-applications for a living. my passion is to share the knowledge between developers, as i think that developing software is too much art, and i would like it to become a lot more of science. i spent a lot of time giving tech-talks, both internally at google, as well as at other companies and conferences, and mostly focus on how to build better software by focusing more on testability. i want to bring software development back into the realm of everyday users and our teenagers, which will be our next generation of software developers. to that end i am working on allowing others to build web-applications using a novel approach of using nothing but html. we call it <angular/>, you can see a demo here:
http://www.getangular.com/screencast
james: <angular/> sounds interesting - could you give us a quick overview of how it provides the functionality it does through pure html?
miško:
<angular/> has two parts:
(1) html is a declarative language, but it is only good at rendering static content, what <angular/> does is to extend the html language to add data-binding, persistence, and looping constructs. in essence it makes your html smarter and more suited for building web-applications.
(2) the second part of angular is your data in the cloud. we provide rest api to your data and we do the hosting. this liberates the data as it can be accessed from anywhere and by anyone (not just by the <angular/> client) (according to security rules you specified).
the end product is that you can create an html form and through data-binding you specify where different parts of your data should appear on the page. <angular/> takes care of fetching the data, and merging it with the html template. and since it is just pure html the html can be hosted anywhere, such as a static host, or embedded in existing systems such as your blog (
example
), or your content-management-system.
james: so, back to testability. why do you think that it's taken so long for test driven development to get proper acceptance? after all, it seems like the most sensible way to develop?
miško:
tdd, may seem natural after you have done it for a while, but it is no more natural than riding a bike. it is a skill like any other, and as such the beginnings are heard and it takes a long time to master it. most of us learn to code without tdd and by the time we encounter it we have had a lot of experience coding in "cowboy" style. learning something new will slow us down at first, just as walking/running is a faster way of transportation than a bicycle when you first start it learning to ride it. however, i think the tide is slowly shifting. many new languages have tdd culture baked right in (such as ruby) you can see that in the support of the tools which you get right out of the box.
james: let's look at your
recent article
on testability. i'm going to go through a few of the points you've mentioned with a few questions.
you list mixing object graph construction with application logic. why does this help with testing? surely there are more important things to do first?
miško:
testing is all about instantiating a small portion of the application and exercising the code and asserting that it does something useful. the key here is instantiating small portion of the application. if you cannot instantiate you cannot test, and if you can only instantiate the whole application than you cannot have a small unit test. so instantiation becomes kind of a prerequisite which no one thinks about until they are trying to write a test. imagine you are trying to test that the cookie handling logic correctly locates a user. you try to instantiate the class responsible but that class needs other classes, and so like a single thread hanging on your sweater you try to pull on it and other things show up as dependencies and your small test grows to the point where you have to make sure you have a real database, kerberos authentication server, and most of your application running. not what we have set out to do. turns out that being able to control the way the individual class in an application are wired together is very important, if we want to keep the code testable on small scale. for this reason, we have to make sure that the code which does the wiring (your new operators) is not interspersed with the code which does the work (conditional and loops).
james: your point on dependency injection is very valid. some people still have trouble grasping the concept of di. could you explain it to us in simple terms?
miško:
i always say: "ask for things, don't look for things". what i mean here is that if you need a class x you should ask for it either in the constructor or the method signature, rather than looking for it by instantiation [new operator] or by looking it up in a global space [aka singleton.getinstance()] or by literarily looking for it in a [service locator]. what in the effect your class is saying is that in order to get my job done, i need a, b, c, therefore i will ask for a, b,c in the constructor rather than instantiating a, or getting b from global space using b.getinstance() or by looking it up in service locator as in registry.get().getc(). what this means is that you left the job of instantiating and wiring to someone else, that someone else is the factory. this gives you tremendous amount of freedom and not just in testing, as you can easily wire the application differently, to get different behavior. for example you can wire a alwaysauthenticated class instead of kerberosauthenticator, which makes development easier, as the developers no longer have to authenticate while testing.
james: doing work in the constructor: it's tempting to do this when you're getting something going initially. would you support my theory that you can just refactor it later? or do you think it's best to work the clean way all the time?
miško:
it may me tempting, but every time i have said, "this time it is different" i end up getting bit by it later and having to refactor. what kills you is that wiring small portions of the applications together is by far the most common thing you will do in your unit tests. which means that any work in the constructor will have to execute many times over. say you want to test a printer. its print method needs a document. what you want, is to say new document() and get on with the business of testing the printer. but if the document constructor does work, say http request to fetch the content, than you will get distracted trying to get your document instantiated while, what you really want to do is test your printer. to make tests easy you need to be able to construct objects at will, otherwise you will spend way to much time fighting with the initialization logic in your test and will lose focus.
james: i really like the analogy of global state to goto. surely there's no excuse for people using global state anymore.
miško:
that is what i think, but the thing is that most people fail to realize to things. (1) singletons are global state. (they have the static field usually called instance in them) and (2) global state is transitive, which means that even if you have a single static field inside a single singleton any object which is accessible from than single instance is still globally accessible. most applications which i have come across with are full of singletons and hence full of global state. here is a simple test. could you instantiate two instances of your application in a single jvm? if you have any global state than the answer is no. but why would you want to instantiate two copies in a single jvm? easy, every test is a instantiation of a small portion of the application, and all tests run in a single jvm.
james: you hate singletons - i believe there's even a code checker at google for this. but for me the jury's out on this one. i don't like too many singletons for sure, but just one in my application could possibly help with things. would you agree? can you explain your anti-singleton-ness to us?
miško:
first lets get our terminology straight: (1) there are singletons (capital s) as in the name given to the design pattern, which has as a private constructor and a global internal variable usually called instance. we get a hold of a singleton by calling a static method as in singleton.getinstane(). (2) there are also singletons (lower s) which simply means that we have a normal class (public constructor no global state), and inside our application we only have a single instance of it. we get hold of singleton by asking for it in the constructor. there are a lot of reasons to have singletons, as in a single instance of something shared among many other objects, such as a cache, and i use singletons all the time. the problem is not with having singletons (one of something), but rather with having singletons (one of something and it is global). asking for a singleton through dependency injection (no global state) is good practice, but getting hold of a singleton through global state is recipe for disaster.
james: you sum up the issue with static methods really well when you say you don't know how to test the main method. why do you think there is such a fascination with static methods when we have such a good object oriented language?
miško:
oo is not intuitive. if it was, than the first language we built would have been oo. oo is about collaboration between objects, whereas procedural codes is about telling someone the steps needed to get something done. static methods are procedural whereas instance method are oo. we are wired to break things down into a list of things to be done, instead of thinking about an ecosystem of objects which can collectively get things done. using static methods sparingly is probably ok, but static methods beget more static methods, since one you cross the boundary from the oo world to procedural, it is unlikely that you will come back. to me it is a slippery slope. rather than spending time thinking wether in this case static method is ok, i have a rule not to think about it and just avoid them.
james: can you explain the composition over inheritance rule a bit more? if i understand you correctly, you mean that it's good practice to avoid using inheritance to reduce the amount of clutter?
miško:
when i first learned about inheritance i got supper excited. with just a single keyword i could got so much behavior into my object. i once wrote a class pixel (i was doing image processing) and i needed to keep it in a list, so i made the pixel inherit from a list. i was so proud of myself. life was good until i needed to keep the pixel at two lists at once, and i could not as the pixel was a list and i could only be part of one list. in retrospect it is the worst thing i could have done. the point of inheritance is to take advantage of polymorphic behavior not to reuse code, and people miss that, they see inheritance as a cheap way to add behavior to a class. when i design code i like to think about options. when i inherit, i reduce my options. i am now sub-class of that class and cannot be a sub-class of something else. i have permanently fixed my construction to that of the superclass, and i am at a mercy of the super-class changing apis. my freedom to change is fixed at compile time. on the other hand composition gives me options. i don't have to call superclass. i can reuse different implementations (as supposed to reusing your super methods), and i can change my mind at runtime. which is why if possible i will always solve my problem with composition over inheritance. (but only inheritance can give you polymorphism)
james: could you give a concrete example of how polymorphism is better to use than conditionals?
miško:
one of the cool things about oo is that polymorphism gives you different behavior based on instance. to achieve that same thing in procedural language you would have to use a switch statement and place the code from each of the subclasses to the corresponding case statements. now it is also common that you would have multiple switch structures in different location of your code. for example rendering and parsing a date would be in different methods which would switch on type. adding new behavior such as gps coordinates would require you to remember to change both locations. in oo you can replace what you are switching on with just a dispatch to a method. as a side benefit the parsing and rendering code for any one type now live next to each other, and you can add new types without recompiling the whole project. an if statement is just a switch with two branches, so it is equivalent. now there are some ifs which can never be replaced such as relative comparison (<>) but most of the ifs in an application or actually of flag type as in if (green && short) {leprechaun code}. here we are switching on type, and oo has types os first class citizens and so we are reinventing something which a language already has.
james: your last two points are all to do with mixing concerns. do you think it's always because we rush things out the door?
miško:
even if you give people more time, they will still mix concerns and make systematic mistakes. they make these mistakes, because they have not fully comprehended the implications, and that is how they were thought, and giving them more time will not make the code any better. people like to use "i was under time pressure" as a defense, after they realize that some of the decisions which they have made cost them dearly. if you know that a decision will cost you dearly in the future, you will not make that no matter what the time pressure is. blaming the time pressure in hindsight misses the real, problem, which is that you simply did not know back than. i have discover this when i was on the project where people were complaining of the way it was and they were blaming time pressure. a decision was made to rewrite a small portion, so you would think that people would not make these mistakes, but contrary, they made the same mistakes again, they even copied the problematic code to the new project. this lead me to realize that it is easy to see that something sucks, but it is not so easy to realize what is the root cause of it, or to do anything about it, and so we just blame the time pressure to make ourselves feel better.
mixing of concerns is one of those subtle things, which everyone agrees on in principle and than when they start coding they break. they make the car class be responsible not just for driving (legitimate concern), but also for building itself, building the engine, and loading the driver from the disk.
james: one of my pet hates is the mixing of business logic in user interface code. does that annoy you too, or have you seen worse?
miško:
mixing logic with presentation can make code hard to understand, but that in it self it does not necessarily make it hard to test or maintain. similarly having separation does not automatically mean testability if you did not properly separate the wiring and logic code. while i personally think it is a good idea and always try to separate them as much as i can it is important to realize that separation is not for the sake of separation, but for the sake of something greater. i separate, because it makes my code easier to test and because it lowers my cognitive load when i am working with it. i feel that many times people separate things because they have always been told to, but they fail to gain any benefits from it, since they have done it the way they have been told rather to achieve some specific benefit. i have broken this rule in few occasions, since i was able to achieve my goal of lowering the cognitive load, and testability without paying the added cost of separation, and that is fine, as long as you know what your goals are.
james: let's look at the complicated case. i've decided that unit tests are the way to go in my project. but i've a load of legacy code. unit testing will be tough- is it worth the refactor? how can i make this easier on myself?
miško:
it is important to not loose focus, which is to deliver the project. the way i usually solve this is that i write the new code in a clean testable tdd way, and than i use one of two strategies:
1. i can wrap the legacy code in a clean interface and write glue code between my ideal interface and the hard to test legacy code. i will probably have hard time testing the glue/adapter code but at least everything else is testable.
2. or, i can go and refactor the existing code, but only what i need to achieve my testability goals.
which of the two strategies you chose will depend on you experience and the complexity of the refactoring you are facing. one thing i noticed is that i often start from the other end where most people start. i write my ideal test, and than try to figure out how i can make it work through refactoring and or adapters, rather than trying to write a test for the current code.
james: if i was to give a statement on the importance of testability it would go along the lines of "
developer testing early saves you time later, even if it seems to take up time. writing testable code makes your code more coherant, cleaner and easier to come back to"
. what would your statement be?
miško:
firstly let me point you to an
article
where i discuss the cost of testing.
i love tdd, because i don't have to thing about breaking things, this removes a level of paranoia and ads confidence to take on any refactoring job which keeps your code in top notch shape. but most importantly since i don't have to think as much about coding, i can use my brain to think more about overall vision of the application. testing does not make your application better because it your application sucks less (it has fewer bugs), it makes your application better because you spent more time thinking about the application, and you were able to re-structure the application.
james: for all the readers of this interview, who now see the benefits of testability, what one tip would you give them to get them on the fast track to testable code.
miško:
try to find your own toy project where you attempt to learn tdd, before you bring tdd to the monster legacy codebase which you have at work. (learn to bike on flat land, before you go mountain biking down a steep slope.) test first may sound simple, but it is a skill, like any other, and as such it will take you months to become an expert at it.
james: as an industry, software development is still in the early stages. do you think that tdd helps to give some quality assurance to what we produce? what are your thoughts on how to make software development a more mature profession?
miško:
people often have the expectations that tdd, produces bug free software, and when it does not they get disappointed. bugs come in two flavors: (1) i misunderstood what needs to be built and i built the wrong thing and (2) i understood correctly, but in the process of translating it to code i made a mistake. tdd, can only help you with the second one. i like to think of it as tdd helps me built what i intended, but whether what i intended is what needs to be built is a different story. so tdd helps to give some quality assurance but it is not a silver bullet which solves all.
software industry is very much in the infancy, as software is much more of an art than science right now. looking back i see that we are discovering new methodologies by which we built software and these methodologies are being slowly translated to new languages which embrace them. oo is an example which started as a methodology for dealing with data and become baked into the language itself. i think the next round of new languages will embrace testing and dependency-injection as first class constructs, helping us further codify the best-practices.
finally, i think the industry needs to reach out to our school system and bring the classes more up to date with what the industry needs. c/c++ may be the perfect language for building operating-systems, virtual-machines, and video-codacs, but they are the wrong language for building applications, wether they are desktop application or a web-application. today most new graduates come out with knowledge of c/c++ and are thrown into building apps, without any knowledge methodologies of building the large scale systems. i would love the universities to focus more on methodology, and generic languages, rather than on one specific language which is 30+ years old.
Opinions expressed by DZone contributors are their own.
Comments