DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Testing, Deployment, and Maintenance Topics

article thumbnail
DevOps Liason Team
We’ve covered the controversy of a DevOps Team on this blog before. DevOps Teams are dangerous in that many organizations realize that their Dev and Ops groups are so far apart, that they need a neutral, expert group that can bring them together. At the same time, there are increasing reports of DevOps teams becoming yet another silo – and one that is often arrogant and disliked. More silos being exactly the opposite of what we are going for, this is a frightening result. The dangers in a DevOps Team seem to be that they will: End up owning a lot of things, and be a silo onto themselves Be over aggressive in dictating how teams should work A little over a week ago at the IBM Innovate conference in Orlando an attendee shared her successful experience creating a DevOps Liaison Team. I am very intrigued by the naming here. When the team is formed with the name and charter to bring the other groups together, it is hard for them to either own systems or dictate. The other pattern that I liked was what WebMD did in their project to select and implement a deployment automation tool. They went to manages in various traditional silos (Dev, QA, Ops, etc) and asked for a techie who: Did real work Had the respect of their peers The manager would delegate authority to compromise to They ended with a team full of skilled engineers who would work together, but then go back to their own teams once the project was done. However, they had formed solid working relationships cross-silo and could become a liaison between their group and others. Both of these approaches seem to form a DevOps Team of sorts, without creating something evil. They also take chisel to walls between groups rather than trying to reorganize radically all at once. I’m encouraged that our industry is beginning to find some healthy patterns for enterprise DevOps adoption. For more on non-evil DevOps teams, check out our recorded webinar: Building a DevOps Team that Isn’t Evil. What about you? Have you had success forming a dedicated team that helps the rest of the organization grok DevOps? Have you failed? Leave your tips in the comments area below.
June 18, 2013
by Eric Minick
· 5,673 Views
article thumbnail
git: Having a branch/tag with the same name (error: dst refspec matches more than one.)
Andres and I recently found ourselves wanting to delete a remote branch which had the same name as a tag and therefore the normal way of doing that wasn’t worked out as well as we’d hoped. I created a dummy repository to recreate the state we’d got ourselves into: $ echo "mark" > README $ git commit -am "readme" $ echo "for the branch" >> README $ git commit -am "for the branch" $ git checkout -b same Switched to a new branch 'same' $ git push origin same Counting objects: 5, done. Writing objects: 100% (3/3), 263 bytes, done. Total 3 (delta 0), reused 0 (delta 0) To ssh://[email protected]/markhneedham/branch-tag-test.git * [new branch] same -> same $ git checkout master $ echo "for the tag" >> README $ git commit -am "for the tag" $ git tag same $ git push origin refs/tags/same Counting objects: 5, done. Writing objects: 100% (3/3), 266 bytes, done. Total 3 (delta 0), reused 0 (delta 0) To ssh://[email protected]/markhneedham/branch-tag-test.git * [new tag] same -> same We wanted to delete the remote ‘same’ branch and the following command would work if we hadn’t created a tag with the same name. Instead it throws an error: $ git push origin :same error: dst refspec same matches more than one. error: failed to push some refs to 'ssh://[email protected]/markhneedham/branch-tag-test.git' We learnt that what we needed to do was refer to the full path for the branch when trying to delete it remotely: $ git push origin :refs/heads/same To ssh://[email protected]/markhneedham/branch-tag-test.git - [deleted] same To delete the tag we could do the same thing: $ git push origin :refs/tags/same remote: warning: Deleting a non-existent ref. To ssh://[email protected]/markhneedham/branch-tag-test.git - [deleted] same Of course the tag and branch still exist locally: $ ls -alh .git/refs/heads/ total 16 drwxr-xr-x 4 markhneedham wheel 136B 13 Jun 23:09 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 .. -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 master -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 same $ ls -alh .git/refs/tags/ total 8 drwxr-xr-x 3 markhneedham wheel 102B 13 Jun 23:08 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 .. -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 same So we got rid of them as well: $ git checkout master Switched to branch 'master' $ git branch -d same Deleted branch same (was 08ad88c). $ git tag -d same Deleted tag 'same' (was 1187891) And now they are gone: $ ls -alh .git/refs/heads/ total 8 drwxr-xr-x 3 markhneedham wheel 102B 13 Jun 23:16 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 .. -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 master $ ls -alh .git/refs/tags/ total 0 drwxr-xr-x 2 markhneedham wheel 68B 13 Jun 23:16 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 .. Out of interest we’d ended up with this situation by mistake rather than by design but it was still fun to do a little bit of git digging to figure out how to solve the problem we’d created for ourselves.
June 17, 2013
by Mark Needham
· 21,407 Views
article thumbnail
Mockito - Extra Interfaces with Annotations and Static Methods
In the code I have quite recently came across a really bad piece of code that based on class casting in terms of performing some actions on objects. Of course the code needed to be refactored but sometimes you can't do it / or don't want to do it (and it should be understandable) if first you don't have unit tests of that functionality. In the following post I will show how to test such code, how to refactor it and in fact what I think about such code ;) Let's take a look at the project structure: As presented in the post regarding Mocktio RETURNS_DEEP_STUBS Answer for JAXB yet again we have the JAXB generated classes by the JAXB compiler in thecom.blogspot.toomuchcoding.model package. Let's ommit the discussion over the pom.xml file since it's exactly the same as in the previous post. In the com.blogspot.toomuchcoding.adapter package we have adapters over the JAXB PlayerDetails class that provides access to the Player interface. There is the CommonPlayerAdapter.java package com.blogspot.toomuchcoding.adapter; import com.blogspot.toomuchcoding.model.Player; import com.blogspot.toomuchcoding.model.PlayerDetails; /** * User: mgrzejszczak * Date: 09.06.13 * Time: 15:42 */ public class CommonPlayerAdapter implements Player { private final PlayerDetails playerDetails; public CommonPlayerAdapter(PlayerDetails playerDetails){ this.playerDetails = playerDetails; } @Override public void run() { System.out.printf("Run %s. Run!%n", playerDetails.getName()); } public PlayerDetails getPlayerDetails() { return playerDetails; } } DefencePlayerAdapter.java package com.blogspot.toomuchcoding.adapter; import com.blogspot.toomuchcoding.model.DJ; import com.blogspot.toomuchcoding.model.DefensivePlayer; import com.blogspot.toomuchcoding.model.JavaDeveloper; import com.blogspot.toomuchcoding.model.PlayerDetails; /** * User: mgrzejszczak * Date: 09.06.13 * Time: 15:42 */ public class DefencePlayerAdapter extends CommonPlayerAdapter implements DefensivePlayer, DJ, JavaDeveloper { public DefencePlayerAdapter(PlayerDetails playerDetails){ super(playerDetails); } @Override public void defend(){ System.out.printf("Defence! %s. Defence!%n", getPlayerDetails().getName()); } @Override public void playSomeMusic() { System.out.println("Oops I did it again...!"); } @Override public void doSomeSeriousCoding() { System.out.println("System.out.println(\"Hello world\");"); } } OffensivePlayerAdapter.java package com.blogspot.toomuchcoding.adapter; import com.blogspot.toomuchcoding.model.OffensivePlayer; import com.blogspot.toomuchcoding.model.PlayerDetails; /** * User: mgrzejszczak * Date: 09.06.13 * Time: 15:42 */ public class OffensivePlayerAdapter extends CommonPlayerAdapter implements OffensivePlayer { public OffensivePlayerAdapter(PlayerDetails playerDetails){ super(playerDetails); } @Override public void shoot(){ System.out.printf("%s Shooooot!.%n", getPlayerDetails().getName()); } } Ok, now let's go to the more interesting part. Let us assume that we have a very simple factory of players: PlayerFactoryImpl.java package com.blogspot.toomuchcoding.factory; import com.blogspot.toomuchcoding.adapter.CommonPlayerAdapter; import com.blogspot.toomuchcoding.adapter.DefencePlayerAdapter; import com.blogspot.toomuchcoding.adapter.OffensivePlayerAdapter; import com.blogspot.toomuchcoding.model.Player; import com.blogspot.toomuchcoding.model.PlayerDetails; import com.blogspot.toomuchcoding.model.PositionType; /** * User: mgrzejszczak * Date: 09.06.13 * Time: 15:53 */ public class PlayerFactoryImpl implements PlayerFactory { @Override public Player createPlayer(PositionType positionType) { PlayerDetails player = createCommonPlayer(positionType); switch (positionType){ case ATT: return new OffensivePlayerAdapter(player); case MID: return new OffensivePlayerAdapter(player); case DEF: return new DefencePlayerAdapter(player); case GK: return new DefencePlayerAdapter(player); default: return new CommonPlayerAdapter(player); } } private PlayerDetails createCommonPlayer(PositionType positionType){ PlayerDetails playerDetails = new PlayerDetails(); playerDetails.setPosition(positionType); return playerDetails; } } Ok so we have the factory that builds Players. Let's take a look at the Service that uses the factory: PlayerServiceImpl.java package com.blogspot.toomuchcoding.service; import com.blogspot.toomuchcoding.factory.PlayerFactory; import com.blogspot.toomuchcoding.model.*; /** * User: mgrzejszczak * Date: 08.06.13 * Time: 19:02 */ public class PlayerServiceImpl implements PlayerService { private PlayerFactory playerFactory; @Override public Player playAGameWithAPlayerOfPosition(PositionType positionType) { Player player = playerFactory.createPlayer(positionType); player.run(); performAdditionalActions(player); return player; } private void performAdditionalActions(Player player) { if(player instanceof OffensivePlayer){ OffensivePlayer offensivePlayer = (OffensivePlayer) player; performAdditionalActionsForTheOffensivePlayer(offensivePlayer); }else if(player instanceof DefensivePlayer){ DefensivePlayer defensivePlayer = (DefensivePlayer) player; performAdditionalActionsForTheDefensivePlayer(defensivePlayer); } } private void performAdditionalActionsForTheOffensivePlayer(OffensivePlayer offensivePlayer){ offensivePlayer.shoot(); } private void performAdditionalActionsForTheDefensivePlayer(DefensivePlayer defensivePlayer){ defensivePlayer.defend(); try{ DJ dj = (DJ)defensivePlayer; dj.playSomeMusic(); JavaDeveloper javaDeveloper = (JavaDeveloper)defensivePlayer; javaDeveloper.doSomeSeriousCoding(); }catch(ClassCastException exception){ System.err.println("Sorry, I can't do more than just play football..."); } } public PlayerFactory getPlayerFactory() { return playerFactory; } public void setPlayerFactory(PlayerFactory playerFactory) { this.playerFactory = playerFactory; } } Let's admit it... this code is bad. Internally when you look at it (regardless of the fact whether it used instance of operator or not) you feel that it is evil :) As you can see in the code we have some class casts going on... How on earth can we test it? In the majority of testing frameworks you can't do such class casts on mocks since they are built with the CGLIB library and there can be some ClassCastExceptions thrown. You could still not return mocks and real implementations (assuming that those will not perform any ugly stuff in the construction process) and it could actually work but still - this is bad code :P Mockito comes to the rescue (although you shouldn't overuse this feature - in fact if you need to use it please consider refactoring it) with its extraInterfaces feature: extraInterfaces MockSettings extraInterfaces(java.lang.Class... interfaces) Specifies extra interfaces the mock should implement. Might be useful for legacy code or some corner cases. For background, see issue 51 hereThis mysterious feature should be used very occasionally. The object under test should know exactly its collaborators & dependencies. If you happen to use it often than please make sure you are really producing simple, clean & readable code. Examples: Foo foo = mock(Foo.class, withSettings().extraInterfaces(Bar.class, Baz.class)); //now, the mock implements extra interfaces, so following casting is possible: Bar bar = (Bar) foo; Baz baz = (Baz) foo; Parameters:interfaces - extra interfaces the should implement. Returns:settings instance so that you can fluently specify other settings Now let's take a look at the test: PlayerServiceImplTest.java package com.blogspot.toomuchcoding.service; import com.blogspot.toomuchcoding.factory.PlayerFactory; import com.blogspot.toomuchcoding.model.*; import org.junit.Test; import org.junit.runner.RunWith; import org.mockito.InjectMocks; import org.mockito.Mock; import org.mockito.invocation.InvocationOnMock; import org.mockito.runners.MockitoJUnitRunner; import org.mockito.stubbing.Answer; import static org.hamcrest.CoreMatchers.is; import static org.junit.Assert.assertThat; import static org.mockito.BDDMockito.*; /** * User: mgrzejszczak * Date: 08.06.13 * Time: 19:26 */ @RunWith(MockitoJUnitRunner.class) public class PlayerServiceImplTest { @Mock PlayerFactory playerFactory; @InjectMocks PlayerServiceImpl objectUnderTest; @Mock(extraInterfaces = {DJ.class, JavaDeveloper.class}) DefensivePlayer defensivePlayerWithDjAndJavaDevSkills; @Mock DefensivePlayer defensivePlayer; @Mock OffensivePlayer offensivePlayer; @Mock Player commonPlayer; @Test public void shouldReturnOffensivePlayerThatRan() throws Exception { //given given(playerFactory.createPlayer(PositionType.ATT)).willReturn(offensivePlayer); //when Player createdPlayer = objectUnderTest.playAGameWithAPlayerOfPosition(PositionType.ATT); //then assertThat(createdPlayer == offensivePlayer, is(true)); verify(offensivePlayer).run(); } @Test public void shouldReturnDefensivePlayerButHeWontBeADjNorAJavaDev() throws Exception { //given given(playerFactory.createPlayer(PositionType.GK)).willReturn(defensivePlayer); //when Player createdPlayer = objectUnderTest.playAGameWithAPlayerOfPosition(PositionType.GK); //then assertThat(createdPlayer == defensivePlayer, is(true)); verify(defensivePlayer).run(); verify(defensivePlayer).defend(); verifyNoMoreInteractions(defensivePlayer); } @Test public void shouldReturnDefensivePlayerBeingADjAndAJavaDev() throws Exception { //given given(playerFactory.createPlayer(PositionType.GK)).willReturn(defensivePlayerWithDjAndJavaDevSkills); doAnswer(new Answer
June 12, 2013
by Marcin Grzejszczak
· 21,757 Views · 2 Likes
article thumbnail
Mockito - RETURNS_DEEP_STUBS for JAXB
Sorry for not having written for some time but I was busy with writing the JBoss Drools Refcard for DZone and I am in the middle of writing a book about Mockito so I don't have too much time left for blogging... Anyway quite recently on my current project I had an interesting situation regarding unit testing with Mockito and JAXB structures. We have very deeply nested JAXB structures generated from schemas that are provided for us which means that we can't change it in anyway. Let's take a look at the project structure: The project structure is pretty simple - there is a Player.xsd schema file that thanks to using the jaxb2-maven-plugin produces the generated JAXB Java classes corresponding to the schema in the target/jaxb/ folder in the appropriate package that is defined in the pom.xml. Speaking of which let's take a look at the pom.xml file. The pom.xml : 4.0.0 com.blogspot.toomuchcoding mockito-deep_stubs 0.0.1-SNAPSHOT UTF-8 1.6 1.6 spring-release http://maven.springframework.org/release maven-us-nuxeo https://maven-us.nuxeo.org/nexus/content/groups/public junit junit 4.10 org.mockito mockito-all 1.9.5 test org.apache.maven.plugins maven-compiler-plugin 2.5.1 org.codehaus.mojo jaxb2-maven-plugin 1.5 xjc xjc com.blogspot.toomuchcoding.model ${project.basedir}/src/main/resources/xsd Apart from the previously defined project dependencies, as mentioned previously in the jaxb2-maven-plugin in the configuration node you can define the packageName value that defines to which package should the JAXB classes be generated basing on the schemaDirectory value where the plugin can find the proper schema files. Speaking of which let's check the Player.xsd schema file (simillar to the one that was present in the Spring JMS automatic message conversion article of mine): As you can see I'm defining some complex types that even though might have no business sense but you can find such examples in the real life :) Let's find out how the method that we would like to test looks like. Here we have the PlayerServiceImpl that implements the PlayerService interface: package com.blogspot.toomuchcoding.service; import com.blogspot.toomuchcoding.model.PlayerDetails; /** * User: mgrzejszczak * Date: 08.06.13 * Time: 19:02 */ public class PlayerServiceImpl implements PlayerService { @Override public boolean isPlayerOfGivenCountry(PlayerDetails playerDetails, String country) { String countryValue = playerDetails.getClubDetails().getCountry().getCountryCode().getCountryCode().value(); return countryValue.equalsIgnoreCase(country); } } We are getting the nested elements from the JAXB generated classes. Although it violates the Law of Demeter it is quite common to call methods of structures because JAXB generated classes are in fact structures so in fact I fully agree with Martin Fowler that it should be called the Suggestion of Demeter. Anyway let's see how you could test the method: @Test public void shouldReturnTrueIfCountryCodeIsTheSame() throws Exception { //given PlayerDetails playerDetails = new PlayerDetails(); ClubDetails clubDetails = new ClubDetails(); CountryDetails countryDetails = new CountryDetails(); CountryCodeDetails countryCodeDetails = new CountryCodeDetails(); playerDetails.setClubDetails(clubDetails); clubDetails.setCountry(countryDetails); countryDetails.setCountryCode(countryCodeDetails); countryCodeDetails.setCountryCode(CountryCodeType.ENG); //when boolean playerOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); //then assertThat(playerOfGivenCountry, is(true)); } The function checks if, once you have the same Country Code, you get a true boolean from the method. The only problem is the amount of sets and instantiations that take place when you want to create the input message. In our projects we have twice as many nested elements so you can only imagine the number of code that we would have to produce to create the input object... So what can be done to improve this code? Mockito comes to the rescue to together with the RETURN_DEEP_STUBS default answer to the Mockito.mock(...) method: @Test public void shouldReturnTrueIfCountryCodeIsTheSameUsingMockitoReturnDeepStubs() throws Exception { //given PlayerDetails playerDetailsMock = mock(PlayerDetails.class, RETURNS_DEEP_STUBS); CountryCodeType countryCodeType = CountryCodeType.ENG; when(playerDetailsMock.getClubDetails().getCountry().getCountryCode().getCountryCode()).thenReturn(countryCodeType); //when boolean playerOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetailsMock, COUNTRY_CODE_ENG); //then assertThat(playerOfGivenCountry, is(true)); } So what happened here is that you use the Mockito.mock(...) method and provide the RETURNS_DEEP_STUBS answer that will create mocks automatically for you. Mind you that Enums can't be mocked that's why you can't write in the Mockito.when(...) functionplayerDetailsMock.getClubDetails().getCountry().getCountryCode().getCountryCode().getValue(). Summing it up you can compare the readability of both tests and see how clearer it is to work with JAXB structures by using Mockito RETURNS_DEEP_STUBS default answer. Naturally sources for this example are available at BitBucket and GitHub.
June 11, 2013
by Marcin Grzejszczak
· 9,999 Views · 1 Like
article thumbnail
Serialization and injection
Serialization is a form of persistence: serialized data survives the process and the RAM where it was created and can be reconstituted inside different processes and machines that live in a different time or place. Sometimes serialization is a poor form of persistence in fact, one that confuses the boundary between the different schemas the data can fit in. However, what I found useful in the last years of development is to institute a strict separation: serialize Value Objects, Entities, and everything that represents the state of the application. Meanwhile, use Dependency Injection over services that are part of a larger object graph and never serialize this second kind of objects. In the discussion that follows, I make the assumption that serialization and deserialization occur on the same machine (e.g. like for web-oriented sessions.) The problem with serialization, which work transparently most of the time, is the need to serialize service objects instead of limiting the procedure to data structures. How can you store such objects? Not options Some options to solve this problems are really not options. Serialization by itself will fail because of the staleness of the references contained in these objects. For example, in PHP trying to serialize a database connections composed by a Repository or DAO object will rightly fail with an exception. Whenever an object represents a resource of the current machine, it cannot usually be serialized except in the case when the only resource involved is RAM. If the resource is disk space or other running processes such as a database daemon, the reconstitution of the object in another place and time will fail and it's best to just stop the developer immediately during storage. Quasi-options Some solutions to the problem try to avoid the staleness problem by serializing objects without their resources, and make them regrab a new version of them on deserialization. In PHP for example, this can be done with the __sleep() and __wakeup() magic methods, called automatically during serialization and deserializaton respectively. This deserialization mechanism introduces a dependency from the serialized Entity to external services: such a dependency is already in place when building the object the first time (passing the XService in the constructor) but it is aggravated when deserializing (depending on a XServiceFactory instead of just an XService). An improvement, from the dependencies point of view, is to reattach collaborators to deserialized objects like you would for other persistence-related tasks. For example, EntityRepository can inject the missing pieces of Entity every time its find() method is called. However, there is still another option, which is the most resilient from the modelling point of view and not only that of dependency management: injecting non-serializable collaborators through the stack. Objects can collaborate even without keeping field references to each other, and injecting dependencies as parameters move the dependency starting point from the server to the client object (which may or may not be desirable). What is most important is that Entities are relieved of having to manage external references in any context, not only that of persistence and in particular serialization. The metaphor for the 3rd option Misko Hevery likes to say: have you ever seen a credit card able to charge itself? If a CreditCard is an Entity in your domain, it would be very strange to keeping a wire attached to your wallet wherever you go. With the first option, you have the card spring a wire when it is taken out of the wallet, like in horror movies. This intelligent cable tries as its best to attach to the nearest Point of Sale (a bad case of bluetooth I think). With Repositories in mind, you're not dealing with automated wires anymore, but you're still attaching cables between cards and fixed devices. In reality, cards collaborate with the PoS in a fast process that does not last more than a few seconds. Actually, sometimes they don't touch it at all, as in all Internet-based purchases. Keeping services around to deal with external dependencies does not mean the API of your Domain Model has to be biased towards service objects: pos.charge(creditCard); // can equivalently be: creditCard.chargeOn(pos); This is a form of Double Dispatch since there are two objects collaborating and you can dispatch (send messages) to both, being polimorphic by substituting both objects. The sequence of calls is: client -> creditCard -> pos The client object still looks at CreditCard as a behaviorally complete object, but it is clear which dependency is necessary to run each use case (CreditCard method). You can persist a CreditCard easily and send it over the wire to caches or databases. When it comes the time to charge, it is the client that has to bring forward a service able to connect to a bank.
June 5, 2013
by Giorgio Sironi
· 7,208 Views
article thumbnail
Agile Teamwork in Practice
"Don't tell people how to do things, tell them what to do and let them surprise you with their results" - General George S. Patton What's the best way to encourage agile teamwork? It's a tricky question, because so much of Scrum and Kanban practice is predicated on the assumption that collaborative behavior will "happen". Empowerment is often presented as the mechanism for achieving this success. If you just press the empowerment button, developers will then choose to self-organize and will go on to deliver sterling results. Patently however, that isn't the case. I'm sure that many of us will have experienced teams that are actually less than the sum of their parts. Technically skilled people can be more focused on stack traces than on individuals and interactions, and may view each other as unwanted complications or impediments. All too often the social graces that underpin effective agile teamwork have to be elicited painfully, like drawing teeth. Whenever I consider this matter, the above quote by Patton often comes into my mind. It isn't the perspicacity of his argument that I find compelling, or even that it was said so long ago. I suppose that these days we have just become more accepting of such observations. No...to me the interesting thing about this quote is that someone of Patton's background and temperament said it. You see, George Smith Patton was arguably the most hard-boiled U.S. General in World War 2. He was spit-and-polish to the core, and an absolute stickler for discipline. Even tiny misdemeanours would incur his wrath. His idea of a touchy-feely management style was to kick people in the pants after slapping them about the chops, and he frequently railed against "malingerers" who he reckoned ought to be court-martialed and shot. We have yet to hear Esther Derby or Johanna Rothman prescribe such remedies for disaffected team members. Perhaps the most politically correct thing we can do is to categorize his beliefs as an alternative viewpoint. Anyway, it's difficult to imagine anyone less likely than Patton to be sympathetic to agile principles, nor anyone more likely to try and micro-manage those they might consider to be their sub-ordinates. It seems we need a deeper insight if we are to explain this unlikely patronage of a central maxim of agile development. I suspect that Patton knew that if a team is to self-organize and deliver value successfully, then discipline will be key. It can't really be about empowerment, because an empowered team can still be sloppy and never cut the mustard. While good management isn't about telling people how to do their jobs, it is about making sure that they understand the rules of best practice and are competent to follow them, preferably with very little oversight. Strangely perhaps, this is a route to freedom rather than constraint. It releases individual initiative. I think that's what Patton was getting at. Who are we to empower others, after all? What gift is that? Where is the transfer of value? How much better it is to instil the best practices that make people more effective, and thereby become more valued themselves. Development Team Membership Now, a development team is made up of individuals, so when we talk about the rules of team membership we are largely talking about what those individuals do. More specifically, it's about what they do in respect to themselves, and with respect to the wider team of stakeholders including the Scrum Master and Product Owner. So before we go any further, let's look at the behaviors that we can expect a disciplined agile developer to exhibit. What a good team member will do: Agree with other team members and the Product Owner to deliver a valuable and achievable piece of work every Sprint Understand the Sprint Backlog and how it correlates to the Sprint Goal Participate fully and actively in daily standups, planning sessions, reviews, and retrospectives Work with the rest of the team to meet each Sprint Goal (self-organize) Help other team members and the Product Owner to clarify requirements, such as by writing user stories and acceptance criteria Pro-actively remove skill silos, such as by pair programming or cross training, and without being told to do so Work with the Product Owner on an ongoing basis, so that work is understood, reviewed, and approved continually Make sure that the work done is transparent, such as by updating Scrum and Kanban boards Understand that they, and all team members, are stakeholders in the agile process Estimate work so that the Product Owner and other stakeholders can plan ahead (e.g. for release planning) Fully support and encourage the elicitation of metrics, and be able to interpret them and act on them Resolve outstanding or impeded work before actioning new work from the backlog Limit work in progress so as to maximize throughput Act immediately on impediments by appraising other team members and the Scrum Master of any issues, and help to resolve them Accept personal responsibility for the team's success Accept personal responsibility for their work meeting the team’s Definition of Done What a good team member doesn’t do… Fail to give the best unpadded estimates that can be provided at the time. Estimates should be given and received in good faith. Cherry pick work from the Sprint Backlog. The backlog is owned by the team and must be actioned in accordance with the team's Sprint Plan. Attempt to work on more than one item at a time. A good team member will pro-actively limit work in progress. Expect somebody else, such as a Scrum Master, to update the Scrum board or Kanban boards. Information radiators are owned by the team. Work in a "skills silo". A good team member does not view their work as a speciality that only he or she is able to work on. Claim that work has been completed if it does not satisfy the team’s Definition of Done Claim that work is complete if it does not meet the specific acceptance criteria that have been agreed for it Shot at Dawn: Teamwork and the Prime Directive If we were to apply the Patton philosophy in extremis, I suppose that an agile team would shoot its own malingerers following a retrospective, the Scrum Master standing by to deliver the coup-de-grace if needed. Although this lurid concept is absurd, how many experienced Scrum Masters have never secretly wished for a revolver in their desks, even for just a fleeting moment? It highlights a problem that the agile community is often evasive about. What should actually be done about a developer who causes problems for the rest of the team? Is it possible, or even desirable, to correlate the occurrences of those problems to the individual concerned? In a Sprint Retrospective, for example, no blame is ever meant to be directed towards any one team member. In fact the format of the session precludes the establishment of such a correlation, or even the inference that a particular individual may have been remiss in some way. Known as the "Prime Directive", this article of faith is meant to be recanted at the beginning of each retrospective session, and it has to be said in earnest. "Regardless of what we discover, we understand and truly believe that everyone did the best job he or she could, given what was known at the time, his or her skills and abilities, the resources available, and the situation at hand." The question is: what if we don't believe it though? What if all the evidence in the world is stacked against it? Should we go along with the directive anyway, and just kid ourselves for the duration of the session? If so, how can it possibly help? Where is the transparency, which we covet in agile practice, if we subscribe to this devil's credo that makes a mockery of the truth? The answer is potentially quite shocking, and certainly little understood. Don't think of the Prime Directive as a creed, or even as the temporary suspension of disbelief for the sake of the meeting. Think of it as a pre-condition that must hold, and genuinely be true, before a retrospective can happen at all. The underlying principle is that all of the attendees must be fully able to participate. All are expected to be professionals who can fulfil their duty to each other and to the Scrum process, and inspect and adapt their working practices accordingly. It isn't enough just to leave your knives at the door. You actually have to trust the people you are working with. Really trust them. Given that most developers are assigned to their teams by managers, and not by each other, this expectation of trust is indeed potentially shocking. It gets even scarier than that. Think about what all of this really means should trust be absent, or somehow lost. It means that you can't have a Sprint Retrospective at all until the issues around trust are resolved. It means that if a team member must be removed, then that should happen beforehand. Scrum does not go so far as to prescribe a mechanism for this, but it is established that a team will self-organize to remove its own problems. Perhaps they will have to make collective representations to a line manager, or petition for a member's removal through the Scrum Master. It might even mean that the team can deselect a team member by their own consensus. Yet however it is done, it appears that the team aren't too far removed from assembling a firing squad after all. If this all seems very draconian, let's reassert the key principle here: when Scrum is done properly a team will solve its own problems, including distasteful matters like this. Now, it has to be admitted that most Scrum teams across industry today don't get to operate at such a high level of proficiency. The consequences of this cut both ways. On the one hand a team may not be allowed to get on with their jobs without interference from management, while on the other hand they usually don't have to deal with the nastiness of putting a sick dog down. A few conversations with that same pointy-haired boss could be enough to get him to do the deed. Yet as the industry transitions more fully towards agile practice, this "remedy" will no longer be sustainable. Problems regarding a team member's competence won't be someone else's responsibility; rather, it will be incumbent upon the team to find a solution. In an agile world, greater responsibility falls on self-managing teams, along with their greater rights. Professionalism: from Team Discipline to Self Discipline In this article we've identified a range of behaviors that typify good team membership, and we've looked squarely at what should happen when things go wrong. In short, it's up to the team to sort out its own problems when a team member doesn't measure up. Yet this is only part of what disciplined agile practice is about. It isn't enough to put the focus on punitive measures and the threat of sanction, even if the exercising of authority is driven entirely by the team itself. What we need to do is to take things a step further. We don't really want discipline to be enforced by the team, even though they should be the ultimate arbiters. What we want is to encourage a self-discipline that wells up from each individual team member, and which serves as an inspiration to others. Disciplined teamwork isn't about empowerment. It's about cascading the release of potential through the clear demonstration of value. I look at it this way. There is only one person in this world any of us can change. I don't think I need to spell out who that person is. So, wherever you and your team may be on your agile journey, there should always be at least one person who can be relied upon. If that person does their bit, then they are helping to make the team more than the sum of its parts. "Don't empower me. Release me. I'll find my own power, and it will be far greater than anything you can bestow on me" - Tobias Mayer
June 5, 2013
by $$anonymous$$
· 13,481 Views · 1 Like
article thumbnail
Create a Couchbase Cluster with Ansible
[This blog was syndicated from http://blog.grallandco.com] Introduction When I was looking for a more effective way to create my cluster I asked some sysadmins which tools I should use to do it. The answer I got during OSDC was not Puppet, nor Chef, but wasAnsible. This article shows you how you can easily configure and create a Couchbase cluster deployed and many linux boxes...and the only thing you need on these boxes is an SSH Server! Thanks to Jan-Piet Mens that was one of the person that convinced me to use Ansible and answered questions I had about Ansible. You can watch the demonstration below, and/or look at all the details in the next paragraph. Ansible Ansible is an open-source software that allows administrator to configure and manage many computers over SSH. I won't go in all the details about the installation, just follow the steps documented in the Getting Started Guide. As you can see from this guide, you just need Python and few other libraries and clone Ansible project from Github. So I am expecting that you have Ansible working with your various servers on which you want to deploy Couchbase. Also for this first scripts I am using root on my server to do all the operations. So be sure you have register the root ssh keys to your administration server, from where you are running the Ansible scripts. Create a Couchbase Cluster So before going into the details of the Ansible script it is interesting to explain how you create a Couchbase Cluster. So here are the 5 steps to create and configure a cluster: Install Couchbase on each nodes of the cluster, as documented here. Take one of the node and "initialize" the cluster, using cluster-init command. Add the other nodes to the cluster, using server-add command. Rebalance, using rebalance command. Create a Bucket, using bucket-create command. So the goal now is to create an Ansible Playbook that executes these steps for you. Ansible Playbook for Couchbase The first think you need is to have the list of hosts you want to target, so I have create a hosts file that contains all my server organized in 2 groups: [couchbase-main] vm1.grallandco.com [couchbase-nodes] vm2.grallandco.com vm3.grallandco.com The group [couchbase-main] group is just one of the node that will drive the installation and configuration, as you probably already know, Couchbase does not have any master... All nodes in the cluster are identical. To ease the configuration of the cluster, I have create another file that contains all parameters that must be sent to all the various commands. This file is located in the group_vars/all see the section Splitting Out Host and Group Specific Data in the documentation. # Adminisrator user and password admin_user: Administrator admin_password: password # ram quota for the cluster cluster_ram_quota: 1024 # bucket and replicas bucket_name: ansible bucket_ram_quota: 512 num_replicas: 2 Use this file to configure your cluster. Let's describe the playbook file : - name: Couchbase Installation hosts: all user: root tasks: - name: download Couchbase package get_url: url=http://packages.couchbase.com/releases/2.0.1/couchbase-server-enterprise_x86_64_2.0.1.deb dest=~/. - name: Install dependencies apt: pkg=libssl0.9.8 state=present - name: Install Couchbase .deb file on all machines shell: dpkg -i ~/couchbase-server-enterprise_x86_64_2.0.1.deb As expected, the installation has to be done on all servers as root then we need to execute 3 tasks: Download the product, the get_url command will only download the file if not already present Install the dependencies with the apt command, the state=present allows the system to only install this package if not already present Install Couchbase with a simple shell command. (here I am not checking if Couchbase is already installed) So we have now installed Couchbase on all the nodes. Let's now configure the first node and add the others: - name: Initialize the cluster and add the nodes to the cluster hosts: couchbase-main user: root tasks: - name: Configure main node shell: /opt/couchbase/bin/couchbase-cli cluster-init -c 127.0.0.1:8091 --cluster-init-username=${admin_user} --cluster-init-password=${admin_password} --cluster-init-port=8091 --cluster-init-ramsize=${cluster_ram_quota} - name: Create shell script for configuring main node action: template src=couchbase-add-node.j2 dest=/tmp/addnodes.sh mode=750 - name: Launch config script action: shell /tmp/addnodes.sh - name: Rebalance the cluster shell: /opt/couchbase/bin/couchbase-cli rebalance -c 127.0.0.1:8091 -u ${admin_user} -p ${admin_password} - name: create bucket ${bucket_name} with ${num_replicas} replicas shell: /opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 --bucket=${bucket_name} --bucket-type=couchbase --bucket-port=11211 --bucket-ramsize=${bucket_ram_quota} --bucket-replica=${num_replicas} -u ${admin_user} -p ${admin_password} Now we need to execute specific taks on the "main" server: Initialization of the cluster using the Couchbase CLI, on line 06 and 07 Then the system needs to ask all other server to join the cluster. For this the system needs to get the various IP and for each IP address execute the add-server command with the IP address. As far as I know it is not possible to get the IP address from the main playbook YAML file, so I ask the system to generate a shell script to add each node and execute the script. This is done from the line 09 to 13. To generate the shell script, I use Ansible Template, the template is available in the couchbase-add-node.j2 file. {% for host in groups['couchbase-nodes'] %} /opt/couchbase/bin/couchbase-cli server-add -c 127.0.0.1:8091 -u ${admin_user} -p ${admin_password} --server-add={{ hostvars[host]['ansible_eth0']['ipv4']['address'] }:8091 --server-add-username=${admin_user} --server-add-password=${admin_password} {% endfor %} As you can see this script loop on each server in the [couchbase-nodes] group and use its IP address to add the node to the cluster. Finally the script rebalance the cluster (line 16) and add a new bucket (line 19). You are now ready to execute the playbook using the following command : ./bin/ansible-playbook -i ./couchbase/hosts ./couchbase/couchbase.yml -vv I am adding the -vv parameter to allow you to see more information about what's happening during the execution of the script. This will execute all the commands described in the playbook, and after few seconds you will have a new cluster ready to be used! You can for example open a browser and go to the Couchase Administration Console and check that your cluster is configured as expected. As you can see it is really easy and fast to create a new cluster using Ansible. I have also create a script to uninstall properly the cluster.. just launch ./bin/ansible-playbook -i ./couchbase/hosts ./couchbase/couchbase-uninstall.yml
June 3, 2013
by Don Pinto
· 5,119 Views · 1 Like
article thumbnail
Accessing An Artifact’s Maven And SCM Versions At Runtime
You can easily tell Maven to include the version of the artifact and its Git/SVN/… revision in the JAR manifest file and then access that information at runtime via getClass().getPackage.getImplementationVersion(). (All credit goes to Markus Krüger and other colleagues.) Include Maven artifact version in the manifest (Note: You will actually not want to use it, if you also want to include a SCM revision; see below.) pom.xml: ... org.apache.maven.plugins maven-jar-plugin ... true true ... ... The resulting MANIFEST.MF of the JAR file will then include the following entries, with values from the indicated properties: Built-By: ${user.name} Build-Jdk: ${java.version} Specification-Title: ${project.name} Specification-Version: ${project.version} Specification-Vendor: ${project.organization.name Implementation-Title: ${project.name} Implementation-Version: ${project.version} Implementation-Vendor-Id: ${project.groupId} Implementation-Vendor: ${project.organization.name} (Specification-Vendor and Implementation-Vendor come from the POM’s organization/name.) Include SCM revision For this you can either use the Build Number Maven plugin that produces the property ${buildNumber}, or retrieve it from environment variables passed by Jenkinsor Hudson (SVN_REVISION for Subversion, GIT_COMMIT for Git). For git alone, you could also use the maven-git-commit-id-plugin that can either replace strings such as ${git.commit.id} in existing resource files (using maven’s resource filtering, which you must enable) with the actual values or output all of them into a git.properties file. Let’s use the buildnumber-maven-plugin and create the manifest entries explicitely, containing the build number (i.e. revision) org.codehaus.mojo buildnumber-maven-plugin 1.2 validate create false false org.apache.maven.plugins maven-jar-plugin 2.4 ${project.name} ${project.version} ${buildNumber} ... Accessing the version & revision As mentioned above, you can access the manifest entries from your code via getClass().getPackage.getImplementationVersion() andgetClass().getPackage.getImplementationTitle(). References SO: How to get Maven Artifact version at runtime? Maven Archiver documentation
May 28, 2013
by Jakub Holý
· 12,760 Views
article thumbnail
Amazon S3 Parallel MultiPart File Upload
In this blog post, I will present a simple tutorial on uploading a large file to Amazon S3 as fast as the network supports. Amazon S3 is clustered storage service of Amazon. It is designed to make web-scale computing easier. Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, secure, fast, inexpensive infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers. For using Amazon services, you'll need your AWS access key identifiers, which AWS assigned you when you created your AWS account. The following are the AWS access key identifiers: Access Key ID (a 20-character, alphanumeric sequence) For example: 022QF06E7MXBSH9DHM02 Secret Access Key (a 40-character sequence) For example: kWcrlUX5JEDGM/LtmEENI/aVmYvHNif5zB+d9+ct Caution Your Secret Access Key is a secret, which only you and AWS should know. It is important to keep it confidential to protect your account. Store it securely in a safe place. Never include it in your requests to AWS, and never e-mail it to anyone. Do not share it outside your organization, even if an inquiry appears to come from AWS or Amazon.com. No one who legitimately represents Amazon will ever ask you for your Secret Access Key. The Access Key ID is associated with your AWS account. You include it in AWS service requests to identify yourself as the sender of the request. The Access Key ID is not a secret, and anyone could use your Access Key ID in requests to AWS. To provide proof that you truly are the sender of the request, you also include a digital signature calculated using your Secret Access Key. The sample code handles this for you. Your Access Key ID and Secret Access Key are displayed to you when you create your AWS account. They are not e-mailed to you. If you need to see them again, you can view them at any time from your AWS account. To get your AWS access key identifiers Go to the Amazon Web Services web site at http://aws.amazon.com. Point to Your Account and click Security Credentials. Log in to your AWS account. The Security Credentials page is displayed. Your Access Key ID is displayed in the Access Identifiers section of the page. To display your Secret Access Key, click Show in the Secret Access Key column. You can use your Amazon keys from a properties file in your application. Here is a sample for properties file containing Amazon keys: # Fill in your AWS Access Key ID and Secret Access Key # http://aws.amazon.com/security-credentials accessKey = secretKey = Here is sample AmazonUtil class for getting AWS Credentials from properties file. public class AmazonUtil { private static final Logger logger = LogUtil.getLogger(); private static final String AWS_CREDENTIALS_CONFIG_FILE_PATH = ConfigUtil.CONFIG_DIRECTORY_PATH + File.separator + "aws-credentials.properties"; private static AWSCredentials awsCredentials; static { init(); } private AmazonUtil() { } private static void init() { try { awsCredentials = new PropertiesCredentials(IOUtil.getResourceAsStream(AWS_CREDENTIALS_CONFIG_FILE_PATH)); } catch (IOException e) { logger.error("Unable to initialize AWS Credentials from " + AWS_CREDENTIALS_CONFIG_FILE_PATH); } } public static AWSCredentials getAwsCredentials() { return awsCredentials; } } Amazon S3 has Multipart Upload service which allows faster, more flexible uploads into Amazon S3. Multipart Upload allows you to upload a single object as a set of parts. After all parts of your object are uploaded, Amazon S3 then presents the data as a single object. With this feature you can create parallel uploads, pause and resume an object upload, and begin uploads before you know the total object size. For more information on Multipart Upload, review the Amazon S3 Developer Guide In this tutorial, my sample application uploads each file parts to Amazon S3 with different threads for using network throughput as possible as much. Each file part is associated with a thread and each thread uploads its associated part with Amazon S3 API. Figure 1. Amazon S3 Parallel Multi-Part File Upload Mechanism Amazon S3 API suppots MultiPart File Upload in this way: 1. Send a MultipartUploadRequest to Amazon. 2. Get a response containing a unique id for this upload operation. 3. For i in ${partCount} 3.1. Calculate size and offset of split-i in whole file. 3.2. Build a UploadPartRequest with file offset, size of current split and unique upload id. 3.3. Give this request to a thread and starts upload by running thread. 3.3.1. Send associated UploadPartRequest to Amazon. 3.3.2. Get response after successful upload and save ETag property of response. 4. Wait all threads to terminate 5. Get ETags (ETag is an identifier for successfully completed uploads) of all terminated threads. 6. Send a CompleteMultipartUploadRequest to Amazon with unique upload id and all ETags. So Amazon joins all file parts as target objects. Here is implementation: public class AmazonS3Util { private static final Logger logger = LogUtil.getLogger(); public static final long DEFAULT_FILE_PART_SIZE = 5 * 1024 * 1024; // 5MB public static long FILE_PART_SIZE = DEFAULT_FILE_PART_SIZE; private static AmazonS3 s3Client; private static TransferManager transferManager; static { init(); } private AmazonS3Util() { } private static void init() { // ... s3Client = new AmazonS3Client(AmazonUtil.getAwsCredentials()); transferManager = new TransferManager(AmazonUtil.getAwsCredentials()); } // ... public static void putObjectAsMultiPart(String bucketName, File file) { putObjectAsMultiPart(bucketName, file, FILE_PART_SIZE); } public static void putObjectAsMultiPart(String bucketName, File file, long partSize) { List partETags = new ArrayList(); List uploaders = new ArrayList(); // Step 1: Initialize. InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest(bucketName, file.getName()); InitiateMultipartUploadResult initResponse = s3Client.initiateMultipartUpload(initRequest); long contentLength = file.length(); try { // Step 2: Upload parts. long filePosition = 0; for (int i = 1; filePosition < contentLength; i++) { // Last part can be less than part size. Adjust part size. partSize = Math.min(partSize, (contentLength - filePosition)); // Create request to upload a part. UploadPartRequest uploadRequest = new UploadPartRequest(). withBucketName(bucketName).withKey(file.getName()). withUploadId(initResponse.getUploadId()).withPartNumber(i). withFileOffset(filePosition). withFile(file). withPartSize(partSize); uploadRequest.setProgressListener(new UploadProgressListener(file, i, partSize)); // Upload part and add response to our list. MultiPartFileUploader uploader = new MultiPartFileUploader(uploadRequest); uploaders.add(uploader); uploader.upload(); filePosition += partSize; } for (MultiPartFileUploader uploader : uploaders) { uploader.join(); partETags.add(uploader.getPartETag()); } // Step 3: complete. CompleteMultipartUploadRequest compRequest = new CompleteMultipartUploadRequest(bucketName, file.getName(), initResponse.getUploadId(), partETags); s3Client.completeMultipartUpload(compRequest); } catch (Throwable t) { logger.error("Unable to put object as multipart to Amazon S3 for file " + file.getName(), t); s3Client.abortMultipartUpload( new AbortMultipartUploadRequest( bucketName, file.getName(), initResponse.getUploadId())); } } // ... private static class UploadProgressListener implements ProgressListener { File file; int partNo; long partLength; UploadProgressListener(File file) { this.file = file; } @SuppressWarnings("unused") UploadProgressListener(File file, int partNo) { this(file, partNo, 0); } UploadProgressListener(File file, int partNo, long partLength) { this.file = file; this.partNo = partNo; this.partLength = partLength; } @Override public void progressChanged(ProgressEvent progressEvent) { switch (progressEvent.getEventCode()) { case ProgressEvent.STARTED_EVENT_CODE: logger.info("Upload started for file " + "\"" + file.getName() + "\""); break; case ProgressEvent.COMPLETED_EVENT_CODE: logger.info("Upload completed for file " + "\"" + file.getName() + "\"" + ", " + file.length() + " bytes data has been transferred"); break; case ProgressEvent.FAILED_EVENT_CODE: logger.info("Upload failed for file " + "\"" + file.getName() + "\"" + ", " + progressEvent.getBytesTransfered() + " bytes data has been transferred"); break; case ProgressEvent.CANCELED_EVENT_CODE: logger.info("Upload cancelled for file " + "\"" + file.getName() + "\"" + ", " + progressEvent.getBytesTransfered() + " bytes data has been transferred"); break; case ProgressEvent.PART_STARTED_EVENT_CODE: logger.info("Upload started at " + partNo + ". part for file " + "\"" + file.getName() + "\""); break; case ProgressEvent.PART_COMPLETED_EVENT_CODE: logger.info("Upload completed at " + partNo + ". part for file " + "\"" + file.getName() + "\"" + ", " + (partLength > 0 ? partLength : progressEvent.getBytesTransfered()) + " bytes data has been transferred"); break; case ProgressEvent.PART_FAILED_EVENT_CODE: logger.info("Upload failed at " + partNo + ". part for file " + "\"" + file.getName() + "\"" + ", " + progressEvent.getBytesTransfered() + " bytes data has been transferred"); break; } } } private static class MultiPartFileUploader extends Thread { private UploadPartRequest uploadRequest; private PartETag partETag; MultiPartFileUploader(UploadPartRequest uploadRequest) { this.s3Client = s3Client; this.uploadRequest = uploadRequest; } @Override public void run() { partETag = s3Client.uploadPart(uploadRequest).getPartETag(); } private PartETag getPartETag() { return partETag; } private void upload() { start(); } } }
May 28, 2013
by Serkan Özal
· 57,366 Views · 3 Likes
article thumbnail
7 Agile Best Practices that You Don’t Need to Follow
There are many good ideas and practices in Agile development, ideas and practices that definitely work: breaking projects into Small Releases to manage risk and accelerate feedback; time-boxing to limit WIP and keep everyone focused; relying only on working software as the measure of progress; simple estimating and using velocity to forecast team performance; working closely and constantly with the customer; and Continuous Integration – and Continuous Delivery – to ensure that code is always working and stable. But there are other commonly accepted ideas and best practices that aren’t important: if you don’t follow them, nothing bad will happen to you and your project will still succeed. And there are a couple that you are better off not following at all. Test-Driven Development Teams that need to move quickly need to depend on a fast, efficient testing safety net. With Test First Development or Test-Driven Development (TDD), there’s no excuse for not writing tests – after all, you have to write a failing test before you write the code. So you end up with a good set of working automated tests that ensure a high level of coverage and regression protection. TDD is not only a way of ensuring that developers test their code. It is also advocated as a design technique that leads to better quality code and a simpler, cleaner design. A study of teams at Microsoft and IBM (Realizing Quality Improvement through Test Driven Development, Microsoft Research, 2008) found that while TDD increased upfront development costs between 15-35% (TDD demands developers change the way that they think and work, which slows developers down, at least at first), it reduced defect density by 40% (IBM) or as much as 60-90% (Microsoft) over teams that did not follow disciplined unit testing. But in Making Software Chapter 12 “How Effective is Test-Driven Development” researchers led by Burak Turhan found that while TDD improves external quality (measured by one or more of test cases passed, number of defects, defect density, defects per test, effort required to fix defects, change density, % of preventative changes) and can improve the quality of the tests (fewer mistakes in the tests, tests that are easier to maintain), TDD does not consistently improve the quality of the design. TDD seems to reduce code complexity and improve reuse, however it also negatively impacts coupling and cohesion. And while method and class-level complexity is better in code developed using TDD, project/package level complexity is worse. People who like TDD like it a lot, so if you like it, do it. And even if you are not TDD-infected, there are times when working test first is natural – when you have to solve a specific problem in a specific way, or if you’re fixing a bug where the failing test case is already written up for you. But the important thing is that you write a good set of tests and keep them up to date and run them frequently – it doesn't matter if you write them before, or after, you write the code. Pair Programming According to the VersionOne State of Agile Development Survey 2012, almost 1/3 of teams follow pair programming – a surprisingly high number, given how disciplined pair programming is, and how few teams follow XP (2%) or Scrum/XP Hybrid (11%) methods where pair programming would be prescribed. There are good reasons for pairing: information sharing and improving code quality through continuous, informal code reviews as developers work together. And there are natural times to pair developers, or sometimes developers and testers, together: when you’re working through a hard design problem; or on code that you’ve never seen before and somebody who has worked on it is available to help; or when you’re over your head in troubleshooting a high-pressure problem; or testing a difficult part of the system; or when a new person joins the team and needs to learn about the code and coding practices. Some (extroverted) people enjoy pairing up, the energy it creates and the opportunities it provides to get to know others on the team. But forcing people who prefer working on their own or who don’t like each other to work closely together is definitely not a good idea. There are real social costs in pairing: you have to be careful to pair people up by skill, experience, style, personality type and work ethic. And sustained pair programming can be exhausting, especially over the long term – one study (Vanhanen and Lassenius 2007) found that people only pair between 1.5 and 4 hours a day on average, because it’s too intense to do all day long. In Pair Programming Considered Harmful? Jon Evans says that pairing can have also negative effects on creativity: Research strongly suggests that people are more creative when they enjoy privacy and freedom from interruption … What distinguished programmers at the top-performing companies wasn’t greater experience or better pay. It was how much privacy, personal workspace and freedom from interruption they enjoyed,” says a New York Times article castigating “the new groupthink”. And in “Still Questioning Extreme Programming” Pete McBreen points out some other disadvantages and weaknesses of pair programming: Exploration of ideas is not encouraged, pairing makes a developer focus on writing the code, so unless there is time in the day for solo exploration the team gets a very superficial level of understanding of the code. Developers can come to rely too much on the unit tests, assuming that if the tests pass then the code is OK. (This follows on from the lack of exploration.) Corner cases and edge cases are not investigated in detail, especially if they are hard to write tests for. Code that requires detail thinking about the design is hard to do when pairing unless one partner completely dominates the session. With the usual tradeoff between partners, it is hard to build technically complex designs unless they have been already been worked out in a solo session. Personal styles matter when pairing, and not all pairings are as productive as others. Pairs with different typing skills and proficiencies often result in the better typist doing all of the coding with the other partner being purely passive. And of course pairing in distributed teams doesn't work well if at all (depending on distance, differences in time zones, culture, working styles, language), although some people still try. While pairing does improve code quality over solo programming, you can get the same improvements in code quality, and at least some of the information sharing advantages, through code reviews, at less cost. Code reviews – especially lightweight, offline reviews – are easier to schedule, less expensive and less intrusive than pairing. And as Jason Cohen points out even if developers are pair programming, you may still need to do code reviews, because pair programming is really about joint problem solving, and doesn’t cover all of the issues that a code review would. Back to Jon Evans for the final word on pair programming: The true answer is that there is no one answer; that what works best is a dynamic combination of solitary, pair, and group work, depending on the context, using your best judgement. Paired programming definitely has its place. (Betteridge’s Law strikes again!) In some cases that place may even be “much of most days.” But insisting on 100 percent pairing is mindless dogma, and like all mindless dogma, ultimately counterproductive. Emergent Design and Metaphor Incremental development works, and trying to keep design simple makes good sense, but attempting to define an architecture on the fly is foolish and impractical. There’s a reason that almost nobody actually follows Emergent Design: it doesn't work. Relying on a high-level metaphor (the system is an "assembly line" or a "bill of materials" or a "hive of bees") shared by the team as some kind of substitute for architecture is even more ridiculous. Research from Carnegie Mellon University found that … natural language metaphors are relatively useless for either fostering communication among technical and non-technical project members or in developing architecture. Almost no one understands what a system metaphor is any ways, or how it is to be used, or how to choose a meaningful metaphor or how to change it if you got it wrong (and how you would know if you got it wrong), including one of the people who helped come up with the idea: Okay I might as well say it publicly - I still haven't got the hang of this metaphor thing. I saw it work, and work well on the C3 project, but it doesn't mean I have any idea how to do it, let alone how to explain how to do it. Martin Fowler, Is Design Dead? Agile development methods have improved development success and shown better ways to approach many different software development problems – but not architecture and design. Daily Standups When you have a new team and everyone needs to get to know each other and more time to understand what the project is about; or when the team is working under emergency conditions trying to fix something or finish something under extreme pressure, then getting everyone together in regular meetings, maybe even more than once a day, is necessary and valuable. But whether everyone stands up or sits down and what they end up talking about in a meeting should be up to you. If your team has been working well together for a while and everyone knows each other and knows what they are working on, and if developers update cards on a task board or a Kanban board or the status in an electronic system as they get things done, and if they are grown up enough to ask for help when they need it, then you don’t need to make them all stand up in a room every morning. Collective Code Ownership Letting everyone work on all of the code isn't always practical (because not everyone on the team has the requisite knowledge or experience to work on every problem) and collective code ownership can have negative effects on code quality. Share code where it makes sense to do so, but realize that not everybody can – or should – work on every part of the system. Writing All Requirements as Stories The idea that every requirement specification can be written as User Stories in 1 or 2 lines on cards, that requirements should be too short on purpose (so that the developer has to talk to someone to explain what’s really needed) and insisting that they should all be in the same template form “As a type of user I want some goal so that some reason…” is silly and unnecessary. This is the same kind of simple minded orthodoxy that led everyone to try to capture all requirements in UML Use Case format with stick men and bubbles 15 years ago. There are many different ways to effectively express requirements. Sometimes requirements need to be specified in detail (when you have to meet regulatory compliance or comply with a standard or integrate with an existing system or implement a specific algorithm or…). Sometimes it’s better to work from a test case or a detailed use case scenario or a wire frame or some other kind of model, because somebody who knows what’s going on has already worked out the details for you. So pick the format and level of detail that works best and get to work. Relying on a Product Owner Relying on one person as the Product Owner, as the single solitary voice of the customer and the “one throat to choke” when the project fails, doesn't scale, doesn't last, and puts the team and the project and eventually the business at risk. It’s a naïve, dangerous approach to designing a product and to managing a development project, and it causes more problems than it solves. Many teams have realized this and are trying to work around the Product Owner idea because they have to. To succeed, a team needs real and sustained customer engagement at multiple levels, and they should take responsibility themselves for making sure that they get what they need, rather than relying on one person to do it all.
May 24, 2013
by Jim Bird
· 49,096 Views
article thumbnail
Building a full-text index of git commits using lunr.js and Github APIs
Github has a nice API for inspecting repositories – it lets you read gists, issues, commit history, files and so on. Git repository data lends itself to demonstrating the power of combining full text and faceted search, as there is a mix of free text fields (commit messages, code) and enumerable fields (committers, dates, committer employers). Github APIs return JSON, which has the nice property of resembling a tree structure – results can be recursed over without fear of infinite loops. Note that to download the entire commit history for a repository, you need to page through it by sha hash. The API I use here lacks diffs, which must be retrieved elsewhere. To test this, access a URL like so. The configurable arguments are the repository owner and name fields. https://api.github.com/repos/torvalds/linux/commits This is what a commit looks like: { "sha": "7638417db6d59f3c431d3e1f261cc637155684cd", "url": "https://api.github.com/repos/octocat/Hello-World/git/commits/7638417db6d59f3c431d3e1f261cc637155684cd", "author": { "date": "2008-07-09T16:13:30+12:00", "name": "Scott Chacon", "email": "[email protected]" }, "committer": { "date": "2008-07-09T16:13:30+12:00", "name": "Scott Chacon", "email": "[email protected]" }, "message": "my commit message", "tree": { "url": "https://api.github.com/repos/octocat/Hello-World/git/trees/827efc6d56897b048c772eb4087f854f46256132", "sha": "827efc6d56897b048c772eb4087f854f46256132" }, "parents": [ { "url": "https://api.github.com/repos/octocat/Hello-World/git/commits/7d1b31e74ee336d15cbd21741bc88a537ed063a0", "sha": "7d1b31e74ee336d15cbd21741bc88a537ed063a0" } ] } To make the test simple, I download these as JSON locally, then start a python webserver. Were I to make many such calls on a public site, I’d set up a proxy to the github APIs. python -m SimpleHTTPServer This data has a number of nested objects and must be flattened to fit into the lunr.jsfull-text index. This example uses the commit number (0, 1, 2..N) as the location in the index, but a real environment should use the commit hash to allow partitioning the ingestion process. Nested objects are flattened by joining subsequent keys with underscores in between. A production-worthy solution needs to escape these to prevent collisions. var documents = []; function recurse(doc_num, base, obj, value) { if ($.isPlainObject(value)) { $.each(value, function (k, v) { recurse(doc_num, base + obj + "_", k, v); }); } else { process(doc_num, base + obj, value); } } function process(doc_num, key, value) { if (documents.length <= doc_num) documents[doc_num] = {}; if (value !== null) documents[doc_num][key] = value + ''; } $.each(data, function(doc_num, commit) { $.each(commit, function(k, v) { recurse(doc_num, '', k, v) }); }); Normally, one sets up a lunr full-text index by specifying all the fields, much like Solr’s numerous XML config files. Lunr doesn’t have nearly as many configuration options, since you only specify the ‘boost’ parameter to increase the value of certain fields in ranking. I imagine this will change as the project grows, at the very least to include type hints. Given the simplicity of field objects, you can infer infer the field list from JSON payloads. The code below provides two modes, one where you inspect the entire JSON payload, or one where you limit how many commits you check, a good option when JSON data is consistent. The function accepts configuration objects resembling ExtJS config objects, which lets you override as desired. If fields derived from existing data are required, they can be inserted after any documents are inserted. function inferIndex(documents, config) { return lunr(function() { this.ref('id'); var found = {}; var idx = this; $.each(documents, function(doc_num, doc) { if (config && config.limit && config.limit < doc_num) return; $.each(doc, function(k, v) { if (!found[k]) { if (config && config[k]) { idx.field(k, config[k]); } else { idx.field(k); } found[k] = true; } }); }); }); } var index = inferIndex(documents, {limit: 1, 'commit_author_name':{boost:10}); Inserting flattened documents into the index becomes simple. The method below provides a callback, should you desire to add calculated fields fields. $.each(documents, function(doc_num, attrs, doc_cb) { var doc = $.extend( {id: doc_num}, attrs); if (doc_cb) { doc = doc_cb(doc); } index.add(doc); }); At this point we’ve indexed the entire commit history from a git repository, which lets us search for commits by topic. While this is useful, it’d be really nice to be able to facet on fields, which would return the number of documents in a category, like a SQL group by. I’ve found it particularly convenient to facet on author, date, or author’s company. If you have access to the original documents, you can easily construct facets based on the results of a lunr search: function facet(index, query, data, field) { var results = index.search(query); var facets = {}; $.each(results, function(index, searchResult) { var doc = data[searchResult.ref]; facets[doc[field]] = (facets[doc[field]] === undefined ? 0 : facets[doc[field]]) + 1; } ); return facets; } Commit messages in repositories where I work often contain names of clients who requested a feature or bug fix. Consequently doing a search faceted by author provides a list of who worked with each client the most – this can also tell you who has worked with various pieces of technology. The following query demonstrates this technique: var facets = facet(index, 'driver', documents, 'commit_author_name'); {"Wolfram Sang":24,"Linus Torvalds":3} The approach shown here works well, but requires retrieving results requires access to the original document data. If we want to filter the results to a category, we need a richer search API than lunr currently provides, as well as callback options within the search API. In Solr there are also options to skip lower-casing data, as that may be inappropriate for category titles. Mitigating these issues will be explored further in future essays.
May 20, 2013
by Gary Sieling
· 8,122 Views
article thumbnail
Azure Blob Storage - "The specified blob or block content is invalid"
If you’re uploading blobs by splitting blobs into blocks and you get the error – The specified blob or block content is invalid, then this post is for you. Short Version If you’re uploading blobs by splitting blobs into blocks and you get the above mentioned error, ensure that your block ids of your blocks are of same length. If the block ids of your blocks are of different length, you’ll get this error. Long Version Now for the longer version of this post . A few days back I was working with storage client library especially around uploading blobs in chunks and with one particular blob I was constantly getting the error – The specified blob or block content is invalid. I tried numerous combinations even resorting to REST API directly but to no avail. It only happened with just one blob. Furthermore if I uploaded the same blob without splitting it into blocks, all was well. I was at my wits’ end. Tried searching the Internet for this error but could not find a conclusive answer to my problem. After much trial and error, I was able to simulate the same problem on other blobs as well. Here’s how you can recreate it: Start uploading the blob by splitting it into blocks. For block id, let’s do a 7 character long string e.g. intValue.ToString(“d7”). This will ensure that my block ids would be “0000001”, “0000002”, …, ”0000010” ….. After one or two blocks are uploaded, cancel the operation. Now re-upload the blob by splitting it into blocks. However this time for block id, let’s do a 6 character long string e.g. intValue.ToString(“d6”). You’ll get the error as soon as you try to upload the 1st block. Possible Solutions Now that we know the root cause of this problem, let’s look at some of the possible solutions to solve this problem. Wait out One possible solution is to wait out. I know its lame but still a possible solution. We know that Windows Azure Blob Storage Service keeps all uncommitted blocks for a duration of 7 days and if within 7 days those uncommitted blocks are not committed, the storage service purges them. I wish storage service provided some mechanism to purge uncommitted blocks programmatically. Commit uncommitted blocks You could possibly commit the blocks which are in uncommitted state so that at least you get a blob (which would not be the blob we wanted to upload in the first place). You can then delete that blob and re-upload the blob by specifying block ids which are of same length. To fetch the list of uncommitted blocks, if you’re using REST API directly you can perform “Get Block List” operation and pass “blocklisttype=uncommitted” as one of the query string parameters. If you’re using storage client library (assuming you’re using the version 2.x of .Net storage client library), you can do something like the code below: private static List GetUncommittedBlockIds(CloudBlockBlob blob) { var sasUri = blob.GetSharedAccessSignature(new SharedAccessBlobPolicy() { SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(5), Permissions = SharedAccessBlobPermissions.Read, }); var blobUri = new Uri(string.Format("{0}{1}", blob.Uri, sasUri)); List uncommittedBlockIds = new List(); var request = BlobHttpWebRequestFactory.GetBlockList(blobUri, null, null, BlockListingFilter.Uncommitted, null, null); //request.Headers.Add("Authorization", using (var resp = (HttpWebResponse)request.GetResponse()) { using (var stream = resp.GetResponseStream()) { var getBlockListResponse = new GetBlockListResponse(stream); var blocks = getBlockListResponse.Blocks; foreach (var block in blocks.Where(b => !b.Committed)) { uncommittedBlockIds.Add(Encoding.UTF8.GetString(Convert.FromBase64String(block.Name))); } } } return uncommittedBlockIds; } A few things to keep in mind here: Microsoft.WindowsAzure.Storage.Blob namespace does not have the capability to get the list of uncommitted blocks. You would need to make use ofMicrosoft.WindowsAzure.Storage.Blob.Protocol namespace. Because we’re kind of invoking the REST API by executing an HttpWebRequest, I created a shared access signature on the blob so that I don’t have to create “Authorization” header. Fetch uncommitted blocks to see block id length You could fetch the list of uncommitted blocks just to find out the length of the block id used. You could then use that block id length for your new upload session and do the upload. Please see the code snippet above to find this information. Upload another blob with same name without splitting it into blocks You could also upload another blob with the same name without splitting it into blocks. It could very well be a zero byte blob. That way your uncommitted block list will be wiped clean. Then you could delete that dummy blob and re-upload the actual blob. A Few Words About Blocks Since we’re talking about blocks, I thought it might be useful to mention a few points about them: Blocks and block related operations are only applicable for “Block Blobs”. Duh!! You’ll get an error if you’re trying to do these operations on a “Page Blob”. For uploading large blobs, it is recommended that you split your blob into blocks. In fact if your blob size is more than 64 MB, then you have to split it into blocks. Minimum size of a block is 1 Byte and the maximum size of a block is 4 MB. It is recommended that you choose a block size based on your internet connectivity and number of parallel threads you want use to upload these blocks. A blob can be split into a maximum of 50000 blocks. It’s important to remember this limitation because you are reminded of this limit when you’re trying to upload 50001st block. The length of all the block ids must be same. So if you’re using an integer value to denote block id, you make sure that you pad that integer value with “0” so that you get same length. So you could do something likeint.ToString(“d6”). When passing the block id as a parameter, it must be Base64 encoded. While the order in which the blocks are uploaded is not important, the order is important when you commit the block list because that’s when the blob is constructed by the service. For example, let’s say you’re uploading a blob by splitting it into 5 blocks (with ids “000001”, “000002”, “000003”, “000004”, and “000005”). You could upload these blocks in any order – 000004, 000001, 000003, 000005, 000002 however when you commit the block list, ensure that the block ids are passed in proper order i.e. 000001, 000002, 000003, 000004, 000005. Summary That’s it for this post. I hope you’ve found this information useful. I spent considerable amount of time trying to fix this problem so I hope it will help some folks out. As always, if you find any issues with the post please let me know and I’ll fix it ASAP.
May 20, 2013
by Gaurav Mantri
· 10,864 Views
article thumbnail
Gradle Goodness: Running a Single Test
Learn how to run test code with Gradle using the test task that is added by the Java plugin.
May 17, 2013
by Hubert Klein Ikkink
· 116,747 Views · 1 Like
article thumbnail
Deploy a File Server in the Cloud (WebDav on Windows Azure)
this month, my fellow it pro technical evangelists and i are authoring a new series of articles on 20 key scenarios with windows azure infrastructure services . check out the list of articles here: http://mythoughtsonit.com/2013/05/20-key-scenarios-with-windows-azure-infrastructure-services/ . web-based distributed authoring and versioning, or webdav, is a set of protocols based on http that allows end-users to map a network drive over http and edit content and files stored on the web server. when webdav was first offered on microsoft server i had evaluated it and decided it did not perform well enough for me. the webdav extension to iis was completely rewritten back in the server 2008 timeframe and is worth taking a look at again. in this article i will guide you step by step through the process of setting up webdav on server 2012 in a windows azure iaas environment. this will give you a solid performing file share on the internet over port 80 and the http protocol. first you need an azure account. you can setup a free trail of azure. details can be found here: http://mythoughtsonit.com/2013/04/step-by-step-guide-to-setting-up-a-windows-azure-free-trial/ second provision a server 2012 machine. watch a video of what to do here: third open port 80 to this new server: in the azure portal select your 2012 server and choose the “endpoints” tab on the top. click “add endpoint” at the bottom of the screen enter the endpoint information for port 80 to port 80 done. next we need to install the iis webserver and webdav. installing webdav on iis 8.0 start server manager and go to “add roles and features” under server roles – add the web server (iis) role click through the wizard until you come to the role services section. then find and select “webdav publishing” and “windows authentication” click next and then install when the install is finished you are ready to move on to the next section. configuring iis 8 for webdav after the installation finishes you need to configure the box for access. start the iis manager tool. choose the “default web site” on the left side. then click on “authentication” open the windows authentication option and enable it. open the “webdav authoring rules” create a webdav rule. i choose to allow all users access to all content. a better security practice is to limit what users can use the service. it’s your data so you decide. make sure webdav is enabled and that your access rule is set: that is it… now your ready to access your webdav file share! test and insure you can hit the web server by using your browser: because you opened port 80 and installed iis 8 you should see the default web page when you browse to your servers internet dns name. example: http://yourdomainname.cloudapp.net/ how to map a drive to your webdav server: there are two ways i use to connect to the webdav server how to map a drive to your webdav server from the win 8 gui: from windows explorer, right click on “computer” and select “map a network drive” map your network drive by entering the address to your server example: http://yourdomainname.cloudapp.net/ i selected “connect using different credentials” because my workstation was not joined to the server in anyway and i needed to use an account in the servers local sam database. hit “finish” and enter your credentials. now you will have a connected drive that you can access from windows explorer or any tool via the drive mapping. how to map a drive to your webdav server from a cmd box: 1. hit windows start and type: cmd 2. enter the command: net use [drive letter] [url] example: net use e: http://yourdomainname.cloudapp.net/
May 15, 2013
by Brian Lewis
· 15,918 Views
article thumbnail
Definitions of Done in Practice
A couple of weeks ago we looked at how to do a quick "health check" of an agile team. We saw that a great deal can be learned just by attending one of their daily stand-ups and by inspecting the state of their Scrum and Kanban boards. Of course these are nothing more than cursory examinations, and serious ailments can lie behind an apparently robust façade of agile practice. If you have reason to believe that a team is dysfunctional, you might have to dig deeper than the superficial evidence suggested by its apparent morphology. In my experience the next thing to examine is the team's "Definition of Done". This is the standard to which all work is put before it can be considered to be complete. Each team is collectively responsible for its own Definition of Done. It's up to them to make sure that it is adequate, and that it is applied by all members to all of the work they do. Without such professional oversight there can be no assurance that any deliverable will truly be fit for release. A spiffy stand-up and a cracker of a Kanban board might suggest rude team health, but they are no guarantee that the Definition of Done is solid, or that it isn't being undercut by someone along the way. Technical debt and rework are the main symptoms to look for. The consequences of backsliding on a Definition of Done might not become apparent until long after the events that caused it. By then, that rework or debt can be difficult to trace to the specific behaviors of those who cheated the system. You see, unfortunately a Definition of Done is a bit like personal hygiene. If there is no oversight, everyone can pretend that they are following the rules for the sake of the team, even though the presence of E. Coli on the office keyboards will tell its own tale about compliance. Everyone knows that it has to be coming from somewhere, but won't admit to their own liability or involvement, perhaps not even to themselves. Just as team vomiting will follow one member's dubious hand-washing practices, a short-changed Definition of Done will lead to rework by the team or the creation of technical debt. This is why team ownership and enforcement of what "Done" means is so important. An effective Definition of Done has to be founded on a healthy balance between due diligence and professional trust. What does this mean for agile development? You can think of a Definition of Done as the key defensive bulwark in software development epidemiology. If you balance the right level of team oversight with the right level of trust, severe outbreaks of technical debt or rework will become rare. High levels of oversight may be needed to start with, since team members might not have bought in to the idea of "done" yet. Once people become conditioned to do the right thing and see themselves as stakeholders in the team and its success, the balance can swing more towards trust. People become reluctant to renege on a team investment if they can see that it adds value for everyone including them. What's more, a Definition of Done improves the more it is respected, and becomes better respected the more it improves. In terms of agile best practice a Definition of Done will be used to determine whether or not User Story implementations are release-ready. However, each team can implement many User Stories over the course of a sprint, and making sure that all of these stories meet the Definition of Done can be challenging. Teams that are new to agile methods often have more modest ambitions. For example, their Definition of Done may only extend as far as delivery into a pre-production environment. Of course, anything less than "fully release ready" incurs the risk of technical debt and the need to pay it back later. Yet like a sloppy approach to hand-washing, it has to be admitted that something is better than nothing at all. Applying a Definition of Done consistently to even a sub-optimal standard will at least permit the delivery of each User Story to a known level of quality. It might not be great but at least it's there, and it's something that can be built upon and improved. The Lessons of Lean-Kanban Lean-Kanban methodologies have an instructive relationship with the Definition of Done. In these approaches the optimization of the value stream is of great significance. Naturally though, if a value stream is to be optimized it must first be understood. This means breaking the stream down into multiple discrete steps that can be studied for bottlenecks and any other occurrences of waste. For example, "Work in Progress" can be broken down into finer-grained stations such as "In Development", "Peer Review", "QA Test", "Knowledge Transfer", and "In Deployment". Team members will be cross-trained and will move freely across those stations in order to expedite as smooth a workflow as possible. Now here's where it gets interesting. If a Lean-Kanban operation has multiple well-defined stations, the case for having a Definition of Done can seem rather harder to make. After all, by the time a User Story gets to "Done", you already know that it has gone through the key steps you care about in the development process. What value can a Definition of Done really add in such a situation? Doesn't it just become waste itself? To find the answer, we need to look back to the manufacturing roots of Lean-Kanban. In a car plant for example, the steps of construction are exceptionally well defined and team members can move freely over several dozen stations. Some of those stations will be for the chassis, others for the interior, others for the engine block and electrics and so on. Yet despite this the Definition of Done will be an absolute corker, and much of the process of verification will be automated. Each station might even have its own Definition of Done so inspection can occur as close as possible to the point of action and potential remedy. The total number of checks that happen before each vehicle leaves the factory will be exhaustive. Why is this thought to be necessary? Because the manufacturers know perfectly well that the verification of "done" adds value. Merely having well-defined stations is no guarantee that everything will be done well. Quality is built in and validated by inspection. One thing's for sure: no-one in IT should accuse car manufacturers of having a weak understanding of what "done" means. The Definition of Done versus Acceptance Criteria However, software projects have a wild-card to deal with that car manufacturers don't have to worry about. Unlike the car doors and carburettors that roll down an assembly line, each User Story is different and has to be treated as a special case. To deal with this, each User Story has Acceptance Criteria that are agreed by the team members and the Product Owner as part of a Sprint Planning Session. Acceptance Criteria have to be quite specific to particular User Stories, because each story can be unique. The Definition of Done, on the other hand, applies to all of the User Stories being worked on by a team. The associated conditions must be invariant. For example, if all work has to be peer reviewed and subjected to QA testing prior to release, then those criteria would be enumerated in the Definition of Done rather than being repeated in each User Story's Acceptance Criteria. If the definition is enforced properly, a developer could never claim that a User Story was “Done” if it hadn't both been reviewed and passed QA testing. Writing a Definition of Done The Scrum Guide describes a Definition of Done as a "shared understanding of what it means for work to be complete". No process is suggested for writing a Definition of Done, nor in fact is there any suggestion that one should be written down at all. However, a documented definition may go some way towards providing that shared understanding. Here's how you can set about eliciting one: Review Acceptance Criteria: Gather the Acceptance Criteria for work completed so far Look for common criteria that can be abstracted out and applied across work in general Use these common criteria as the basis for a Definition of Done Assess Technical Debt Identify any rework that needs to be done Identify the reasons why it wasn't done properly the first time Identify what measures can be put in place to stop similar rework from occurring Add these measures to the Definition of Done (DoD) Continually update the DoD In each Sprint Review, identify which (if any) work was rejected or which caused rework to be done, then In each Sprint Retrospective, challenge the DoD for relevance and completeness There isn't a prescribed format for a Definition of Done, but it can be beneficial to use the same as that which is used for Acceptance Criteria, either in whole or in part. This allows a flexible definition based on story type. Alternatively they can be written as simple lists. Here are some examples: Example of a Definition of Done in Acceptance Criteria Format Given that a user story has required a code change When BDD and unit tests have been written for the story and the code change has been reviewed and the code change has been approved by a peer and all BDD and unit tests have been run and no BDD or unit tests have broken (green bar) and the code change has been committed to the repository and QA testing has passed satisfactorily and the Product Owner has approved the change Then the user story will be deployed to production and it will be considered Done. Given that a user story has required the authoring of documentation When the documentation has been reviewed and approved by a peer and the documentation has been approved by the Product Owner Then a new version of the documentation will be committed and the user story will be considered Done. Example of a Definition of Done in List Format Code changes: BDD tests written and pass Unit tests written and pass Code peer reviewed & approved Code committed to repository QA testing done Product Owner signed off Documentation: Documentation has been peer reviewed & approved Documentation approved by Product Owner Version number updated Definitions of Done for IT Infrastructure Support We've seen that having a good Definition of Done is important, although in IT we also need Acceptance Criteria that address the particulars of each User Story. When used in combination they can approach the levels of rigor that have been shown to be possible in manufacturing. Those working in software development can adopt a similar commitment to quality. Now we need to turn our attention to another function within the IT department...Infrastructure Support. Infrastructure support teams are increasingly expected to work in an agile way. As part of an enterprise transformation that does not seem unreasonable. After all, the rest of the organization is highly dependent upon them. Their scope includes such things as deploying new workstations and laptops (possibly across entire sites), installing networks, performing miscellaneous diagnostics and repairs, and maintaining and upgrading local server resources. Clearly they will also need Definitions of Done and Acceptance Criteria if they are to be stakeholders in a joined-up agile enterprise. The question is, how on earth can a meaningful Definition of Done be abstracted across wildly different physical tasks? How can a Definition of Done cover everything from a phone installation to a printer driver upgrade or a memory enhancement, or a firewall configuration to a keyboard replacement? The answer is to focus on the value chain that is represented by each user story. Work is not "released" as such, but rather it is handed over to someone who can derive benefit from it (i.e. the person occupying the user story role). This is the key to understanding "done" in an infrastructure context. If you can identify the parties who derive value, and demonstrably pass that value on to them, then your work is done. Here's an example of a Definition of Done that might be used to close out a support ticket: The receiver of the work has been identified Handover instructions have been completed and given to the receiver The receiver has been notified of the intention to close the ticket, and has not raised an objection A security assessment has been conducted and approved There are three things to point out here. The absence of any reference to a Product Owner. This is because infrastructure teams have to support multiple products, and prioritization of work is traditionally handled not through any sense of ownership of those products, but rather through help-desk triage. It's certainly possible for work to be represented by Product Owners, but it would have to be ownership of the business support function rather than ownership of the actual products being supported. The need to identify and work with the actual receivers of value is still there. The "acceptance by default" position. Receivers typically have little incentive to sign work off as being complete. On the contrary, their incentive is to defer acceptance as long as possible, for potential use as a "banker" in case a requirement for additional unforeseen work transpires. They might hope to ride this new work on an existing ticket instead of having to raise a new one. Receivers can be expected to care about their own support needs, not about the big picture of enterprise delivery. If a Product Owner can be identified - even if it is just the most likely owner of the business support function - then this situation can be improved. A Product Owner can apply leverage for appropriate and timely sign-off, such as by not accepting new work from certain parties while their approval (or justified rejection) of prior work remains outstanding. The elicitation of solid Acceptance Criteria can help the Product Owner immensely. Security implications need to be given careful consideration. The reworking of organizational infrastructure offers great potential for security to be compromised. Approval from Information Security should be obtained for all work, either directly or through authorized agents. One approach is for each team to have a designated "security champion" who provides this function. Conclusion Teams that appear to be healthy can still incur rework and technical debt. A poor understanding of what "done" means often underlies such problems, and this should be one of the first things to be looked at if problems are suspected. Having a meaningful Definition of Done encourages a team's sense of ownership of their own process, and helps instil self-discipline into its members to follow agile best practices. The application of this standard requires finding the right balance between team oversight and trust.
May 15, 2013
by $$anonymous$$
· 40,691 Views · 1 Like
article thumbnail
Synchronising Multithreaded Integration Tests revisited
I recently stumbled upon an article Synchronising Multithreaded Integration Tests on Captain Debug's Blog. That post emphasizes the problem of designing integration tests involving class under test running business logic asynchronously. This contrived example was given (I stripped some comments): public class ThreadWrapper { public void doWork() { Thread thread = new Thread() { @Override public void run() { System.out.println("Start of the thread"); addDataToDB(); System.out.println("End of the thread method"); } private void addDataToDB() { // Dummy Code... try { Thread.sleep(4000); } catch (InterruptedException e) { e.printStackTrace(); } } }; thread.start(); System.out.println("Off and running..."); } } This is only an example of common pattern where business logic is delegated to some asynchronous job pool we have no control over. Roger Hughes (the author) enumerates few techniques of testing such code, including: arbitrary ("long enough") sleep() in test method to make sure background logic finishes refactoring doWork() so that it accepts CountDownLatch and agrees to notify it when job is done making the method above package private and @VisibleForTesting only "The" solution - refactoring doWork() so that it accepts arbitrary Runnable. In test we can wrap this Runnable (decorator pattern) and wait for inner Runnable to complete Last solution is not bad but it changes the responsibilities of ThreadWrapper significantly. Now it's up to the caller to decide what kind of job should be executed asynchronously while previously ThreadWrapper was encapsulating business logic completely. I am not saying it's a bad design, but it's drastically different from original method. Awaitility Can we write a test without such a massive refactoring? First solution involves handy library called Awaitility. This library is not a silver bullet, it simply evaluates given condition periodically and makes sure it's fulfilled within given time. It's the kind of code you probably wrote once or twice - wrapped in a library with well designed API. So here is our initial approach: import static com.jayway.awaitility.Awaitility.await; import static java.util.concurrent.TimeUnit.SECONDS; //... await().atMost(10, SECONDS).until(recordInserted()); //... private Callable recordInserted() { return new Callable() { @Override public Boolean call() throws Exception { return dataExists(); } }; } I think there is nothing to explain here. dataExists() is simply a boolean method that initially returns false but will eventually return true once the background task (addDataToDB()) is done. In other words we assume that background task introduces some side effect and dataExists() can detect that side effect. BTW I happened to have JDK 8 with Lambda support installed and IntelliJ IDEA gives me this nice tooltip: Suddenly I get this Java 8-compatible alternative suggested: private Callable recordInserted() { return () -> dataExists(); } But there's more: Which transforms my code to: private Callable recordInserted() { return this::dataExists; } this:: prefix means that recordInsterted is a method of current object. Just as well we can say someDao::dataExists. Simply put this syntax turns method into a function object we can pass around (this process is called eta expansion in Scala). By now recordInsterted() method is no longer that needed so I can inline it and remove it completely: await().atMost(10, SECONDS).until(this::dataExists); I am not sure what I love more - the new lambda syntax or how IntelliJ IDEA takes pre-Java 8 code and retrofits it for me automatically (well, it's still a bit experimental, just reported IDEA-106670). I can run this intention in IntelliJ project-wide, Lambda-enabling my whole code base in seconds. Sweet! But back to original problem. Awaitility helps a lot by providing decent API and some handy features. I use it extensively in combination with FluentLenium. But periodically polling for state changes feels a bit like a workaround and still introduces minimal latency. But notice that running and synchronizing on asynchronous tasks is quite common and JDK already provides necessary facilities: Future abstraction! java.util.concurrent.Future To limit the scope of refactoring I will leave the original new Thread() approach for now and use SettableFuture from Guava. It is a Future implementation that allows triggering completion or failure at any time, from any thread (see DeferredResult - asynchronous processing in Spring MVC for more advanced usage). As you can see the changes are quite small: public class ThreadWrapper { public ListenableFuture doWork() { final SettableFuture future = SettableFuture.create(); Thread thread = new Thread() { @Override public void run() { addDataToDB() //... //last instruction future.set(null); } private void addDataToDB() { // Dummy Code... // ... } }; thread.start(); return future; } } doWork() now returns ListenableFuture with lifecycle controlled inside asynchronous task. We use Void but in reality you might want to return some asynchronous result instead. future.set(null) invocation in the end is crucial. It signals that future is fulfilled and all threads waiting for that future will be notified. Once again, in practice you would use e.g. Future and then instead of null we would say future.set(someInteger). Here null is just a placeholder for Void type. How does this help us? Test code can now rely on future completion: final ListenableFuture future = wrapper.doWork(); future.get(10, SECONDS); future.get() blocks until future is done (with timeout), i.e. until we call future.set(...). BTW I use ListenableFuture from Guava but Java 8 introduces equivalent and standard CompletableFuture - I will write about it soon. So, we are getting somewhere. Future is a useful abstraction for waiting and signalling completion of background jobs. But there is also one immense advantage of Future which are not taking, ekhm, advantage from - exception handling and propagation. Future.get() will block until future is complete and return asynchronous result or throw an exception initially thrown from our job. This is really useful for asynchronous tests. Currently if Thread.run() throws an exception it may or may not be logged or visible to us and future will never be completed. With Awaitility it's slightly better - it will timeout without any meaningful reason, which have to be tracked down manually in console/logs. But with minor modification our test is much more verbose: public void run() { try { addDataToDB() //... future.set(null); } catch (Exception e) { future.setException(e); } } If some exception occurs in asynchronous job, it will pop-up and be shown as JUnit/TestNG failure reason. (Listening)ExecutorService That's it. If addDataToDB() throws an exception it will not be lost. Instead our future.get() in test will re-throw that exception for us. Our test won't simply timeout leaving us with no clue what went wrong. Great, but do we really have to create this special SettableFuture instance, can't we just use existing libraries that already give us Future with correct underlying implementation? Of course! By this requires further refactoring: import com.google.common.util.concurrent.ListeningExecutorService; import com.google.common.util.concurrent.MoreExecutors; import java.util.concurrent.Executors; import java.util.concurrent.Future; public class ThreadWrapper { private final ListeningExecutorService executorService = MoreExecutors.listeningDecorator( Executors.newSingleThreadExecutor() ); public ListenableFuture doWork() { Runnable job = new Runnable() { @Override public void run() { //... } }; return executorService.submit(job); } } This is what you've all been waiting for. Don't start new Thread all the time, use thread pool! I actually went one step further by using ListeningExecutorService - an extension to ExecutorService that returns ListenableFuture instances (see why you want that). But the solution doesn't require this, I just spread good practices. As you can see Future instance is now created and managed for us. The test is exactly the same but production code is cleaner and more robust. MoreExecutors.sameThreadExecutor() The final trick I want to show you involves dependency injection. First let's externalize the creation of a thread pool from ThreadWrapper class: private final ListeningExecutorService executorService; public ThreadWrapper() { this(Executors.newSingleThreadExecutor()); } public ThreadWrapper(ExecutorService executorService) { this.executorService = MoreExecutors.listeningDecorator(executorService); } We can now optionally supply custom ExecutorService. This is good for various other reasons, but for us it opens brand new testing opportunity: MoreExecutors.sameThreadExecutor(). This time we modify our test slightly: final ThreadWrapper wrapper = new ThreadWrapper(MoreExecutors.sameThreadExecutor()); wrapper.doWork().get(); See how we pass custom ExecutorService? It's a very special implementation that doesn't really maintain thread pool of any kind. Every time you submit() some task to that "pool" it will be executed in the same thread in a blocking manner. This means that we no longer have asynchronous test, even though the production code wasn't changed that much! wrapper.doWork() will block until "background" job finishes. The extra call to get() is still needed to make sure exceptions are propagated, but is guaranteed to never block (because the job is already done). Using the same thread to execute asynchronous task instead of a thread pool might have an unexpected results if you somehow depend on thread-based properties, e.g. transactions, security, ThreadLocal. However if you use standard ThreadPoolExecutor with CallerRunsPolicy, JDK already behaves this way if thread pool is overflowed. So it's not that unusual. Summary Testing asynchronous code is hard, but you have options. Several options. But one conclusion that strikes me is the side effect of our efforts. We refactored original code in order to make it testable. But the final production code is not only testable, but also much better structured and robust. Surprisingly it's even source-code compatible with previous version as we barely changed return type from void to Future. It seems to be a rule - testable code is often better designed and implemented. Unit test is the first client code using our library. It naturally forces us to to think more about consumers, not the implementation.
May 7, 2013
by Tomasz Nurkiewicz
· 8,936 Views · 1 Like
article thumbnail
Agile Estimation in Practice
The longer I spend working as an agile coach, the more I find myself in disagreement with Hamlet. To estimate, or not to estimate? That is the question. Out of all of the agile practices which have been adopted in recent years, few have proven more controversial than this one. The battle for and against rages like Shakespearean armies set against each other's teeth. (Free Estimation Ebook) At first blush there doesn't seem to be any reasonable cause for disagreement. The rationale for making estimates is ostensibly straightforward. If a team is to work in Sprints, and to deliver something at the end of each one, then the work must surely be estimated. Otherwise how can the team know if it is even possible to do the work within the Sprint? How can they commit to deliver something by the end of that time-box if the effort involved is of uncertain magnitude? Well, there are two things that we need to draw out at this point. Firstly, the above rationale assumes that Sprints will be used, and that delivery will therefore be time-boxed. That's a very Scrum oriented philosophy...but Scrum isn't the only agile way of working. Lean-Kanban teams, for example, don't use Sprints and rarely make use of estimates. Secondly, Scrum itself says nothing about estimation. It only says that each item in a backlog must be sized - how that sizing happens is up to the team. It should also be remembered that a Scrum team commits to a Sprint Goal that delivers value, not to the delivery of a certain number of estimated points. So then...to estimate, or not to estimate? Let's listen in at the camp-fires of each side, and pick out in more detail the arguments they make for and against. For (Ye Scrum Brigade of Sprinte and Stande-uppe) "Estimates allow us to predict when a Sprint Goal will be met, and therefore when a substantial increment of value will be delivered" "Our estimates help our stakeholders plan ahead. They are part of the value we provide" "Estimates help us to de-risk scope of uncertain size and complexity" "Estimated work can be traded in and out of scope for other work of similar size. Without estimates you can't trade" "The very process of estimation adds value. When we estimate we discuss requirements in more detail, and gain a better understanding of what is needed" Against (Ye Lean Kanban Brigade of Boarde Pullers) "Estimates are rarely accurate. All you are doing when you estimate a piece of work is to set false expectations" "In practice, estimation is seen as a commitment, not as a best guess. Every time you make an estimate, you make a rod for your own back" "Estimation is time consuming. The time a team spends playing planning poker or whatever is time that could have been spent on delivery. Estimation is waste." "It's the actuals that matter, not estimates. Agility requires metrics, and the only metrics that count are those that reflect actual delivery" Both sides are right If you see this debate in terms of whether a Scrum or Lean-Kanban process is being followed, then both sides are right. A Scrum process is optimized for project work where scope risk is high and an entire system is represented. The requirements tend to be uncertain, complex, and very heavily intertwined. By committing to a Sprint Goal and to the delivery of a substantial increment of value, that risk can be managed. Uncertain and interdependent requirements are batched together into a Sprint and dealt with as a group. When this is done well, you have a clear Sprint Goal and a coherent Sprint Backlog. When it is done badly, you have a vague or disjointed Sprint Goal, a mishmash of requirements that command no sense of team purpose, and no team commitment towards the delivery of an increment. A Lean-Kanban process, on the other hand, is usually focused on "Business As Usual" (BAU) activities. The diet of a Lean-Kanban operation should consist of small and repeatable changes. They don't have to be related at all...in fact they shouldn't be. Things like bug fixes, minor enhancements, and administrative tasks are representative of this kind of work. Scope risk is low because the process of making such changes is well understood. Estimates are generally held to be unnecessary because there is very little uncertainty to deal with. There is no need for work to be batched...each change can be actioned and delivered independently of all others. Work is enqueued and actioned according to priority and the required quality of service. Predictions are based on the actual rate of delivery, not on estimates. In a Lean-Kanban way of working, the actuals are indeed everything. Methods of estimation So then, estimates add value where scope is uncertain and there are associated risks to be managed. That's why Scrum teams engaged on projects typically make use of them, but Lean-Kanban BAU teams generally don't. Now let's look at three simple methods of estimation that Scrum teams, or other teams doing project work, can make use of. Planning Poker This is a well established technique popularised by Mike Cohn, and variations on his Planning Poker cards can be found in offices across the world. A typical Planning Poker set has cards with the following numbers: ½ 1 2 3 5 8 13 20 40 100. Nerds will observe and be irritated by the fact that this is roughly (but not quite) in line with the Fibonacci sequence. Here's how it's played: An identical hand of cards is given to each team member. Each team member will have a set of cards with numbers on the above pseudo-Fibonacci scale. The Product Owner describes the piece of work to be estimated. Normally this is a user story with acceptance criteria. Each team member mentally estimates the size of it on the scale. They can ask the Product Owner questions to clarify any points, but for the moment they will keep their estimate to themselves. Each team member places the card that corresponds to their mental estimate face down in front of them. At the facilitators instruction - usually the Scrum Master - the team turn their cards over. In an ideal case the cards will all have the same value, suggesting that the team have a common understanding of the requirements and the likely effort that will be involved. If the values are different, the team then need to discuss their estimates and their reasoning behind them. They need to understand each other’s thinking, and from that reach a consensus. It may be necessary to replay the cards several times before agreement is reached. By convention, estimates are written on the corner of a User Story card before being placed on a Scrum board. A variation of this takes from a suite of regular playing cards. The Ace (1), 2, 3, 5, 8, Jack, Queen, and King might be used. The Jack signifies that no or negligible work needs doing (jack all). The Queen indicates a larger story that should be broken down in the planning session and reconsidered, while the King indicates an epic that will need greater analysis and cannot be brought into scope for this Sprint. The Joker can be played if anyone wants a coffee break. As an estimation method, Planning Poker has the advantage of being fairly democratic. Every team member gets a hand of cards and is allowed to play, and has a clear opportunity to explain their reasoning to the others. The disadvantage of Planning Poker is that it can be rather time consuming in comparison with other methods. It can also encourage novice teams to estimate in terms of time, as they are often initially prejudiced to correlate points to hours or days. This prejudice must be challenged and eroded if the relative sizing of estimates is to be achieved. Team Sort (T-Shirt Sizing) This is a good way of doing team estimates if no planning poker cards are available. All you need are six scraps of paper and a set of index cards with the requirements (e.g. user stories) written on them. Normally these will be the same index cards that go on the Scrum board. Write one of the following sizes on each of the scraps of paper: Extra Small (XS), Small (S), Medium (M), Large (L), Extra Large (XL), and Extra Extra Large (XXL) Arrange the sizes in a horizontal line on a table, ordered from XS on the left to XXL on the right. Put the pile of index cards on the table in front of the sizing line. The team then collaborate to organise the requirements on the cards under the headings XS to XXL. They can ask the Product Owner to clarify any questions that they may have while doing so. Once the cards have all been sorted, story points can be allocated to each of them by mapping each T-Shirt size to a value. This allows metrics to be gathered about the flow of work, and used to populate a velocity or burndown chart. T Shirt Size Suggested Story Point Value XS 1 S 2 M 3 L 5 XL 8 XXL 13 An advantage of the team sort is that it is quick and easy to do. The complete set of requirements is estimated in one sweep. Also, it is a fairly direct way of achieving relative sizing. There is no temptation to correlate points to hours. The disadvantage is that it is potentially undemocratic, in that assertive team members can dominate meeker ones with their opinions. There is a variant of the team sort which encourages more egalitarian behavior. Each team member takes it in turns to move one card by one position. They also have the option to pass, i.e. to not move a card. Eventually a consensus should be reached and no more cards will be moved. However this is a more time consuming method and deadlocks can occur. These deadlocks can be difficult to spot if multiple card shifts are caught in the cycle. One Point One Card This method is a spin on the Lean-Kanban approach of tracking actuals. Instead of estimating the relative effort required for each story card, the team estimates how many stories it is likely to complete in the Sprint being planned. This can be as straightforward as using the yesterday's weather analogy for velocity estimation. Just as the weather today is most likely to resemble the weather yesterday, the velocity that will be achieved by a team in the upcoming Sprint will most likely match the velocity of its predecessor. So if two dozen cards were completed in the last sprint, approximately two dozen can be expected in the one that follows. The budget can be adjusted to allow for holiday, foreseeable absences, and other such changes that will impact the team's commitment. The advantage of this system is its raw simplicity. The estimation overhead is almost negligible. Also, it encourages the authoring of small user stories that will spend little time in progress and that stand little chance of being impeded. The liquidity of the board is therefore increased and further requirements analysis is encouraged. Some variation in size will be inevitable, and there will be statistical outliers, but the effects of these will average out as the flow rate increases. The disadvantage of this technique lies in the separation of fine-grained user stories from business value. There is a significant risk that they will become excessively technically focused and task-like. Conclusion Agile estimation is often seen as being invaluable, yet others dismiss it as waste. The reasons for this disagreement can be traced to disparities in Scrum and Lean-Kanban ways of working, and to the fundamental differences between project work and Business As Usual. When seen in the context of Scrum projects, some form of estimation process is valuable. Yet regardless of the method chosen, it must be acknowledged that a Scrum Team is responsible for its own estimates. No-one else can make a team's estimates for them. Going through that process of estimation, and understanding the size and scope of the work, is fundamental to the team's sense of Sprint Backlog ownership and to their commitment to a Sprint Goal.
May 3, 2013
by $$anonymous$$
· 54,540 Views · 3 Likes
article thumbnail
Code Ownership – Who Should Own the Code?
A key decision in building and managing any development team is agreeing on how ownership of the code will be divided up: who is going to work on what code; how much work can be, and should be, shared across the team; and who will be responsible for code quality. The approach that you take has immediate impact on the team’s performance and success, and a long-term impact on the shape and quality of the code. Martin Fowler describes three different models for code ownership on a team: Strong code ownership – every module is owned exclusively by someone, developers can only change the code that they own, and if they need to change somebody else’s code, they need to talk to that owner and get the owner’s agreement first – except maybe in emergencies. Weak code ownership – where modules are still assigned to owners, but developers are allowed to change code owned by other people. Owners are expected to keep an eye on any changes that other people make, and developers are expected to ask for permission first before making changes to somebody else’s code. This can be thought of as a shared custody model, where an individual is forced to share ownership of their code with others; or Code Stewardship, where the team owns all of the code, but one person is held responsible for the quality of specific code, and for helping other people make changes to it, reviewing and approving all major changes, or pairing up with other developers as necessary. Brad Appleton says the job of a code steward is not to make all of the changes to a piece of code, but to “safeguard the integrity + consistency of that code (both conceptually and structurally) and to widely disseminate knowledge and expertise about it to others”. Collective Code Ownership – the code base is owned or shared by the entire team, and everyone is free to make whatever changes they need – or want – to make, including refactoring or rewriting code that somebody else originally wrote. This is a model that came out of Extreme Programming, where the Whole Team is responsible together for the quality and integrity of the code and for understanding and keeping the design. Arguments against Strong/Individual Code Ownership Fowler and other XP advocates such as Kent Beck don’t like strong individual code ownership, because it creates artificial barriers and dependencies inside the team. Work will stall and pause if you need to wait for somebody to make or even approve a change, and one owner can often become the critical path for the entire team. This could encourage developers to come up with their own workarounds and compromises. For example, instead of changing an API properly (which would involve a change to somebody else’s code), they might shoe horn in a change, like stuffing something into an existing field. Or they might take a copy of somebody’s code and add whatever they need to it, making maintenance harder in the future. Other arguments against strong ownership are that it can lead to defensiveness and protectionism on the part of some developers (“hey, don’t touch my code!”), where they take any criticism of the code as a personal attack, creating tension on the team and discouraging reviewers from offering feedback and discouraging refactoring efforts; and local over-optimization, if developers are given too much time to spend to polish and perfect their precious code without thinking of the bigger picture. And of course there is the “hit by a truck factor” to consider – the impact that a person leaving the team will have on productivity if they’re the only one who works on a piece of code. Ward Cunningham. one of the original XPers, also believes that there is more pride of ownership when code is shared, because everyone’s work is always on display to everyone else on the team. Arguments against Collective Code Ownership But there are also arguments against Collective Code Ownership. A post by Mike Spille lists some problems that he has seen when teams try to “over-share” code: Inconsistency. No overriding architecture is discernible, just individual solutions to individual problems. Lots of duplication of effort results, often leading to inconsistent behavior Bugs. People "refactoring" code they don't really understand break something subtle in the original code. Constant rounds of "The Blame Game". People have a knee jerk reaction to bugs, saying "It worked when I wrote it, but since Joe refactored it....well, that's his problem now.". Slow delivery. Nobody has any expertise in any given domain, so people are spending more time trying to understand other people's code, less time writing new code. Matthias Friedrich, in Thoughts on Collective Code Ownership believes that Collective Code Ownership can only work if you have the right conditions in place: Team members are all on a similar skill level Programmers work carefully and trust each other The code base is in a good state Unit tests are in place to detect problematic changes (although unit tests only go so far) Remember that Collective Code Ownership came out of Extreme Programming. Successful team ownership depends on everyone sharing an understanding of the domain and the design, and maintaining a high-level of technical discipline: not only writing really good automated tests as a safety net, but everyone following consistent code conventions and standards across the code base, and working in pairs because hopefully one of you knows the code, or at least with two heads you can try to help each other understand it and make fewer mistakes. Another problem with Collective Code Ownership is that ownership is spread so thin. Justin Hewlett talks about the Tragedy of the Commons problem: people will take care of their own yard, but how many people will pick up somebody else’s litter in the park, or on a street - even if they walk in that park or down that street everyday? If the code belongs to everyone, then there is always “someone else” who can take care of it – whoever that “someone else” may be. As a developer, you’re under pressure, and you may never touch this piece of code again, so why not get whatever you need to do as quickly as possible and get on to the next thing on your list, and let "somebody else" worry about refactoring or writing that extra unit test or...? Code Ownership in the Real World I've always worked on or with teams that follow individual (strong or weak) code ownership, except for an experiment in pure XP and Collective Code Ownership on one team over 10 years ago. One (or maybe two) people own different pieces of the code and do all or most of the heavy lifting work on that code. Because it only makes sense to have the people who understand the code best do most of the work, or the most important work. It’s not just because you want the work “done right” – sometimes you don’t really have a choice over who is going to do the work. As Ralf Sudelbucher points out, Collective Code ownership assumes that all coding work is interchangeable within a team, which is not always true. Some work isn't interchangeable because of technology: different parts of a system can be written in different languages, with different architectures. You have to learn the language and the framework before you can start to understand the other problems that need to be solved. Or it might be because of the problem space. Sure, there is always coding on any project that is “just typing”: journeyman work that is well understood, like scaffolding work or writing another web form or another CRUD screen or fixing up a report or converting a file format, work that has to be done and can be taken on by anyone who has been on the team for a while and who understands where to find stuff and how things are done – or who pairs up with somebody who knows this. But other software development involves solving hard domain problems and technical problems that require a lot of time to understand properly – where it can take days, weeks, months or sometimes even years to immerse yourself in the problem space well enough to know what to do, where anyone can’t just jump in and start coding, or even be of much help in a pair programming situation. The worst disasters occur when you turn loose sorcerers' apprentices on code they don't understand. In a typical project, not everyone can know everything - except in some mature domains where there have been few business paradigm shifts in the past decade or two. Jim Coplien, Code Ownership I met someone who manages software development for a major computer animation studio. His team has a couple of expert developers who did their PHDs and post grad work in animating hair – that’s all that they do, and even if you are really smart you’ll need years of study and experience just to understand how they do what they do. Lots of scientific and technical engineering domains are also like this – maybe not so deeply specialized, but they involve non-trivial work that can’t be easily or competently done by generalists, even competent generalists. Programming medical devices or avionics or robotics or weapons control; or any business domain where you are working at the leading edge of problem solving, applying advanced statistical models to big data analysis or financial trading algorithms or risk-management models; or supercomputing and high-scale computing and parallel programming, or writing an operating system kernel or solving cryptography problems or doing a really good job of User Experience (UX) design. Not everyone understands the problems that need to be solved, not everyone cares about the problems and not everyone can do a good job of solving them. Ownership and Doing it Right If you want the work done right, or need it to be done right the first time, it should be done by someone who has worked on the code before, who knows it and who has proven that they can get the job done. Not somebody who has only a superficial familiarity with the code. Research work by Microsoft and others have shown that as more people touch the same piece of code, there is more chance of misunderstandings and mistakes – and that the people who have done the most work on a piece of code are the ones who make the fewest mistakes. Fowler comes back to this in a later post about “Shifting to Code Ownership” where he shares a story from a colleague who shifted a team from collective code ownership to weak individual code ownership because weaker or less experienced programmers were making mistakes in core parts of the code and impacting quality, velocity and the team’s morale. They changed their ownership model so anyone could work around the code base, but if they needed to change core code, they had to do this with the help of someone who knew that part of the code well. In deciding on an an ownership approach, you have to make a trade-off between flexibility and quality, team ownership and individual ownership. With individual ownership you can have siloing problems and dependencies on critical people, and you’ll have to watch out for trucks. But you can get more done, faster, better and by fewer people.
April 29, 2013
by Jim Bird
· 14,938 Views
article thumbnail
Maven Deploy to Nexus
1. Overview In a previous article, I discussed how a Maven project can locally install a third party jar that has not yet been deployed on Maven central (or on any of the other large and publicly hosted repositories). That solution should only be applied in small projects where installing, running and maintaining a full Nexus server may be overkill. However, as a project grows, Nexus quickly becomes the only real and mature option for hosting third party artifacts, as well as for reusing internal artifacts across development streams. This article will show how to deploy the artifacts of a project to Nexus, with Maven. 2. Nexus requirements in the pom In order for Maven to be able to deploy the artifacts it creates in the package phase of the build, it needs to define the repository information where the packaged artifacts will be deployed, via the distributionManagement element: nexus-snapshots http://localhost:8081/nexus/content/repositories/snapshots A hosted, public Snapshots repository comes out of the box on Nexus, so there’s no need to create or configure anything further. Nexus makes it easy to determine the URLs of its hosted repositories – each repository displays the exact entry to be added in the of the project pom, under the Summary tab. 3. The plugins By default, Maven handles the deployment mechanism via the maven-deploy-plugin – this mapped to the deployment phase of the default Maven lifecycle: maven-deploy-plugin 2.7 default-deploy deploy deploy The maven-deploy-plugin is a viable option to hanldle the task of deploying to artifacts of a project to Nexus, but it was not built to take full advantage of what Nexus has to offer. Because of that fact, Sonatype built a Nexus specific plugin – the nexus-staging-maven-plugin – that is actually designed to take full advantage of the more advanced functionality that Nexus has to offer – functionality such as staging. Although for a simple deployment process we do not require staging functionality, we will go forward with this custom Nexus plugin since it was built with the clear purpose to talk to Nexus well. The only reason to use the maven-deploy-plugin is to keep open the option of using an alternative to Nexus in the future – for example an Artifactory repository. However, unlike other components that may actually change throughout the lifecycle of a project, the Maven Repository Manager is highly unlikely to change, so that flexibility is not required. So, the first step in using another deployment plugin in the deploy phase is to disable the existing, default mapping: org.apache.maven.plugins maven-deploy-plugin ${maven-deploy-plugin.version} true Now, we can define: org.sonatype.plugins nexus-staging-maven-plugin 1.3 default-deploy deploy deploy nexus http://localhost:8081/nexus/ true The deploy goal of the plugin is mapped to the deploy phase of the Maven build. Also notice that, as discussed, we do not need staging functionality in a simple deployment of -SNAPSHOT artifacts to Nexus, so that is fully disabled via the element. 4. The Global settings.xml Deployment to Nexus is a secured operation – and a deployment user exists for this purpose out of the box on any Nexus instance. Configuring Maven with the credentials of this deployment user, so that it can interact correctly with Nexus, cannot be done in the pom.xml of the project. This is because the syntax of the pom doesn’t allow it, not to mention the fact that the pom may be a public artifact, so not well suited to hold credential information. The credentials of the server has to be defined in the global Maven setting.xml: nexus-snapshots deployment the_pass_for_the_deployment_user The server can also be convigured to use key based security instead of raw and plaintext credentials. 5. The deployment process Performing the deployment process is a simple task: mvn clean deploy -Dmaven.test.skip=true Skipping tests is OK in the context of a deployment job, because this job should be the last job from a deployment pipline for the project. A common example of such a deployment pipeline would be a succession of Jenkins jobs, each triggering the next only if it completletes succesfully. As such, it is the responsibility of the previous jobs in the pipeline to run all tests suites from the project – by the time the deployment job runs, all tests should already pass. If ran a a single command, then tests can be kept active to run before the deployment phase executes: mvn clean deploy 6. Conclusion This is a simple, yet highly effective solution to deploying to Maven artifacts to Nexus. It is also somewhat oppinionated – nexus-staging-maven-plugin is used instead of the default maven-deploy-plugin; staging functionality is disabled, etc – it is these choices that make the solution simple and practical. Potentially activating the full staging functionality can be the subject of a future article. Finally, we’ll discuss the Release Process in the next article.
April 24, 2013
by Eugen Paraschiv
· 43,533 Views · 2 Likes
article thumbnail
Multipart Upload on S3 with jclouds
1. Goal In the previous article, we looked at how we can use the generic Blob APIs from jclouds to upload content to S3. In this article we will use the S3 specific asynchronous API from jclouds to upload content and leverage the multipart upload functionality provided by S3. 2. Preparation 2.1. Set up the custom API The first part of the upload process is creating the jclouds API – this is a custom API for Amazon S3: public AWSS3AsyncClient s3AsyncClient() { String identity = ... String credentials = ... BlobStoreContext context = ContextBuilder.newBuilder("aws-s3"). credentials(identity, credentials).buildView(BlobStoreContext.class); RestContext providerContext = context.unwrap(); return providerContext.getAsyncApi(); } 2.2. Determining the number of parts for the content Amazon S3 has a 5 MB limit for each part to be uploaded. As such, the first thing we need to do is determine the right number of parts that we can split our content into so that we don’t have parts below this 5 MB limit: public static int getMaximumNumberOfParts(byte[] byteArray) { int numberOfParts= byteArray.length / fiveMB; // 5*1024*1024 if (numberOfParts== 0) { return 1; } return numberOfParts; } 2.3. Breaking the content into parts Were going to break the byte array into a set number of parts: public static List breakByteArrayIntoParts(byte[] byteArray, int maxNumberOfParts) { List parts = Lists. newArrayListWithCapacity(maxNumberOfParts); int fullSize = byteArray.length; long dimensionOfPart = fullSize / maxNumberOfParts; for (int i = 0; i < maxNumberOfParts; i++) { int previousSplitPoint = (int) (dimensionOfPart * i); int splitPoint = (int) (dimensionOfPart * (i + 1)); if (i == (maxNumberOfParts - 1)) { splitPoint = fullSize; } byte[] partBytes = Arrays.copyOfRange(byteArray, previousSplitPoint, splitPoint); parts.add(partBytes); } return parts; } We’re going to test the logic of breaking the byte array into parts – we’re going to generate some bytes, split the byte array, recompose it back together using Guava and verify that we get back the original: @Test public void given16MByteArray_whenFileBytesAreSplitInto3_thenTheSplitIsCorrect() { byte[] byteArray = randomByteData(16); int maximumNumberOfParts = S3Util.getMaximumNumberOfParts(byteArray); List fileParts = S3Util.breakByteArrayIntoParts(byteArray, maximumNumberOfParts); assertThat(fileParts.get(0).length + fileParts.get(1).length + fileParts.get(2).length, equalTo(byteArray.length)); byte[] unmultiplexed = Bytes.concat(fileParts.get(0), fileParts.get(1), fileParts.get(2)); assertThat(byteArray, equalTo(unmultiplexed)); } To generate the data, we simply use the support from Random: byte[] randomByteData(int mb) { byte[] randomBytes = new byte[mb * 1024 * 1024]; new Random().nextBytes(randomBytes); return randomBytes; } 2.4. Creating the Payloads Now that we have determined the correct number of parts for our content and we managed to break the content into parts, we need to generate the Payload objects for the jclouds API: public static List createPayloadsOutOfParts(Iterable fileParts) { List payloads = Lists.newArrayList(); for (byte[] filePart : fileParts) { byte[] partMd5Bytes = Hashing.md5().hashBytes(filePart).asBytes(); Payload partPayload = Payloads.newByteArrayPayload(filePart); partPayload.getContentMetadata().setContentLength((long) filePart.length); partPayload.getContentMetadata().setContentMD5(partMd5Bytes); payloads.add(partPayload); } return payloads; } 3. Upload The upload process is a flexible multi-step process – this means: the upload can be started before having all the data – data can be uploaded as it’s coming in data is uploaded in chunks – if one of these operations fails, it can simply be retrieved chunks can be uploaded in parallel – this can greatly increase the upload speed, especially in the case of large files 3.1. Initiating the Upload operation The first step in the Upload operation is to initiate the process. This request to S3 must contain the standard HTTP headers – the Content-MD5 header in particular needs to be computed. Were going to use the Guava hash function support here: Hashing.md5().hashBytes(byteArray).asBytes(); This is the md5 hash of the entire byte array, not of the parts yet. To initiate the upload, and for all further interactions with S3, we’re going to use the AWSS3AsyncClient – the asynchronous API we created earlier: ObjectMetadata metadata = ObjectMetadataBuilder.create().key(key).contentMD5(md5Bytes).build(); String uploadId = s3AsyncApi.initiateMultipartUpload(container, metadata).get(); The key is the handle assigned to the object – this needs to be a unique identifier specified by the client. Also notice that, even though we’re using the async version of the API, we’re blocking for the result of this operation – this is because we will need the result of the initialize to be able to move forward. The result of the operation is an upload id returned by S3 – this will identify the upload throughout it’s lifecycle and will be present in all subsequent upload operations. 3.2. Uploading the Parts The next step is uploading the parts. Our goal here is to send these requests in parallel, as the upload parts operation represent the bulk of the upload process: List> ongoingOperations = Lists.newArrayList(); for (int partNumber = 0; partNumber < filePartsAsByteArrays.size(); partNumber++) { ListenableFuture future = s3AsyncApi.uploadPart( container, key, partNumber + 1, uploadId, payloads.get(partNumber)); ongoingOperations.add(future); } The part numbers need to be continuous but the order in which the requests are send is not relevant. After all of the upload part requests have been submitted, we need to wait for their responses so that we can collect the individual ETag value of each part: Function, String> getEtagFromOp = new Function, String>() { public String apply(ListenableFuture ongoingOperation) { try { return ongoingOperation.get(); } catch (InterruptedException | ExecutionException e) { throw new IllegalStateException(e); } } }; List etagsOfParts = Lists.transform(ongoingOperations, getEtagFromOp); If, for whatever reason, one of the upload part operations fails, the operation can be retried until it succeeds. The logic above does not contain the retry mechanism, but building it in should be straightforward enough. 3.3. Completing the Upload operation The final step of the upload process is completing the multipart operation. The S3 API requires the responses from the previous parts upload as a Map, which we can now easily create from the list of ETags that we obtained above: Map parts = Maps.newHashMap(); for (int i = 0; i < etagsOfParts.size(); i++) { parts.put(i + 1, etagsOfParts.get(i)); } And finally, send the complete request: s3AsyncApi.completeMultipartUpload(container, key, uploadId, parts).get(); This will return final ETag of the finished object and will complete the entire upload process. 4. Conclusion In this article we built a multipart enabled, fully parallel upload operation to S3, using the custom S3 jclouds API. This operation is ready to be used as is, but it can be improved in a few ways. First, retry logic should be added around the upload operations to better deal with failures. Next, for really large files, even though the mechanism is sending all upload multipart requests in parallel, a throttling mechanism should still limit the number of parallel requests being sent. This is both to avoid bandwidth becoming a bottleneck as well as to make sure Amazon itself doesn’t flag the upload process as exceeding an allowed limit of requests per second – the Guava RateLimiter can potentially be very well suited for this. P.S. You might dig following me on Twitter.
April 21, 2013
by Eugen Paraschiv
· 6,594 Views · 1 Like
  • Previous
  • ...
  • 573
  • 574
  • 575
  • 576
  • 577
  • 578
  • 579
  • 580
  • 581
  • 582
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×