DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Testing, Deployment, and Maintenance Topics

article thumbnail
Code Coverage of Jasmine Tests using Istanbul and Karma
for modern web application development, having dozens of unit tests is not enough anymore. the actual code coverage of those tests would reveal if the application is thoroughly stressed or not. for tests written using the famous jasmine test library, an easy way to have the coverage report is via istanbul and karma . for this example, let’s assume that we have a simple library sqrt.js which contains an alternative implementation of math.sqrt . note also how it will throw an exception instead of returning nan for an invalid input. var my = { sqrt: function(x) { if (x < 0) throw new error("sqrt can't work on negative number"); return math.exp(math.log(x)/2); } }; using jasmine placed under test/lib/jasmine-1.3.1 , we can craft a test runner that includes the following spec: describe("sqrt", function() { it("should compute the square root of 4 as 2", function() { expect(my.sqrt(4)).toequal(2); }); }); opening the spec runner in a web browser will give the expected outcome: so far so good. now let's see how the code coverage of our test setup can be measured. the first order of business is to install karma . if you are not familiar with karma, it is basically a test runner which can launch and connect to a specific set of web browsers, run your tests, and then gather the report. using node.js, what we need to do is: npm install karma karma-coverage before launching karma, we need to specify its configuration . it could be as simple as the following my.conf.js (most entries are self-explained). note that the tests are executed using phantomjs for simplicity, it is however quite trivial to add other web browsers such as chrome and firefox. module.exports = function(config) { config.set({ basepath: '', frameworks: ['jasmine'], files: [ '*.js', 'test/spec/*.js' ], browsers: ['phantomjs'], singlerun: true, reporters: ['progress', 'coverage'], preprocessors: { '*.js': ['coverage'] } }); }; running the tests, as well as performing code coverage at the same time, can be triggered via: node_modules/.bin/karma start my.conf.js which will dump the output like: info [karma]: karma v0.10.2 server started at http://localhost:9876/ info [launcher]: starting browser phantomjs info [phantomjs 1.9.2 (linux)]: connected on socket n9ndnhj0np92ntspgx-x phantomjs 1.9.2 (linux): executed 1 of 1 success (0.029 secs / 0.003 secs) as expected (from the previous manual invocation of the spec runner), the test passed just fine. however, the most particular interesting piece here is the code coverage report, it is stored (in the default location) under the subdirectory coverage . open the report in your favorite browser and there you'll find the coverage analysis report. behind the scene, karma is using istanbul , a comprehensive javascript code coverage tool (read also my previous blog post on javascript code coverage with istanbul ). istanbul parses the source file, in this example sqrt.js , using esprima and then adds some extra instrumentation which will be used to gather the execution statistics. the above report that you see is one of the possible outputs, istanbul can also generate lcov report which is suitable for many continuous integration systems (jenkins, teamcity, etc). an extensive analysis of the coverage data should also prevent any future coverage regression, check out my other post hard thresholds on javascript code coverage . one important thing about code coverage is branch coverage . if you pay attention carefully, our test above is still not exercising the situation where the input to my.sqrt is negative. there is a big "i" marking in the third-line of the code, this is istanbul telling us that the if branch is not taken at all (for the else branch, it will be an "e" marker). once this missing branch is noticed, improving the situation is as easy as adding one more test to the spec: it("should throw an exception if given a negative number", function() { expect(function(){ my.sqrt(-1); }). tothrow(new error("sqrt can't work on negative number")); }); once the test is executed again, the code coverage report looks way better and everyone is happy. if you have some difficulties following the above step-by-step instructions, take a look at a git repository i have prepared: github.com/ariya/coverage-jasmine-istanbul-karma . feel free to play with it and customize it to suit your workflow!
October 8, 2013
by Ariya Hidayat
· 49,214 Views
article thumbnail
Introduction to Android Studio
Feeling good to be back at the blog . Actually, I have been managing GDG Ahmedabad, delivering android talks, and managing workshops locally and outside my region. Last month, I was quite busy in organizing the “DevFest” event for GDG Ahmedabad, and then for the preparation of my two talks for the GDG Kathmandu DevFest. I was invited to deliver two talks at DevFest, which was organized by GDG Kathmandu. I have already published slides on my Speakerdeck. I am not sure whether you have already checked and learned from my speaker deck, but still give me a chance to write about Introduction to Android studio here. What is Android Studio? It’s an Android focused IDE, designed specially for Android development. It was launched on 16th May 2013, during Google's I/O 2013 event. Android studio contains all the Android SDK tools to design, test, debug and profile your app. By looking at the development tools and environment, we can see its similar to Eclipse with the ADT plug-in, but as I have mentioned above, it's an Android focused IDE, and there are many cool features available in Android Studio that can foster and increase your development productivity. One great thing is that it depends on the IntelliJ Idea IDE, which has proved itself to be a great IDE and has been in use by many Android engineers. What is the Difference Between IntelliJ Idea and Android Studio? Nothing, in regards to Android. If you use IntelliJ… Keep using it IntelliJ 13 will have the same stuff EAP of IntelliJ Idea 13 includes all the new stuff If Not… Give Android Studio a try You may have some questions in mind regarding IntelliJ and Android Studio. If so, check the FAQ section: IntelliJ IDEA and Android Studio FAQ. Let’s Download Android Studio You can download Android Studio from the android developer site: http://developer.android.com/sdk/installing/studio.html. Cool Features of Android Studio As I have mentioned, it's similar to Eclipse with the ADT plug-in, but Android Studio has many cool features that can help you to increase development productivity. Here are the cool features: Powerful code editing (smart editing, code re-factoring) Rich layout Editor (As soon as you drag and drop views on the layout, it shows you a preview in all the screens including Nexus 4, Nexus 7, Nexus 10 and many other resolutions. Layout designing can be done much faster way as compared to eclipse.) Gradle-based build support Maven Support Template-based wizards Lint tool analysis (The Android lint tool is a static code analysis tool that checks your Android project source files for potential bugs and optimization improvements for correctness, security, performance, usability, accessibility, and internationalization). You can experience all the cool features by using Android Studio yourself Awesome Stuff Inside Darcula Theme It's actually a black-based theme. While using Android Studio, I enjoy working in Darcula theme environment. By the way, Its Darcula theme, not Dracula. I am correcting this just because I have seen many people on Stackoverflow and Google+ saying Dracula. You can set the Darcula theme in Android Studio by: File > Settings > IDE Settings > Appearance > Theme: Darcula. Preview All the Screens We can consider this is as part of the Rich layout editor feature. With this privilege, users can design layouts and can check layouts by previewing in all the possible screens, such as Nexus 4, Nexus 7, Nexus and many other devices. It helps the user to improve layout designs while providing compatibility to various resolutions available. Device Framed Screen Capture It provides ability to directly generate a screenshot of your application. Yes, it was already included in the SDK, but Android Studio provides something more: Device frame (As frames for many Nexus devices are available, you can capture screenshot in whichever frame you like most) Drop shadow Screen glare Color Preview I like this feature very much and I have found this feature helpful while working on big projects. While using Eclipse, we have to have 3rd party color chooser and picker but this feature gives privilege to select color from in-build color chooser and can also have preview in Colors.xml file. Color Preview – Activity class While using Eclipse, it’s difficult to check which color we have used. Yes, we can imagine the color by its name, but an actual preview is much better. This feature was recently introduced in Android Studio, so you must have latest version installed. Hard Coded Strings Here is another feature I like and have found useful: Whenever you use any string resources from Strings.xml, it displays actual value instead of variable name. This setting comes by default, but in case you aren’t able to get hard coded strings in your activity class, then try any of the below ways. Settings > Editor > Code Folding > Android String References OR Select String and right click on it and then go to Folding > Collapse OR CTRL + Numpad ‘-’ Create Layout Variation This provides the ability to create layout variation directly. For example: layout for the large screen, layout for Xlarge screen, etc. The great thing is that the created variant layout gets stored in particular folders like layout-xlarge, layout-large-land, etc. Should I Use Android Studio? You might have explored all the cool features, or you are ready to explore right now. But questions might have arisen in your mind: “Should I use Android Studio,” or “should we start using Android Studio right now,” or “should I continue with IntelliJ or Eclipse?” My answer is a big NO to use Android Studio as your main IDE for Android development, because currently its EARLY ACCESS PREVIEW and it's maturing over days. Engineers have been working hard to improve this IDE. So, you should wait until the BETA comes out. I agree with Carlos Vega (commented over G+) on this point: “You should at least migrate to Intellij Idea 12 so that you get familiar with the IDE’s workflow and keyboard shortcuts. That way when Android Studio reach a more stable level, you can switch without a major learning curve.” Thanks, Carlos Vega, for the input. By the way, here is the presentation I delivered at the GDG Kathmandu DevFest.
October 7, 2013
by Paresh Mayani
· 26,633 Views
article thumbnail
TestNG @Test Annotation and DataProviderClass Example
In the previous post, we have seen an example where dataProvider attribute has been used3 to test methods with different sets of input data for the same test method. TestNG provides another attribute dataProviderClass in conjunction with dataProvider to fetch the input data for the test methods from an external class. The actual class that holds input data is set to the dataProviderClass attribute and datProvider by itself holds the method name where the input data is actually fetched. Here is a quick example to show how to use dataProviderClass and dataProvide attribute Code Service Class ? view source print? 01.package com.skilledmonster.example; 02./** 03.* Simple calculator service to demonstrate TestNG Framework 04.* 05.* @author Jagadeesh Motamarri 06.* @version 1.0 07.*/ 08.public interface CalculatorService { 09.int sum(int a, int b); 10.int multiply(int a, int b); 11.int div(int a, int b); 12.int sub(int a, int b); 13.} Service Implementation Class ? view source print? 01.package com.skilledmonster.example; 02./** 03.* Simple calculator service implementation to demonstrate TestNG Framework 04.* 05.* @author Jagadeesh Motamarri 06.* @version 1.0 07.*/ 08.public class SimpleCalculator implements CalculatorService { 09.public int sum(int a, int b) { 10.return a + b; 11.} 12.public int multiply(int a, int b) { 13.return a * b; 14.} 15.public int div(int a, int b) { 16.return a / b; 17.} 18.public int sub(int a, int b) { 19.return a - b; 20.} 21.} Data Provider Class ? view source print? 01.package com.skilledmonster.common; 02.import org.testng.annotations.DataProvider; 03./** 04.* Data Provider class for TestNG test cases 05.* 06.* @author Jagadeesh Motamarri 07.* @version 1.0 08.*/ 09.public class TestNGDataProvider { 10./** 11.* Data Provider for testing sum of 2 numbers 12.* 13.* @return 14.*/ 15.@DataProvider 16.public static Object[][] testSumInput() { 17.return new Object[][] { { 5, 5 }, { 10, 10 }, { 20, 20 } }; 18.} 19./** 20.* Data Provider for testing multiplication of 2 numbers 21.* 22.* @return 23.*/ 24.@DataProvider 25.public static Object[][] testMultipleInput() { 26.return new Object[][] { { 5, 5 }, { 10, 10 }, { 20, 20 } }; 27.} 28.} Finally, test class that uses dataProviderClass attribute to feed the input data for the test methods ? package com.skilledmonster.example; import org.testng.Assert; import org.testng.annotations.BeforeClass; import org.testng.annotations.Test; import com.skilledmonster.common.TestNGDataProvider; /** * Example to demonstrate use of dataProviderClass and dataProvide attributes of TestNG framework * * @author Jagadeesh Motamarri * @version 1.0 */ public class TestNGAnnotationTestDataProviderExample { public CalculatorService service; @BeforeClass public void init() { System.out.println("@BeforeClass: The annotated method will be run before the first test method in the current class is invoked."); System.out.println("init service"); service = new SimpleCalculator(); } @Test(dataProviderClass = TestNGDataProvider.class, dataProvider = "testSumInput") public void testSum(int a, int b) { System.out.println("@Test : testSum()"); int result = service.sum(a, b); Assert.assertEquals(result, a + b); } @Test(dataProviderClass = TestNGDataProvider.class, dataProvider = "testMultipleInput") public void testMultiple(int a, int b) { System.out.println("@Test : testMultiple()"); int result = service.multiply(a, b); Assert.assertEquals(result, a * b); } } Output As shown in the above console output, each of the testSum() and testMutiple() methods are invoked with different sets of input data using an external class with dataProviderClass attribute. Advantage More flexibility and re-usability of commonly used data across several test classes. Download Download TestNG DataProvider Example
October 2, 2013
by Jagadeesh Motamarri
· 25,471 Views
article thumbnail
Sparse and Memory-mapped Files
One of the problems with memory-mapped files is that you can’t actually map beyond the end of the file. So you can’t use that to extend your file. I had a thought about and set out to check out what happens when I create a sparse file, a file that only take space when you write to it, and at the same time, map it. As it turns out, this actually works pretty well in practice. You can do so without any issues. Here is how it works: using (var f = File.Create(path)) { int bytesReturned = 0; var nativeOverlapped = new NativeOverlapped(); if (!NativeMethod.DeviceIoControl(f.SafeFileHandle, EIoControlCode.FsctlSetSparse, IntPtr.Zero, 0, IntPtr.Zero, 0, ref bytesReturned, ref nativeOverlapped)) { throw new Win32Exception(); } f.SetLength(1024*1024*1024*64L); } This creates a sparse file that is 64 GB in size. Then we can map it normally: using (var mmf = MemoryMappedFile.CreateFromFile(path)) using (var memoryMappedViewAccessor = mmf.CreateViewAccessor(0, 1024*1024*1024*64L)) { for (long i = 0; i < memoryMappedViewAccessor.Capacity; i += buffer.Length) { memoryMappedViewAccessor.WriteArray(i, buffer, 0, buffer.Length); } } And then we can do stuff to it. And that includes writing to yet-unallocated parts of the file. This also means that you don’t have to worry about writing past the end of the file, the OS will take care of all of that for you. Happy happy, joy joy, etc. There is one problem with this method, however. It means that you have a 64 GB file, but you don’t have that much allocated. What that means in turn is that you might not have that much space available for the file. Which brings up an interesting question, what happens when you are trying to commit a new page, and the disk is out of space? Using file I/O you would get an I/O error with the right code. But when using memory mapped files, the error would actually turn up during access, which can happen pretty much anywhere. It also means that it is a Standard Exception Handling error in Windows, which requires special treatment. To test this out, I wrote the following so it would write to a disk that had only about 50 GB free. I wanted to know what would happen when it ran out of space. That is actually something that happens, and we need to be able to address this issue robustly. The kicker is that this might actually happen at any time, so that would really result is some… interesting behavior with regards to robustness. In other words, I don’t think that this is a viable option, it is a really cool trick, but I don’t think it is a very well thought out option. By the way, the result of my experiment was that we had an effectively a frozen process. No errors, nothing, just a hung. Also, I am pretty sure that WriteArray() is really slow, but I’ll check this out at another pointer in time.
October 1, 2013
by Oren Eini
· 8,118 Views
article thumbnail
Android Activity Recognition
activity recognition gives our android device the ability to detect a number of our physical activities like walking, riding a bicycle, driving a car or standing idle. all that can be detected by simply using an api to access google play services , an increasingly crucial piece of software available to all android versions. as in the article on geofencing , we will download the sample app ( activityrecognition.zip ) at the android developer’s site and start playing with it, eventually modifying parts of it to fit our purposes. we will show here only the most relevant code sections. the first thing to note is that we need a specific permission to use activity recognition: as with geofencing or location updates, we use the api to request google play services to analyse our data and provide us with the results. the chain of method calls for requesting updates is similar to that of geofencing: make sure that google play services is available. as an activity recognition client, request a connection. once connected, location services calls back the onconnected() method in our app. proceed with the updates request via a pending intent pointing to an intentservice we have written. google location services sends out its activity recognition updates as intent objects, using the pendingintent we provided. get and process the updates in our intentservice’s onhandleintent() method. the sample app writes all the updates in a log file, and that is ok if we like that sort of thing … though a closer look at the data makes us realize that most of it is garbage. do we really need to know that we have a 27 percent chance of being driving a vehicle and a 7 percent chance of riding a bicycle when we are in fact sitting idle at our desk? not really. what we want is the most significant data, and in this case, that would be the most probable activity: //.. import com.google.android.gms.location.activityrecognitionresult; import com.google.android.gms.location.detectedactivity; /** * service that receives activityrecognition updates. it receives updates * in the background, even if the main activity is not visible. */ public class activityrecognitionintentservice extends intentservice { //.. /** * called when a new activity detection update is available. */ @override protected void onhandleintent(intent intent) { //... // if the intent contains an update if (activityrecognitionresult.hasresult(intent)) { // get the update activityrecognitionresult result = activityrecognitionresult.extractresult(intent); detectedactivity mostprobableactivity = result.getmostprobableactivity(); // get the confidence % (probability) int confidence = mostprobableactivity.getconfidence(); // get the type int activitytype = mostprobableactivity.gettype(); /* types: * detectedactivity.in_vehicle * detectedactivity.on_bicycle * detectedactivity.on_foot * detectedactivity.still * detectedactivity.unknown * detectedactivity.tilting */ // process } } } instead of writing the updates to a log file, it is simpler to just store them in memory (e.g. in a static list in a dedicated class) and display them to the user of our app. one way to do this would be by using a fragment to display the updates on top of a google map. as commented in previous articles, fragments were introduced in honeycomb but are also available to older android versions through the support library . once we define our own xml layout for the actreconfragment and give it a transparent background (left to the reader as an exercise), we will get a nice overlaid display like this: since we have chosen to show the most probable activity to the users of our app, we need the display to be dynamic, like a live feed . for that, we can add a local broadcast in our service: //inside activityrecognitionintentservice 's onhandleintent intent broadcastintent = new intent(); // give it the category for all intents sent by the intent service broadcastintent.addcategory(activityutils.category_location_services); // set the action and content for the broadcast intent broadcastintent.setaction(activityutils.action_refresh_status_list); // broadcast *locally* to other components in this app localbroadcastmanager.getinstance(this).sendbroadcast(broadcastintent); we are using a localbroadcastmanager (included in android 3.0 and above, and in the support library v4 for early releases). apart from providing our own layout to position the activity detection panel on top of a map, the only new code snippet we wrote is the above local broadcast. for the remainder below, we have simply re-positioned the sample app’s code in a fragment and use in-memory storage of the activity updates instead of using a log file. the receiver on that local broadcast is in our fragment: //... public class actreconfragment extends fragment{ // intent filter for incoming broadcasts from the intentservice intentfilter mbroadcastfilter; // instance of a local broadcast manager private localbroadcastmanager mbroadcastmanager; //... /** * called when the corresponding map activity's * oncreate() method has completed. */ @override public void onactivitycreated(bundle savedinstancestate) { super.onactivitycreated(savedinstancestate); // set the broadcast receiver intent filer mbroadcastmanager = localbroadcastmanager.getinstance(getactivity()); // create a new intent filter for the broadcast receiver mbroadcastfilter = new intentfilter(activityutils.action_refresh_status_list); mbroadcastfilter.addcategory(activityutils.category_location_services); //... } /** * broadcast receiver that receives activity update intents * this receiver is local only. it can't read broadcast intents from other apps. */ broadcastreceiver updatelistreceiver = new broadcastreceiver() { @override public void onreceive(context context, intent intent) { // when an intent is received from the update listener intentservice, // update the display. updateactivityhistory(); } }; //... } live feed shots: once we have taken care of the display, we need to move on to other important aspects like what to do with those activity updates. the sample app gives us one example of that in the activityrecognitionintentservice : if( // if the current type is "moving" i.e on foot, bicycle or vehicle ismoving(activitytype) && // the activity has changed from the previous activity activitychanged(activitytype) // the confidence level for the current activity is >= 50% && (confidence >= 50)) { // do something useful } simply getting the most probable activity might be ok for displaying purposes, but might not be enough for an app to act on it and do something useful. we need to make sure that the type of activity and the corresponding confidence level (i.e. probability) are adequate for our purposes. while a detected activity type of “unknown” with a confidence level of 52% is next to useless, knowing that the user is moving in a vehicle as opposed to walking can be put to good use: increase the frequency of location updates, enlarge the map area of available points of interest, etc … activity recognition has been added as an experimental feature to this geofencing app . check it out and feel free to post any feedback.
September 30, 2013
by Tony Siciliani
· 32,839 Views
article thumbnail
ElasticSearch: Java API
ElasticSearch provides Java API, thus it executes all operations asynchronously by using client object.
September 30, 2013
by Hüseyin Akdoğan DZone Core CORE
· 137,555 Views · 4 Likes
article thumbnail
TestNG @BeforeClass Annotation Example
TestNG method that is annotated with @BeforeClass annotation will be run before the first test method in the current class is invoked.
September 28, 2013
by Jagadeesh Motamarri
· 45,488 Views · 3 Likes
article thumbnail
Connecting to SQL Azure with SQL Management Studio
Intro If you want to manage your SQL Databases in Azure using tools that you’re a little more familiar and comfortable with – for example – SQL Management Studio, how do you go about connecting? You could read the help article from Microsoft, or you can follow my intuitive screen-based instructions, below: Assumptions 1. I’m assuming you have a version of SQL Management Studio already installed. I believe you’ll need at least SQL Server 2008 R2’s version or newer 2. I’m further assuming you’ve already created a SQL Database in Azure Steps to Connect SSMS to SQL Azure 1. Authenticate to the Azure Portal 2. Click on SQL Databases 3. Click on Servers 4. Click on the name of the Server you wish to connect to… 5. Click on Configure… If not already in place, click on ‘Add to the allowed IP addresses’ to add your current IP address (or specify an address you wish to connect from) and click ‘Save’ 6. Open SQL Management Studio and connect to Database services (usually comes up by default) Enter the fully qualified server name (.database.windows.net) Change to SQL Server Authentication Enter the login preferred (if a new database, the username you specified when yuo created the DB server) Enter the correct password 7. Hit the Connect button Troubleshooting Ensure you have the appropriate ports open outbound from your local network or connection (typically port 1433) Ensure you have allowed the correct public IP address you’re trying to connect from via the Azure Portal (steps 1-5 above) Ensure you are using the correct server name and user name For SSMS, this is the server name (in step 4) followed by .database.windows.net Ensure you are using SQL Server Authentication For SSMS the username format is If you forgot the password of your username, you can reset the password in the Azure Portal, in step 4, click on Dashboard: Lastly… You can click on the Database (in step 2) to see your connection options:
September 25, 2013
by Rob Sanders
· 262,899 Views
article thumbnail
TestNG Depedency Test – Multiple Test Method Dependency
Dependency is a feature in TestNG that allows a test method to depend on a single or a group of test methods. This will help in executing a set of tests to be executed before a test method. The dependency on multiple test methods is configured for a test by providing comma separated dependent test method names to the attribute dependsOnMethods while using the Test annotation. The following example shows a test class where process() test method depends on multiple test methods start() and initi() of the same class. Code ? package com.skilledmonster.example; import org.testng.annotations.Test; /** * Example to demonstrate TestNG multiple dependency method execution * * @author Jagadeesh Motamarri * @version 1.0 */ public class MultipleDependencyTest { @Test public void start() { System.out.println("Starting the server"); } @Test(dependsOnMethods = { "start" }) public void init() { System.out.println("Initializing the data for processing!"); } @Test(dependsOnMethods = { "start", "init" }) public void process() { System.out.println("Processing the data!"); } @Test(dependsOnMethods = { "process" }) public void stop() { System.out.println("Stopping the server"); } } Output As seen in the above console output, process() method executed after start() and init() methods are executed and like wise stop() method is executed after process() method is executed. Download [GitHub]
September 22, 2013
by Jagadeesh Motamarri
· 40,845 Views
article thumbnail
The Real Cost of Change in Software Development
There are two widely opposed (and often misunderstood) positions on how expensive it can be to change or fix software once it has been designed, coded, tested and implemented. One holds that it is extremely expensive to leave changes until late, that the cost of change rises exponentially. The other position is that changes should be left as late as possible, because the cost of changing software is – or at least can be – essentially flat (that’s why we call it software). Which position is right? Why should we care? And what can we do about it? Exponential Cost of Change Back in the early 1980s, Barry Boehm published some statistics (Software Engineering Economics, 1981) which showed that the cost of making a software change or fix increases significantly over time – you can see the original curve that he published here. Boehm looked at data collected from Waterfall-based projects at TRW and IBM in the 1970s, and found that the cost of making a change increases as you move from the stages of requirements analysis to architecture, design, coding, testing and deployment. A requirements mistake found and corrected while you are still defining the requirements costs almost nothing. But if you wait until after you've finished designing, coding and testing the system and delivering it to the customer, it can cost up to 100 times as much. A few caveats here. First, the cost curve is much higher in large projects (in smaller projects, the cost curve is more like 1:4 instead of 1:100). Those cases when the cost of change rises up to 100 times are rare, what Boehm calls Architecture-Breakers, where the team gets a fundamental architectural assumption wrong (scaling, performance, reliability) and doesn't find out until after customers are already using the system and running into serious operational problems. This analysis was all done on a small data sample from more than 30 years ago, when developing code was much more expensive and time-consuming and paperworky, and the tools sucked. A few other studies have been done since then that mostly back up Boehm's findings – at least the basic idea that the longer it takes for you to find out that you made a mistake, the more expensive it is to correct it. These studies have been widely referenced in books like Steve McConnell’s Code Complete, and used to justify the importance of early reviews and testing: Studies over the last 25 years have proven conclusively that it pays to do things right the first time. Unnecessary changes are expensive. Researchers at Hewlett-Packard, IBM, Hughes Aircraft, TRW, and other organizations have found that purging an error by the beginning of construction allows rework to be done 10 to 100 times less expensively than when it's done in the last part of the process, during system test or after release (Fagan 1976; Humphrey, Snyder, and Willis 1991; Leffingwell 1997; Willis et al. 1998; Grady 1999; Shull et al. 2002; Boehm and Turner 2004). In general, the principle is to find an error as close as possible to the time at which it was introduced. The longer the defect stays in the software food chain, the more damage it causes further down the chain. Since requirements are done first, requirements defects have the potential to be in the system longer and to be more expensive. Defects inserted into the software upstream also tend to have broader effects than those inserted further downstream. That also makes early defects more expensive. There’s some controversy over how accurate and complete this data is, how much we can rely on it, and how relevant it is today when we have much better development tools and many teams have moved from heavyweight sequential Waterfall development to lightweight iterative, incremental development approaches. Flattening the Cost of Changing Code The rules of the game should change with iterative and incremental development – because they have to. Boehm realized back in the 1980s that we could catch more mistakes early (and therefore reduce the cost of development) if we think about risks upfront and design and build software in increments, using what he called the Spiral Model, rather than trying to define, design and build software in a Waterfall sequence. The same ideas are behind more modern, lighter Agile development approaches. In Extreme Programming Explained (the first edition, but not the second) Kent Beck states that minimizing the cost of change is one of the goals of Extreme Programming, and that a flattened change cost curve is “the technical premise of XP”: Under certain circumstances, the exponential rise in the cost of changing software over time can be flattened. If we can flatten the curve, old assumptions about the best way to develop software no longer hold … You would make big decisions as late in the process as possible, to defer the cost of making the decisions and to have the greatest possible chance that they would be right. You would only implement what you had to, in hopes that the needs you anticipate for tomorrow wouldn't come true. You would introduce elements to the design only as they simplified existing code or made writing the next bit of code simpler. It’s important to understand that Beck doesn't say that with XP the change curve is flat. He says that these costs can be flattened if teams work toward this, leveraging key practices and principles in XP, such as: Simple Design, doing the simplest thing that works, and deferring design decisions as late as possible (YAGNI), so that the design is easy to understand and easy to change Continuous, disciplined refactoring to keep the code easy to understand and easy to change Test-First Development – writing automated tests upfront to catch coding mistakes immediately, and to build up a testing safety net to catch mistakes in the future Developers collaborating closely and constantly with the customer to confirm their understanding of what they need to build and working together in pairs to design solutions and solve problems, and catch mistakes and misunderstandings early Relying on working software over documentation to minimize the amount of paperwork that needs to be done with each change (write code, not specs) The team’s experience working incrementally and iteratively – the more that people work and think this way, the better they will get at it. All of this makes sense and sounds right, although there are no studies that back up these assertions, which is why Beck dropped this change curve discussion from the second edition of his XP book. But, by then, the idea that change could be flat with Agile development had already become accepted by many people. The Importance of Feedback Scott Amber agrees that the cost curve can be flattened in Agile development, not because of Simple Design, but because of the feedback loops that are fundamental to iterative, incremental development. Agile methods optimize feedback within the team, developers working closely together with each other and with the customer and relying on continuous face-to-face communications. Following technical practices like test-first development, pair programming and continuous integration makes these feedback loops even tighter. But what really matters is getting feedback from the people using the system – it’s only then that you know if you got it right or what you missed. The longer that it takes to design and build something and get feedback from real users, the more time and work that is required to get working software into a real customer’s hands, the higher your cost of change really is. Optimizing and streamlining this feedback loop is what is driving the lean startup approach to development: defining a minimum viable product (something that just barely does the job), getting it out to customers as quickly as you can, and then responding to user feedback through continuous deployment and A/B testing techniques until you find out what customers really want. Even Flat Change Can Still Be Expensive Even if you do everything to optimize these feedback loops and minimize your overheads, this still doesn’t mean that change will come cheap. Being fast isn’t good enough if you make too many mistakes along the way. The Post Agilist uses the example of painting a house: Assume that it costs $1,000 each time you paint the house, whether you paint it blue, red or white. The cost of change is flat. But if you have to paint it blue first, then red, then white before everyone is happy, you’re wasting time and money. “No matter how expensive or cheap the "cost of change" curve may be, the fewer changes that are made, the cheaper and faster the result will be … Planning is not a four letter word.” (However, I would like to point out that “plan” is.) Spending too much time upfront in planning and design is waste. But not spending enough time upfront to find out what you should be building and how you should be building it before you build it, and not taking the care to build it carefully, is also a waste. Change Gets More Expensive Over Time You also have to accept that the incremental cost of change will go up over the life of a system, especially once a system is being used. This is not just a technical debt problem. The more people using the system, the more people who might be impacted by the change if you get it wrong, the more careful you have to be. This means that you need to spend more time on planning and communicating changes, building and testing a roll-back capability, and roll changes out slowly using canary releases and dark launching – which add costs and delays to getting feedback. There are also more operational dependencies that you have to understand and take care of, and more data that you have to change or fix up, making changes even more difficult and expensive. If you do things right, keep a good team together and manage technical debt responsibly, these costs should rise gently over the life of a system – and if you don’t, that exponential change curve will kick in. What is the real cost of change? Is the real cost of change exponential, or is it flat? The truth is somewhere in between. There’s no reason that the cost of making a change to software has to be as high as it was 30 years ago. We can definitely do better today, with better tools and better, cheaper ways of developing software. The keys to minimizing the costs of change seem to be: Get your software into customer hands as quickly as you can. I am not convinced that any organization really needs to push out software changes 10 to 50 to 100 times a day, but you don’t want to wait months or years for feedback, either. Deliver less, but more often. And because you’re going to deliver more often, it makes sense to build a continuous delivery pipeline so that you can push changes out efficiently and with confidence. Use ideas from lean software development and maybe Kanban to identify and eliminate waste and to minimize cycle time. We know that, even with lots of upfront planning and design thinking, we won’t get everything right upfront -- this is the Waterfall fallacy. But it’s also important not to waste time and money iterating when you don’t need to. Spending enough time upfront in understanding requirements and in design to get it at least mostly right the first time can save a lot later on. Whether you’re working incrementally and iteratively, or sequentially, it makes good sense to catch mistakes early when you can, whether you do this through test-first development and pairing, or requirements workshops and code reviews -- whatever works for you.
September 20, 2013
by Jim Bird
· 21,947 Views
article thumbnail
Solving the Detached Many-to-Many Problem with the Entity Framework
Introduction This article is part of the ongoing series I’ve been writing recently, but can be read as a standalone article. I’m going to do a better job of integrating the changes documented here into the ongoing solution I’ve been building. However, considering how much time and effort I put into solving this issue, I’ve decided to document the approach independently in case it is of use to others in the interim. The Problem Defined This issue presents itself when you are dealing with disconnected/detached Entity Framework POCO objects,. as the DbContext doesn’t track changes to entities. Specifically, trouble occurs with entities participating in a many-to-many relationship, where the EF has hidden a “join table” from the model itself. The problem with detached entities is that the data context has no way of knowing what changes have been made to an object graph, without fetching the data from the data store and doing an entity-by-entity comparison – and that assuming it’s possible to fetch the same way as it was originally. In this solution, all the entities are detached, don’t use proxy types and are designed to move between WCF service boundaries. Some Inspiration There are no out-of-the-box solutions that I’m aware of which can process POCO object graphs that are detached. I did find an interesting solution called GraphDiff which is available from github and also as a NuGet package, but it didn’t work with the latest RC version of the Entity Framework (v6). I also found a very comprehensive article on how to implement a generic repository pattern with the Entity Framework, but it was unable to handle detached many-to-many relationships. In any case, I highly recommend a read of this article, it was inspiration for some of the approach I’ve ended up taking with my own design. The Approach This morning I put together a simple data model with the relationships that I wanted to support with detached entities. I’ve attached the solution with a sample schema and test data at the bottom of this article. If you prefer to open and play with it, be sue to add the Entity Framework (v6 RC) via NuGet, I’ve omitted it for file size and licensing reasons). Here’s a logical view of the model I wanted to support: Here’s the schema view from SQL Server: Here’s the Entity Model which is generated from the above SQL schema: In the spirit of punching myself in the head, I’ve elected to have one table implement an identity specification (meaning the underlying schema allocated PK ID values) whereas the other two tables the ID must be specified. Theoretically, if I can handle the entity types in a generic fashion, then this solution can scale out to larger and more complex models. The scenarios I’m specifically looking to solve in this solution with detached object graphs are as follows: Add a relationship (many-to-many) Add a relationship (FK-based) Update a related entity (many-to-many) Update a related entity (FK-based) Remove a relationship (many-to-many) Remove a relationship (FK-based) Per the above, here’s the scenarios within the context of the above data model: Add a new Secondary entity to a Primary entity Add an Other entity to a Secondary entity Update a Secondary entity by updating a Primary entity Update an Other entity from a Secondary entity (or Primary entity) Remove (but not delete!) a Secondary entity from a Primary entity Remove (but not delete) a Other entity from a Secondary entity Establishing Test Data Just to give myself a baseline, the data model is populated (by default) with the following data. This gives us some “existing entities” to query and modify. More Work for the Consumer Although I tried my best, I couldn’t come to a design which didn’t require the consuming client to do slightly more work to enable this to work properly. Unfortunately the best place for change tracking to occur with disconnected entities is with the layer making changes – be it a business layer or something downstream. To this effect, entities will need to implement a property which reflects the state of the entity (added, modified, deleted etc.). For the object graph to be updated/managed successfully, the consumer of the entities needs to set the entity state properly. This isn’t at all as bad as it sounds, but it’s not nothing. Establishing some Scaffolding After generating the data model, the first thing to be done is ensure each entity derives from the same base class. (“EntityBase”) this is used later to establish the active state of an entity when it needs to be processed. I’ve also created an enum (“ObjectState”) which is a property of the base class and a helper function which maps ObjectState to an EF EntityState. In case this isn’t clear, here’s a class view: Constructing Data Access To ensure that the usage is consistent, I’ve defined a single Data Access class, mainly to establish the pattern for handling detached object graphs. I can’t stress enough that this is not intended as a guide to an appropriate way to structure your data access – I’ll be updating my ongoing series of articles to go into more detail – this is only to articulate a design approach to handling detached object graphs. Having said all that, here’s a look at my “DataAccessor” class, which can be used with generic data access entities (by way of generics): As with my ongoing project, the Entity Framework DbContext is instantiated by this class on construction, and implements IDisposable to ensure the DbContext is disposed properly upon construction. Here’s the constructor showing the EF configuration options I’m using: public DataAccessor() { _accessor = new SampleEntities(); _accessor.Configuration.LazyLoadingEnabled = false; _accessor.Configuration.ProxyCreationEnabled = false; } Updating an Entity We start with a basic scenario to ensure that the scaffolding has been implemented properly. The scenario is to query for a Primary entity and then change a property and update the entity in the data store. [TestMethod] public void UpdateSingleEntity() { Primary existing = null; String existingValue = String.Empty; using (DataAccessor a = new DataAccessor()) { existing = a.DataContext.Primaries.Include("Secondaries").First(); Assert.IsNotNull(existing); existingValue = existing.Title; existing.Title = "Unit " + DateTime.Now.ToString("MMdd hh:mm:ss"); } using (DataAccessor b = new DataAccessor()) { existing.State = ObjectState.Modified; b.InsertOrUpdate(existing); } using (DataAccessor c = new DataAccessor()) { existing.Title = existingValue; existing.State = ObjectState.Modified; c.InsertOrUpdate(existing); } } You’ll noticed that there is nothing particularly significant here, except that the object’s State is reset toModified between operations. Updating a Many-to-Many Relationship Now things get interesting. I’m going to query for a Primary entity, then I’ll update both a property of thePrimary entity itself, and a property of one of the entity’s relationships. [TestMethod] public void UpdateManyToMany() { Primary existing = null; Secondary other = null; String existingValue = String.Empty; String existingOtherValue = String.Empty; using (DataAccessor a = new DataAccessor()) { //Note that we include the navigation property in the query existing = a.DataContext.Primaries.Include("Secondaries").First(); Assert.IsTrue(existing.Secondaries.Count() > 1, "Should be at least 1 linked item"); } //save the original description existingValue = existing.Description; //set a new dummy value (with a date/time so we can see it working) existing.Description = "Edit " + DateTime.Now.ToString("yyyyMMdd hh:mm:ss"); existing.State = ObjectState.Modified; other = existing.Secondaries.First(); //save the original value existingOtherValue = other.AlternateDescription; //set a new value other.AlternateDescription = "Edit " + DateTime.Now.ToString("yyyyMMdd hh:mm:ss"); other.State = ObjectState.Modified; //a new data access class (new DbContext) using (DataAccessor b = new DataAccessor()) { //single method to handle inserts and updates //set a breakpoint here to see the result in the DB b.InsertOrUpdate(existing); } //return the values to the original ones existing.Description = existingValue; other.AlternateDescription = existingOtherValue; existing.State = ObjectState.Modified; other.State = ObjectState.Modified; using (DataAccessor c = new DataAccessor()) { //update the entities back to normal //set a breakpoint here to see the data before it reverts back c.InsertOrUpdate(existing); } } If we actually run this unit test and set the breakpoints accordingly, you’ll see the following in the database: Database at Breakpoint #1 / Database at Breakpoint #2 Database when Unit Test completes You’ll notice at the second breakpoint that the description of the first entities have both been updated. Examining the Insert/Update Code The function exposed by the “data access” class really just passes through to another private function which does the heavy lifting. This is mainly in case we need to reuse the logic, since it essentially processes state action on attached entities. public void InsertOrUpdate(params T[] entities) where T : EntityBase { ApplyStateChanges(entities); DataContext.SaveChanges(); } Here’s the definition of the ApplyStateChanges function, which I’ll discuss below: private void ApplyStateChanges(params T[] items) where T : EntityBase { DbSet dbSet = DataContext.Set(); foreach (T item in items) { //loads related entities into the current context dbSet.Attach(item); if (item.State == ObjectState.Added || item.State == ObjectState.Modified) { dbSet.AddOrUpdate(item); } else if (item.State == ObjectState.Deleted) { dbSet.Remove(item); } foreach (DbEntityEntry entry in DataContext.ChangeTracker.Entries() .Where(c => c.Entity.State != ObjectState.Processed && c.Entity.State != ObjectState.Unchanged)) { var y = DataContext.Entry(entry.Entity); y.State = HelperFunctions.ConvertState(entry.Entity.State); entry.Entity.State = ObjectState.Processed; } } } Notes on this Implementation What this function does is to iterate through the items to be examined, attach them to the current Data Context (which also attaches their children), act on each item accordingly (add/update/remove) and then process new entities which have been added to the Data Context’s change tracker. For each newly “discovered” entity (and ignoring entities which are unchanged or have already been examined), each entity’s DbEntityEntry is set according to the entity’s ObjectState (which is set by the calling client). Doing this allows the Entity Framework to understand what actions it needs to perform on the entities when SaveChanges() is invoked later. You’ll also note that I set the entity’s state to “Processed” when it has been examined, so we don’t act on it more than once (for performance purposes). Fun note: the AddOrUpdate extension method is something I found in theSystem.Data.Entity.Migrations namespace and it acts as an ‘Upsert’ operation, inserting or updating entities depending on whether they exist or not already. Bonus! That’s it for adding and updating, believe it or not. Corresponding Unit Test The following unit test establishes the creation of a new many-to-many entity, it is then removed (by relationship) and then finally deleted altogether from the database: [TestMethod] public void AddRemoveRelationship() { Primary existing = null; using (DataAccessor a = new DataAccessor()) { existing = a.DataContext.Primaries.Include("Secondaries") .FirstOrDefault(); Assert.IsNotNull(existing); } Secondary newEntity = new Secondary(); newEntity.State = ObjectState.Added; newEntity.AlternateTitle = "Unit"; newEntity.AlternateDescription = "Test"; newEntity.SecondaryId = 1000; existing.Secondaries.Add(newEntity); using (DataAccessor a = new DataAccessor()) { //breakpoint #1 here a.InsertOrUpdate(existing); } newEntity.State = ObjectState.Unchanged; existing.State = ObjectState.Modified; using (DataAccessor b = new DataAccessor()) { //breakpoint #2 here b.RemoveEntities(existing, x => x.Secondaries, newEntity); } using (DataAccessor c = new DataAccessor()) { //breakpoint #3 here c.Delete(newEntity); } } Test Results: Pre-Test – Breakpoint #1 / Breakpoint #2 Breakpoint #3 / Post execution (new entity deleted) SQL Profile Trace Removing a Many-to-Many Relationship Now this is where it gets tricky. I’d like to have something a little more polished, but the best I have come up with to date is a separate operation on the data provider which exposes functionality akin to “remove relationship”. The fundamental problem with how the EF POCO entities work without any modifications, is when they are detached, to remove a many-to-many relationship, the relationship to be removed is physically removed from the collection. When the object graph is sent back for processing, there’s a missing related entity, and the service or data context would have to make an assumption that the omission was on purpose, not to mention that it would have to compare against data currently in the data store. To make this easier, I’ve implemented a function called “RemoveEnttiies” which alters the relationship between the parent and the child/children. The one bug catch is that you need to specify the navigation property or collection, which might make it slightly undesirable to implement generically. In any case, I’ve provided two options – with the navigation property as a string parameter or as a LINQ expression – they both do the same thing. public void RemoveEntities(T parent, Expression> expression, params T2[] children) where T : EntityBase where T2 : EntityBase { DataContext.Set().Attach(parent); ObjectContext obj = DataContext.ToObjectContext(); foreach (T2 child in children) { DataContext.Set().Attach(child); obj.ObjectStateManager.ChangeRelationshipState(parent, child, expression, EntityState.Deleted); } DataContext.SaveChanges(); } Notes on this Implementation The “ToObjectContext” is an extension method, and is akin to (DataContext as IObjectContextAdapter).ObjectContext. This is to expose a more fundamental part of the Entity Framework’s object model. We need this level of access to get to the functionality which controls relationships. For each child to be removed (note: not deleted from the physical database), we nominate the parent object, the child, the navigation property (collection) and the nature of the relationship change (delete). Note that this will NOT WORK for Foreign Key defined relationships – more on that below. To delete entities which have active relationships, you’ll need to drop the relationship before attempting to delete or else you’ll have data integrity/referential integrity errors, unless you have accounted for cascading deletion (which I haven’t). Example execution: using (DataAccessor c = new DataAccessor()) { //c.RemoveEntities(existing, "Secondaries", s); //(or can use an expression): c.RemoveEntities(existing, x => x.Secondaries, s); } Removing FK Relationships As mentioned above, you can’t just edit the relationship to remove an FK-based relationship. Instead, you have to follow the EF practice of setting the FK entity to NULL. Here’s a Unit Test which demonstrates how this is achieved: Secondary s = ExistingEntity(); using (DataAccessor c = new DataAccessor()) { s.Other = null; s.OtherId = null; s.State = ObjectState.Modified; o.State = ObjectState.Unchanged; c.InsertOrUpdate(s); } We use the same “Insert or Update’ call – being aware that you have to set the ObjectState properties accordingly. Note: I’m in the process of testing the reverse removal – i.e. what happens if you want to remove a Secondaryentity from an Other entity’s collection. Deleting Entities This is fairly straightforward, but I’ve taken a few more precautions to ensure that the entity to be deleted is valid no the server side. public void Delete(params T[] entities) where T : EntityBase { foreach (T entity in entities) { T attachedEntity = Exists(entity); if (attachedEntity != null) { var attachedEntry = DataContext.Entry(attachedEntity); attachedEntry.State = EntityState.Deleted; } } DataContext.SaveChanges(); } To understand the above, you should take a look at the implementation of the “Exists” function which essentially checks the data store and local cache to see if there is an attached representation: protected T Exists(T entity) where T : EntityBase { var objContext = ((IObjectContextAdapter)this.DataContext) .ObjectContext; var objSet = objContext.CreateObjectSet(); var entityKey = objContext.CreateEntityKey(objSet.EntitySet.Name, entity); DbSet set = DataContext.Set(); var keys = (from x in entityKey.EntityKeyValues select x.Value).ToArray(); //Remember, there can by surrogate keys, so don't assume there's //just one column/one value //If a surrogate key isn't ordered properly, the Set().Find() //method will fail, use attributes on the entity to determine the //proper order. //context.Configuration.AutoDetectChangesEnabled = false; return set.Find(keys); } This is a fairly expensive operation which is why it’s pretty much reserved for deletes and not more frequent operations. It essentially determines the target entity’s primary key and then checks whether the entity exists or not. Note: I haven’t tested this on entities with surrogate keys, but I’ll get to it at some point. If you have surrogate key tables, you can define the PK key order using attributes on the model entity, but I haven’t done this (yet). Summary This article is the culmination of about two days of heavy analysis and investigation. I’ve got a whole lot more to contribute on this topic, but for now, I felt it was worthy enough to post as-is. What you’ve got here is still incredibly rough, and I haven’t done nearly enough testing. To be honest, I was quite excited by the initial results, which is why I decided to write this post. there’s an incredibly good chance that I’ve missed something in the design and implementation, so please be aware of that. I’ll be continuing to refine this approach in my main series of articles with much cleaner implementation. In the meantime though, if any of this helps anyone out there struggling with detached entities, I hope it helps. There’s precious few articles and samples that are up to date, and very few that seem to work. This is provided without any warranty of any kind! If you find any issues please e-mail me [email protected] and I’ll attempt to refactor/debug and find ways around some of the inherent limitations. In the meantime, there are a few helpful links I’ve come across in my travels on the WWW. See below. Example Solution Files [ Files ] Note: you’ll need to add the Entity Framework v6 RC package via NuGet, I haven’t included it in the archive. Helpful Links http://blog.magnusmontin.net/2013/05/30/generic-dal-using-entity-framework/ https://github.com/refactorthis/GraphDiff http://stackoverflow.com/questions/11686225/dbset-find-method-ridiculously-slow-compared-to-singleordefault-on-id http://stackoverflow.com/questions/10381106/cannot-update-many-to-many-relationships-in-entity-framework http://stackoverflow.com/questions/8413248/how-to-save-an-updated-many-to-many-collection-on-detached-entity-framework-4-1 http://stackoverflow.com/questions/6018711/generic-way-to-check-if-entity-exists-in-entity-framework
September 18, 2013
by Rob Sanders
· 163,459 Views
article thumbnail
This is how Facebook develops and deploys software. Should you care?
A recently published academic paper by Prof. Dror Feitelson at Hebrew University, Eitan Frachtenberg a research scientist at Facebook, and Kent Beck (who is also doing something at Facebook), describes Facebook’s approach to developing and deploying its front-end software. While it would be more interesting to understand how back-end development is done (this is where the real heavy lifting is done scaling up to handle hundreds of millions of users), there are a few things in the paper that are worth knowing about. Continuous Deployment at Facebook is Not Continuous Deployment Rather than planning work out into projects or breaking work into time-boxed Sprints, Facebook developers do most of their work in independent, small changes that are released frequently. This makes sense in Facebook’s online business model, everyone constantly tuning the platform and trying out new options and applications in different user communities, seeing what sticks. It’s a credit to their architecture that so many small, independent changes can actually be done independently and cheaply. Facebook says that it follows Continuous Deployment, but it’s not Continuous Deployment the way that IMVU made popular where every change is pushed out to customers immediately, or even how a company like Etsy does Continuous Deployment. At Facebook, code can be released twice a day, but this is done mostly for bug fixes and internal code. New production code is released once per week: thousands of changes by hundreds of developers are packaged up by their small release team on Sundays, run through automated regression testing, and released on Tuesday if the developers who contributed the changes are present. Release engineers assess the risk of changes based on the size of the change, the amount of discussion done in code reviews (which is recorded through an internal code review tool), and on each developer’s “push karma”: how many problems they have seen from code by this developer before. A tool called “Gatekeeper” controls what features are available to which customers to support dark launching, and all code is released incrementally – to staging, then a subset of users, and so on. Changes can be rolled-back if necessary – individually, or, as a last resort, an entire code release. However, like a lot of Silicon Valley DevOps shops, they mostly follow the “Real Men only Roll Forward” motto. Code Ownership A key to the culture at Facebook is that developers are individually responsible for the code that they wrote, for testing it and supporting it in production. This is reflected in their code ownership model: Developers must also support the operational use of their software — a combination that’s become known as “DevOps.” This further motivates writing good code and testing it thoroughly. Developers’ personal stake in keeping the system running smoothly complements the engineering procedures and lets the system maintain quality at scale. Methodologies and tools aren’t enough by themselves because they can always be misused. Thus, a culture of personal responsibility is critical. Consequently, most source files are modified by only a few engineers. Although at least one other engineer reviews all changes before they’re committed, a third of the source files have only been edited by one engineer, and another quarter by two. Only 10 percent of the files are handled by more than seven engineers. On the other hand, the distribution of engineers per file has a heavy tail, with the most widely shared file handled by no fewer than 870 distinct engineers. These widely shared files are predominantly library files and also include major configuration and top-level PHP files. Testing? We don’t need no stinking testing … Facebook doesn't have an independent test team, because, it says, doesn'tneed one. First, they depend a lot on code reviews to find bugs: At Facebook, code review occupies a central position. Every line of code that’s written is reviewed by a different engineer than the original author. This serves multiple purposes: the original engineer is motivated to ensure that the code is of high quality, the reviewer comes with a fresh mind and might find defects or suggest alternatives, and, in general, knowledge about coding practices and the code itself spreads throughout the company. Developers are also responsible for writing unit tests and their own regression tests – they have “tens of thousands of regression tests” (which doesn't sound like nearly enough for 10+ million lines of mostly PHP code compiled into C++, in both of which languages coding mistakes are easy to make) and automated performance tests. And developers also test the software by using the development version of Facebook for their personal Facebook use. According to the authors, “this is just one aspect of the departure from traditional software development”. But Facebook developers using their own software internally (and passing this off as “testing”) is no different than the early days at Microsoft where employees were supposed to “eat their own dog food”, a practice that did little if anything to improve the quality of Microsoft products. Facebook also depends on customers to test the software for it. Software is released in steps for A/B testing and “live experimentation” on subsets of the user base, whether customers want to participate in this testing or not. Because its customer base is so large, it can get meaningful feedback from testing with even a small percentage of users, which at least minimizes the risk and inconvenience to customers. Security??? While performance is an important consideration for developers at Facebook, there is no mention of security checks or testing anywhere in this description of how Facebook develops and deploys software. No static analysis, dynamic analysis/scanning, pen testing or explanation of how the security team and developers work together, not even for “privacy sensitive code” – although this code is “held to a higher standard” it doesn’t explain what this “higher standard” is. Presumably it relies on the use of libraries and frameworks to handle at least some AppSec problems, and possibly to look for security bugs in its code reviews, but it doesn't say. There isn’t much information available on Facebook’s AppSec program anywhere. The security team at Facebook seems to spend a lot of time educating people on how to use Facebook safely and how to develop Facebook apps safely and running their bug bounty program which pays outsiders to find security bugs for them. A search on security on Facebook mostly comes back with a long list of public security failures, privacy violations and application security vulnerabilities found over the years and continuing up to the present day. Maybe the lack of an effective AppSec program is the reason for this. This is the way Facebook is Developed. Should you care? While it’s interesting to get a look inside a high-profile organization like Facebook and how it approaches development at scale, it’s not clear why this paper was written. There is little about what Facebook is doing (on its front-end development at least) that is unique or innovative, except maybe the way it uses BitTorrent to push code changes out to thousands of servers like Twitter does, something that I already heard about a few years ago at Velocity and that has been written about before. I like the idea of developers being responsible for their work, all the way into production, which is a principle that we also follow. Code reviews are good. Dark launching features is a good practice and has been a common practice in systems for a long time (even before it was called "dark launching"). Not having testers or doing AppSec is not good. Otherwise, I'm not sure what the rest of us can learn from or would want to use from this.
September 4, 2013
by Jim Bird
· 42,879 Views · 1 Like
article thumbnail
Different way to handle events in Android
Typically, events respond to user interactions. Android supports multiple ways to handle events on views. When a user clicks on an Android View, some method is getting called by the Android framework and then past the control to the application listeners. For example, when a user clicks on a view like as a button, the onTouchEvent() method is called on that button object. In order to make our application respond to the event, we must extend the class and override the method. But extending every View object in order to handle such an event would not be practical. Each View class in Android provides a collection of nested interfaces called listeners with callbacks that you can much more easily define in order to handle the event. 1. Defining a listener programatically on the OnCreate method button.setOnClickListener(new OnClickListener(){ @Override public void onClick(View v) { //do stuff } }); ? This method will create an anonymous class for each button you create. This is recommended only if you have fewer listeners in your class. But if we have a complex screen layout with many views, then writing a listener programatically for each view will make the code messy. It's costly and less readable. 2. Setting the android:OnClick property in XML ? Many people use this method of handling click events by writing an OnClick attribute in XML. But usually it is not preferable, because it is better to keep listeners inside the code. Internally, Android is using the Java reflection concept behind the scenes to handle this. It is less readable, and confuses some developers. 3. Implementing the OnClickListener interface on the Activity class and passing a reference to the Button public class MainActivity extends Activity implements OnClickListener{ @Override public void onClick(View v) { //do stuff } protected void onCreate(Bundle savedInstanceState) { ... button.setOnClickListener(this); } } Here, we are implementing the OnClickListener interface on the activity class and passing a self reference to the button. This way, the OnClick listener will hold the reference to the activity object, and is a heavy operation to keep the whole activity’s object in it. This way we can handle the click event for all views. However, we need to differentiate views using their IDs. We can use the view.getId() method to see which button was clicked. Again, this is preferable only when we have fewer views to handle. This way, all the click event handling codes are done in one place. This way is hard to navigate because you can’t determine the type of the listener you are using with the current button (I know Eclipse will highlight the methods this is pointing at, but with lots of code I think it will be hard to find). 4. Create a field with the OnClickListener type private OnClickListener onClickHandler = new OnClickListener(){ @Override public void onClick(View v) { //stuff } }; protected void onCreate(Bundle savedInstanceState) { ... button.setOnClickListener(onClickHandler); } ? The best practice is the create a local variable with the OnClickListener type. This way it is easy to navigate and more readable. But it doesn't stop you from implementing the other three options provided above. Everyone has different way of looking at the problem.
September 1, 2013
by Nilanchala Panigrahy
· 9,113 Views
article thumbnail
How to Display HTML in Android TextView
This example explains to display HTML in Android TextView. Many times while you design an application, you may encounter a place where you will like to use HTML content in your screen. This may be to display a static “eula” or “help” content. In android there is a lovely class android.text.HTML that processes HTML strings into displayable styled text. Currently android doesn’t support all HTML tags. Android API documentation does not stipulate what HTML tags are supported. I have looked into the android Source code and from a quick look at the source code, here’s what seems to be supported as of now. http://grepcode.com/file/repository.grepcode.com/java/ext/com.google.android/android/2.2_r1.1/android/text/Html.java , , , , , , , , , , , , , , , , , , , , From HTML method returns displayable styled text from the provided HTML string. As per );"="" android’s official Documentations any tags in the HTML will display as a generic replacement image which your program can then go through and replace with real images. Html.formHtml method takes an Html.TagHandler and an Html.ImageGetter as arguments as well as the text to parse. We can parse null as for the Html.TagHandler but you’d need to implement your own Html.ImageGetter as there isn’t a default implementation. The Html.ImageGetterneeds to run synchronously and if you’re downloading images from the web you’ll probably want to do that asynchronously. But in my example I am using the images from resources to make my ImageGetter implementation simpler. package com.javatechig.example.ui; import android.os.Bundle; import android.app.Activity; import android.graphics.drawable.Drawable; import android.text.Html; import android.view.Menu; import android.widget.TextView; /* * @author: nilanchala * http://javatechig.com/ */ public class MainActivity extends Activity { private final String htmlText = " Heading TextThis tutorial " + "explains how to display " + "HTML text in android text view. " + "" + " Example from " + "Javatechig.com"; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); TextView htmlTextView = (TextView)findViewById(R.id.html_text); htmlTextView.setText(Html.fromHtml(htmlText, new ImageGetter(), null)); } @Override public boolean onCreateOptionsMenu(Menu menu) { // Inflate the menu; this adds items to the action bar if it is present. getMenuInflater().inflate(R.menu.main, menu); return true; } private class ImageGetter implements Html.ImageGetter { public Drawable getDrawable(String source) { int id; if (source.equals("hughjackman.jpg")) { id = R.drawable.hughjackman; } else { return null; } Drawable d = getResources().getDrawable(id); d.setBounds(0,0,d.getIntrinsicWidth(),d.getIntrinsicHeight()); return d; } }; }
August 30, 2013
by Nilanchala Panigrahy
· 33,936 Views · 2 Likes
article thumbnail
How to Create a Web-app with Quartz Scheduler and Logging
I sometimes help out users in Quartz Scheduler forums. Once in a while some one would ask how can he/she setup the Quartz inside a web application. This is actualy a fairly simple thing to do. The library already comes with a ServletContextListener that you can use to start a Scheduler. I will show you a simple webapp example here. First create a Maven pom.xml file. 4.0.0 quartz-web-demo quartz-web-demo war 1.0-SANPSHOT org.quartz-scheduler quartz 2.2.0 Then you need to create a src/main/webapp/META-INF/web.xml file. quartz:config-file quartz.properties quartz:shutdown-on-unload true quartz:wait-on-shutdown true quartz:start-on-load true org.quartz.ee.servlet.QuartzInitializerListener And lastly, you need a src/main/resources/quartz.properties config file for Scheduler. # Main Quartz configuration org.quartz.scheduler.skipUpdateCheck = true org.quartz.scheduler.instanceName = MyQuartzScheduler org.quartz.scheduler.jobFactory.class = org.quartz.simpl.SimpleJobFactory org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool org.quartz.threadPool.threadCount = 5 You may configure many other things with Quartz, but above should get you started as in In-Memory scheduler. Now you should able to compile and run it. bash> mvn compile bash> mvn org.apache.tomcat.maven:tomcat7-maven-plugin:2.1:run -Dmaven.tomcat.port=8081 How to configure logging for Quartz Scheduler Another frequently asked question is how do they setup logging and see the DEBUG level messages. The Quartz Schedulers uses SLF4J, so you have many loggers options to choose. I will show you how to setup Log4j for example below. First, add this to your pom.xml org.slf4j slf4j-log4j12 1.7.5 Then add src/main/resources/log4j.properties file to show messages onto STDOUT. log4j.rootLogger=INFO, stdout log4j.logger.org.quartz=DEBUG log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n Restart your web application on command line, and now you should see all the DEBUG level logging messages coming from Quartz library. With everything running, your next question might be asking how do you access the scheduler from your web application? Well, when the scheduler is created by the servlet context listener, it is stored inside the web app’s ServletContext space with org.quartz.impl.StdSchedulerFactory.KEY key. So you may retrieve it and use it in your own Servlet like this: public class YourServlet extends HttpServlet { public init(ServletConfig cfg) { String key = "org.quartz.impl.StdSchedulerFactory.KEY"; ServletContext servletContext = cfg.getServletContext(); StdSchedulerFactory factory = (StdSchedulerFactory) servletContext.getAttribute(key); Scheduler quartzScheduler = factory.getScheduler("MyQuartzScheduler"); // TODO use quartzScheduler here. } } Now you are on your way to build your next scheduling application! Have fun!
August 30, 2013
by Zemian Deng
· 37,581 Views
article thumbnail
XP Values: Courage
In a complex system such as a software development team, it's easy for fear to arise.
August 28, 2013
by Giorgio Sironi
· 6,776 Views
article thumbnail
java.net.ProtocolException: Server Redirected Too Many Times
A couple of weeks ago I was trying to write a test around some OAuth code that we have on an internal application and I was using Jersey Client to send the various requests. I initially started with the following code: Client = Client.create(); ClientResponse response = client.resource( "http://localhost:59680" ).get( ClientResponse.class ); But when I ran the test I was getting the following exception: com.sun.jersey.api.client.ClientHandlerException: java.net.ProtocolException: Server redirected too many times (20) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:151) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:680) at com.sun.jersey.api.client.WebResource.get(WebResource.java:191) at com.neotechnology.testlab.manager.webapp.AuthenticationIntegrationTest.shouldRedirectToGitHubForAuthentication(AuthenticationIntegrationTest.java:81) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at com.neotechnology.kirkaldy.testing.Resources$1.evaluate(Resources.java:84) at com.neotechnology.kirkaldy.testing.FailureOutput$2.evaluate(FailureOutput.java:37) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.junit.runner.JUnitCore.run(JUnitCore.java:157) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:63) Caused by: java.net.ProtocolException: Server redirected too many times (20) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1446) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:249) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) ... 28 more If we check the traffic going across port 59680 we can see what’s going wrong: $ sudo ngrep -d lo0 port 59680 interface: lo0 (127.0.0.0/255.0.0.0) filter: (ip) and ( port 59680 ) ##### T 127.0.0.1:59704 -> 127.0.0.1:59680 [AP] GET / HTTP/1.1..User-Agent: Java/1.6.0_45..Host: localhost:59680..Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2..Connection: keep-alive.... ## T 127.0.0.1:59680 -> 127.0.0.1:59704 [AP] HTTP/1.1 302 Found..Set-Cookie: JSESSIONID=mdyw3a4fmqc1b6p53birm4dd;Path=/..Expires: Thu, 01 Jan 1970 00:00:00 GMT..Location: http://localhost:59679/authorize?client_id=basic-client&state=the-state&scope=user%2Crepo..Content-Length : 0..Server: Jetty(8.1.8.v20121106).... ########### T 127.0.0.1:59707 -> 127.0.0.1:59680 [AP] GET /auth/callback?code=timey-wimey&state=the-state HTTP/1.1..User-Agent: Java/1.6.0_45..Host: localhost:59680..Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2..Connection: keep-alive.... ## T 127.0.0.1:59680 -> 127.0.0.1:59707 [AP] HTTP/1.1 302 Found..Cache-Control: no-cache..Set-Cookie: JSESSIONID=8gggez0ns9ftiex4314mbgz9;Path=/..Expires: Thu, 01 Jan 1970 00:00:00 GMT..Location: http://localhost:59680/..Content-Length: 0..Server: Jetty(8.1.8.v20121106).... ########### T 127.0.0.1:59713 -> 127.0.0.1:59680 [AP] GET / HTTP/1.1..User-Agent: Java/1.6.0_45..Host: localhost:59680..Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2..Connection: keep-alive.... ## The response we receive includes a direction to the client to store a cookie but we can see on the next request that the cookie hasn’t been included. I came across this post, which had a few suggestions on how to get around the problem, but the only approach that worked for me was to use jersey-apache-client for which I added the following dependency: com.sun.jersey.contribs jersey-apache-client 1.13 jar I then change my client code to read like this: ApacheHttpClientConfig config = new DefaultApacheHttpClientConfig(); config.getProperties().put(ApacheHttpClientConfig.PROPERTY_HANDLE_COOKIES, true); ApacheHttpClient client = ApacheHttpClient.create( config ); client.setFollowRedirects(true); client.getClientHandler().getHttpClient().getParams().setBooleanParameter( HttpClientParams.ALLOW_CIRCULAR_REDIRECTS, true ); ClientResponse response = client.resource( "http://localhost:59680" ).get( ClientResponse.class ); If we run that and watch the output using ngrep we can see that it now handles cookies correctly: $ sudo ngrep -d lo0 port 59680 Password: interface: lo0 (127.0.0.0/255.0.0.0) filter: (ip) and ( port 59680 ) ##### T 127.0.0.1:60372 -> 127.0.0.1:59680 [AP] GET / HTTP/1.1..User-Agent: Jakarta Commons-HttpClient/3.1..Host: localhost:59680.... ## T 127.0.0.1:59680 -> 127.0.0.1:60372 [AP] HTTP/1.1 302 Found..Set-Cookie: JSESSIONID=vn8zzf9ep3x4mtw66ydm0n6a;Path=/..Expires: Thu, 01 Jan 1970 00:00:00 GMT..Location: http://localhost:60322/authorize?client_id=basic-client&state=the-state&scope=user%2Crepo..Content-Length : 0..Server: Jetty(8.1.8.v20121106).... ## T 127.0.0.1:60372 -> 127.0.0.1:59680 [AP] GET /auth/callback?code=timey-wimey&state=the-state HTTP/1.1..User-Agent: Jakarta Commons-HttpClient/3.1..Host: localhost:59680..Cookie: $Version=0; JSESSIONID=vn8zzf9ep3x4mtw66ydm0n6a; $Path=/.... ## T 127.0.0.1:59680 -> 127.0.0.1:60372 [AP] HTTP/1.1 302 Found..Cache-Control: no-cache..Location: http://localhost:59680/..Content-Length: 0..Server: Jetty(8.1.8.v20121106).... ## T 127.0.0.1:60372 -> 127.0.0.1:59680 [AP] GET / HTTP/1.1..User-Agent: Jakarta Commons-HttpClient/3.1..Host: localhost:59680..Cookie: $Version=0; JSESSIONID=vn8zzf9ep3x4mtw66ydm0n6a; $Path=/.... ## T 127.0.0.1:59680 -> 127.0.0.1:60372 [AP] HTTP/1.1 200 OK..Vary: Accept-Encoding..Accept-Ranges: bytes..Content-Type: text/html..Content-Length: 2439..Last-Modified: Tue, 23 Jul 2013 10:48:15 GMT..Server: Jetty(8.1.8.v20121106)....... . . . . . . . . . . . ....
August 21, 2013
by Mark Needham
· 33,530 Views
article thumbnail
OpenStack Savanna: Fast Hadoop Cluster Provisioning on OpenStack
introduction openstack is one of the most popular open source cloud computing projects to provide infrastructure as a service solution. its key components are compute (nova), networking (neutron, formerly known as quantum), storage (object and block storage, swift and cinder, respectively), openstack dashboard (horizon), identity service (keystone) and image service (glance). there are other official incubated projects like metering (celiometer) and orchestration and service definition (heat). savanna is a hadoop as a service for openstack introduced by mirantis . it is still in an early phase (version .02 was released in summer 2013) and according to its roadmap version 1.0 is targeted for official openstack incubation. in principle, heat also could be used for hadoop cluster provisioning but savanna is especially tuned for providing hadoop-specific api functionality while heat is meant to be used for generic purposes. savanna architecture savanna is integrated with the core openstack components such as keystone, nova, glance, swift and horizon. it has a rest api that supports the hadoop cluster provisioning steps. savanna api is implemented as a wsgi server that, by default, listens to port 8386. in addition, savanna can also be integrated with horizon, the openstack dashboard to create a hadoop cluster from the management console. savanna also comes with a vanilla plugin that deploys a hadoop cluster image. the standard out-of-the-box vanilla plugin supports hadoop 1.1.2 version. installing savanna the simplest option to try out savanna is to use devstack in a virtual machine. i was using an ubuntu 12.04 virtual instance in my tests. in that environment we need to execute the following commands to install devstack and savanna api: $ sudo apt-get install git-core $ git clone https://github.com/openstack-dev/devstack.git $ vi localrc # edit localrc admin_password=nova mysql_password=nova rabbit_password=nova service_password=$admin_password service_token=nova # enable swift enabled_services+=,swift swift_hash=66a3d6b56c1f479c8b4e70ab5c2000f5 swift_replicas=1 swift_data_dir=$dest/data # force checkout prerequsites # force_prereq=1 # keystone is now configured by default to use pki as the token format which produces huge tokens. # set uuid as keystone token format which is much shorter and easier to work with. keystone_token_format=uuid # change the floating_range to whatever ips vm is working in. # in nat mode it is subnet vmware fusion provides, in bridged mode it is your local network. floating_range=192.168.55.224/27 # enable auto assignment of floating ips. by default savanna expects this setting to be enabled extra_opts=(auto_assign_floating_ip=true) # enable logging screen_logdir=$dest/logs/screen $ ./stack.sh # this will take a while to execute $ sudo apt-get install python-setuptools python-virtualenv python-dev $ virtualenv savanna-venv $ savanna-venv/bin/pip install savanna $ mkdir savanna-venv/etc $ cp savanna-venv/share/savanna/savanna.conf.sample savanna-venv/etc/savanna.conf # to start savanna api: $ savanna-venv/bin/python savanna-venv/bin/savanna-api --config-file savanna-venv/etc/savanna.conf to install savanna ui integrated with horizon, we need to run the following commands: $ sudo pip install savanna-dashboard $ cd /opt/stack/horizon/openstack-dashboard $ vi settings.py horizon_config = { 'dashboards': ('nova', 'syspanel', 'settings', 'savanna'), installed_apps = ( 'savannadashboard', .... $ cd /opt/stack/horizon/openstack-dashboard/local $ vi local_settings.py savanna_url = 'http://localhost:8386/v1.0' $ sudo service apache2 restart provisioning a hadoop cluster as a first step, we need to configure keystone-related environment variables to get the authentication token: ubuntu@ip-10-59-33-68:~$ vi .bashrc $ export os_auth_url=http://127.0.0.1:5000/v2.0/ $ export os_tenant_name=admin $ export os_username=admin $ export os_password=nova ubuntu@ip-10-59-33-68:~$ source .bashrc ubuntu@ip-10-59-33-68:~$ ubuntu@ip-10-59-33-68:~$ env | grep os os_password=nova os_auth_url=http://127.0.0.1:5000/v2.0/ os_username=admin os_tenant_name=admin ubuntu@ip-10-59-33-68:~$ keystone token-get +-----------+----------------------------------+ | property | value | +-----------+----------------------------------+ | expires | 2013-08-09t20:31:12z | | id | bdb582c836e3474f979c5aa8f844c000 | | tenant_id | 2f46e214984f4990b9c39d9c6222f572 | | user_id | 077311b0a8304c8e86dc0dc168a67091 | +-----------+----------------------------------+ $ export auth_token="bdb582c836e3474f979c5aa8f844c000" $ export tenant_id="2f46e214984f4990b9c39d9c6222f572" then we need to create the glance image that we want to use for our hadoop cluster. in our example we have used mirantis's vanilla image but we can also build our own image: $ wget http://savanna-files.mirantis.com/savanna-0.2-vanilla-1.1.2-ubuntu-12.10.qcow2 $ glance image-create --name=savanna-0.2-vanilla-hadoop-ubuntu.qcow2 --disk-format=qcow2 --container-format=bare < ./savanna-0.2-vanilla-1.1.2-ubuntu-12.10.qcow2 ubuntu@ip-10-59-33-68:~/devstack$ glance image-list +--------------------------------------+-----------------------------------------+-------------+------------------+-----------+--------+ | id | name | disk format | container format | size | status | +--------------------------------------+-----------------------------------------+-------------+------------------+-----------+--------+ | d0d64f5c-9c15-4e7b-ad4c-13859eafa7b8 | cirros-0.3.1-x86_64-uec | ami | ami | 25165824 | active | | fee679ee-e0c0-447e-8ebd-028050b54af9 | cirros-0.3.1-x86_64-uec-kernel | aki | aki | 4955792 | active | | 1e52089b-930a-4dfc-b707-89b568d92e7e | cirros-0.3.1-x86_64-uec-ramdisk | ari | ari | 3714968 | active | | d28051e2-9ddd-45f0-9edc-8923db46fdf9 | savanna-0.2-vanilla-hadoop-ubuntu.qcow2 | qcow2 | bare | 551699456 | active | +--------------------------------------+-----------------------------------------+-------------+------------------+-----------+--------+ $ export image_id=d28051e2-9ddd-45f0-9edc-8923db46fdf9 then we have installed httpie , an open source http client that can be used to send rest requests to savanna api: $ sudo pip install httpie from now on we will use httpie to send savanna commands. we need to register the image with savanna: $ export savanna_url="http://localhost:8386/v1.0/$tenant_id" $ http post $savanna_url/images/$image_id x-auth-token:$auth_token username=ubuntu http/1.1 202 accepted content-length: 411 content-type: application/json date: thu, 08 aug 2013 21:28:07 gmt { "image": { "os-ext-img-size:size": 551699456, "created": "2013-08-08t21:05:55z", "description": "none", "id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9", "metadata": { "_savanna_description": "none", "_savanna_username": "ubuntu" }, "mindisk": 0, "minram": 0, "name": "savanna-0.2-vanilla-hadoop-ubuntu.qcow2", "progress": 100, "status": "active", "tags": [], "updated": "2013-08-08t21:28:07z", "username": "ubuntu" } } $ http $savanna_url/images/$image_id/tag x-auth-token:$auth_token tags:='["vanilla", "1.1.2", "ubuntu"]' http/1.1 202 accepted content-length: 532 content-type: application/json date: thu, 08 aug 2013 21:29:25 gmt { "image": { "os-ext-img-size:size": 551699456, "created": "2013-08-08t21:05:55z", "description": "none", "id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9", "metadata": { "_savanna_description": "none", "_savanna_tag_1.1.2": "true", "_savanna_tag_ubuntu": "true", "_savanna_tag_vanilla": "true", "_savanna_username": "ubuntu" }, "mindisk": 0, "minram": 0, "name": "savanna-0.2-vanilla-hadoop-ubuntu.qcow2", "progress": 100, "status": "active", "tags": [ "vanilla", "ubuntu", "1.1.2" ], "updated": "2013-08-08t21:29:25z", "username": "ubuntu" } } then we need to create a nodegroup templates (json files) that will be sent to savanna. there is one template for the master nodes ( namenode , jobtracker ) and another template for the worker nodes such as datanode and tasktracker . the hadoop version is 1.1.2. $ vi ng_master_template_create.json { "name": "test-master-tmpl", "flavor_id": "2", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "node_processes": ["jobtracker", "namenode"] } $ vi ng_worker_template_create.json { "name": "test-worker-tmpl", "flavor_id": "2", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "node_processes": ["tasktracker", "datanode"] } $ http $savanna_url/node-group-templates x-auth-token:$auth_token < ng_master_template_create.json http/1.1 202 accepted content-length: 387 content-type: application/json date: thu, 08 aug 2013 21:58:00 gmt { "node_group_template": { "created": "2013-08-08t21:58:00", "flavor_id": "2", "hadoop_version": "1.1.2", "id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "name": "test-master-tmpl", "node_configs": {}, "node_processes": [ "jobtracker", "namenode" ], "plugin_name": "vanilla", "updated": "2013-08-08t21:58:00", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } } $ http $savanna_url/node-group-templates x-auth-token:$auth_token < ng_worker_template_create.json http/1.1 202 accepted content-length: 388 content-type: application/json date: thu, 08 aug 2013 21:59:41 gmt { "node_group_template": { "created": "2013-08-08t21:59:41", "flavor_id": "2", "hadoop_version": "1.1.2", "id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "name": "test-worker-tmpl", "node_configs": {}, "node_processes": [ "tasktracker", "datanode" ], "plugin_name": "vanilla", "updated": "2013-08-08t21:59:41", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } } the next step is to define the cluster template: $ vi cluster_template_create.json { "name": "demo-cluster-template", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "node_groups": [ { "name": "master", "node_group_template_id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "count": 1 }, { "name": "workers", "node_group_template_id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "count": 2 } ] } $ http $savanna_url/cluster-templates x-auth-token:$auth_token < cluster_template_create.json http/1.1 202 accepted content-length: 815 content-type: application/json date: fri, 09 aug 2013 07:04:24 gmt { "cluster_template": { "anti_affinity": [], "cluster_configs": {}, "created": "2013-08-09t07:04:24", "hadoop_version": "1.1.2", "id": "{ "name": "cluster-1", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "cluster_template_id" : "64c4117b-acee-4da7-937b-cb964f0471a9", "user_keypair_id": "stack", "default_image_id": "3f9fc974-b484-4756-82a4-bff9e116919b" }", "name": "demo-cluster-template", "node_groups": [ { "count": 1, "flavor_id": "2", "name": "master", "node_configs": {}, "node_group_template_id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "node_processes": [ "jobtracker", "namenode" ], "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 }, { "count": 2, "flavor_id": "2", "name": "workers", "node_configs": {}, "node_group_template_id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "node_processes": [ "tasktracker", "datanode" ], "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } ], "plugin_name": "vanilla", "updated": "2013-08-09t07:04:24" } } now we are ready to create the hadoop cluster: $ vi cluster_create.json { "name": "cluster-1", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "cluster_template_id" : "64c4117b-acee-4da7-937b-cb964f0471a9", "user_keypair_id": "savanna", "default_image_id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9" } $ http $savanna_url/clusters x-auth-token:$auth_token < cluster_create.json http/1.1 202 accepted content-length: 1153 content-type: application/json date: fri, 09 aug 2013 07:28:14 gmt { "cluster": { "anti_affinity": [], "cluster_configs": {}, "cluster_template_id": "64c4117b-acee-4da7-937b-cb964f0471a9", "created": "2013-08-09t07:28:14", "default_image_id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9", "hadoop_version": "1.1.2", "id": "d919f1db-522f-45ab-aadd-c078ba3bb4e3", "info": {}, "name": "cluster-1", "node_groups": [ { "count": 1, "created": "2013-08-09t07:28:14", "flavor_id": "2", "instances": [], "name": "master", "node_configs": {}, "node_group_template_id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "node_processes": [ "jobtracker", "namenode" ], "updated": "2013-08-09t07:28:14", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 }, { "count": 2, "created": "2013-08-09t07:28:14", "flavor_id": "2", "instances": [], "name": "workers", "node_configs": {}, "node_group_template_id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "node_processes": [ "tasktracker", "datanode" ], "updated": "2013-08-09t07:28:14", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } ], "plugin_name": "vanilla", "status": "validating", "updated": "2013-08-09t07:28:14", "user_keypair_id": "savanna" } } after a while we can run the nova command to check if the instances are created and running: $ nova list +--------------------------------------+-----------------------+--------+------------+-------------+----------------------------------+ | id | name | status | task state | power state | networks | +--------------------------------------+-----------------------+--------+------------+-------------+----------------------------------+ | 1a9f43bf-cddb-4556-877b-cc993730da88 | cluster-1-master-001 | active | none | running | private=10.0.0.2, 192.168.55.227 | | bb55f881-1f96-4669-a94a-58cbf4d88f39 | cluster-1-workers-001 | active | none | running | private=10.0.0.3, 192.168.55.226 | | 012a24e2-fa33-49f3-b051-9ee2690864df | cluster-1-workers-002 | active | none | running | private=10.0.0.4, 192.168.55.225 | +--------------------------------------+-----------------------+--------+------------+-------------+----------------------------------+ now we can log in to the hadoop master instance and run the required hadoop commands: $ ssh -i savanna.pem [email protected] $ sudo chmod 777 /usr/share/hadoop $ sudo su hadoop $ cd /usr/share/hadoop $ hadoop jar hadoop-example-1.1.2.jar pi 10 100 savanna ui via horizon in order to create nodegroup templates, cluster templates and the cluster itself we used a command line tool -- httpie -- to send rest api calls. the same functionality is also available via horizon, the standard openstack dashboard. first we need to register the image with savanna: then we need to create the nodegroup templates: after that we have to create the cluster template: and finally we have to create the cluster:
August 20, 2013
by Istvan Szegedi
· 9,419 Views
article thumbnail
Resource Pooling, Virtualization, Fabric, and the Cloud
One of the five essential attributes of cloud computing (see The 5-3-2 Principle of Cloud Computing) is resource pooling, which is an important differentiator separating the thought process of traditional IT from that of a service-based, cloud computing approach. Resource pooling in the context of cloud computing and from a service provider’s viewpoint denotes a set of strategies and a methodical way of managing resources. For a user, resource pooling institutes an abstraction for presenting and consuming resources in a consistent and transparent fashion. This article presents these key concepts derived from resource pooling: Resource Pools Virtualization in the Context of Cloud Computing Standardization, Automation, and Optimization Fabric Cloud Closing Thoughts Resource Pools Ultimately, data center resources can be logically placed into three categories. They are: compute, networks, and storage. For many, this grouping may appear trivial. It is, however, a foundation upon which some cloud computing methodologies are developed, products designed, and solutions formulated. Compute This is a collection of all CPU capabilities. Essentially all data center servers, either for supporting or actually running a workload, are all part of this compute group. Compute pool represents the total capacity for executing code and running instances. The process to construct a compute pool is to first inventory all servers and identify virtualization candidates followed by implementing server virtualization. It is never too early to introduce a system management solution to facilitate the processes, which in my view is a strategic investment and a critical component for all cloud initiatives. Networks The physical and logical artifacts putting in place to connect resources, segment, and isolate resources from layer three and below, etc., are gathered in the network pool. Networking enables resources becoming visible and hence possibly manageable. In the age of instant gratification, networks and mobility are redefining the security and system administration boundaries, and play a direct and impactful role in user productivity and customer satisfaction. Networking in cloud computing is more than just remote access, but empowerment for a user to self-serve and consume resources anytime, anywhere, with any device. BYOD and consumerization of IT are various expressions of these concepts. Storage This has long been a very specialized and sometimes mysterious part of IT. An enterprise storage solution is frequently characterized as a high-cost item with a significant financial and contractual commitment, specialized hardware, proprietary API and software, a dependency on direct vendor support, etc. In cloud computing, storage has become even more noticeable since the ability to grow and shrink based on demands, i.e. elasticity, demands an enterprise-level, massive, reliable, and resilient storage solution at a global scale. While enterprise IT is consolidating resources and transforming the existing establishment into a cloud computing environment, how to leverage existing storage devices from various vendors and integrate them with the next generation storage solutions is among the highest priorities for modernizing a data center. Virtualization in the Context of Cloud Computing In the last decade, virtualization has proved its value and accelerated the realization of cloud computing. Then, virtualization was mainly server virtualization, which in an over-simplified statement means hosting multiple server instances with the same hardware while each instance runs transparently and in insolation, as if each consumes the entire hardware and is the only instance running. Much of the customer expectations, business needs, and methodologies has since evolved. Now, we should validate virtualization in the context of cloud computing to fully address the innovations rapidly changing how IT conducts business and delivers services. As discussed below, in the context of cloud computing, consumable resources are delivered in some virtualized form. Various virtualization layers collectively construct and form the so-called fabric. Server Virtualization The concept of server virtualization remains: running multiple server instances with the same hardware while each instance runs transparently and in isolation, as if each instance is the only instance running and consuming the entire server hardware. In addition to virtualizing and consolidating servers, server virtualization also signifies the practices of standardizing server deployment switching away from physical boxes to VMs. Server virtualization is for packaging, delivering, and consuming a compute pool. There are a few important considerations of virtualizing servers. IT needs the ability to identify and manage bare metal such that the entire resource life-cycle management from commencing to decommissioning can be standardized and automated. To fundamentally reduce the support and training cost while increasing productivity, a consistent platform with tools applicable across physical, virtual, on-premises, and off-premises deployments is essential. The last thing IT wants is one set of tools for physical resources and another for those virtualized, one set of tools for on-premises deployment and another for those deployed to a service provider, and one set of tools for development and another for deploying applications. The requirement is one methodology for all, one skill set for all, and one set of tools for all. This advantage is obvious when developing applications and deploying Windows Server 2012 R2 on premises or off premises to Windows Azure. The Active Directory security model can work across sites, System Center can manage resources deployed off premises to Windows Azure, and Visual Studio can publish applications across platforms. Windows infrastructure architecture, security, and deployment models are all directly applicable. Network Virtualization The similar idea of server virtualization applies here. Network virtualization is the ability to run multiple networks on the same network device while each network runs transparently and in isolation, as if each network is the only network running and consuming the entire network hardware. Conceptually, since each network instance is running in isolation, one tenant’s 192.168.x network is not aware of another tenant’s identical192.168.x network running with the same network device. Network virtualization provides the translation between physical network characteristics and the representation of and a resource identity in a virtualized network. Consequently, above the network virtualization layer, various tenants while running in isolation can have identical network configurations. A great example of network virtualization is Windows Azure virtual networking. At any given time, there can be multiple Windows Azure subscribers all allocating the same 192.168.x address space with an identical subnet scheme (192.168.1.x/16) for deploying VMs. Those VMs belonging to one subscriber will however not be aware of or visible to those deployed by others, despite the fact that the network configuration, IP scheme, and IP address assignments may all be identical. Network virtualization in Windows Azure isolates on subscriber from the others such that each subscriber operates as if the subscription is the only one employing a 192.168.x address space. Storage Virtualization I believe this is where the next wave of drastic cost reduction of IT post-server virtualization happens. Historically, storage has been a high cost item in any IT budget in each and every aspects including hardware, software, staffing, maintenance, SLA, etc. Since the introduction of Windows Server 2012, there is a clear direction where storage virtualization is built into OS and becoming a commodity. New capabilities like Storage Pool, Hyper-V over SMB, Scale-Out Fire Share, etc., are now part of Windows Server OS and are making storage virtualization part of server administration routines and easily manageable with tools and utilities like PowerShell, which is familiar to many IT professionals. The concept of storage virtualization remains consistent with the idea of logically separating a computing object from its hardware, in this case the storage capacity. Storage virtualization is the ability to integrate multiple and heterogeneous storage devices, aggregate the storage capacities, and present/manage as one logical storage device with a continuous storage space. JBOD is a technology to realize this concept. Standardization, Automation and Optimization Each of the three resource pools has an abstraction to logically present itself with characteristics and work patterns. A compute pool is a collection of physical (virtualization and infrastructure) hosts and VMs. A virtualization host hosts VMs that run workloads deployed by service owners and consumed by authorized users. A network pool encompasses network resources including physical devices, logical switches, address spaces, and site configurations. Network virtualization as enabled/defined in configurations can identify and translate a logical/virtual IP address into a physical one, such that tenants with the same network hardware can implement an identical network scheme without a concern. A storage pool is based on storage virtualization which is a concept of presenting an aggregated storage capacity as one continuous storage space as if provided from one logical storage device. In other words, the three resource pools are wrapped with server virtualization, network virtualization, and storage virtualization, respectively. Each virtualization presents a set of methodologies on which work patterns are derived and common practices are developed. These virtualization layers provides opportunities to standardize, automate, and optimize deployments and considerably facilitates the adoption of cloud computing. Standardization Virtualizing resources decouples the dependency between instances and the underlying hardware. This offers an opportunity to simplify and standardize the logical representation of a resource. For instance, a VM is defined and deployed with a VM template that provides a level of consistency with a standardized configuration. Automation Once VM characteristics are identified and standardized, we can now generate an instance by providing only instance-based information or information that depends on run-time, such as the VM machine name, which must be validated at run-time to prevent duplicated names. This requirement for providing only minimal information at deployment can be significantly simplify and streamline operations for automation. And with automation, resources can then be deployed, instantiated, relocated, taken off-line, brought back online, or removed rapidly and automatically based on set criteria. Standardization and automation are essential mechanisms so that workload can be scaled on demand, i.e., become elastic. Optimization Standardization provides a set of common criteria. Automation executes operations based on set criteria with volumes, consistency, and expediency. With standardization and automation, instances can be instantiated with consistency, efficiency, and predictability. In other words, resources can be operated in bulk with consistency and predictability. The next logical step is then to optimize the usage based on SLA. The presented progression is what resource pooling and virtualizations can provide and facilitate. These methodologies are now built into products and solutions. Windows Server 2012 R2 and System Center 2012 and later integrate server virtualization, network virtualization, and storage virtualization into one consistent solution platform with standardization, automation, and optimization for building and managing clouds. Fabric This is a significant abstraction in cloud computing. Fabric implies accessibility and discoverability, and denotes the ability to discover, identify, and manage a resource. Conceptually, fabric is an umbrella term encompassing all the underlying infrastructure supporting a cloud computing environment. At the same time, a fabric controller represents the system management solution which manages, i.e. owns, fabric. In cloud architecture, fabric consists of the three resource pools: compute, networks, and storage. Compute provides the computing capabilities, executes code, and runs instances. Networks glues the resources based on requirements. Storage is where VMs, configurations, data, and resources are kept. Fabric shields the physical complexities of the three resource pools presented with server virtualization, network virtualization, and storage virtualization. All operations are eventually directed by the fabric controller of a data center. Above fabric, there are logical views of consumable resources including VMs, virtual networks, and logical storage drives. By deploying VMs, configuring virtual networks, or acquiring storage, a user consumes resources. Under fabric, there are virtualization and infrastructure hosts, Active Directory, DNS, clusters, load balancers, address pools, network sites, library shares, storage arrays, topology, racks, cables, etc., all under the fabric controller’s command to collectively present and support fabric. For a service provider, building a cloud computing environment is essentially establishing a fabric controller and constructing fabric. Namely, instituting a comprehensive management solution, building the three resource pools, and integrating server virtualization, network virtualization, and storage virtualization to form fabric. From a user’s point of view, how and where a resource is physically provided is not a concern, but the accessibility, readiness, scalability, and fulfillment of SLA are. Cloud This is a well-defined term and we should not be confused with it. (see NIST SP 800-145 and the 5-3-2 Principle of Cloud Computing) We need to be very clear on: what a cloud must exhibit (the five essential attributes), how to consume it (with SaaS, PaaS, or IaaS), and the model a service is deployed in (like private cloud, public cloud, and hybrid cloud). Cloud is a concept, a state, a set of capabilities such that a business can be delivered as a service, i.e. available on demand. The architecture of a cloud computing environment is presented with three resource pools: compute, networks, and storage. Each is an abstraction provided by a virtualization layer. Server virtualization presents a compute pool with VMs that supply the computing, i.e. CPUs, and power to execute code and run instances. Network virtualization offers a network pool and is the mechanism that allows multiple tenants with identical network configurations on the same virtualization host while connecting, segmenting, isolating network traffic with virtual NICs, logical switches, address space, network sites, IP pools, etc. Storage virtualization provides a logical storage device with the capacity to appear continuous and aggregated with a pool of storage devices behind the scene. The three resource pools together constitute the fabric (of a cloud) while the three virtualization layers collectively form the abstraction, such that while the underlying physical infrastructure may be intricate, the user experience above fabric remains logical and consistent. Deploying a VM, configuring a virtual network, or acquiring storage is transparent with virtualization regardless of where the VM actually resides, how the virtual network is physically wired, or what devices in the aggregate the requested storage is provided with. Closing Thoughts Cloud is a very consumer-focused approach. It is about a customer’s ability and control based on SLA in getting resources when needed and with scale, and equally important releasing resources when no longer required. It is not about products and technologies. It is about servicing, consuming, and strengthening the bottom line.
August 12, 2013
by Yung Chou
· 10,391 Views
article thumbnail
Spock - Return Nested Spies / Mocks
Hi! Some time ago I have written an article about Mockito and using RETURNS_DEEP_STUBS when working with JAXB. Quite recently we have faced a similliar issue with deeply nesetd JAXB and the awesome testing framework written in Groovy called Spock. Natively Spock does not support creating deep stubs or spies so we needed to create a workaround for it and this article will show you how to do it. Project structure We will be working on the same data structure as in the RETURNS_DEEP_STUBS when working with JAXB article so the project structure will be quite simillar: As you can see the main difference is such that the tests are present in the /test/groovy/ folder instead of /test/java/ folder. Extended Spock Specification In order to use Spock as a testing framework you have to create Groovy test scripts that extend the Spock Specification class. The details of how to use Spock are available here. In this project I have created an abstract class that extends Specification and adds two additional methods for creating nested Test Doubles (I don't remember if I haven't seen a prototype of this approach somewhere on the internet). ExtendedSpockSpecification.groovy package com.blogspot.toomuchcoding.spock; import spock.lang.Specification /** * Created with IntelliJ IDEA. * User: MGrzejszczak * Date: 14.06.13 * Time: 15:26 */ abstract class ExtendedSpockSpecification extends Specification { /** * The method creates nested structure of spies for all the elements present in the property parameter. Those spies are set on the input object. * * @param object - object on which you want to create nested spies * @param property - field accessors delimited by a dot - JavaBean convention * @return Spy of the last object from the property path */ protected def createNestedSpies(object, String property) { def lastObject = object property.tokenize('.').inject object, { obj, prop -> if (obj[prop] == null) { def foundProp = obj.metaClass.properties.find { it.name == prop } obj[prop] = Spy(foundProp.type) } lastObject = obj[prop] } lastObject } /** * The method creates nested structure of mocks for all the elements present in the property parameter. Those mocks are set on the input object. * * @param object - object on which you want to create nested mocks * @param property - field accessors delimited by a dot - JavaBean convention * @return Mock of the last object from the property path */ protected def createNestedMocks(object, String property) { def lastObject = object property.tokenize('.').inject object, { obj, prop -> def foundProp = obj.metaClass.properties.find { it.name == prop } def mockedProp = Mock(foundProp.type) lastObject."${prop}" >> mockedProp lastObject = mockedProp } lastObject } } These two methods work in a very simillar manner. Assuming that the method's argument property looks as follows: "a.b.c.d" then the methods tokenize the string by "." and iterate over the array -["a","b","c","d"]. We then iterate over the properties of the Meta Class to find the one whose name is equal to prop (for example "a"). If that is the case we then use Spock's Mock/Spy method to create a Test Double of a given class (type). Finally we have to bind the mocked nested element to its parent. For the Spy it's easy since we set the value on the parent (lastObject = obj[prop]). For the mocks however we need to use the overloaded >> operator to record the behavior for our mock - that's why dynamically call the property that is present in the prop variable (lastObject."${prop}" >> mockedProp). Then we return from the closure the mocked/spied instance and we repeat the process for it Class to be tested Let's take a look at the class to be tested: PlayerServiceImpl.java package com.blogspot.toomuchcoding.service; import com.blogspot.toomuchcoding.model.PlayerDetails; /** * User: mgrzejszczak * Date: 08.06.13 * Time: 19:02 */ public class PlayerServiceImpl implements PlayerService { @Override public boolean isPlayerOfGivenCountry(PlayerDetails playerDetails, String country) { String countryValue = playerDetails.getClubDetails().getCountry().getCountryCode().getCountryCode().value(); return countryValue.equalsIgnoreCase(country); } } The test class And now the test class: PlayerServiceImplWrittenUsingSpockTest.groovy package com.blogspot.toomuchcoding.service import com.blogspot.toomuchcoding.model.* import com.blogspot.toomuchcoding.spock.ExtendedSpockSpecification /** * User: mgrzejszczak * Date: 14.06.13 * Time: 16:06 */ class PlayerServiceImplWrittenUsingSpockTest extends ExtendedSpockSpecification { public static final String COUNTRY_CODE_ENG = "ENG"; PlayerServiceImpl objectUnderTest def setup(){ objectUnderTest = new PlayerServiceImpl() } def "should return true if country code is the same when creating nested structures using groovy"() { given: PlayerDetails playerDetails = new PlayerDetails( clubDetails: new ClubDetails( country: new CountryDetails( countryCode: new CountryCodeDetails( countryCode: CountryCodeType.ENG ) ) ) ) when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return true if country code is the same when creating nested structures using spock mocks - requires CGLIB for non interface types"() { given: PlayerDetails playerDetails = Mock() ClubDetails clubDetails = Mock() CountryDetails countryDetails = Mock() CountryCodeDetails countryCodeDetails = Mock() countryCodeDetails.countryCode >> CountryCodeType.ENG countryDetails.countryCode >> countryCodeDetails clubDetails.country >> countryDetails playerDetails.clubDetails >> clubDetails when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return true if country code is the same using ExtendedSpockSpecification's createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.ENG when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return false if country code is not the same using ExtendedSpockSpecification createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.PL when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } def "should return true if country code is the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.ENG when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return false if country code is not the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.PL when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } } Let's move through the test methods one by one. First I present the code and then have a quick description of the presented snippet. def "should return true if country code is the same when creating nested structures using groovy"() { given: PlayerDetails playerDetails = new PlayerDetails( clubDetails: new ClubDetails( country: new CountryDetails( countryCode: new CountryCodeDetails( countryCode: CountryCodeType.ENG ) ) ) ) when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you could find the approach of creating nested structures by using the Groovy feature of passing properties to be set in the constructor. def "should return true if country code is the same when creating nested structures using spock mocks - requires CGLIB for non interface types"() { given: PlayerDetails playerDetails = Mock() ClubDetails clubDetails = Mock() CountryDetails countryDetails = Mock() CountryCodeDetails countryCodeDetails = Mock() countryCodeDetails.countryCode >> CountryCodeType.ENG countryDetails.countryCode >> countryCodeDetails clubDetails.country >> countryDetails playerDetails.clubDetails >> clubDetails when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you can find a test that creates mocks using Spock - mind you that you need CGLIB as a dependency when you are mocking non interface types. def "should return true if country code is the same using ExtendedSpockSpecification's createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.ENG when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you have an example of creating nested mocks using the createNestedMocks method. def "should return false if country code is not the same using ExtendedSpockSpecification createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.PL when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } An example showing that creating nested mocks using the createNestedMocks method really works - should return false for improper country code. def "should return true if country code is the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.ENG when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you have an example of creating nested spies using the createNestedSpies method. def "should return false if country code is not the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.PL when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } An example showing that creating nested spies using the createNestedSpies method really works - should return false for improper country code. Summary In this post I have shown you how you can create nested mocks and spies using Spock. It can be useful especially when you are working with nested structures such as JAXB. Still you have to bear in mind that those structures to some extend violate the Law of Demeter. If you check my previous article about Mockito you would see that: We are getting the nested elements from the JAXB generated classes. Although it violates the Law of Demeter it is quite common to call methods of structures because JAXB generated classes are in fact structures so in fact I fully agree with Martin Fowler that it should be called the Suggestion of Demeter. And in case of this example the idea is the same - we are talking about structures so we don't violate the Law of Demeter. Advantages With a single method you can mock/spy nested elements Code cleaner than creating tons of objects that you then have to manually set Disadvantages Your IDE won't help you with providing the property names since the properties are presented as Strings You have to set Test Doubles only in the Specification context (and sometimes you want to go outside this scope) Sources As usual the sources are available at BitBucket and GitHub.
August 8, 2013
by Marcin Grzejszczak
· 16,274 Views · 1 Like
  • Previous
  • ...
  • 571
  • 572
  • 573
  • 574
  • 575
  • 576
  • 577
  • 578
  • 579
  • 580
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×