DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest DevOps and CI/CD Topics

article thumbnail
Literate Programming and GitHub
I remain captivated by the ideals of Literate Programming. My fork of PyLit (https://github.com/slott56/PyLit-3) coupled with Sphinx seems to handle LP programming in a very elegant way. It works like this. Write RST files describing the problem and the solution. This includes the actual implementation code. And everything else that's relevant. Run PyLit3 to build final Python code from the RST documentation. This should include the setup.py so that it can be installed properly. Run Sphinx to build pretty HTML pages (and LaTeX) from the RST documentation. I often include the unit tests along with the sphinx build so that I'm sure that things are working. The challenge is final presentation of the whole package. The HTML can be easy to publish, but it can't (trivially) be used to recover the code. We have to upload two separate and distinct things. (We could use BeautifulSoup to recover RST from HTML and then PyLit to rebuild the code. But that sounds crazy.) The RST is easy to publish, but hard to read and it requires a pass with PyLit to emit the code and then another pass with Sphinx to produce the HTML. A single upload doesn't work well. If we publish only the Python code we've defeated the point of literate programming. Even if we focus on the Python, we need to do a separate upload of HTML to providing the supporting documentation. After working with this for a while, I've found that it's simplest to have one source and several targets. I use RST ⇒ (.py, .html, .tex). This encourages me to write documentation first. I often fail, and have blocks of code with tiny summaries and non-existent explanations. PyLit allows one to use .py ⇒ .rst ⇒ .html, .tex. I've messed with this a bit and don't like it as much. Code first leaves the documentation as a kind of afterthought. How can we publish simply and cleanly: without separate uploads? Enter GitHub and gh-pages. See the "sphinxdoc-test" project for an example. Also thishttps://github.com/daler/sphinxdoc-test. The bulk of this is useful advice on a way to create the gh-pages branch from your RST source via Sphinx and some GitHub commands. Following this line of thinking, we almost have the case for three branches in a LP project. The "master" branch with the RST source. And nothing more. The "code" branch with the generated Python code created by PyLit. The "gh-pages" branch with the generated HTML created by Sphinx. I think I like this. We need three top-level directories. One has RST source. A build script would run PyLit to populate the (separate) directory for the code branch. The build script would also run Sphinx to populate a third top-level directory for the gh-pages branch. The downside of this shows up when you need to create a branch for a separate effort. You have a "some-major-change" branch to master. Where's the code? Where's the doco? You don't want to commit either of those derived work products until you merge the "some-major-change" back into master. GitHub Literate Programming There are many LP projects on GitHub. There are perhaps a dozen which focus on publishing with the Github-flavored Markdown as the source language. Because Markdown is about as easy to parse as RST, the tooling is simple. Because Markdown lacks semantic richness, I'm not switching. I've found that semantically rich markup is essential. This is a key feature of RST. It's carried forward by Sphinx to create very sophisticated markup. Think:code:`sample` vs. :py:func:`sample` vs. :py:mod:`sample` vs.:py:exc:`sample`. The final typesetting may be similar, but they are clearly semantically distinct and create separate index entries. A focus on Markdown seems to be a limitation. It's encouraging to see folks experiment with literate programming using Markdown and GitHub. Perhaps other folks will look at more sophisticated markup languages like RST. Previous Exercises See https://sourceforge.net/projects/stingrayreader/ for a seriously large literate programming effort. The HTML is also hosted at SourceForge: http://stingrayreader.sourceforge.net/index.html. This project is awkward because -- well -- I have to do a separate FTP upload of the finished pages after a change. It's done with a script, not a simple "git push." SourceForge has a GitHub repository. https://sourceforge.net/p/stingrayreader/code/ci/master/tree/. But. SourceForge doesn't use GitHub.com's UI, so it's not clear if it supports the gh-pages feature. I assume it doesn't, but, maybe it does. (I can't even login to SourceForge with Safari... I should really stop using SourceForge and switch to GitHub.) See https://github.com/slott56/HamCalc-2.1 for another complex, LP effort. This predates my dim understanding of the gh-pages branch, so it's got HTML (in doc/build/html), but it doesn't show it elegantly. I'm still not sure this three-branch Literate Programming approach is sensible. My first step should probably be to rearrange the PyLit3 project into this three-branch structure.
June 24, 2015
by Steven Lott
· 1,968 Views
article thumbnail
Perforce and Go2Group Integrate Helix SCM Platform with ConnectALL ALM Router
New Integration Provides Seamless Connections Between Perforce Helix and Leading Application Lifecycle Management Systems WOKINGHAM, UK. (June 24, 2015) – Perforce Software, the leader in software configuration management (SCM) and collaboration, and Go2Group, an Atlassian Platinum and Enterprise Expert, today announced the Perforce ConnectALL Adapter. The new adapter for Go2Group’s ConnectALL ALM Router connects Perforce Helix to Application Lifecycle Management (ALM) systems supported by ConnectALL. The companies also announced that they have expanded their partnership, which first began in 2002. “Very few SCMs can handle binary data, and no other SCM solution supports large file formats that scale across globally distributed enterprises like Helix,” said Brett Taylor, president of Go2Group. “Our customers demand future-proof solutions, and with Perforce we know they don’t have to worry about outgrowing their systems—it will serve them well whether they’re a team of 50 or 50,000.” With the Perforce adapter, ConnectALL automatically synchronises data and workflow with other ALM systems and integrates ALM systems components within minutes. “We’re excited to be a part of the ConnectALL ecosystem of adapters and to enable companies to more easily design, configure, synchronise, manage, and monitor their integrations with Perforce,” said Dave Robertson, vice president of Channels at Perforce. “We’re glad to extend our partnership with Go2Group to new technologies and markets.” Go2Group is part of Perforce’s network of sales partners across Europe, the Middle East, Africa, Asia Pacific and India. Perforce partners serve customers in more than 100 countries worldwide. The Perforce ConnectALL Adapter is available for purchase from the Go2Group website.
June 24, 2015
by Fran Cator
· 955 Views
article thumbnail
New Integrated Biometrics System Extends Enterprise Access Control to the Data Centre and Other High-Security Settings
Combining BioConnect with Digitus server cabinet access control makes biometric identity management more effective and easier ENTERTECH SYSTEMSandDigitus Biometricshave just announced a new technology partnership between ENTERTECH’S BioConnect identity management platform and the Digitus db Bus and db Cabinet Sentry for server cabinet access control. The result is a new, fully-integrated solution called db BioConnect. The newdb BioConnectlets company data centres, co-location data centres and IT room customers simplify biometric implementation, enrollment and management for access control of perimeter doors, interior rooms, cages and now server racks all integrated into one identity platform. This game-changing solution now extends enterprise access security platforms into the data centre, making investments in Access Control Management Systems far more economical and effective. “Digitus Biometrics is the leader in providing biometric access control to the critical infrastructure market," said Rob Douglas, ENTERTECH SYSTEMS CEO. "With Digitus technology integrated to BioConnect, all 15 of our certified access control partners will now be able to offer it to their customers. For the first time, end users will have access to an integrated biometric solution that secures access control from the data centre entrance to the server cabinet instead of having to deal with stand-alone systems.” The list of BioConnect certified access control partners can be found atwww.bioconnect.com/partners. ENTERTECH SYSTEMS’BioConnectis the most advanced identity management platform on the global market today. Simple, secure and scalable, it provides Suprema biometric authentication across leading access control systems. As an application for security professionals, BioConnect helps enterprises successfully implement identity solutions by making deployment and use of biometrics easier than ever. BioConnect addresses deployment challenges by reducing costs, overcoming complexity and making it easier to on-board users. The platform provides seamless synchronization of data such as new cardholders, changes and deletions, and is tailor-made for enterprises where true identity is critical for secure access to physical facilities and software applications. “Our biometric access control solutions are designed to meet the needs of a diverse range of clients," said David Orischak, Digitus Biometrics CEO. "This technology integration to create db BioConnect will let us offer a single, centralised, highly advanced access control solution that's easy to deploy and use. An industry first, our customers will be able to use one centralised system facility-wide to secure a company’s critical infrastructure and data." db Bus ServerRack Access Controlis designed for data centres needing to secure large volumes of server cabinets with only one small component per cabinet. A single db Bus controller allows the user to secure up to 32 racks with a single 48V power supply and may be infinitely scaled.db Cabinet Sentryis designed for data centres with a structured cabling scheme. It is Digitus’ most compact, cost-effective and energy-efficient means of securing server cabinets. This extremely versatile unit can be deployed in networked or stand-alone environments, using power over Ethernet (PoE) or an external power supply.db BioLockis a server cabinet lock that uses biometric identification with prints for up to 10 fingers per user. Through this new technology partnership, these db products will be integrated with BioConnect to create the new integrated db BioConnect solution. “Digitus’ use of leading Suprema biometric technology in their server cabinet access control solutions is a natural fit for ENTERTECH SYSTEMS as the operating partner for Suprema in the US, Canada, UK, Ireland and Puerto Rico,” added Douglas. “The implications to the market are significant," added Orischak. "The integrated db BioConnect solution can be used to manage any cabinet system where biometric access control is warranted – even SCIF’s and locked areas housing sensitive assets such as pharmaceuticals, hazardous materials, intelligence archives, customer and patient records, as well as critical IP.” For more information on the db BioConnect integrated solution, visitwww.bioconnect.com/db.
June 24, 2015
by Fran Cator
· 1,342 Views
article thumbnail
How to Split Single PST File into Multiple PSTs & Merge Multiple PSTs using .NET
This technical tip explains how to .NET developers can split and Merge PST files using Aspose.Email. Aspose.Email API provides the capability to split a single PST file into multiple PST files of desired file size. It can also merge multiple PST files into a single PST file. Both the splitting and merging of PSTs operations can be tracked by adding events to these operations. Aspose.Email for .NET is a set of components allowing developers to easily implement email functionality within their ASP.NET web applications, web services & Windows applications. It Supports Outlook PST, EML, MSG & MHT formats. //Code sample for Splitting a Single PST into multiple PSTs [C#] using (PersonalStorage pst = PersonalStorage.FromFile(@"D:\test\source.pst")) { // The events subscription is an optional step for the tracking process only. pst.StorageProcessed += PstSplit_OnStorageProcessed; pst.ItemMoved += PstSplit_OnItemMoved; // Splits into pst chunks with the size of 5mb pst.SplitInto(5000000, @"D:\test\chunks\"); } [VB.NET] Using pst As PersonalStorage = PersonalStorage.FromFile("D:\test\source.pst") ' The events subscription is an optional step for the tracking process only. pst.StorageProcessed += PstSplit_OnStorageProcessed pst.ItemMoved += PstSplit_OnItemMoved ' Splits into pst chunks with the size of 5mb pst.SplitInto(5000000, "D:\test\chunks\") //Code sample for Merging of Multiple PSTs into a single PST [C#] totalAdded = 0; using (PersonalStorage pst = PersonalStorage.FromFile(@"D:\test\destination.pst")) { // The events subscription is an optional step for the tracking process only. pst.StorageProcessed += PstMerge_OnStorageProcessed; pst.ItemMoved += PstMerge_OnItemMoved; // Merges with the pst files that are located in separate folder. pst.MergeWith(Directory.GetFiles(@"D:\test\sources\")); Console.WriteLine("Total messages added: {0}", totalAdded); } [VB.NET] totalAdded = 0 Using pst As PersonalStorage = PersonalStorage.FromFile("D:\test\destination.pst") ' The events subscription is an optional step for the tracking process only. pst.StorageProcessed += PstMerge_OnStorageProcessed pst.ItemMoved += PstMerge_OnItemMoved ' Merges with the pst files that are located in separate folder. pst.MergeWith(Directory.GetFiles("D:\test\sources\")) Console.WriteLine("Total messages added: {0}", totalAdded) End Using //Code sample Merging Folders from another PST [C#] totalAdded = 0; using (PersonalStorage destinationPst = PersonalStorage.FromFile(@"D:\test\destination.pst")) using (PersonalStorage sourcePst = PersonalStorage.FromFile(@"D:\test\source.pst")) { FolderInfo destinationFolder = destinationPst.RootFolder.AddSubFolder("FolderFromAnotherPst"); FolderInfo sourceFolder = sourcePst.GetPredefinedFolder(StandardIpmFolder.DeletedItems); // The events subscription is an optional step for the tracking process only. destinationFolder.ItemMoved += destinationFolder_ItemMoved; // Merges with the folder from another pst. destinationFolder.MergeWith(sourceFolder); Console.WriteLine("Total messages added: {0}", totalAdded); } [VB.NET] totalAdded = 0 Using destinationPst As PersonalStorage = PersonalStorage.FromFile("D:\test\destination.pst") Using sourcePst As PersonalStorage = PersonalStorage.FromFile("D:\test\source.pst") Dim destinationFolder As FolderInfo = destinationPst.RootFolder.AddSubFolder("FolderFromAnotherPst")t Dim sourceFolder As FolderInfo = sourcePst.GetPredefinedFolder(StandardIpmFolder.DeletedItems) ' The events subscription is an optional step for the tracking process only. destinationFolder.ItemMoved += destinationFolder_ItemMoved ' Merges with the folder from another pst. destinationFolder.MergeWith(sourceFolder) Console.WriteLine("Total messages added: {0}", totalAdded) End Using End Using //Helping Methods void destinationFolder_ItemMoved(object sender, ItemMovedEventArgs e) { totalAdded++; } void PstMerge_OnStorageProcessed(object sender, StorageProcessedEventArgs e) { Console.WriteLine("*** The storage is merging: {0}", e.FileName); } void PstMerge_OnItemMoved(object sender, ItemMovedEventArgs e) { if (currentFolder == null) { currentFolder = e.DestinationFolder.RetrieveFullPath(); } string folderPath = e.DestinationFolder.RetrieveFullPath(); if (currentFolder != folderPath) { Console.WriteLine(" Added {0} messages to \"{1}\"", messageCount, currentFolder); messageCount = 0; currentFolder = folderPath; } messageCount++; totalAdded++; } void PstSplit_OnStorageProcessed(object sender, StorageProcessedEventArgs e) { if (currentFolder != null) { Console.WriteLine(" Added {0} messages to \"{1}\"", messageCount, currentFolder); } messageCount = 0; currentFolder = null; Console.WriteLine("*** The chunk is processed: {0}", e.FileName); } void PstSplit_OnItemMoved(object sender, ItemMovedEventArgs e) { if (currentFolder == null) { currentFolder = e.DestinationFolder.RetrieveFullPath(); } string folderPath = e.DestinationFolder.RetrieveFullPath(); if (currentFolder != folderPath) { Console.WriteLine(" Added {0} messages to \"{1}\"", messageCount, currentFolder); messageCount = 0; currentFolder = folderPath; } messageCount++; }
June 24, 2015
by David Zondray
· 1,599 Views
article thumbnail
Overcoming Barriers to Performance and Scalability Test Automation
[This article was written by Ophir Prusak] Guest author Ophir Prusak is chief evangelist atBlazeMeter. To learn more about load and performance testing automation, he invites readers toattend a meetupthis Wednesday, June 24, at New Relic’s San Francisco offices. Performance and load testing are kind of like flossing your teeth. You know you need to do it, but you might not be doing it as much as you should. When your site goes down because it couldn’t handle the load, you look back and realize you might have easily prevented it with a little more testing in advance. That’s why companies are automating their application testing in an effort to lower costs, increase efficiency, and reduce the time needed to release new features. The importance of automated testing in a continuous delivery era Continuous Delivery (CD) is rapidly emerging as the “new normal” in software development, as Perforce discovered in an independent survey, with an estimated 80% of SaaS companies and 51% of non-SaaS companies adopting this practice. Companies that provide Software-as-a-Service know they need to be continuously creating new features, updating their websites, and optimizing their backend. But while software development has adapted nicely in terms of automation, the testing side has moved more slowly. For a fully Continuous Delivery and Integration process to be realized, performance testing must be automated. As the need for testing increases, doing it manually can dramatically increase your time to release. Automating testing throughout the CD process can help detect errors instantly and deliver software faster. Making it work JMeter is the de facto standard in open source load testing. It’s the most widely used open source tool for performance testing for a good reason. There’s virtually nothing it can’t test (websites, native mobile applications, APIs, and Web applications) and it’s extremely powerful and fully featured. Yet there are challenges. JMeter poses a steep learning curve in terms of integration and ease of use. Additionally, it doesn’t integrate easily with APM and Continuous Integration (CI) tools. Many developers have been looking for a way to conduct performance testing with less time and effort—and fewer hiccups along the way. Taurus: An effort to simplify test automation A new open source project called Taurus (Test AUtomation Running Smoothly) is designed to provide exactly that—a way to remove most of the pain of using JMeter on its own. Taurus can give you the ability to Create and define a load test even without using JMeter. Override existing JMeter files or tests configurations. Create human-readable configuration files and testing scripts that are easily added to source control systems like GitHub. Integrate into CI tools like Jenkins. Run multiple tests in parallel. Provide pass/fail criteria back into the CI tool for easier automation of test-results analysis. Make analysis of test results easier and more intuitive. Taurus still uses JMeter under the hood, but is designed to have a much easier learning curve, especially for simple tests. Taurus also offers a built-in result analysis engine that provides both console-based reporting features and result analysis. Performance testing and optimizing your applications is not simple, yet there are solutions available that make the process easier and more successful. I’m looking forward to seeing how the technology evolves even further in the near future. If you want to learn more about Taurus, check out the project on GitHub. Better yet, you are invited to come to a meetup this Wednesday, June 24, at New Relic’s San Francisco offices. You can learn a lot more about Taurus and how you can use it to help scale load and performance testing automation.
June 24, 2015
by Fredric Paul
· 1,769 Views
article thumbnail
Don't Leave Your Database Behind
Microsoft’s recent announcement that Windows 10 will be the last big-bang release for the operating system was a surprise to many. But what on Earth is going on? Why have they suddenly turned round and decided to upgrade their software with a series of frequent updates instead? The answer is a term not many chief executives will be familiar with, but an important one nevertheless: continuous delivery. Continuous delivery for software applications means that new features can be released faster and companies can be more competitive. Although it requires a change in processes and an investment in the tools that make it possible, the return on investment is high – and proven. The 2014 State of DevOps Report from Puppet Labs found that high-performing IT organisations that use practices such as continuous delivery are twice as likely to exceed their profitability, market share and productivity goals. WHAT ABOUT THE DATABASE? A stumbling block many organisations find in their pursuit of continuous delivery is the database. Databases hold critical information and their stability is vital to the bottom line. Often they’re tied into those same applications that can gain from continuous delivery, but they’re seen as too risky to include in the process. In fact, the DZone Guide to Continuous Delivery 2015 showed that while 61 per cent of companies have already implemented continuous delivery for their applications, it falls to half that number when it comes to the database.The result? Application development is faster, smoother and more cost effective, while database development lags behind. “Clients such as Yorkshire Water are seeing a return on investment of 700 per cent after investing in continuous delivery for their databases” At Redgate, we’ve been working on a way to resolve the issue with a Database Lifecycle Management (DLM) solution: one that brings all the advantages of continuous delivery to the database, while protecting the data at the same time. We have some pedigree here. Our tools are already used in more than 90 per cent of Fortune 100 companies, and are trusted in areas such as finance, healthcare and manufacturing, where performance and reliability are not optional. It’s working too. Clients such as Yorkshire Water are seeing a return on investment of 700 per cent after investing in continuous delivery for their databases. Similarly, StateServ, a medical equipment supplier with customers across the United States, has adopted Redgate’s SQL Source Control and SQL Compare to reduce the time it takes to release new versions of its website by 50 per cent. Property services provider Move With Us has also adopted these tools to reduce the cost of releasing database updates by 97 per cent. As Anthony Hall, IT operations manager, says: “We spend less time deploying and more time developing better software, which keeps us ahead of our competitors. It results in a product that more closely matches what the customer wants, at a cheaper price. That equals happy customers, a happier development team and bigger profits for the company.” I love comments like that. It demonstrates that all the effort we’ve put into developing our tools and the DLM process to go with it is paying off. Not for us – for our clients. By aligning the continuous delivery of the application with the continuous delivery of the database, a company will inevitably see increased productivity, reduced risk and higher quality end-products. This suddenly means that your technology becomes a strategic advantage, as opposed to an unpredictable risk, and can be used to deliver real financial benefits to the business. CASE STUDY KEEPING DATA FLOWING FOR YORKSHIRE WATER Yorkshire Water manage the collection, treatment and distribution of water in Yorkshire, supplying around 1.24 billion litres of drinking water daily. At the same time they collect, treat and dispose of about one billion litres of waste water safely back into the environment. As might be imagined, their applications and databases are diverse and technically challenging, and deploying changes to the databases used to be manual and time consuming. The company turned to Redgate’s database development tools to automate the changes, saving time as well as avoiding errors. Using SQL Source Control and SQL Compare, they achieved every aim. In 25 days, they moved their first project from a manual deployment process to full automation and it now takes half a day to start auto-deploying a new project. As software development team leader Carl Davison says: “It’s now a minor overhead to deploy. We predict the return on investment to be in the order of 700 per cent over the next five years. We can deploy database changes as soon as the business needs them, without delays or problems.”
June 24, 2015
by Moe Long
· 1,508 Views
article thumbnail
Hazelcast Cluster Quorum
Originally written by David Brimley. The Death Spiral. A new feature in the 3.5 release of Hazelcast is the Cluster Quorum. In this instance we’re not talking about a Quorum in its traditional distributed systems sense, think of a Cluster Quorum as a kind of gatekeeper, protecting your cluster during times of unexpected member loss. You can use Cluster Quorums to restrict operations on Maps or indeed the entire cluster based upon environmental criteria. This sounds great you say, but I’m still not sure how this can help me? OK. Let's take a look at a scenario… Imagine a cluster that has a very high number of writes to a certain map. We also have other maps that are not updated quite so frequently and all the while we have hundreds of clients all reading from the cluster but not at the same frequency as the data that is entering the system. In normal circumstances if a machine or a number of machines were to die in the cluster we may still have enough memory available to store our data, but the amount of threads available to process requests would be reduced. We now have less cores available and the partition threads in the cluster could quickly become overwhelmed by the one map that is updated rapidly. This could mean other clients becoming starved of threads, unable to service requests. It’s also possible that the remaining members would become so consumed that they’re unable to respond to membership pings, the knock on effect could result in the member being forced out of the cluster on the assumption that it is dead. To protect the rest of the cluster in the event of member loss we need a way to stop the writes to the high frequency map whilst allowing operations to the other data structures. We can then continue to provide a good service to our other users whilst the crashed machines are restored to the cluster. Bring on the Quorum! As of Hazelcast 3.5 we now have the ability to restrict operations on distinct data structures. We do this via a Quorum configuration. We observed that other IMDG products provide Quorums that have protection at a cluster level,we decided to go one step further and provide Quorum protection around data structures as well. In the example below we create a very simple Quorum on the default map. The ‘default’ map in Hazelcast is the configuration used if no other match is found. In this instance no operations will be allowed unless the cluster has a minimum of 3 members. You’ll also note that the Quorum configuration is separate from the Map. This means that you can have multiple Quorums in a cluster attached to many different structures. If the Quorum thresholds are not satisfied then a QuorumException is thrown when we try to interact with the default map in any way. Be it from a client or another member. 3 quorumRuleWithThreeNodes Quorum Functions It’s simple to set up a Quorum check based on cluster size as we’ve seen above, but if you want to make a slightly more complex check you can do this by applying a Quorum Function. QuorumConfig quorumConfig = new QuorumConfig(); quorumConfig.setName("MyQuorum"); quorumConfig.setEnabled(true); quorumConfig.setType(QuorumType.WRITE); quorumConfig.setQuorumFunctionImplementation(new QuorumFunction() { @Override public boolean apply(Collection members) { return (members.size() >= 3) && (someOtherExternalClusterState); } }); In the example above we use Configuration API to set-up the Quorum to disallowwrites if the boolean returned from the QuorumFunction is false. In the function we test if the size of the cluster is greater than 3 and also if a variable namedsomeOtherExternalClusterState is equal true. You now get the idea that by using a function you can test for other state and not just cluster member. Listen In. Another nice feature of Quorums is the ability to listen in to Quorum Events. You can register a new callback interface called not surprisingly a QuorumListener. Quorum listeners are local to the node that they are registered, so they receive only events occurred on that local node. 3 com.company.quorum.ThreeNodeQuorumListener quorumRuleWithThreeNodes The QuorumListener has just one method that is called passing you aQuorumEvent. package com.hazelcast.quorum; import java.util.EventListener; /** * Listener to get notified when a quorum state is changed */ public interface QuorumListener extends EventListener { /** * Called when quorum presence state is changed. * * @param quorumEvent provides information about quorum presence and current member list. */ void onChange(QuorumEvent quorumEvent); } The QuorumEvent itself allows you to determine if a Quorum has been established or if it has been lost via its isPresent() method call. Additionally it provides the required cluster members to form a quorum and also the current membership list. Query the Quorums. Above we saw how we could receive callbacks, but in some cases we may just wish to make an immediate check to see if the Quorum is established or not. We can do this via the QuorumService. HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config); QuorumService quorumService = hazelcastInstance.getQuorumService(); Quorum quorum = quorumService.getQuorum(quorumName); boolean quorumPresence = quorum.isPresent(); In Conclusion The Cluster Quorum feature is another important tool for you to manage your cluster. In future versions of Hazelcast there are plans to add other data structures, for example you’ll be able to protect operations against Topics or Queues.
June 24, 2015
by Andrea Echstenkamper
· 2,834 Views · 2 Likes
article thumbnail
New Relic’s Docker Monitoring Now Generally Available
[This article was written by Andrew Marshall] We’ve been talking a lot about Docker over the past few weeks—with good reason. Docker’s explosive growth in popularity within the enterprise has enabled new distributed application architectures and with it a need for app-centric monitoring of your Docker containers within the context of the rest of your infrastructure. We’re thrilled to announce today that New Relic’s Docker monitoring is now generally available to New Relic customers, just in time for DockerCon 2015! (And as we noted last week, New Relic’s Docker monitoring solution has been selected by Docker for its Ecosystem Technology Partner program as a proven container monitoring solution.) Why app-centric monitoring? If you’re a software business using Docker containers, chances are you’ve done so to gain efficiencies from your system resources or portability across environments to shorten the cycle between writing and running code. Either way, adding Docker containers to your app development meant a new tier of infrastructure to monitor, which equated to a “black box” in your data—one that you had no visibility into from a monitoring perspective, Docker monitoring with New Relic is designed to “fix” this lack of monitoring visibility by adding an app-centric view of Docker containers to the existing New Relic Servers interface you already use. Now, instead of having a gap between the application and server monitoring views, we’ve added the ability to see containers with the same “first-class“ experience as you would with virtual machines and servers. You can now drill down from the application (which is really what you care about) to the individual Docker container, and then to the physical server. No more blind spots! As we strive to do with all of our products, we took the approach of “important” over “impressive” when it comes to the container information we provide to users. Based on direct feedback from customers, we’ve tried to take the mystery out of finding the right container to help you get back to developing your applications. As the way people use containers changes over time, we plan to continue to listen to our customers to help shape how we approach Docker container monitoring. Restoring 360-degree view of your application environment One example of how app-centric monitoring can impact a team moving to microservices or distributed application environments is Motus, a mobile workforce management company. Motus has been a New Relic customer for more than four years and recently has been shifting to a microservices architecture with approximately 95% of its production workload now running in Docker containers. While Docker helpd Motus gain speed and agility while reducing infrastructure complexity, the link between the application and what was happening with the container it was running on was broken. During the trial of New Relic’s Docker monitoring, Motus was able to more easily identify which container an app was running on, all the way down to the node. That was a big help when they needed to investigate an issue and determine if a new container was required.. During the beta alone, Motus estimates that using New Relic helped them to reduce the time to investigate and fix problems with its Docker containers by 30%! Motus isn’t just using New Relic to diagnose when a problem occurs. Docker monitoring with New Relic has helped Motus analyze and “right size” its containers for the application to better allocate resources for performance and budget. Get started with New Relic’s Docker monitoring today, for more information, please stop by our booth at DockerCon, June 22-23 in San Francisco! Resources: Motus Docker Monitoring Case Study Docker Monitoring with New Relic Enabling Docker Monitoring with New Relic Docker in the New Relic Community Forum
June 24, 2015
by Fredric Paul
· 1,019 Views
article thumbnail
Percona XtraDB Cluster (PXC): How Many Nodes Do You Need?
Written by Stephane Combaudon. A question I often hear when customers want to set up a production PXC cluster is: “How many nodes should we use?” Three nodes is the most common deployment, but when are more nodes needed? They also ask: “Do we always need to use an even number of nodes?” This is what we’ll clarify in this post. This is all about quorum I explained in a previous post that a quorum vote is held each time one node becomes unreachable. With this vote, the remaining nodes will estimate whether it is safe to keep on serving queries. If quorum is not reached, all remaining nodes will set themselves in a state where they cannot process any query (even reads). To get the right size for you cluster, the only question you should answer is: how many nodes can simultaneously fail while leaving the cluster operational? If the answer is 1 node, then you need 3 nodes: when 1 node fails, the two remaining nodes have quorum. If the answer is 2 nodes, then you need 5 nodes. If the answer is 3 nodes, then you need 7 nodes. And so on and so forth. Remember that group communication is not free, so the more nodes in the cluster, the more expensive group communication will be. That’s why it would be a bad idea to have a cluster with 15 nodes for instance. In general we recommend that you talk to us if you think you need more than 10 nodes. What about an even number of nodes? The recommendation above always specifies odd number of nodes, so is there anything bad with an even number of nodes? Let’s take a 4-node cluster and see what happens if nodes fail: If 1 node fails, 3 nodes are remaining: they have quorum. If 2 nodes fail, 2 nodes are remaining: they no longer have quorum (remember 50% is NOT quorum). Conclusion: availability of a 4-node cluster is no better than the availability of a 3-node cluster, so why bother with a 4th node? The next question is: is a 4-node cluster less available than a 3-node cluster? Many people think so, specifically after reading this sentence from the manual: Clusters that have an even number of nodes risk split-brain conditions. Many people read this as “as soon as one node fails, this is a split-brain condition and the whole cluster stop working”. This is not correct! In a 4-node cluster, you can lose 1 node without any problem, exactly like in a 3-node cluster. This is not better but not worse. By the way the manual is not wrong! The sentence makes sense with its context. There could actually reasons why you might want to have an even number of nodes, but we will discuss that topic in the next section. Quorum with multiple data centers To provide more availability, spreading nodes in several datacenters is a common practice: if power fails in one DC, nodes are available elsewhere. The typical implementation is 3 nodes in 2 DCs: Notice that while this setup can handle any single node failure, it can’t handle all single DC failures: if we lose DC1, 2 nodes leave the cluster and the remaining node has not quorum. You can try with 4, 5 or any number of nodes and it will be easy to convince yourself that in all cases, losing one DC can make the whole cluster stop operating. If you want to be resilient to a single DC failure, you must have 3 DCs, for instance like this: Other considerations Sometimes other factors will make you choose a higher number of nodes. For instance, look at these requirements: All traffic is directed to a single node. The application should be able to fail over to another node in the same datacenter if possible. The cluster must keep operating even if one datacenter fails. The following architecture is an option (and yes, it has an even number of nodes!): Conclusion Regarding availability, it is easy to estimate the number of nodes you need for your PXC cluster. But node failures are not the only aspect to consider: Resilience to a datacenter failure can, for instance, influence the number of nodes you will be using.
June 24, 2015
by Peter Zaitsev
· 1,390 Views
article thumbnail
Git for Windows, Getting Invalid Username or Password with Wincred
if you use https to communicate with your git repository, es, github or visualstudioonline, you usually setup credential manager to avoid entering credential for each command that contact the server. with latest versions of git you can configure wincred with this simple command. git config --global credential.helper wincred this morning i start getting error while i’m trying to push some commits to github. $ git push remote: invalid username or password. fatal: authentication failed for 'https://github.com/proximosrl/jarvis.documents tore.git/' if i remove credential helper (git config –global credential.helper unset) everything works, git ask me for user name and password and i’m able to do everything, but as soon as i re-enable credential helper, the error returned. this problem is probably originated by some corruption of stored credentials, and usually you can simply clear stored credentials and at the next operation you will be prompted for credentials and everything starts worked again. the question is, where are stored credential for wincred? if you use wincred for credential.helper, git is storing your credentials in standard windows credential manager you can simply open credential manager on your computer, figure 1: credential manager in your control panel settings opening credential manager you can manage windows and web credentials. now simply have a look to both web credentials and windows credentials, and delete everything related to github or the server you are using. the next time you issue a git command that requires authentication, you will be prompted for credentials again and the credentials will be stored again in the store. gian maria.
June 23, 2015
by Ricci Gian Maria
· 20,958 Views
article thumbnail
It's Time to Start Programming (for) Adults
This week we're in Boston at DevNation, an awesome, young (second ever), and relatively intimate (~500 attendees) conference on anything and everything hard-core, cool-and-hot (DevOps, big data, Angular, IoT, you name it), and of course -- since the conference is organized by Red Hat -- totally open-source. So far I've had in-depth conversations with five super-amazing engineers, attended several inspiring keynotes, and chatted with one skilled developer after another. We'll transcribe the deeper interviews shortly, including some on topics totally unrelated to this post. But meanwhile I'd like to offer some thoughts inspired by the first day of the event. The general theme is: we're just beginning to get serious about separation of concerns. The metaphor that keeps popping into my head comes from the first keynote: machines have finally grown up. Imperatives: telling really unintelligent agents what to do (and then they sort of do whatever they please) It is trivial to observe that computers are incredibly stupid. Turing's fundamental paper is about how to figure out whether a theoretical computer will keep calculating the values of a function until the heat death of the universe (okay that's a slight oversimplification). The fact that Edsger Dijkstra felt the need to rail gently against all goto statements in any higher-level language than machine code suggests that, in 1968, far too many computers needed instructions about how to read the instructions that tell them what to do in the first place. Richard Feynman's famous lecture on computer heuristics is the condescension of the man who conceived quantum computing to the level of functional composition (hmmm) and file systems (double sigh). Stupid agents need to be told exactly what to do. Then they need to be told to pay attention to the exact part of the command that tells them exactly what they have been told to do (dude, just goto line 1343 already and shut up). Then they don't do what you told them (optimistically we call this an 'exception'), and then you send them into time out / set a break point and try to figure out where the idiot state muted off the rails. They stare blankly at the wall / variable / register and either do nothing or repeat another unintelligibly wrong result until you notice that your increment is (apparently meaninglessly to you) one bracket too deep. You sigh and tell them what to do again, and after a while they hit age thirty (life-years/debug-hours) and maybe do something useful with their (process-)lives. Well, maybe I'm straining the metaphor a little here, but you get the point because it cuts too close to home. We spend far too much time fixing stupid mistakes that we didn't even know we were making because -- like all actual human beings -- we assumed that the agent we commanded will use their common sense to iron out those few whiffs of, admit it, frank nonsense that our step-by-step instructions will probably always contain. So, at least, goes the imperative programming paradigm. The machine does what you tell it to; and the universe collapses onto itself before the last real number is computed. Functions: reliable, predictable adults Time to give credit where it's due: I'm really just riffing on the metaphor Venkat Subramanian offered in his highly enjoyable keynote on The Joy of Functional Programming yesterday morning His not-so-smart agents -- the 'programmed' of imperative programming -- were toddlers. Since I don't have any kids, I can't presume to understand this experience fully (although I did grow up with three younger brothers..). But the general idea is: imperative programming is tricky because, when you spell everything out super literally, it's very hard to tell exactly why what you thought should happen didn't. Venkat's talk was a whirlwind of functional concepts, from the thrill of immutability to the self-evident utility of memoization. For random (Myers-Briggs?) reasons, the object-oriented paradigm never seemed very intuitive to me -- I've gravitated towards functional style even when the problem domain wasn't actually modeled very well by functions -- but Venkat's side-by-side implementations of simple calculations in OO and functional Java showed the readability delta very clearly. Functional code is beautiful because it looks like its purpose. It tells you flat-out: here is what I do; and then it does it. But immutable functions are also beautiful because they do exactly the same thing every time. I couldn't count on my two year old brother very much at all because given a certain input I had pretty much no idea what would come out. But we all count on our grown-up collaborators to output exactly what they should, given a definite input, predictably and reliably every time. Of course, people also do more than expected -- every intervention of intelligence is an injection of creativity, not generated by the definition of the function -- but at least they do what you need them to do and no less. Containers: grown-ups with good boundaries I'm picking out just one aspect of the resurgent 'joy' of functional programming because the renaissance of containerization (another 'old' technology that is just now really taking off) is, I think, a part of the same shift toward, let's say, treating computers as adults. If functions are reliable agents, then applications in well-defined containers are self-sufficient agents who know exactly what they need from others and neither require nor demand anything more. If apps on dedicated VMs are teenagers negotiating personal boundaries by waking/booting up independently (and taking far too long -- and far too many resources -- to do so, given their meager output) -- or bubble boys, isolated in ways that are unfortunate in order to isolate in ways that are absolutely necessary -- then containerized applications are subway-riders who jam into the train without offending anyone or campers who can live anywhere with just a backpack of just the stuff they need. Of course, subway-riders and campers do more than just not-mess-up. But what's kind of neat about containers is that -- like an adult with good boundaries -- clearly defined bounds and interfaces free up the application / mind to do whatever world-changing thing the developer / human has cooked up. I'll come back to this metaphor in a later article. (Mesh networks, SDN, and ad-hoc computing are all part of the same picture, I think. Kubernetes probably is too, along with event-driven and reactive programming, the actor model, dreams of Smalltalk, and of course REST, at least of the HATEOAS flavor.) But maybe this isn't a good way to think about some of these recent sparks in devworld within a single paradigm -- and maybe my perpetual discomfort with OO is influencing me too much. What do you think?
June 23, 2015
by John Esposito
· 1,935 Views · 1 Like
article thumbnail
FusionExperience announces successful partnership with Cloud Consulting
London, UK – FusionExperience, the business and data solutions provider, today announces the success of its first salesforce.com partnership with Cloud Consulting Ltd. (CCL). CCL was working with an international airline client to migrate a legacy charter and group booking application from one Salesforce.com instance to a new one. Very early on in the project CCL discovered that there were considerable elements of unsupported custom code and that these had to be redesigned and redeveloped. The airline took the opportunity at this stage to request changes and improve the application in line with their new business processes. CCL worked with FusionExperience to migrate the application to the latest salesforce.com environment and re-architected the booking engine functionality and complex pricing algorithms using Apex and VisualForce. For business reasons the airline had a strict project deadline and despite all the unknowns involved the project timescales were maintained and FusionExperience delivered on time and to budget. The airline went live with the application on schedule without any post-production problems or warranty fixes required. They now have an up to date system that has achieved a game changing transformation in the way it does business. Robin James, Platform Evangelist for FusionExperience said; “The ability to seamlessly work with our partners on salesforce.com projects enables rapid scaling of resources and capabilities. This ensures that the client is delighted by the results, yet unaware of the complex extended ecosystem that has been involved. This is facilitated by that fact that we all speak the same salesforce.com language. Cloud Consulting is an ideal partner to work with in this way, as our delivery and technical strengths are well matched with their intimate client facing approach.” Tim Pullen, Managing Director of CCL added: “We already had a close relationship with FusionExperience and it was natural for us to turn to them for help with this suddenly extremely challenging project. The combination of cleaning, segmenting and splitting the data in Salesforce.com, extracting the system configuration and custom code and then creating a new system was tough enough to start but then having to redevelop the application from scratch took it to a new level. Right from the start Robin James and his team took everything in their stride and provided a level of comfort, reassurance, skill and professionalism that we’d never experienced before from other partners. Bear in mind that the old system had no user or technical documentation plus undocumented code and you begin to understand just how good the end result has been for the airline. Thank you Fusion!”
June 22, 2015
by Fran Cator
· 835 Views
article thumbnail
Devnation Keynote 6/22 #2: The Future of Development with Kubernetes and Docker
From the DevNation Agenda site: You've probably heard a lot about Linux containers and the exciting potential they hold. In this presentation, Matt Hicks will cover how Docker and Kubernetes have evolved to fundamentally change how you will approach development and operations. If you are looking for an understanding of the technology and how it relates to the common roles in IT today, this is the talk to watch. Speaker: Matt Hicks -- Vice President of engineering, Red Hat Matt Hicks is a founding member of the OpenShift by Red Hat team. He has spent more than a decade in software engineering, with a variety of roles in development, operations, architecture, and management. His real expertise is in bridging the gap between developing code and actually running it in production. An expert in IT and cloud-based architectures, he spends his time these days evolving OpenShift to use the power of cloud and make developers more productive.
June 22, 2015
by N A
· 1,088 Views · 2 Likes
article thumbnail
The Coming Donkey Apocalypse: DevOps Days Austin
After a bit of a gap I’m continuing the my series from DevOps Days Austin. After Damon Edwards kicked off the event, Michael Cote of Pivotal took the stage. Cote presented “The coming donkey apocalypse — what happens when Devops goes mainstream.” Take a listen (you can find his slides below): <br> Some of the ground that Cote covers: What DevOps as a community needs to focus on next to expand Unicorns (eg Uber and Netflix), Horses (eg top banks) and Donkeys (mainstream organizations) 3 key areas of DevOps to focus on today Culture and process Supporting legacy code Tools and technology Interviews on tap: Cameron Haight – Gartner John Willis – Docker Paul Read – Release Engineering Approaches Extra Credit reading Here’s how we can push DevOps mainstream – Cote Opening Keynote – Damon Edwards Pau for now…
June 21, 2015
by Barton George
· 851 Views
article thumbnail
Long-Term Log Analysis with AWS Redshift
You will aggregate a lot of logs over the lifetime of your product and codebase, so it’s important to be able to search through them. In the rare case of a security issue, not having that capability is incredibly painful. You might be able to use services that allow you to search through the logs of the last two weeks quickly. But what if you want to search through the last six months, a year, or even further? That availability can be rather expensive or not even an option at all with existing services. Many hosted log services provide S3 archival support which we can use to build a long-term log analysis infrastructure with AWS Redshift. Recently I’ve set up scripts to be able to create that infrastructure whenever we need it at Codeship. AWS Redshift AWS Redshift is a data warehousing solution by AWS. It has an easy clustering and ingestion mechanism ideal for loading large log files and then searching through them with SQL. As it automatically balances your log files across several machines, you can easily scale up if you need more speed. As I said earlier, looking through large amounts of log files is a relatively rare occasion; you don’t need this infrastructure to be around all the time, which makes it a perfect use case for AWS. Setting Up Your Log Analysis Let’s walk through the scripts that drive our long-term log analysis infrastructure. You can check them out in the flomotlik/redshift-logging GitHub repository. I’ll take you step by step through configuring the whole setup of the environment variables needed, as well as starting the creation of the cluster and searching the logs. But first, let’s get a high-level overview of what the setup script is doing before going into all the different options that you can set: Creates an AWS Redshift cluster. You can configure the number of servers and which server type should be used. Waits for the cluster to become ready. Creates a SQL table inside the Redshift cluster to load the log files into. Ingests all log files into the Redshift cluster from AWS S3. Cleans up the database and prints the psql access command to connect into the cluster. Be sure to check out the script on GitHub before we go into all the different options that you can set through the .env file. Options to set The following is a list of all the options available to you. You can simply copy the .env.template file to .env and then fill in all the options to get picked up. AWS_ACCESS_KEY_ID AWS key of the account that should run the Redshift cluster. AWS_SECRET_ACCESS_KEY AWS secret key of the account that should run the Redshift cluster. AWS_REGION=us-east-1 AWS region the cluster should run in, default us-east-1. Make sure to use the same region that is used for archiving your logs to S3 to have them close. REDSHIFT_USERNAME Username to connect with psql into the cluster. REDSHIFT_PASSWORD Password to connect with psql into the cluster. S3_AWS_ACCESS_KEY_ID AWS key that has access to the S3 bucket you want to pull your logs from. We run the log analysis cluster in our AWS Sandbox account but pull the logs from our production AWS account so the Redshift cluster doesn’t impact production in any way. S3_AWS_SECRET_ACCESS_KEY AWS secret key that has access to the S3 bucket you want to pull your logs from. PORT=5439 Port to connect to with psql. CLUSTER_TYPE=single-node The cluster type can be single-node or multi-node. Multi-node clusters get auto-balanced which gives you more speed at a higher cost. NODE_TYPE Instance type that’s used for the nodes of the cluster. Check out the Redshift Documentation for details on the instance types and their differences. NUMBER_OF_NODES=10 Number of nodes when running in multi-mode. CLUSTER_IDENTIFIER=log-analysis DB_NAME=log-analysis S3_PATH=s3://your_s3_bucket/papertrail/logs/862693/dt=2015 Database format and failed loads When ingesting log statements into the cluster, make sure to check the amount of failed loads that are happening. You might have to edit the database format to fit to your specific log output style. You can debug this easily by creating a single-node cluster first that only loads a small subset of your logs and is very fast as a result. Make sure to have none or nearly no failed loads before you extend to the whole cluster. In case there are issues, check out the documentation of the copy command which loads your logs into the database and the parameters in the setup script for that. Example and benchmarks It’s a quick thing to set up the whole cluster and run example queries against it. For example, I’ll load all of our logs of the last nine months into a Redshift cluster and run several queries against it. I haven’t spent any time on optimizing the table, but you could definitely gain some more speed out of the whole system if necessary. It’s just fast enough already for us out of the box. As you can see here, loading all logs of May — more than 600 million log lines — took only 12 minutes on a cluster of 10 machines. We could easily load more than one month into that 10-machine cluster since there’s more than enough storage available, but for this post, one month is enough. After that, we’re able to search through the history of all of our applications and past servers through SQL. We connect with our psql client and send of SQL queries against the “events’ database. For example, what if we want to know how many build servers reported logs in May: loganalysis=# select count(distinct(source_name)) from events where source_name LIKE 'i-%'; count ------- 801 (1 row) So in May, we had 801 EC2 build servers running for our customers. That query took ~3 seconds to finish. Or let’s say we want to know how many people accessed the configuration page of our main repository (the project ID is hidden with XXXX): loganalysis=# select count(*) from events where source_name = 'mothership' and program LIKE 'app/web%' and message LIKE 'method=GET path=/projects/XXXX/configure_tests%'; count ------- 15 (1 row) So now we know that there were 15 accesses on that configuration page throughout May. We can also get all the details, including who accessed it when through our logs. This could help in case of any security issues we’d need to look into. The query took about 40 seconds to go though all of our logs, but it could be optimized on Redshift even more. Those are just some of the queries you could use to look through your logs, gaining more insight into your customers’ use of your system. And you et all of that with a setup that costs $2.50 an hour, can be shut down immediately, and recreated any time you need access to that data again. Conclusions Being able to search through and learn from your history is incredibly important for building a large infrastructure. You need to be able to look into your history easily, especially when it comes to security issues. With AWS Redshift, you have a great tool in hand that allows you to start an ad hoc analytics infrastructure that’s fast and cheap for short-term reviews. Of course, Redshift can do a lot more as well. Let us know what your processes and tools around logging, storage, and search are in the comments.
June 21, 2015
by Florian Motlik
· 1,449 Views
article thumbnail
Spring XD 1.2 GA, Spring XD 1.1.3 and Flo for Spring XD Beta Released
Written by Mark Pollack. Today, we are pleased to announce the general availability of Spring XD 1.2, Spring XD 1.1.3 and the release of Flo for Spring XD Beta. 1.2.0.GA: zip 1.1.3.RELEASE: zip Flo for Spring XD Beta You can also install XD 1.2 using brew and rpm The 1.2 release includes a wide range of new features and improvements. The release journey was an eventful one, mainly due to Spring XD’s popularity with so many different groups, each with their respective request priorities. However the Spring XD team rose to the challenge and it is rewarding to look back and review the amount of innovation delivered to meet our commitments toward simplifying big data complexity. Here is a summary of what we have been busy with for the last 3 months and the value created for the community and our customers. Flo for Spring XD and UI improvements Flo for Spring XD is an HTML5 canvas application that runs on top of the Spring XD runtime, offering a graphical interface for creation, management and monitoring streaming data pipelines. Here is a short screencast showing you how to build an advanced stream definition. You can browse the documentation for additional information and links to additional screen casts of Flo in action. The XD admin screen also includes a new Analytics section that allows you to easily view gauges, counters, field-value counters and aggregate counters. Performance Improvements Anticipating increased high-throughput and low-latency IoT requirements, we’ve made several performance optimizations within the underlying message-bus implementation to deliver several million messages per second transported between Spring XD containers using Kafka as a transport. With these optimizations, we are now on par with the performance from Kafka’s own testing tools. However, we are using the more feature rich Spring Integration Kafka client instead of Kafka’s high level consumer library. For anyone who is interested in reproducing these numbers, please refer to the XD benchmarking blog, which describes the tests performed and infrastructure used in detail. Apache Ambari and Pivotal HD To help automate the deployment of Spring XD on an Apache HadoopⓇ cluster, we added an Apache AmbariⓇ plugin for Spring XD. The plugin is supported on both Pivotal HD 3.0 and Hortonworks HDP 2.2 distributions. We also added support in Spring XD for Pivotal HD 3.0, bringing the total number of Hadoop versions supported to five. New Sources, Processors, Sinks, and Batch Jobs One of Spring XD’s biggest value propositions is its complete set of out-of-the-box data connectivity adapters that can be used to create real-time and batch-based data pipelines, and these require little to no user-code for common use-cases. With the help of community contributions, we now have MongoDB, VideCap, and FTP as source modules, an XSLT-transformer processor, and FTP sink module. The XD team also developed a Cassandra sink and a language-detection processor. Recognizing the important role in the Pivotal Big Data portfolio, we have also added native integration with Pivotal Greenplum Database and Pivotal HAWQ through gpfdist sink for real-time streaming and also support for gpload based batch jobs. Adding to our developer productivity theme and the use of Spring XD in production for high-volume data ingest use-cases, we are delighted to recognize Simon Tao and Yu Cao (EMC² Office of The CTO & Labs China), who have been operationalizing Spring XD data pipelines in production since 2014 and also for the VideCap source module contribution. Their use-case and implementation specifics (in their own words) are below. “There are significant demands to extract insights from large magnitude of unstructured video streams for the video surveillance industry. Prior to being analyzed by data scientists, the video surveillance data needs to be ingested in the first place. To tackle this challenge, we built a highly scalable and extensible video-data ingestion platform using Spring XD. This platform is operationally ready to ingest different kinds of video sources into a centralized Big Data Lake. Given the out-of-the-box features within Spring XD, the platform is designed to allow rich video content processing capabilities such as video transcoding and object detection, etc. The platform also supports various types of video sources—data processors and data exporting destinations (e.g. HDFS, Gemfire XD and Spark)—which are built as custom modules in Spring XD and are highly reusable and composable. With a declarative DSL, a video ingestion stream will be handled by a video ingestion pipeline defined as Directed Acyclic Graph of modules. The pipeline is designed to be deployed in a clustered environment with upstream modules transferring data to downstream ones efficiently via the message bus. The Spring-XD distributed runtime allows each module in the pipeline to have multiple instances that run in parallel on different nodes. By scaling out horizontally, our system is capable of supporting large scale video surveillance deployment with high volume of video data and complex data processing workloads.” Custom Module Registry and HA Support Though we have had the flexibility to configure shared network location for distributed availability of custom modules (via: xd.customModule.home), we also recognized the importance of having the module-registry resilient under failure scenarios—hence, we have an HDFS backed module registry. Having this setup for production deployment provides consistent availability of custom module bits and the flexibility of choices, as needed by the business requirements. Pivotal Cloud Foundry Integration Furthering the Pivotal Cloud Foundry integration efforts, we have made several foundation-level changes to the Spring XD runtime, so we are able to run Spring XD modules as cloud-native Apps in Lattice and Diego. We have aggressive roadmap plans to launch Spring XD on Diego proper. While studying Diego’s Receptor API (written in Go!), we created a Java Receptor API, which is now proposed to Cloud Foundry for incubation. Next Steps We have some very interesting developments on the horizon. Perhaps the most important, we will be launching new projects that focus on message-driven and batch-oriented “data microservices”. These will be built directly on Spring Boot as well as Spring Integration and Spring Batch, respectively. Our main goal is to provide the simplest possible developer experience for creating cloud-native, data-centric microservice apps. In turn, Spring XD 2.0 will be refactored as a layer above those projects, to support the composition of those data microservices into streams and jobs as well as all of the “as a service” aspects that it provides today, but it will have a major focus on deployment to Cloud Foundry and Lattice. We will be posting more on these new projects soon, so stay tuned! Feedback is very important, so please get in touch with questions and comments via * StackOverflowspring-xd tag * Spring JIRA or GitHub Issues Editor’s Note: ©2015 Pivotal Software, Inc. All rights reserved. Pivotal, Pivotal HD, Pivotal Greenplum Database, Pivotal Gemfire and Pivotal Cloud Foundry are trademarks and/or registered trademarks of Pivotal Software, Inc. in the United States and/or other countries. Apache, Apache Hadoop, Hadoop and Apache Ambari are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All Posts Engineering Releases News and Events
June 21, 2015
by Pieter Humphrey
· 3,690 Views
article thumbnail
How New Relic Introduced US Foods to DevOps
At last week’s New Relic User Group meetup in Chicago, David M. Kent, Senior Director, Technical Architecture, of giant food distributor US Foods told the “story of how New Relic introduced us to the concept of DevOps.” It’s a fascinating tale of transforming technology at a 162-year-old company, but let’s take a moment to put his presentation in context. Sean Carpenter, Product Mgr., New Relic David’s talk was the highlight of a rewarding evening for some 50 New Relic users that also featured a spirited round of “Stump the Relics” Q&A as well as our own Sean Carpenter sharing tips on how to manage complexity with New Relic’s new Service Maps. New Relic has a definite outlook on how to monitor, Sean said: The key is to start by focusing on providing a good experience, not tossing in a zillion features into version 1.0. Then you can work on continuous improvement as you learn more about what makes the most difference to actual users. Step one in US Foods’ quest for DevOps Formed in 1853 to sell provisions during the California Gold Rush, US Foods launched its first ecommerce site in 1999, and now books more than 50% of its annual e-commerce revenue from more than 107,000 active customers. But after deploying Release 2 of its e-commerce platforms in 2012, the company made the decision to rethink its e-commerce strategy, adopting an agile methodology as it moved toward Release 3 on a completely new architecture stack. (The company is now 40% done moving customers to the new platform, David said, after two years of development and planning.) David Kent, US Foods For Release 3, the company decided all new servers would be virtual machines, and brought in “agile coaches to teach us how to be agile,” David said. The team worked on automating their build, deployment, and testing processes. The long-term goal? To create a DevOps model comprised of building a culture of collaboration where dev, infrastructure, and QA teams work together harmoniously using advanced tools to automate traditionally manual processes. Overcoming challenges The biggest challenge the company faced, David noted, was the mindset of longtime managers on the operations side. As one of his slides noted: Some business apps written 30 years ago are still in production Some managers hired 30 years ago are still in production “We wanted to refocus on culture and help people understand why DevOps is a good idea [but] they still don’t understand why we needed to go faster… ‘We’ve been doing fine, meeting the business needs for 30 years, why do I need to deploy daily or weekly?’ they’d ask.” “We’re still trying to figure out how to fix that,” he said. “We are also trying to improve collaboration across teams with more video conferencing and collaboration tools.” But in a big, geographically distributed company (US Foods has its data center in Phoenix while its business development team is located in the Chicago suburbs), he believes travel is still important to “get face time with key people.” The role of New Relic David Kent presents to the Chicago User Group audience. US Foods brought in New Relic late in 2012: “We were having problems,” David said, and “we didn’t know how to troubleshoot them efficiently.” Although originally targeted to provide end-to-end transaction monitoring for Release 3, David said, US Foods has found surprising ways to get value out of New Relic in the meantime. “Any time an app was having performance problem, we said, ‘Hey, let’s put New Relic on it,’ and we can find out what the problem is.” Broader visibility is another nice thing about New Relic, David said. US Foods now has 121 people with visibility into all of these monitors. “They get the same performance information that the engineers have!” Going forward, the company is running a proof-of-concept trial with New Relic Mobile on the latest version of its mobile app, and is very interested in using New Relic Insights to incorporate nontechnical data. They want to track things like how many cases they’re shipping, David said. There’s a role for New Relic Synthetics, too. “Most of our traffic comes during the day,” David said. “Our customers are primarily ordering around noon… there’s no activity at night. But we also want to know if there’s a problem at night before our customers find it, and Synthetics can help with that.” It was a great presentation and a great night. Big thanks to Sprout Social for hosting everyone in their awesome space! DevOps Days discount Finally, for more on DevOps in Chicago, Jerry Cattell told user group attendees thatDevOpsDays is returning to Chicago on August 25 and 26. The conference brings development and operations together to solve problems, explore tools, and think big about DevOps concepts and philosophies—and New Relic users can get a 10% discount with the code NEWRELIC10. Also note that the call for proposals is open until June 30 if you want to share something about your company culture, development practices, deployment pipelines, metrics, monitoring, or interesting ways you are using New Relic. Join our Meetup groups and stay connected New Relic User Groups are a great opportunity to meet and learn from other New Relic users in your area, hang with the New Relic team, and get answers to lingering questions. If you’re in Chicago, be sure to join our Chicago New Relic User Group Meetup.com page and get notifications for future events. To see all the New Relic User Group events coming up, visit our Events page (just choose “NRUG” from the pull-down menu to see only the User Group events). Interested in speaking at our next event or setting up a New Relic User Group in your city? Drop us a note at [email protected]. Note: Event dates, speakers, and schedules are subject to change without notice.
June 20, 2015
by Fredric Paul
· 1,417 Views
article thumbnail
Why We Need Continuous Integration
Introduction Continuous integration is a practice that helps developers deliver better software in a more reliable and predictable manner. This article deals with the problems developers face while writing, testing and delivering software to end users. Through exploring continuous integration, we will cover how we can overcome these issues. The Problem First, we will take a look at the source of the problem, which lies in the software development cycle. Next, we will cover some of the change conflicts that can take place during that process, and finally we will explore the main factors that can make these problems escalate, followed by an explanation of how continuous integration solves these issues. The Source of the Problem Let's take a look at what a traditional software development cycle looks like. Each developer gets a copy of the code from the central repository. The starting point is usually the latest stable version of the application. All developers begin at the same starting point, and work on adding a new feature or fixing a bug. Each developer makes progress by working on their own or in a team. They add or change classes, methods and functions, shaping the code to meet their needs, and eventually they complete the task they were assigned to do. Meanwhile, the other developers and teams continue working on their own tasks, changing the code or adding new code, solving the problems they have been assigned. If we take a step back and look at the big picture, i.e. the entire project, we can see that all developers working on a project are changing the context for the other developers as they are working on the source code. As teams finish their tasks, they copy their code to the central repository. There are two scenarios that can take place at this point. The code in the central repository is unchanged The code is the same as the initial copy. If this is the case, things are simple, because the system is unchanged. All the ideas we had about the system still stand. This is always the case if you are the only developer working on the application and if you have finished your work before the other members of your team. Either way, things are looking good for you. The system you have created and tested can be delivered to users without additional changes. The code in the central repository has changed The second scenario is that the application you have been working on has changed, and you discover this at the point when you try to copy your code over to the central repository. Changes in the code may or may not be in conflict with the ones you've made. If there are conflicts, you need to resolve them in order to be able to successfully deliver your code to the users. In this case, things could get complicated. Next, we'll explore the types of conflicts that can happen and what you may need to do to resolve them. Change Conflicts There are several types of change conflicts that can occur when integrating code. Here are some of the most common ones. We'll start with the simplest scenarios, and gradually explore the more complex ones. The implementation details have changed - You refactored a method, but so did the developer that has already integrated their code into the central repository. The behavior of the method is the same in all three implementations. You will need to pick the version that will stay, and remove the other implementations. You can even come up with a fourth implementation. This is a simple type of conflict, which you can usually resolve within a few minutes. The APIs you have been relying on have changed - For instance, the behavior of a certain method has changed. This could affect your code in a number of ways — from minor changes that you might need to make, to major structural changes. There is no silver bullet in such cases. You will need to carefully study the changes and make all the fixes. An entire subsystem of the application behaves in a different way - in such cases you will almost certainly be facing a partial, if not a full rewrite of your solution. If this is the case, you will probably need to speak with all the developers working on the application, because such a significant change should not happen without letting the rest of the team know about it. These and a number of other issues could come up, caused by various factors. Different versions of frameworks, libraries, databases are another potential source of conflicts. Once you have updated your code so it can be compiled or interpreted, you also need to remember to repeat all the tests that you have previously ran. These examples show that the amount of work needed to solve a problem that was initially assigned to a developer can easily double. Escalating Factors Here are some of the main factors that can make these problems escalate. The size of the team working on the project. The number of changes that are being pushed back into the main repository is proportional to the number of people on the project. This makes the process of integrating code into the main repository significantly harder. The amount of time passed since the developer got the latest version of the code from the central repository. As time passes, other people working on the same project are integrating more and more of their work, and changing the context in which your code needs to run. Sometimes the changes in the main repository are so big that it's easier to do a complete rewrite of your solution. A large number of changes in the system make integration events more complex and can have a huge effect on the productivity of the team. Such situations are even referred to as "integration hell". This process has a number of other negative consequences for your business. Testing and fixing bugs can take forever. Your releases are running late. Teams are stressed out because of long and unpredictable release cycles, and morale deteriorates. Solution: Integrate Continuously The solution to the problem of managing a large number of changes in big integration events is conceptually simple. We need to split these big integration events into much smaller integration events. This way, developers need to deal with a much smaller number of changes, which are easier to understand and manage. To keep integration events small and easily manageable, we need them to happen often. A couple of times a day is ideal. The practice of doing small integrations often is called Continuous Integration. The idea is simple, but at the same time it often appears to be impossible to implement in practice. This is because changing the process requires us to change some of our own habits, and changing habits is difficult. The Practice of Continuous Integration In order to avoid the previously described issues, developers need to integrate their partially complete work back into the main repository on a daily basis, or even a couple of times a day. To accomplish this, they first need to pull in all the changes added to the main repository while they were working on the code. They also must make sure that their code will work once it is integrated into the main repository. The only way to ensure this is to test every feature of the application. What first comes into mind when we start considering continuous integration is that the developers would need to spend half of their time every day testing the code in order not to break the code in the main repository for everyone else. This is why the prerequisite for continuous integration is having an automated test suite. Automated tests take away the burden of the manual, repetitive, and error-prone testing process from the developers. They also make the entire testing process much quicker. A computer can replace hours of manual testing with just minutes of automated testing. Behavior-driven and test-driven development are techniques that help developers write clean, maintainable code while writing tests at the same time. Testing techniques are out of the scope of this article, and you can read more about them in other articles on Semaphore Community. Tests make sense only if they are executed every time the source code changes, without exception. A continuous integration service such as Semaphore CI is a tool which can automate this process by monitoring the central code repository and running tests on every change in the source code. Apart from running tests, they also collect test results and communicate those results to the entire team working on the project. The result of continuous integration is so important that many teams have a rule to stop working on their current task if the version in the central repository is broken. They join the team which is working on fixing the code until tests are passing again. The role of a continuous integration service is to improve the communication between developers by communicating the status of a project's source code. How to Adopt Continuous Integration Continuous integration as a practice makes a big contribution to improving the development process, but also calls for essential changes in the everyday development routine. Adopting it comes with challenges that are easy to overcome if the process is introduced gradually. One of the biggest challenges teams face is the lack of an automated testing suite. A good recipe for overcoming this situation is to start adding automated tests for all new features as they are being developed. At the same time, the developer working on a bug fix should also work to cover the related code with tests. Whenever a bug is reported, the team should first write a failing test to demonstrate the existence of bug. Once the fix is created, the tests should pass. Over time, the automated tests suite gradually becomes more comprehensive, and the developers begin relying on it more and more. Adopting a continuous integration service to communicate the status of the tests to the entire team in the early stages of a project is also important, because it raises awareness of the project status among team members. Conclusion Introducing continuous integration and automated testing into the development process changes the way software is developed from the ground up. It requires effort from all team members, and a cultural shift in the organization. Big changes in the workflow are not easy to pull off quickly. Changes have to be introduced gradually, and all team members and stakeholders need to be on board with the idea. Educating team members about the practice of continuous integration practice and building the automated tests suite needs to be done systematically. Once the first steps have been taken, the process usually continues on its own, as both developers and stakeholders begin seeing the benefits of automated testing suites and the peace of mind that this practice brings to the entire team. Article originally posted on the Semaphore Community.
June 20, 2015
by Darko Fabijan
· 1,195 Views
article thumbnail
Building Microservices: Using an API Gateway
Learn about using the microservice architecture pattern to build microservices and API gateways--compared to the usage of monolithic application architecture.
June 16, 2015
by Patrick Nommensen
· 121,121 Views · 40 Likes
article thumbnail
Why 12 Factor Application Patterns, Microservices and CloudFoundry Matter (Part 2)
Learn why 12 Factor Application Patterns, Microservices and CloudFoundry matter when trying to change the way your product is produced.
June 12, 2015
by Tim Spann DZone Core CORE
· 15,647 Views · 4 Likes
  • Previous
  • ...
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×