Using the Cloud as a Sandbox for Server Consolidation
This article discusses how to use the Cloud to streamline the design and planning of on-premise server consolidation projects.
Join the DZone community and get the full member experience.Join For Free
A server consolidation project is a significant undertaking, and IT teams will need to answer several weighty questions while planning for one. Can servers be consolidated more efficiently? If so, what is the cost and timeline for doing so? How can we adjust parameters like CPU, memory and storage if our estimates turn out to be incorrect? How will we handle R&D and testing for the new systems? And, most importantly, if we answer any of these questions incorrectly, how much will that impact the timeline and budget for the project?
It’s possible to answer these questions by using the Cloud as a sandbox environment to test design assumptions and validate ways of re-organizing and consolidating servers. Ultimately the servers will remain hosted on-premise, the cloud is just used as a placeholder in the planning and design phase of the project. Here's how I recommend approaching the project.
Move Existing Resources to the Cloud Unchanged
I suggest a “lift and shift” to create a clone of the final target system in the Cloud, but not a re-engineering any components into cloud-native equivalents: the same number of LPARs, same memory/disk/CPU allocations, same file system structures, same exact IP addresses, same exact hostnames, and network subnets, etc. IT can now use cloud benefits like fast creation of duplicates, ephemeral longevity, software-defined networking, and API automation. As design principles are finalized based on research performed on the cloud version of the system, those findings can be applied to the on-prem final buildout.
IT can reuse existing on-prem assets as the foundation for cloud components to speed this process along. IT can use tools like "alt-disk-copy" to take snapshots of root and data volume groups and move them to LPARs running in the cloud. They can also use existing on-prem mksysb images to build LPARs in the cloud.
Create Templates and Clone Environments
After moving the on-prem assets to be tested into the cloud, IT should assemble all the LPARs representing an “Environment” and save them as a single object called a Template. Then they can create copies of the Template down to the hostname, IP address, subnet, and disk allocations. They’ll need to implement some form of isolation to prevent collisions across duplicate environments. Each environment must exist within its own software-defined networking space, not visible to other environments that are also running. Using this mechanism, it is possible to create exact clones of multi-VM architectures with multiple subnets containing replicated address spaces. Each environment becomes a virtual private data center.
Now IT needs a way for identical network environments to coexist without breaking basic network constructs. One method for doing this is to have cloned environments communicate back to upstream on-prem resources via a single focal point called an “environment virtual router” (VR). The VR becomes the “jump-host” that allows operations like SSH into each unique environment. From on-prem, users first SSH to the jump-host, which exposes a unique IP address to on-prem, and then relays down to the VM within an individual environment. The VR also hides the lower VMs containing duplicate host names and IP addresses. This prevents individual hosts from needing to go through a risky and time-consuming “re-IP” process.
Allocate Templates to Users Based on Roles
Once Templates are created, they can be made available to users (typically engineering, development, and QA testing teams for a server consolidation project). Each of these teams can then work on their portion of the project in parallel on an exact duplicate of the target system.
Each cloud provider has a different mechanism for assigning users to roles and restricting their operations based on those roles. For example, developers don’t need access to components that are part of production and ENG might not need visibility to the core QA environments.
Test and Reset
Once an application environment has been used extensively for testing the design of the new, consolidated servers, the test data may become stale, or automated test cases have created data that must be reset or removed before the next test run can be executed. Another variation is that multiple, but slightly different test datasets are needed to validate all configurations of the target system. Traditional enterprise thinking would use scripts and possibly other automation techniques to delete database data, reset configuration files, and remove log files, which is risky and time-consuming.
This cloud sandbox method allows for a new way of resetting data. IT can save complete, ready-to-use VMs along with their network topology and data into an “image template.” If a database becomes corrupt, or heavily modified from the previous testing, instead of resetting the data via scripts, the cloud model replaces the entire database VM along with all of its data. This often takes just seconds or minutes versus hours or days. Imagine getting a complex multi-VM/LPAR back in just a few hours instead of having to build it all over again!
This approach also improves QA and integration testing. The cloud can create multiple “Integration” environments based on the target goal. R&D and ENG subgroups can each have their dedicated Integration environment where they can work independently without colliding with other system components. For example, all the teams working on database changes could first integrate their work into a localized Integration testing environment reserved for them. Once successful, they can move the bulk modifications to the higher level and start adding other system components.
The ability for multiple groups to be working on exact copies of the target system at the same time is extremely powerful. It allows for more experimentation and more efficient work. As design principles are finalized based on research performed on the cloud version of the system, those findings can be applied to the on-prem final buildout. With this method, issues can be discovered before beginning the project, rather than halfway through it. That means a more realistic idea of the cost, timeline, and final result of the consolidation, and a better final product for everyone.
Published at DZone with permission of Tony Perez. See the original article here.
Opinions expressed by DZone contributors are their own.