In part 1, Peter introduced green computing with Azure. He discussed where power saving can be achieved and the four levels of optimizing your infrastructure. This week, in part 2, he looks in detail at how to save power in each of those levels. He also discusses how Azure is different from other clouds when it comes to being green and he also throws in a word of caution.
Optimizing your solution
Ok, so now that you know what the 4 levels of application maturity mean you might want to optimize your solution to lift it from level 0 or 1 to level 3. Unfortunately in many cases this isn’t possible. Many times you have to take an iterative approach to move your application from one level to the next. The good news is that with all optimizations that you do to accommodate a certain level you are automatically preparing your application for the next level. Therefore the next part of this paper will inspect what you can do to lift your application to the next layer.
Starting at level 0
Be a good power saving citizen. Most PCs or respectively their components are not actively working 24 hours every day. They have idle times where no useful work is done, but the hardware is up and running anyways. Be it for their owner to desire a fast availability when he returns to work or for general availability of a server system. A lot of energy can be saved by supporting sleep mode properly. A machine that is not in use can go to a low power standby mode and safe significant amounts of energy. In the case of a server it might not be the whole machine but definitely components of the machine that can be turned off for extended periods of time.
Unfortunately not all devices, drivers and applications behave well when it comes to a transparent support for sleep and wake mode. Here are a few tips on what you can do to make your application a better citizen:
- Test that your application is not blocking the machine from entering sleep mode,
- Test whether the application can successfully resume functionality after returning from sleep mode.
- Do not hold the network unnecessarily.
- Do not require animated UI.
- Use disk access sparingly to allow HDD to enter a power saving state. Use memory disks wisely.
- Consider event driven architecture instead of regular polling.
Minimize your data. Storing data consumes a considerable amount of energy to keep the devices powered up for random disk access. Consider moving old data into an archival storage with lower energy consumption and lower cost. Also consider that an additional investment into memory or solid state disks can dramatically reduce your HDD usage and operating cost.
Make your applications leaner and meaner. In the last years we’ve seen a dramatic increase of processing power and capabilities of computers which led to the notion of a free lunch for developers. “I don’t need to write effective code as there will be a new CPU next year executing this twice as fast.” While this attitude worked for a short period of time it’s coming to a grinding end these days as the physical limits of speeding up CPUs are showing their powerful effects. It’s about time that you start thinking again about how to make your code more efficient. Not only will it execute faster it will also reduce the amount of power needed for the same result. Code profilers are a very effective tool to find places that could need some optimization.
Sharing Resources – Level I
Server Efficiency. All Servers need a basic amount of energy to be just there and idle. Therefore putting more applications/load on one server will allow more code to share the same base load and automatically render the overall solution more energy efficient. Another way to optimize energy consumption sounds kind of obvious – it is the good old “do more with less”. You can achieve this quite easily by moving to a new operating system which handles the underlying hardware more efficiently than a previous version and therefore reduces the amount of energy used for the basic survival of the server. The article “Microsoft Windows Server 2008 Power Savings”3 shows how an optimized operating system can deliver a better workload to power usage ratio. This is also shown in figure 1 and 2.
Figure 1. Power usage of idle vs. active servers for Windows Server 2003 and Windows Server 2008
Figure 2: Power usage comparison between out-of-the-box installations of Windows Server 2003 and Windows Server 2008
Although purists of high availability N-Tier architecture propagated for a long time to put every single layer of the application on a separate machine and to make it also redundant and use two machines - just in case outage or overly great success hits the application, we clearly see today that this is a highly wasteful way of running your IT and can be strongly optimized by trying to use the existing capacity of your servers to the maximum instead of designing to operate with a maximum of spare resources. A single computer running at 80 percent CPU usage will definitely use less power than two computers running at 40% each. Therefore it is safe to say that single-application (-tier) deployments are not efficient and resources should ideally be used in a shared manner to aim for the highest possible utilization levels.
For large organizations this will produce much bigger payoffs than subsequent hardware virtualization for Layer II. Not only will you safe on hardware purchases and energy, but also on operational costs. The key to these cost savings is to factor out the most common components across all your enterprise architecture and have those deployed as shared services. This will increase their load and also the usage of the underlying hardware and increase the resource usage to result ratio.
From the above we can deduct two basic rules:
- Consolidate applications together onto a minimum number of servers, while guaranteeing a flawless execution of the workload.
- Develop a plan to rationalize and consolidate your application and platform portfolios. The higher your reuse level of common services the more efficient your IT infrastructure becomes. Development costs can also be significantly reduced by enforcing the sharing of services across the enterprise.
Data Center Efficiency
The efficiency of a data center is usually measured in PUE – Power Usage Effectiveness. This measurement defines the ratio between power used to run the facility (cooling, light, power distribution …) and power used to run IT equipment (servers, switches, routers …). The bigger a data center it is the more efficient its design usually is and the lower the PUE ratio is. A typical PUE ratio today is 2.0 meaning only half of your power input is actually used for the actual computing and the other half is “lost” in the facility for cooling, power transformation and transportation loses, etc… . A highly efficient modern data center has a PUE as low as 1.2 while older or smaller scale, less efficient data centers are often found running at a PUE of over 10.
Another important factor to consider is the modularity of the data center. If it has to be extended in large blocks at once than there are a lot of unused/wasted resources until the peak usage is reached. At that point in time the data center typically becomes very efficient. Unfortunately that also triggers an extension of the data center to accommodate future growth which again turns the data center into a very inefficient one.
Sharing Resources – Level II
Eliminate dedicated infrastructure. The above mentioned Windows Server 2008 article also talks about virtualization of servers through technologies like Hyper-V and clearly shows that multiple virtual machines can run on a single physical machine without consuming significantly more power than would have been consumed if that machine was running as a standalone server. This is because most of the power a server uses is not used for the actual computing workload but for keeping the machine alive and running with a base infrastructure. By using virtualization several servers can use the same hardware base like power supplies, fans, etc. that a standalone server would have and therefore provide a more efficient usage of the consumed resources. “Running 4 virtual machines means saving the equivalent power output of three physical servers; running 10 virtual machines means saving the equivalent power output of 9 physical servers.” [[[[ REF 4 ]]]]
See Figure 3 for the details of the comparison.
Figure 3: Power consumption of 1 physical server in standalone and Hyper-V tests
Another benefit of hardware virtualization is that a virtualized environment takes up less space, produces less cooling costs, has lower hardware costs and incurs a smaller amount of maintenance costs than a non-virtualized environment.
Same rules as for computing virtualization also apply for storage. A central storage area network will provide better energy and cost efficiency than having many standalone servers with their own set of hard drives. Hardware virtualization isn’t completely for free. You have to plan for additional setup and management cost. Nevertheless this should be easily set off by the savings that can be achieved by virtualization.
To maximize the use of existing resources all virtual machines should support to be paused and resumed so that only those machines are running that are really needed. Ideally you will have a dynamically scalable application in place that can handle the addition or removal of instance dynamically. This will allow you to only run exactly the amount of machines that is needed that very moment to get the job done.
Be careful when designing your virtualization solution. Mistakes are easy to make here. There’s also potential overhead in resource switching and especially in resource deadlocks between virtual machines on the same physical machine. It all depends on how well the system is designed or not – the benefits might be minimal if the move to virtualized environments is done wrong, or in the worst case might even pose significant business interruption risk.
As mentioned earlier in this paper hardware virtualization still has a significant shortcoming: All virtual machines are instantiating their own operating system and recreating services that could potentially be reused across multiple servers.
Optimization for Cloud Computing – Level III Virtualization
Cloud Computing leverages the economies of scale like no other technology does. Therefore it offers an unmatched cost and energy efficiency benefit to the other virtualization maturity levels. Cloud Computing vendors are very specialized companies and not many can be part of this business. It is extremely capital intensive to build data centers that can be shared by many organizations seamlessly and still achieve great PUE ratings. It is also necessary to shift your infrastructure to locations with access to low emission energy or other unique geographic, geopolitical and sociopolitical features. Not many companies have the capabilities to operate on a truly global basis to run a data center in a location that allows the lowest possible PUE.
Not all clouds are born equal. While there is a lot of buzz in the IT press about cloud computing lately it still seems not to be clear what cloud computing actually means. The only thing that all definitions agree on is that it means that your computing is happening in the (internet-) cloud in a specialized, highly optimized infrastructure that is multitenant capable, but inherently isolated for each user. Let’s look at two vendors for cloud computing that are out there today: Microsoft and Amazon.
Comparing those two we come quickly to realize that the Amazon Elastic Compute Cloud is actually more of a Level II solution on the way to a Level III cloud computing offering. It’s true it has good economics of scale and it runs in a highly specialized data center, but it is still mainly hardware virtualization. You provide a virtual machine which is running in their data center. This is definitely better (from an energy perspective) than running the virtual machine in your own low scale, energy wasting data center.
Now let’s look at Microsoft’s cloud computing offering: Windows Azure. It is running in a large scale highly optimized, energy efficient data center as well, but the fundamental difference is that you as a customer need not to provide virtual machines to be run in that data center. Yes, Microsoft is running their internal infrastructure as virtual machines on physical servers to get every possible benefit they can. The big difference is that what it exposed to the outside world is more similar to the principles of basic von-Neumann computer architecture than to isolated VMs demanding their dedicated resources.
Similar to an operating system Windows Azure provides services that help you access the underlying hardware, or actually to be more precise an abstract service that is usually provided by hardware in a PC. Such a service can be disk access, computation, memory access, network access, messaging, a queue, a graphics engine and so on. Table 3 compares some of the items or features you will typically find in any operating system.
|Executable, DLLs||Service Package
|Local Data Store
|Applications, Windows Services||
||HTML generation in Web Role
||Live Services Identity
Remember, virtualization is about sharing! Now, why is this so much more efficient?
Remember, virtualization is efficient because it allows multiple items to share the same base cost instead of replicating it. The same principle that already applies in the design for operating systems for decades are now used for cloud computing.
When you are running an operating system like Microsoft Windows on your PC then you will have typically many applications running at the same time. All of those applications will have a 32 bit address space available to them on a 32 bit system (or 64bit accordingly on a 64 bit system). So each application believes it can use the maximum of addressable memory space for their own process, yet several applications can co-exist on a machine with less memory than the 32 bit address space can actually contain. This is a very well known architecture principle, commonly known as virtual memory.
The same way that virtual memory allows for the most efficient usage of the scarce hardware resources (there isn’t simply enough memory in a PC to give each application a physical 32bit address space) efficient cloud computing platforms, like Windows Azure enable their users, to provision their software into a virtualized application environment which is shared with many other tenants. Windows Azure acts in this case like the Windows operating system you are running on your PC. It provides a kernel that deals with all low level interactions and hardware access. This kernel-level is called fabric within Windows Azure. As a user of a cloud based application you will have about as much direct interaction with the Fabric as you would have with kernel functions on your home PC.
On top of the fabric are higher level services that can be consumed by applications that were created specifically for this operating system. In Windows this might be for example the Windows API itself or the .NET WinForms or WPF library. In Windows Azure this is the Web Role, Worker Role, Blob Storage, Table Storage, Queue Storage and other mechanisms.
Those higher level services are used to build applications without having to deal with all the details of the virtualization and abstraction layers underneath. Some application are used directly by users, others expose another layer of services to be consumed by other applications. Examples for those high value services exposing a higher level of functionality are for example SQL Data Services, Live Mesh, SharePoint Services and others.
As you can see virtualization is an inherent concept of great cloud architecture. Each layer of services builds on top of a virtualized layer underneath that not only deals with all the complexity but also guarantees as much reuse as possible. This design allows for the maximum efficiency of the underlying hardware with minimum waste of energy for idle machines or respectively just replicating the same power and resource consumption over and over again to keep the bases for all the services running.
Oh, and by the way: it also completely removes the hassle from writing scalable applications as this method inherently forces the application to support scale-out scenarios. Again a hint, that green application architecture also leads to a better performing application.
A word of caution. If you are bold enough to try to move from Level 0 or 1 to Level 3 directly I want to add a few words of caution at this point. Moving directly to Level 3 will force you to include a myriad of software and hardware proxies to augment your existing solution to make it cloud capable. Since the solutions was not designed with the cloud in mind it might be very hard and very resource intensive to move it to Level 3.
The range of problems you might run into is big, starting with additional elements that might consume more energy than what you save by going to the cloud to really business endangering difficulties like performance bottlenecks on the network, identity integration issues, and loss of governance. Not to mention the extra costs that might by incurred by the loss in efficiency in other parts of the business due to unforeseen service interruption or undocumented dependencies.
Other energy savings of cloud computing. By moving your application from a local data center into the cloud you also impact the usage of energy to drive anything else besides your computing load. Starting with not having to pay for your own cooling, energy distribution, networking component and other items needed for running datacenters you also have to look at things like embodied energy.
Embodied Energy is the energy inherent in equipment. So basically it is the energy consumed for the manufacturing of your server. This includes the resources used for building the machine as well as the energy used during the manufacturing process. And this is where the biggest savings in energy consumption are hidden. About 80% of the total energy and resource consumption of a server are attributable to embodied energy and only 20% is actual usage related energy spending.
Therefore cloud computing offers a very compelling energy saving scenario by removing the necessity for running a large amount of servers in your own data center. It’s true that the cloud data center has to buy servers to run your workload and this is just shifting the energy consumption from one entity to another. Don’t forget in this line of argumentation that the larger the data center becomes the more efficiently it can be operated. And it can also use more efficient hardware. Where a smaller enterprise would use individual servers, the data center might use blade servers. Where a large enterprise would run racks full with blade servers the cloud computing data center can use container based server groups that can be build more efficiently and are also build with future recycling in mind; therefore reducing the total footprint of embodied energy.
Besides those obvious, manufacturing and operation related savings there is a huge amount of potential savings in a cloud computing data center in the way the computing resources are shared between the tenants of the data center.
Next week, in the final part of this series, Peter will look at the financial benefits of green computing with Azure and programming models for developing on Azure.