Accelerate Innovation by Shifting Left FinOps, Part 3
FinOps is an evolving practice. In part 3 of this series, learn about the cost optimization techniques for infrastructure.
Join the DZone community and get the full member experience.Join For Free
Now that we have understood the importance of cost models and how to create and refine cost models, it's time to start to optimize our workload components. Any workload deployment and architecture contains the key components that are depicted in the architecture diagram given below. All the layers of the architecture provide opportunities for cost optimization. In this post, we will look at cost optimization techniques for infrastructure (layer highlighted in blue).
Key architecture components for a cloud-deployed workload
We will break down infrastructure into three main components to focus on:
Taking this approach allows us to focus and optimize the specific component, but it also allows flexibility in replacing that component entirely with a managed service or equivalent.
The optimization technique for computing involves first selecting the right type of compute resource for each component and then selecting the best attributes for that type of resource.
The types of compute resources available may depend on your vendor but are typically:
- Virtual Server
- Physical or Baremetal Server
The compute resources are allocated dynamically by the vendor and may be shared securely across one or many customers. These resources are only paid for when they are actively performing workload processing; typically, they bill in very small increments (~1 sec). Serverless offers automatic control of the start/stop of the resources and, thus, the pricing.
The attributes to select for optimization will be the amount of compute and memory and the level of concurrency. Selecting a balance of components, along with optimization of code, will provide the shortest initiation and execution time and, thus, the lowest cost.
A piece of hardware (physical or virtual) is divided and can be used to run one or more customer workloads. If any workload is running, then the hardware is incurring costs. They typically bill in small increments (~1min).
The attributes to select for optimization will be the container management system, amount of compute, memory, and storage. Minimizing the size of your containers allows more flexibility and scalability for their placement and overall infrastructure management.
A virtual piece of hardware that is dedicated to a specific workload. Billing is similar to the container. The attributes for optimization are the type of compute — selecting the type that provides the best ratio of compute and memory, minimizing overheads across all attributes, and the amount compute and memory.
Physical or Bare-Metal Server
A physical piece of hardware is dedicated to this workload. This may be required for consistency of physical compute architecture, such as the number of processors or cores due to software licensing or supportability. They typically bill in small increments (~1min)
The attributes for optimization are the same as virtual.
In general, the recommended compute types are serverless, container, virtual, and physical servers. The recommended pricing models are interruptible, commitment discount, and on-demand.
Combined with the types of available resources is the way they can be paid for. Different pricing models are typically available across the vendors:
- Interruptible/spot: Offers the highest discount, but resource availability may change, and you could lose capacity.
- Commitment Discount: Offers medium discounts for making commitments or consistently using resources for a period of time.
- On-demand: No discounts, the standard pay-as-you-go model
This recommendation allows for not only the lowest possible costs but lower costs over the workload lifetime as it allows for greater flexibility in the way the workload can be run and the ability to modify resources with high granularity if pricing or additional resource types become available.
The optimization technique for compute involves selecting the right compute resource for each component as well as looking at the ways to optimize the cost for the selected compute type. The first step is you need to ensure the right selection of the compute type for your components and services. For example, whether you want to run your microservice on a virtual machine or a container or whether it is possible to run the service as a serverless function. Depending on the service design, the decision needs to be made whether serverless computing services like AWS Lambda can be leveraged in the solution. The use of serverless helps with lowering the overall cost further as it is charged based on the number of requests and the duration of the function, so you only pay for the compute time you consume.
The solution architect needs to select the appropriate compute option and update the cost for each component. Depending on the selection of the compute type, explore the various optimization options.
- For example, if you are selecting a virtual machine as a compute option, again, we need to look for the cost-effective and appropriate instance type depending on the workload.
- The number of instance types must be automatically adjustable based on the application load. This is often achieved by enabling the architecture to leverage the auto-scaling capabilities of the cloud.
- For the selected compute, the model needs to be reviewed to determine if there is any cost advantage going for reserved instances and subscribing to savings plans, which lowers the overall cost.
- Similarly, the architecture needs to be reviewed for whether the jobs can be run with spot instances, which offer up to 90% savings compared to on-demand instances.
- Finally, optimize the rate for your rightsized infrastructure by seeing what discounts you are able to get by making commitments and reservations for consistent use of resources for a period of time. Typically, during workload planning, these resources may be rolled up at an enterprise level, so you get better savings without losing the flexibility.
Data storage costs make up the bulk of the cost for the solution involving processing a large amount of data. So, choosing the right storage class for the different types of data is very important. The optimization technique for storage involves first selecting the right type of storage resource for each component and then selecting the best attributes for that type of resource.
The types of storage resources available may depend on your workload needs and vendor but are typically:
Object storage typically provides the lowest cost and largest access times. It is used for data that has infrequent access and is the least mobile. Object storage typically provides versioning features.
The attributes to select for optimization are the location, performance tier, access time, and level of durability of storage.
Block storage provides fast access and low access times. It is used where data is frequently accessed and needs to be available across system restarts/reboots. Block storage typically provides a snapshot feature.
The attributes to select for optimization are the size of the storage and the performance tier (number of IOPS)
Similar to block storage, it provides the fastest access. Access is, however, restricted to the specific compute hardware and cannot be accessed or made available to other hardware or systems.
The attributes to select for optimization are the same as block storage
Storage Type Selection
In general, the recommendation for storage types is in order: object, block, and ephemeral. The block-level storage required needs to be analyzed based on the use cases. These are volumes attached to the compute instances. There should be periodic checks to see whether EBS volume storage is attached to any compute or not. Depending on that, you can recommend taking out the EBS to save money. For later requirements, you can take snapshots to back up the data and restore it on demand. AWS Elastic Block Store (EBS) provides an incremental snapshotting feature, which means you pay only for the changes made since the last snapshot.
On each type of resource type, the data should always be on the most optimal performance tier at all times. The less frequently the data is accessed, the lower the performance tier can be used. Services such as AWS S3 provide features to move data automatically based on access requirements, which reduces manual optimization and automatically optimizes storage costs.
Validation of these requirements and having lifecycle management built into the solution is an important check on the cloud cost model. This helps to reduce storage costs by using the right storage class for each data type.
It is recommended to have automated lifecycle management on all storage where available.
Storage Class and Access Patterns
The pricing for storage by the cloud provider varies based on the storage class. So, it is important to validate the data availability requirements to support the solution and select the right storage class. Secondly, we need to evaluate how long the data needs to be in that storage class based on the access patterns. Based on the solution requirements, we can use specific lifecycle rules to move the data across multiple storage classes. For example, some data can be made easily accessible and available in the standard tier for specific days. Post that period, it can be transitioned to a different storage class based on the access patterns.
Wherever you are not clear on the access patterns, you can leverage the intelligent tiering capabilities provided by the Cloud provider. For example, AWS provides an S3 Intelligent-Tiering storage class that automatically moves objects between two access tiers based on changing access patterns. This automatic transition of objects to the most cost-effective storage class based on usage patterns helps save costs for the solution.
Understanding the data requirements, from how much needs to be kept online and how much data can be retrieved on-demand, is important. Gathering this requirement from these dimensions helps you determine the data retention periods. On the expiry of the data retention period, the solution should support automation or scripts to purge or safely delete the data. Any data required for long-term storage needs to be moved to the appropriate low-cost storage. For instance, you can write rules to push data accessed infrequently or data beyond 90 days to be archived in a storage class like AWS Glacier. This storage provides low-cost storage but could take a while to retrieve.
Data should be kept as close as possible to the processing components. This minimizes any data transfer costs and maximizes performance. Techniques such as caching can be applied to different types of storage by temporarily increasing the performance level, temporarily changing the performance attributes, or placing the data in a faster type of storage.
Another important step is to keep the data close to the function and avoid costs and latency due to cross-region data access or transfer. Data transfer acceleration also helps to improve performance and reduce cost by spreading the data across distributed edge locations.
Networking costs can also be another significant portion of the overall workload cost. Network architecture and design can be critical with regard to cost, especially for hybrid and multi-cloud deployments. So, the network deployment architecture needs to be reviewed carefully. Optimizing network cost can be achieved by implementing one or many of the following techniques.
Close placement of compute and data components helps manage the network interactions better and keep the chatter to a minimum. You should look to introduce some intelligent routing by leveraging Network and Application level Load Balancers to distribute traffic efficiently and manage network costs.
The fewer the number of components and the shorter the path between the compute and storage, typically, the more efficient and lower the cost will be.
For access between the cloud and other sites, such as on-premise data centers, dedicated connectivity, such as AWS Direct Connect, will provide lower costs compared to internet VPN services as the amount of data transfer increases.
For access between different services or workload components, internal data links should be used — such as AWS Transit Gateways, VPC Endpoints, or Private Link. This reduces internet traffic, which is more costly.
It is critical to evaluate the placement of each of the nodes in your solution. The right placement of the execution and data units helps manage the network interactions better and keep the chatter to a minimum. You should look to introduce some intelligent routing by leveraging Network and Application level Load Balancers to distribute traffic efficiently and manage network costs.
- Microsegmentation of the network helps with security and network isolation. But at the same time, you need to review the risks and make decisions to ensure you don't overdo micro-segmentation and security groups.
- The solution should also provision for network monitoring tools to be included to understand the access and usage patterns, so it can be optimized in the later phases.
In the next part of this series, we will discuss how to optimize the cost of your application.
Opinions expressed by DZone contributors are their own.