Keys To Performance Optimization And Monitoring
Keys To Performance Optimization And Monitoring
The keys to performance optimization and monitoring are designing infrastructure and conducting real-time user monitoring.
Join the DZone community and get the full member experience.Join For Free
SignalFx is the only real-time cloud monitoring platform for infrastructure, microservices, and applications. The platform collects metrics and traces across every component in your cloud environment, replacing traditional point tools with a single integrated solution that works across the stack.
To gather insights on the state of performance optimization and monitoring today, we spoke to 12 executives from 11 companies that provide performance optimization and monitoring solutions for their clients.
Here's what they told us when we asked, "What are the keys to performance optimization and monitoring?"
Design infrastructure and applications to handle adverse operating environments and degradation like unreliable connections and bandwidth. Understand whether the user base is using a distributed app that is sharing resources which can adversely affect performance. Design awareness for the worst-case scenario.
Have a good test planas part of pre-deployment. Look for dirty test beds. Understand the influence and impact of third-party relationships and services. Add them all up to determine end-user performance from the developer environment to the real world.
Active monitoring involveslooking for a combination of viewpoints. Some test robots are watching key systems to ensure they’re up and running well. Mix with passive monitoring. Use of true end-point monitoring is useful if you can find it just before jumping to the internet.
The end-user perspective is incredibly important. Real user monitoring from the end-point for video, web pages, applications is a great way to ensure a great user experience (UX). This involves the actionability of the visibility — in-time remediation of the UX to optimize the UX.
Based on hundreds of thousands of users, we’ve observed that performance does not exist in a vacuum. It must align with other business interests based on the business goals. In retail, application performance and latency can impact revenue.
Performance learning must be incorporated back in via post-mortems, reviews, for a cadence of continuous learning. Different performance monitoring tools do different things well.
There is not one solution that does it all holistically. Step back and look at the whole picture to determine which solution will be meet your needs. You want to be proactive so you can catch performance issues in pre-production or identify leading indicators before they hurt performance or provide a poor CX.
Application delivery infrastructureand other components. The proliferation of new services with new requirements. Users in different geographic locations (road, home, virtual call centers) using different devices to access apps. This has an implication on the network. More services are critical for the business to run. IT must provide support and applications for different environments. Companies are using SaaS globally with naïve assumptions about what performance looks like in different parts of the world. Performance must be understood and it is affected by the architecture of the network as well as the performance of the network(s) on which the applications are being accessed and running.
We work with our own app. Therefore, we have high visibility into what the problemsare and what happens as it scales.
Monitoring and managementget data from the full stack connecting to the web app everywhere data could reside to provide a full contextual picture. If the app is slow, you need to know if the code is bad, if there are resource allocation issues, a crowded cluster, services, or infrastructure. There should be a full stack approach from the application layer to the infrastructure to pinpoint where the problem is. There should be an interconnection of all factors to manage the data with contextual information.
Having information interconnected helps you solve the problem faster. It requires five tools to get all of the information. We focus on the full big data stack using AI and machine learning to uncover problems. We automate root cause analysis. Algorithms are constantly improving to figure out problems. Unfortunately, there is a lack of expertise and reliability that results in a lack of confidence in the system. Cloudera and Hortonworks clients are concerned with how to manage data as it scales.
The key here is toset the correct and achievable metricsto strive for. Based on that, try to approach these in a steady and effective way.
For a true performance monitoring and optimization solution to be effective, it must bea platform that can measure in real-time (not averages), monitor system-wide, and scale at the enterprise level. The keys are to offer true real-time performance visibility via IT infrastructure instrumentation (servers, network, storage) that monitors all IO transactions flowing across it, to provide accurate and actionable performance and optimization analytics that also enable collaborative cross-silo decision support, and to offer contextual reporting and dashboards to all constituencies — from the IT operators to the CIO.
From my years of experience, the keys to performance optimization and monitoring are: understand your product, understand its target audience, and most definitely understand the load the product will have while in production. I like to give this advice with a slight smile, but a serious undertone: prepare for the worst and hope for the worst. You need to make sure you understand the load your product will have while in production, and you need to prepare your servers for it. Let's not forget that the assumption is that the actual load won’t fall short of your expectations since you want your product to be successful. Another word of advice: Keep performance improvements under control. An increase from 30% to 60% in performance could come cheap (just add some indexes), but to jump from 60% to 80% you might have to redesign big chunks of your application. Aim for low hanging fruit and design your systems with horizontal scalability in mind.
The ability to understand as quickly as possible why performance issues happen is important. Monitoring low hanging performance metrics (such as slow requests) for individual components or microservices it is helpful in catching performance regressions early on. Then, analyzing the actual root cause of the problem can be very challenging. The analysis requires skills that many developers do not possess. So, teaching developers and DevOps how to think about performance issues and how to triage them is key to success here.
We focus onkeeping it simple so that we know it gets done. It would have been easy to create some all-encompassing mandate and elaborate process, but inevitably, these sorts of things often just get forgotten as people get busy or pressure is applied to get a new release, feature, or fix out the door. By implementing a plan that meshes simplicity and accountability, we are empowering each team to identify what’s important and using the larger team to challenge those ideas. Here’s our general process:
- Establish a list of KPIs for each BU and looking for overlap.
- Focus at the start on the low-hanging fruit, common KPIs, and anything deemed high or critical by the stakeholders of each BU.
- Ensure QA processes are automated and capable of accurately measuring each KPI.
- Implement ways to measure each of the KPIs and assign accountability to make sure the products meet those goals.
- Finally, constantly review the metrics and work with the user community to make sure that the focus is kept on the right stuff. Some things are going to be obvious to everyone; others maybe not so much. We learn a lot in our forums and during our UIX interviews.
What are the keys to performance optimization and monitoring from your perspective?
By the way, here’s who we spoke to!
- Josh Gray, Chief Architect, Cedexis.
- Jeff Bishop, General Manager, ConnectWise Control.
- Bryan Jenks, CEO and Co-Founder, DropLit.io.
- Doru Paraschiv, Co-Founder, IRON Sheep TECH.
- Yoav Landman, Co-Founder and CTO, JFrog.
- Jim Frey, V.P. Strategic Alliances, Kentik.
- Eric Sigler, Head of DevOps, PagerDuty.
- Nick Kephart, Senior Director Product Marketing, ThousandEyes.
- Kunal Agarwal, CEO, Unravel Data.
- Len Rosenthal, CMO, Virtual Instruments.
- Alex Rysenko, Lead Software Engineer, and Eugene Abramchuk, Senior Performance Engineer, Waverley Software.
Opinions expressed by DZone contributors are their own.