Address Non-Functional Requirements: How To Improve Performance
The article is written to address specific performance issues that may evolve during a product's lifecycle and how one can address them.
Join the DZone community and get the full member experience.Join For Free
Performance of any software system is the measure of how fast or responsive that system is under a given workload and a given hardware. By workload, I mean data requirements in the backend, and the volume of requests and hardware can be defined by the system's capacity, such as CPU, memory, etc.
How To Identify a Performance Problem
Most performance problems evolve from a queue that builds up due to inefficient code block, which results in slow processing, serial access instead of concurrent access, or limited resources, which are incapable of serving the capacity required for efficient processing.
What Is Latency?
It is a measure of how much time a request-response spends within a system measured in time units. Our goal to improve performance should be to minimize latency.
What Is Throughput?
Throughput is a measure of how many requests a system can process in a given time. It is a rate, so it is measured as the rate of request processing, which is dependent on latency. Our goal should be to maximize throughput.
How Do You Think We Could Address Network Latency Bottleneck?
A network latency bottleneck can be of two broad types, connection-related and data-transfer-related. We will see both of them now.
1. Connection Bottleneck
A connection-related problem can be addressed by using connection pools or persistent connections. Database connection pooling is a way to reduce the cost of opening and closing connections by maintaining a “pool” of open connections that can be passed from database operation to database operation as needed. A persistent connection, also known as a Hypertext Transfer Protocol (HTTP) persistent connection, refers to a network communication channel that remains open for further HTTP requests and responses instead of closing after a single exchange. Persistent connections are also called HTTP keep-alive and HTTP connection reuse.
2. Data Transfer Bottleneck
A data transfer bottleneck can be addressed by using caching or by limiting data format and increasing compression. Data caching is a process that stores multiple copies of data or files in a temporary storage location or cache so they can be accessed faster. Limiting data format can be achieved by reducing the size of data that has to be transferred over the network, and compression refers to the use of specific formulas and carefully designed algorithms used by compression software or programs to reduce the size of various kinds of data.
How To Address Memory Latency
A memory latency bottleneck can arrive because of four broad types. We will discuss them one by one.
1. Avoid Bloating of Memory
Our process should have little memory and limited code as much as possible because the code base is also loaded in memory, and then the code base is executed in the processor. So if the number of instructions is lesser, the back and forth between RAM and processor will be lesser. So, having a smaller code base is a good thing. The other thing that we need to worry about regarding memory bloat is that the heap space we are using should be as small as possible because that will create fewer jobs for the garbage collector, and the chances of the process going out of memory will be lesser.
2. Weak/Soft References
Whenever a process runtime notices that the process is running short of memory space, it can let the garbage collector destroy or clean those objects that are referred to by these weak and soft references. It is extremely useful for large objects garbage collection.
3. Divide Large Batch Process
When there is a high memory consumption use case for a large batch process, it is often advisable to use the divide and conquer principle to divide the processing into smaller units consuming small manageable units of memory.
4. Garbage Collection Algorithm
Typically, there are a couple of use cases for garbage collection algorithms. In the case of batch processes (Parallel collector) and real-time processing (CMS collector), different kinds of algorithms are used.
How To Address Disk Latency
A disk latency bottleneck can be of three broad types. Let's discuss them in detail.
In the case of logging, we are sequentially writing over a log file, and any sequential IO is much faster than any random IO. When we are doing logging, if we can log as much data in one go, that will help in reducing these contact switching-related costs. Also, wherever possible, do asynchronous logging, which transfers that data to be logged to another thread from the main thread.
2. Web Content Files
Static content should be stored in the reverse proxy, which makes it available in memory. Also, make use of solutions like page cache, which makes the pages that have been already read remain in the RAM. More zero copy can be used when we are copying files over the network, ignoring user mode to do it in kernel mode, which makes copying of data faster. All these facilities are typically available in a reverse proxy.
3. DB Disk Access
One of the performance improvements to significantly improve DB response time is to have denormalized fields on the primary table instead of joining multiple tables. Also, making use of indexes avoids full table scan and can pinpoint the exact disk location to find that record.
How To Address CPU Latency
Inefficient Algorithms and Queries
Make use of efficient algorithms and efficient queries to improve performance. Try to load test a complex algorithm with more data and simultaneous requests to check the system's performance under load.
Batch processing improves performance by combining multiple calls into a single call, improving CPU latency. If you have an application where heavy inter-thread communication is present, try to put in a delay in the inter-thread checks so that there is a little bit less load on the CPU. Async IO in a separate thread can be used to improve performance by delegating the work without impacting the main thread. The threaded model efficiently delegates the processing to an asynchronous thread that does the main IO processing and returns to the main thread. Determining the right thread pool size is important to reduce the context switching and CPU latency.
How To Select the Right Data Store for Performance
For ACID transactions where the data is relational and structured, the use of a relational database is recommended. Now, the burning question is how to improve the performance of the relational database. Caching with a NoSQL key-value store is recommended to reduce the load on the relational database and improve performance. Also, to reduce the load on the relational database and to reduce size, an archival service is recommended, which moves old data from the relational database to a NoSQL store.
A NoSQL Document Store database is recommended for a write-heavy system with fewer reads. For a read-heavy system, a NoSQL Columnar database is recommended. For Caching requirements, use a NoSQL Key-Value Store. For search use cases, use Elastic Search which has a document store and can perform fuzzy search. An asynchronous call to Kafka is used for enabling analytics, which then calls the Spark Streaming Cluster and further stores in the Hadoop cluster from where AI/ML jobs can run. Images/Videos can be stored in an object storage, and CDN service can be used to improve performance with the help of local caching.
Improving performance is an interesting topic in software development, which is an important non-functional requirement in software development best practices. Getting it right requires a broad spectrum of skills and can be an integral part of the software development lifecycle. This article will help you look at the broad areas, and you can address these areas in your respective projects.
Opinions expressed by DZone contributors are their own.