Hardware Sizing for Java/Java EE Products
Many attributes contribute to the final sizing tabulation when it comes to hardware sizing on Java/JEE — especially when indexing frameworks like Lucene are involved.
Join the DZone community and get the full member experience.
Join For FreeFor hardware sizing on Java/JEE, especially when indexing frameworks like Lucene are involved, many attributes contribute to the final sizing tabulation.
Before we get started with a sizing exercise, we need to understand that the following will impact the accuracy of our metrics:
The exact version of the runtime, frameworks, and servers used.
The experience of the senior engineer/architect doing the sizing exercise.
An understanding of the functional characteristics of the system being built.
An agreement upon the non-functional characteristics of the system being built.
Appreciated sizing environments (development, testing, production, UAT, etc).
The future extensions or possible lifeline of the system being built.
The following are the most important standard guidelines and criteria for hardware or server sizing. Please note that these guidelines are for the server that hosts the application. They do not contain any database sizing guidelines.
The hardware component should operate at no more than 80% utilization.
Processors and memory resources should be allocated for maximum user load.
User think times and network latency should be taken into account.
Know the number of potential users and number of concurrent users.
Know the service time and average response time of your application.
If you are using a Solr/Lucene type of indexing or disk-based frameworks, then it is important that you estimate the entire possible index size by deducing the number of documents, the number of indexed fields, the number of stored fields, and the average size of each document.
By considering a buffer, you may be able to compute the disk space. While computing the estimated memory requirements for Solr/Lucene, the number of unique terms per field also needs to be considered. In the references below, I have provided a sheet (which has been made publicly available by Lucidworks that will provide you all the attributes needed to tune your memory sizing and disk space requirements for Solr/Lucene, especially with respect to the caching of query terms.
While doing the sizing exercise, you can provide various tabulated forms as the result for each of the possible environments. Alternatively, you may choose to present a single tabulated result (mentioning the environment for which you are providing this sizing). You may mention the additional constraints that may be applicable across environments. It is important that the buffer may be added to each of the computed attributes, keeping in mind the cost and future extensibility.
Most come to a conclusion that hardware is inexpensive these days, so we can recommend something that is beyond the best possible maximum load. Though this may work almost always, we may not be able to come out with a possible minimum estimate with least cost. Coming out with the estimates and keeping this in mind will equip us to better understand the future issues that various functional and non-functional aspects may cause. This is especially if we want to achieve maximum efficiency under the constraints for all possible loads.
For example, if we were to achieve this in the development, testing, or UAT environments, we may be able to point out the memory leak that would have manifested itself due to an incorrect development practice or deployment strategy. Sometimes, we may also end giving an inflated estimate for an otherwise small system. The resources may go unused if we go with the former approach.
Before I take you to the tabulation, there are a few terms that need to be defined.
User think time: The time that the user is not engaged in the actual use of the processor (the time between requests). This is used interchangeably with user wait time. In reality, however, this has a slightly different impact, as it involves the time required by a user for thinking and performing their next action in the application either due to the response or otherwise.
Response time: The average response time measured at the client under load.
Concurrent users: The number of users measured on the server, taken in snapshots from the server status or server console.
Service time: The elapsed time to complete the operation measured for a single user.
Maximum user load: The maximum number of concurrent users that may be expected or for which the system is tested.
User wait times: The time elapsed between actions or clicks for a given user. This is used interchangeably with user think time. In reality, however, this has a slightly different impact, as it involves the time required by a user for analyzing or reading data received between requests, as well as performing other tasks such as reading email, using the telephone, and chatting with a colleague or on other applications simultaneously running. Software testing and performance may be put to great use to improve user experience and/or performance.
CPU utilization: Average of the total CPU utilization as a percentage.
The final tabulated hardware sizing recommendation for the Java/JEE product will look like the following.
The load balancing, data clustering, failover strategy, and backup strategy are not planned for due to the nature of the system.
FIELD NAME |
FIELD TYPE |
Type of environment |
Development [/testing] |
Type of machines |
Physical [/virtual] |
Number of servers |
1x |
Operating system |
Red Hat Enterprise Linux (Linux X.Y.ZZ-AAA.BB.C.eRR.xpp_bb OS) |
Application server |
Weblogic ??c (Weblogic ??.?.?*) |
Load balancing |
[NONE] |
Data clustering |
[NONE] |
Failover strategy |
[NONE] |
Database connections |
10 [ |
Backup strategy |
[NONE] |
Processors |
4 Cores |
Concurrency |
~500 concurrent users [including think times] |
Memory/RAM |
4GB |
Garbage collection |
Generational garbage collector [ |
Disk Capacity Reasons Lucene Indexing |
~10GB SSD (/HDD) [Logs, indexes, dependencies, + buffer] [~300MB worst case, + buffer] |
Java Heap Size Lucene Second Level Caching |
Dedicated machine [-Xms=??g -Xmx=??.?g] ~100MB [worst case, + buffer] ~000MB [NONE] |
This recommendation is for the development environment. It is best that the above is used and emulated for any development or testing. For production or UAT environments, the considerations (with our recommendations in brackets) related to the following:
Storage capacity [500GB SSD].
Storage redundancy (RAID).
Processor cores [08+].
Total memory (RAM) [08GB+].
Application failover strategy [Active-Active with 4x Physical Servers]
...should best match with other organizational or hardware tier standards.
Please do take a look at the reference links, which can be used to get the best results for your hardware sizing and capacity planning exercises. I promise to return back in the near future with a hardware sizing calculator for Java/JEE products. This should be of help and reference when doing initial evaluations/estimations to report to customers or management.
Happy hardware sizing for Java/JEE products!
Note: Recently, I was a Software Development Architect at a software product company. This write-up is based on the work done as part of the special product customization for a big logistics customer, as well as for later use in the product itself].
References
Published at DZone with permission of Sumith Puri. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments