Essentially the main point is that the ability or effectiveness of scaling vertically/horizontally your application depends on various factors, more complex than just looking at the OS CPU and memory utilization.
Proper usage of the right tools and capture of application specific metrics are crucial in order to identify tuning opportunities. This approach will also help you determine the right initial and incremental infrastructure/middleware sizing for your on-premises or cloud production environment, reducing your client hardware/hosting costs long-term and improving ROI.
For example, if you look at your Java application LIVE data (OldGen footprint after a major collection), some applications have LIVE data which depend mainly on the concurrent load and/or active users e.g. session footprint and other long-lives cached objects. These applications will benefit well from vertical or horizontal scaling as load is split across more JVM processes and/or physical VM's, reducing pressure points on the JVM fundamentals such as the garbage collection process.
On the contrary, Java applications dealing with a large LIVE data footprint due to excessive caching, memory leaks, etc., will poorly scale since this memory footprint is "cloned" entirely or partially over the new JVM processes or physical VM's. These applications will benefit significantly from an application and JVM optimization project which can both improve the performance and scalability, thus reducing the need to "over-scale" your environment in the long-term.