There's been a lot of discussion about multi-tenancy since the arrival of cloud computing. Multi-tenancy is the act of hosting multiple non-related resource allocations on the same hardware. For example, with regard to virtual machine instances, the concepts surrounding multi-tenancy are multiple virtual machine instances all running on the same servers sharing the CPUs, memory, and network adapters. With Software-as-a-Service (SaaS), multi-tenancy is the represented by multiple clients all sharing the same application, but ensuring their data is properly partitioned.
One consequence of poorly-behaved tenants is that they can starve out other well-behaved tenants if controls are not put in place to limit consumption of resources, which often they are not. The reason why limits are often not either enforced or implemented is because the implementors believe they can scale up and out to meet the demand. After all, this is the promise of cloud computing.
However, not all resources scale equally and all re-allocations out of the pool could negatively impact other tenants' ability to scale. One area where this is often forgotten is in network bandwidth. While it's possible to control how much CPU time or memory a given machine instance can use, it is more difficult to control how much network bandwidth the processes of a particular tenant will use. Hence, one client could cause a denial of service attack accidentally for co-residing tenants if they are all sharing the same network adapter or running on the same switch.
What cloud service providers often forget is that everything in the cloud uses the network bandwidth. That means in addition to your satisfying client requests, those same client requests often generate additional traffic. This traffic may be communications with the storage network, database and application servers, or even the hypervisor substrate reallocating and re-balancing loads. At the end of the day, the physical limitations of the network are limited even in the case of fiber-based connections, which are great for internal communication, but rarely extended end-to-end. In the end, rarely does anyone measure or understand the aggregate network load caused by a single tenant in a multi-tenant architecture and in some cases it may not even be possible to directly assess a particular network load to a single tenants use.
Beyond the obvious possible performance impact, why does this matter? When designing solutions for the cloud, it is imperative that you test your production environment's real operating performance and not rely on the specifications provided by the cloud provider. Your machine instance with two-cores operating at 2.5Ghz, 2048 megabytes of RAM, 500 gigabytes of disk space and 1 gigabyte network adapter does not equate 1:1 in performance. You will need to instrument your application running in this environment and not assume that it will operate as it did in the test environment or, if migrating, the way it ran on your prior infrastructure. This means obtaining average IOPS for storage, average bandwidth over a reasonable usage period, and response times around key processes.