Two Approaches on Multi-Tenancy in the Cloud
Join the DZone community and get the full member experience.
Join For FreeFirst of all, increase efficiency through sharing is a fundamental value proposition underlying all cloud computing initiatives, there is no debate that ...
- We should "share resource" to increase utilization and hence improve efficiency
- We should accommodate highly dynamic growth and shrink requirement rapidly and smoothly
- We should "isolate" the tenant so there is no leakage on sensitive information
But at which layer should be facilitate that ? Hypervisor level or DB level.
Hypervisor level Isolation
Hypervisor
is a very low-level layer of software that maps the physical machine to
a virtualized machine on which a regular OS runs on. When the regular
OS issue system calls to the VM, it is intercepted by the Hypervisor
which maps to the underlying hardware. The hypervisor also provide some
traditional OS functions such as process scheduling to determine which
VM to run. Hypervisor can be considered to be a very lean OS that sits
very close to the bare hardware.
Depends on the specific
implementation, Hypervisor introduces an extra layer of indirection and
hence incur a certain % of overhead. If we need a VM with capacity less
than a physical machine, Hypervisor allows us to partition the hardware
into finer granularity and hence improve the efficiency by having more
tenants running on the same physical machine. For light-usage tenant,
such increment in efficiency should offset the lost from the overhead.
Since
Hypervisor focus on low-level system level primitives, it provides the
cleanest separation and hence lessen security concerns. On the other
hand, by intercepting at the lowest layer, Hypervisor retain the
familar machine model that existing system/network admin are familiar
with. Since Application is now completely agnostic to the presence of
Hyervisor, this minimize the change required to move existing apps into
the cloud and makes cloud adoption easier.
Of course, the
downside is that virtualization introduce a certain % of overhead. And
the tenant still need to pay for the smallest VM even none of its user
is using it.
DB level Isolation
Here
is another school of thought, if tenants are running the same kind of
application, the only difference is the data each tenant store. Why
can't we just introduce an extra attribute "tenantId" in every table
and then append a "where tenantId = $thisTenantId" in every query ? In
other words, add some hidden column and modify each submitted query.
In
additional, the cloud provider usually need to re-architect the
underlying data layer and move to a distributed and partitioned DB.
Some of the more sophisticate providers also need to invest in
developing intelligent data placement algorithm based on workload
patterns.
In this approach, the degree of isolating is as good
as the rewritten query. In my opinion, this doesn't seem to be hard,
although it is less proven than the Hypervisor approach.
The advantage of DB level isolation is there is no VM overhead and there is no minimum charge to the tenant.
However,
we should compare these 2 approach not just from a resource utilization
/ efficiency perspective, but also other perspectives as well, such as
...
Freedom of choice on technology stack
Hypervisor
isolation gives it tenant maximum freedom of the underlying technology
stack. Each tenant can choose the stack that fits best to its
application's need and inhouse IT skills. The tenant can also free to
move to latest technologies as they evolve.
This freedom of
choice comes with a cost though. The tenant need to hire system
administrators to configure and maintain the technology stack.
In
a DB level isolation, the tenants are live within a set of predefined
data schemas and application flows. So their degree of freedom is
limited to whatever the set of parameters that the cloud provider
exposes. Also the tenants' applications are "lock-in" to the cloud
provider's framework, and a tight coupling and dependency is created
between the tenant and the cloud provider.
Of course, the advantage is that there is no administration needed in the technology stack.
Reuse of Domain Specific Logic
Since
it focus in the lowest layer of resource sharing, Hypervisor isolation
provides no reuse at the app logic level. Tenants need to build their
own technology stack from the ground up and write their own application
logic.
In the DB isolation approach, the cloud provider
pre-defines a set of templates in DB schemas and Application flow logic
based on their domain expertise (it is important that the cloud
provider must be the recognized expert in that field). The tenant can
leverage the cloud provider's domain expertise and focus in purely
business operation.
Conclusion
I think each approach will attract a very different (and clearly disjoint) set of audiences.
Notice
that DB-level isolation commoditize everything and make it very hard to
create product feature differentiations. If I am a technology startup
company trying to develop a killer product, then my core value is my
domain expertise. In this case, I won't go with the DB-level isolation
which impose too much constraints on me to distinguish my product from
"anyone else". Hypervisor level isolation much better because I can
outsource the infrastructure layer and focus in my core value.
On
the other hand, if I am operating a business but not building a
product, then I would like to outsource all supporting functions
including my applications as well. In this case, I would pick the best
app framework provided by the market leader and follow their best
practices (also very willing to live by their constraints), the DB
level isolation is more compelling in this case.
Opinions expressed by DZone contributors are their own.
Comments