In the "Data Path": Dumping the Database
Join the DZone community and get the full member experience.Join For Free
Ari Zilka has been doing road shows where he talks to CIO's at Fortune 500 companies about virtualization. "We're seeing a lot of traction around private clouds," He said. Customers are telling Terracotta, "I've played around in the cloud and my teams are telling me that the database is the problem." Zilka said that's because there's not a good controllable interface in the cloud, and since the disks are abstracted, you can't run large database server instances. This is a problem for stateless applications talking to databases for all of their storage. Zilka reports that cloud and data grid vendors are saying, "now is the time to redesign applications for the cloud. If you design them right this time, using grids and cloud-friendly distribution models of data, then you will have scalability in the cloud." The only problem, Zilka says, is that CIO's are not interested in funding a complete re-write or even a small update of how they manage data unless it is absolutely neccessary.
According to Zilka, there are two major bottlenecks to scalability. The first one is the database, which Terracotta addresses by offloading the database. However, it wasn't easy to dump the database without drastic data management changes. "I think three or four years ago we all thought that object caches could replace databases for certain use cases, and when you replace a database you get rid of the impedance mismatch," said Zilka. "That impedance mismatch is very expensive in terms of overhead for every database. Hibernate and other ORMs attemt to solve the impedance mismatch problem."
However, there were problems with object caches as well. Database engineers didn't understand them. "I have a friend who taught me a lot about this," Zilka said. "His name is Ben Wang, and he used to work at JBoss and now he's at Alibaba.com." Wang told Zilka, "If you get into the 'data path', [object caching] is much easier to consume." With EHcache as a cache of databases and Hibernate using EHcache, Zilka says, "we're in the data path." The data path means an application with EHcache runs faster because it thinks it's talking to the database, but in reality, it's actually talking to its own memory. This makes caching easy for data engineers to understand. Zilka says, in the past, Terracotta circumvented the impedance mismatch by switching databases over to object-oriented caching. Today Terracotta says, "just slot me into that 'impedance' and I'll take out the overhead of the 'mismatch' without changing the programming or architecture environment."
The second bottleneck is what Zilka calls, "a change management elasticity bottleneck." Zilka explains, "When you add more nodes, people are finding that their infrastructre doesn't notice that they've added them." If there are no load balancers in the cloud, it is difficult to make new nodes share the work. To solve this bottleneck, Terracotta created solutions for the deployment of new application nodes and it created software load balancing compatable with the cloud. The solutions are based on HA Proxy, an open source framework that ships with Terracotta's EC2 and private cloud deployments.
"Over 70% of apps use EHcache today," says Zilka. EHcache users can plug in to Terracotta's cloud-ready server and keep all of their data in sync accross the cloud. The server will run applications with EHcache just like a tomcat instance and keep them consistent as the cloud grows and shrinks based on user demand. "It'll do all this transparently," says Zilka. "You dont' have to re-design anything." According to Zilka, when CIO's learn about Terracotta they say, "I think I'm ready to go to cloud."
Opinions expressed by DZone contributors are their own.