DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Data
  4. Three High-Performance Caching Methods

Three High-Performance Caching Methods

Mitch Pronschinske user avatar by
Mitch Pronschinske
·
Feb. 11, 10 · Interview
Like (0)
Save
Tweet
Share
13.27K Views

Join the DZone community and get the full member experience.

Join For Free
Information systems performance and SOA performance are key concerns for architects that plan on implementing an enterprise data virtualization layer. Data virtualization is a form of data integration that tends to include many data sources that are both relational and non-relational, and organizes them in a logical virtualized manner rather than a physically consolidated manner.  Performance is very important in environments with widely distributed data sources because network latency cannot be controlled.  Real or near real-time environments also require high performance from the participating data services.  This means that the data virtualization layer's performance depends mostly on response latency.  The bottlenecks produced by network latency from the overloaded data virtualization layer can be reduced through high-performance caching in a data virtualization layer.

Three factors contribute to the data virtualization layer's response latency: the network, the middleware, and the data sources.  When these three things are all located on the same subnets, network latency will be pretty constant.  In this scenario, architects can reduce the response latency at the slowest data source to reduce latency for the entire solution.  Another approach is high-performance caching.  With a caching system in place, performance will increase and many of the client requests will be fulfilled by cached data, which reduces the number of requests that go against the data source production.

Single Cache Instance

Single Cache is the most basic implementation in the data virtualization layer and they are preferred for small or medium projects with low or moderate client load activity.  The implementation team should consider putting the cache on the same subnet as the data virtualization middleware to minimize the network latency between the middleware and the cache.  If cached data is frequently accessed, but relatively small, it might be an even better idea to put the caching system on the same blade server as the middleware to completely eliminate network latency.

Caching raw table data is a good choice for environments where one data source is significantly slower than the rest of the data sources.  This improves the performance of the overall solution because it doesn't make the middleware stand idle and cause latency.  Materialized-view caching is best for when many clients send identical requests and clog production systems with requests that invoke identical responses.  In this case, the data virtualization middleware will execute the first client request against the production systems, and then cache it.  Because it doesn't discard the returned result-set, the subsequent client requests will be fulfilled by the cache system instead of the production systems.  Procedural caching should be used if one of the data sources is a web service with long or unpredictable response latency.  In this solution, the data virtualization middleware optimizes the overall performance by caching the result-sets from the web service sources based on the passed parameters.

Cluster Cache

This is an implementation for more complex deployments.  The Cluster Cache can handle heavy client request loads by clustering the data virtualization middleware into multiple nodes.  Be aware that the middleware clustering increases the load on the production data sources, because each client request is executed against the production data sources.  A caching system in a clustered environment could have a significant impact on the solution's performance and on offloading stress from the production systems.

Distributed Cache

A Distributed Cache is best for environments with one or more clients located remotely.  These systems usually have a central cache repository with multiple remote caches to service requests from the remote clients without increased latency.  The edge caches don't need to be a full copy of the central cache because the edge caches monitors remote client requests and replicate only the portions of the central cache that are relevant to each request.  Changes to the central cache are copied dynamically to the edge caches without a wholesale re-sync.

                

Using specialized data virtualization middleware with high-performance caching is quickly becoming a popular method for reducing performance bottlenecks.
Cache (computing) Data integration Virtualization Requests Middleware Web Service

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • GitLab vs Jenkins: Which Is the Best CI/CD Tool?
  • Public Key and Private Key Pairs: Know the Technical Difference
  • Best CI/CD Tools for DevOps: A Review of the Top 10
  • Using GPT-3 in Our Applications

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: