Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Implementing a Fast Data Access Layer and Breaking the Great (Fire)Wall of China

DZone's Guide to

Implementing a Fast Data Access Layer and Breaking the Great (Fire)Wall of China

If you add the complexities of data peaks and data access restrictions in specific countries like China, meeting the real-time performance demands becomes challenging.

· Big Data Zone ·
Free Resource

How to Simplify Apache Kafka. Get eBook.

In today’s digital and fast-paced world — now more than ever — enterprises are struggling to provide the required access speed to their services and data.

This problem has become critical for global companies whose data resides at a specific location while their clients from remote locations around the world expect fast access to this data.

All modern enterprise IT organizations are concerned with replicating mission-critical data across sites, applications, or data centers. Large sites often serve users across continents. To ensure consistently low latency, it is common to maintain a copy of the data close to the users in each continent. The content delivery network (CDN) handles the distribution of static data. However, there are many challenges to achieving real-time data replication, and enterprises have additional demanding requirements that must be a part of any solution, including reliability, security, and filtering.

If you add the complexities of data peaks and data access restrictions in specific countries like China, meeting the real-time performance demands becomes even more challenging.

China is an important market for many large global companies and many of them have employees in China. Companies such as Avon, GE, and AT&T, for example, have been in China and manufacturing products for 20 to 30 years. Retailers like Walmart (NYSE: WMT) have thousands of locations across China. The situation in the athletic gear market is similar. Nike (NYSE: NKE) has a strong sales base in China, but so does Germany’s Adidas. General Motors (NYSE: GM) is the leader in the Chinese light truck and car market.

Just think of the frustration of an employee based on the “great China firewall” when he encounters a huge delay as he tries to access crucial company data from an application.

3 Proven Steps to Overcoming the Fast Data Access Challenge

To begin addressing the challenge of accessing data in real-time we are using an example of a successful deployment where the following three steps were taken:

  1. Clarify the business requirements of the customer.
  2. Understand the existing architecture and why there was a technological challenge.
  3. Assess the priority of the data per business application and region.

1. Business Requirements

Our customer has a mobile application that provides services to both employees and clients of the company, allowing users to perform business transactions and get information. Our customer required the capability to provide services in real-time or near-real-time.

2. Technological Challenge

Our customer was facing two technical challenges:

  1. Issue high latency and low performance: Huge delay due to the “great China firewall” when trying to access crucial company data from an application.
  2. No existing capability to do asynchronous calls from Kinvey Platform: All calls were only synchronized and based on cold data. There was no existing possibility of doing any kind of up-front buffering.

3. Assess Data Prioritization

Defining the data prioritization — which data is “hot” —  with the ability to handle this data in an optimal manner increases the overall performance of the solution.

Deployed Solution

One of the key components of the solution is the ability to work synchronously and asynchronously on both LAN (between GigaSpaces and Kinvey) and WAN (between the USA and China) network environments.

The GigaSpaces solution is based on combining synchronized and asynchronous product features, and by doing so, covering the need for hot data and cold data accessibility on a global scale.

3 Steps to Implementing GigaSpaces’ Fast Data Access Layer

1. Set up 1…n sites globally

  • pu.xml WAN Gateway China side configuration snippet (Sink):
<beans
………...
<os-gateway:sink id="sink" local-gateway-name="CHINA" gateway-lookups="gatewayLookups" local-space-url="jini://*/*/SpaceChina" start-embedded-lus="false">
	<os-gateway:sources>
		<os-gateway:source name="USA"/>
	</os-gateway:sources>
</os-gateway:sink>
………..
  • pu.xml WAN Gateway USA side snippet (Delegator):
<beans
………...
<os-gateway:delegator id="delegator" local-gateway-name="USA" gateway-lookups="gatewayLookups" start-embedded-lus="false">
	<os-gateway:delegation target="CHINA" />
</os-gateway:delegator>
……….
</beans>

2. Set up communication between the sites using the GigaSpaces LRMI protocol.

  • pu.xml web API china side configuration snippet:
<beans
…………
// defining LRMI connection to local china space
<os-core:space id="chinaIJSpace" url="jini://*/*/SpaceChina" />
<os-core:giga-space id="chinaSpace" space="chinaIJSpace" />

// defining LRMI connection to remote US space
<os-core:space id="usIJSpace" url="jini://*/*/SpaceUS" />
<os-core:giga-space id="usSpace" space="usIJSpace" />

<!--  US space based remoting services →
<bean id="kinveyDataRemotingService" class="org.openspaces.remoting.ExecutorSpaceRemotingProxyFactoryBean">
	<property name="gigaSpace" ref="usSpace" />
	<property name="serviceInterface" value="com.gigaspaces.fdal.service.remoting.IKinveyDataRemotingService" />
</bean>
……….
</beans>

3. Based on the LRMI communication channels, we can now transfer the requested data according to its priority:

  • High (hot): Get the data from the local space. If it does not exist, proceed to get it synchronously from a site that has the needed data.
    • CountryController Java class implementation code snippet taken from China web API:
Country[] countries = chinaSpace.readMultiple(new Country());

if (countries != null && countries.length > 0) {
LOGGER.info(countries.length + " country objects have been read from the China space. Notifying Kinvey...");

KinveyAsyncGetRequest kinveyNotifyRequest = new KinveyAsyncGetRequest(url, headers, user.getId(), Country.class);

usSpace.write(kinveyNotifyRequest);

return ok(Arrays.stream(countries).map(Country :: getProperties).toArray());
}
 else 
{
Response<Country[]> countriesResponse = kinveyDataRemotingService.load(new Request(url, headers), Country.class);
LOGGER.warning(String.format("%d country objects have been read from Kinvey. Kinvey request: GET %s", countriesResponse.getEntity().length, url));
return countriesResponse.toRestResponse();
}
  • Low (cold): Get the data from a remote site asynchronously according to the user profile predicting that this data will be needed for him later on.

    • KinveyDataAsyncService Java class implementation code snippet, taken from US space business logic:
@EventDriven
@Polling
public class KinveyDataAsyncService {
………..
@EventTemplate
public SQLQuery<KinveyAsyncGetRequest> template()
{
SQLQuery<KinveyAsyncGetRequest> query = new SQLQuery<>(KinveyAsyncGetRequest.class, "");
query.setRouting(routing);
return query;
}

@SpaceDataEvent
public void eventProcess(KinveyAsyncGetRequest request) {
………
if (Boolean.TRUE.equals(request.getSaveToSpace())) {
Response<?> response = kinveyDataService.load(url, requestHeaders, request.getEntityType());
if (request.isSessionData()) {
PrivateData[] data = (PrivateData[]) response.getEntity();
sessionDataManager.write(usSpace, data, request.isReplicable());
}
httpStatus = response.getCode();

By getting the user details (profile) on the first access, we can now collect all of the user’s predictive data asynchronously and this data will wait for him locally once he requires it.

Topics:
big data ,fast data ,data analytics ,real-time data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}