Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

API Development Using AWS Serverless Architecture

DZone 's Guide to

API Development Using AWS Serverless Architecture

Look at API development using an AWS serverless architecture.

· Integration Zone ·
Free Resource

I recently had the opportunity to work on an AWS-based Serverless architecture solution. This is for ZIP files processing requirements. At a high level, the requirements expected to be delivered from AWS are summarized below:

  1. Create a final output zip file from the contents of source zip files and arrange them in a specific hierarchy of folder structure. There are 2 Systems that will make source ZIP files available in the S3 bucket.
  2. Delete the set of files requested by Pega.
  3. Transfer the output zip file to the external SFTP server.

Image title

AWS Architecture

We proposed the following AWS architecture to implement these requirements. AWS API Gateway will expose RESTful APIs. Pega BPM is going to be a client for RESTful APIs.

Architecture of Serverless Architecture

We defined the following set of AWS services to meet these requirements:

  1. AWS API Gateway: for creating and publishing RESTful APIs to be called from Pega.
  2. AWS Lambda service: Lambda function contains actual file processing logic. We used both sync and async lambda functions.
  3. NAT Gateway: for outbound internet access for Lambda functions e.g. to transfer the final ZIP file to the external SFTP server over the internet.
  4. S3 object service: object storage for source zip files, final zip file, and status files.
  5. Transfer service for SFTP: managed AWS service for secure file transfer functionality. This was used as an interim solution until the external SFTP server is set up by the customer.
  6. IAM service: roles (with policy permissions) for service-to-service communication
  7. VPC peering: to allow private communication between 2 AWS VPCs i.e. Pega VPC and file processing VPC. Both Pega VPC and File processing VPC must be in the same region for VPC communication to work.
  8. Cloudwatch logs: for logging of main events in lambda functions.

We finalized Node JS as the programming language of choice to implement file processing logic. The reason for this decision is based on the following considerations:

  1. Widely used in event-driven architecture and is a popular language to code Lambda logic.
  2. Its JavaScript at server-side, so there's less of a learning curve than Java/PHP/.NET Core languages.
  3. It provides many base npm packages to leverage from for the purpose of development.

We used Node JS Promise to avoid what is called "callback hell" in Node JS parlance. The JS function returns Promise, which is resolved when the "then" method is called on the promise and results are made available in the callback function. Promise recursion and promise chaining were used for repetitive processing and chaining of promises.

We have designed 3 RESTful APIs for Pega to call into:

1. API for Create final ZIP file:
Pega will send names of source zip files, source bucket, expected hierarchy structure of final zip file, name of final zip file, and bucket in JSON request to this API.

API Gateway receives the request and sends it to Lambda function synchronously. Lambda function validates Pega request and if there are no validation issues, then it calls another Lambda function asynchronously for the actual file processing logic. This sync-async approach was taken as the API times out in 29 seconds whereas file processing logic exceeds 29 seconds for processing of large sized files (~ 1Gb).

We had 2 solution options to inform Pega of the outcome of file processing:

  1. Pega to host API, which async Lambda will then call into to inform success/failure status of processing
  2. Lambda function to write file processing status to CSV file and drop the file in S3. Pega will have file listener on the bucket to inform the status.

We opted for the second approach, as the InfoSec department of customers had concerns related to the hosting and security of solution option 1.

Lambda service has a limitation that it provides 512Mb of temp disk storage for each Lambda function. In our case source (and final) zip file, sizes can easily exceed this disk storage limit. This limitation rules out file processing using the /tmp disk storage. We have investigated the npm repository and zeroed in on the ‘adm-zip’ node package, which supports file processing in the memory.

Below are the code snippets for returning promise from the function getMulitpleSourceZIP. This is for one of the source systems that provide multiple source files in S3. Also note the use of promise recursion, as we wanted to repeat the processing until all Source zip files were loaded using the S3 getObject API and the contents shoved into the final zip file using the adm-zip package. Also, the source zip file loading from S3 and shoving contents into the final zip file is done one-by-one for memory optimization purposes.

getMultipleSourceZIP = function(arr, orgHirArr, modHirArr, finalZip) {
return new Promise((resolve, reject) => {
if (arr.length > 0) {
 //pop up Source ZIP object from array and call getObject S3 API for each object
 var param = arr.splice(0, 1);
 var response = S3.getObject(params, function(err, data) {
if (err) {
reject(err);
} else {
try {
var zip = new ZIP(data.Body);
var zipEntries = zip.getEntries(); 
zipEntries.forEach(function(zipEntry) {
 //Logic to load object from S3 using getObject API and move it 
 // into finalZip using addFile API of adm-zip package.
finalZip.addFile(filepath, zipEntry.getData());
});
//Use of promise recursion until array has no elements
resolve(getMultipleSourceZIP(arr, orgHirArr, modHirArr, finalZip));
 } catch (error) {
reject(error);
 }
}
})
} else {
  resolve(finalZip);
}
}
}
-

The code snippets below are for the invocation of this function. Note the use of promise chaining using the "then" method and call to another function (for another system), which returns another Node JS promise.

//Get Source system #1 zip files from S3 and shove them into final zip as per required structure
getMultipleSourceZIP(paramsArray, event.project.hierarchy, hierarchyArray, finalZip).
then((finalZip) => { 
//Get Source system #2 zip files from S3
return getSingleSourceZip(input_bucket, sourcefile, event.project.hierarchy, hierarchyArray, finalZip);}).
then((finalZip) => { 
      //logic to put final zip file in S3 using S3.putObject API
     return putFinalZipFile(outputBucket, finalZipFileKey, finalZip);
}).catch((err) => { 
     //handle the error
    responseMessage = 'Error received while putting final ZIP file into S3, the error details: ' + err;
     dataToWrite= 'F,'+ responseMessage;
     writeToCSV(outputBucket, dataToWrite, event.project.finalzip);
});

The code snippet below is for synchronous responder Lambda, which calls file processing Lambda asynchronously. Responder lambda returns immediately after request validation.

//Validate Pega request, if successful call another Lambda asynchronous
let lambda = new aws.Lambda();
lambda.invoke({
  FunctionName: 'create-final-zip', //name of file processing lambda  
  InvocationType: 'Event', //asynchronous invocation of create-final-zip Lambda fn
  Payload: JSON.stringify(event, context, callback) // pass params
}, function (error, data) {
   if (error) context.done('error', error);
   if(data.Payload) context.succeed(data.Payload);
}
});
//callback to notify caller (API) that request is received successfully and being processed asynch
callback(null, ‘Final ZIP File creation request received successfully’);

2. API for delete ZIP files:

Pega will send an API request with a list of zip files to be deleted from the source bucket.

The S3 headObject API is called to see if the requested ZIP file exists in the bucket. This call returns the metadata of the object without returning the object itself, otherwise, it returnan s error if the object is not available in the given bucket. Please see the code snippet below for the checkObject function, which creates and returns Node JS promise. This promise will be resolved after "then" is called on returned promise.

checkObject = function(inputDataArray, outputDataArray) {
return new Promise((resolve, reject) => {

if (inputDataArray.length > 0) {
var param = inputDataArray.splice(0, 1);
let params = {
Bucket  : param[0].Bucket,
Key     : param[0].Key
  };

 //Returns metadata of object without returning object itself
S3.headObject(params, function(err, data) {
 if (err) {  
 outputDataArray.nfArr.Objects.push(params);
 } else {   
outputDataArray.deleteArr.Objects.push(params);
     }
   resolve(checkObject(inputDataArray, outputDataArray));
});
} else {
resolve(outputDataArray);
}
});
}

We called the S3 deleteObjects API for getting deleteArr objects deleted from the bucket.

3. API for SFTP file upload:
Pega will request this API to transfer the final zip file from S3 bucket to EAI FTP server using the Secure FTP protocol.

We used the ‘ssh2-sftp-client’ npm package, which provides many useful functions to create an SFTP client, connect to the SFTP server, put the file at a location on the server, etc. Each of the package methods return Node JS promise. We have chained promise invocation as follows:

fs.readFile(process.env.SFTP_PRIVATE_KEYPATH, function(err, content) {

  if (err) {
console.log('error ='+err); 
  }
  sftp.connect({
host: process.env.SFTP_URL,
port: process.env.SFTP_PORT,
username: process.env.SFTP_USER,
privateKey: content
}).then(() => {
console.log('FTP server connected');
return this.getObject(bucket, filename);
}).then((data) => {
return sftp.put(data.Body, process.env.REMOTE_FILE_PATH + filename);
}).then((data) => {
return sftp.end();
}).catch((err) => {
return callback(err);
});
       });

The this.getObject function returns promise for getting the final zip file from the S3 bucket.

The FTP server connection details are passed to the code using Lambda environment variables. We used managed ‘SFTP transfer service’ from AWS until the customer made an available external SFTP server. We could easily swap ‘AWS managed SFTP service’ with external FTP server due to the use of the environment variables for connection.

The details of some of the AWS services used to implement these functionalities are listed below:

  1. API Gateway:
    • defined 3 HTTP Post APIs, and each API is integrated with respective Lambda function.
    • these are created as private APIs, meaning they are not exposed over the internet and can only be called from peered Pega VPC. As they are private, no API authentication is needed.
    • An API resource policy is defined to allow access from Pega VPC ID only and blocks all other access.
    • APIs are deployed to get private endpoints, which Pega will use.
  2. Lambda Service:
    • The execution role with appropriate policies assigned for Lambda service operates on S3 service APIs.
    • Configured lambda functions executed in VPC private subnets use VPC Networking. It will have access to the EAI SFTP server via NAT Gateway. The SFTP server will whitelist the NAT Gateway static IP to permit file uploads.
  3. S3 Service:
    • The source system uses S3 for storing source ZIP files
    • Lambda function creates the final zip file as per Pega requested structure into a specific location in S3
    • Status file is generated for zip creation and an SFTP transfer is at a specific location in S3
  4. AWS transfer for SFTP:
    • This is a managed service by AWS for file transfers
    • We used this as an interim solution until the EAI SFTP server is setup by the customer
    • Lambda function will authenticate to the SFTP server using an ssh-rsa private key
    • Use of environment variables for server connection

Takeaways

This article should provide some useful guidance on how to use various AWS services to implement RESTful APIs for clients. Node JS is a very popular server side javascript language used in event-based architecture. In addition, you get many npm packages to benefit from in the development process.

Let me know your thoughts in the comments.

Topics:
node js developement ,aws api gateway ,aws lambda function ,serverless architecture ,aws services ,integration ,tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}