BankNext Microservices: A Case Study
This article is a case study of “BankNext”: It includes a digital transformation, customer onboarding process, and Event-driven & with the hidden landmines.
Join the DZone community and get the full member experience.Join For Free
“BankNext” is on an ambitious digital transformation phase and desires to make its customer onboarding process seamless. After an elaborate functional flow analysis, BankNext implemented an orchestration architecture to collaborate between various microservices.
- Initiation: Prospective customer initiates the joining process on BankNext
- Prechecks: BankNext first invokes a Screening MSvc and a Deduplication Msvc to study the prospect and ensure that this entity is not already present in the system
- Processing (after prechecks pass): CustomerMgt Msvc to create the Customer entity and AccountMgt Msvc to create the Account entity for this customer.
- Compliance: Monitoring Msvc for the initial period to check suspicious activities and Recommendation Msvc to provide customer delight based on customer preferences.
BankNext is quite happy with this architecture due to the following:
- Entity Orchestrator Msvc provides the system the ability to collaborate between multiple microservices.
- The orchestrator is able to sequence and synchronize the invocations.
- The orchestrator adjusts the actions responsively according to responses/exceptions from the called microservices.
- The orchestrator cleverly uses the completable futures to benefit from Async parallel calls wherever the calls are not interdependent.
BankNext quickly realizes: (Houston) we have a problem(s):
- Undesirable latency when invoking and collating responses from multiple microservices.
- Assume each Msvc requires 300ms, the total time required for all 7 would be a minimum of 1seconds (i.e. 300ms * 7).
- This wait makes the prospective customer unhappy.
- The introduction of any new Msvc compels the Orchestrator to explicitly write that invocation call which causes an undesirable tight coupling.
Back to the Drawing Board
After thorough analysis, the engineering and business teams together determine that the root cause of the latency is the longest-running services.
The business review concludes that Screening and Deduplication prechecks are mandatory. Once passed the user is in the clear to get an account, hence a confirmation can be sent out. The monitoring and recommendation are secondary services and should not delay the user confirmation.
In order to implement this business vision, the system design had to evolve into breaking the flow such that the user confirmation is sent immediately after prechecks. The remaining processes i.e customer creation and account creation, monitoring, and recommendation should happen asynchronously. This would greatly enhance the user experience.
- The engineering team immediately realizes that an Event-Driven Choreography would be the right approach to solve this challenge.
- After successful prechecks, EntityMgt Msvc publishes this new entity txn to the Kafka “new_entity_initiated_topic”.
- The EntityMgt Msvc sends back an initial confirmation to the user.
- CustomerMgt Msvc subscribes to the Kafka “new_entity_initiated_topic” and consumes this event.
- CustomerMgt Msvc creates the customer in the system and publishes this event to the Kafka “new_customer_created_topic”.
- AccountMgt Msvc subscribes to the Kafka “new_customer_created_topic” and consumes this event.
- AccountMgt Msvc creates the account for this customer in the system and publishes this event to the Kafka “new_account_created_topic”
- The new entity is successfully created in the system.
- Supporting Services i.e Monitoring Msvc and Recommendation Msvc subscribe to the Kafka “new_account_created_topic”.
- Since this process is asynchronous, a new Notification Msvc is also required to notify the account details to the customer.
- This Notification Msvc can be quickly hooked up into this new architecture by simply subscribing it to the Kafka “new_account_created_topic”.
New Architecture Positives
- Significant improvement in responsiveness to the customer.
- Elimination of tight coupling because explicit service call is no longer needed. The system only needs to know the topic to publish or subscribe to.
- Tremendous future flexibility as new services can quickly be incorporated by subscribing to the correct topic.
New Architecture Negatives: Hidden Landmines
- With improved flexibility, this architecture brings higher system complexity.
- Major infrastructure enhancement to enable the event-driven processing with introduction of robust messaging components.
- Observability, system debugging and tracing capabilities need to be significantly improved to troubleshoot failure scenarios.
- State management, retries, and system-wide rollback mechanisms add to the complexity.
- Requires elaborate SAGA implementation (will be covered in the next article) for system atomicity.
Summary: Architecture, TechStack, and Rationale
|Flexible & extensible integrations||Choreography||Topics||Eliminates centralized coordinator dependency|
|Event-driven capability||Messaging systems||Kafka||Eliminates tight coupling between svcs|
|Future capability efficiencies||Publish/Subscribe||Kafka||Eliminates explicit new svc call invocation|
|System-wide observability||Centralized logging/tracing||ELK/Sleuth||Async logging & observability|
|Heavy volume/TPS||Streaming||Kafka/Async operations||Extremely low latency|
Opinions expressed by DZone contributors are their own.