Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

D-ASYNC: Journey to Code-First Cloud Native Apps

DZone's Guide to

D-ASYNC: Journey to Code-First Cloud Native Apps

Read more about the need for coding language that can be ruin in the cloud, regardless of environment, and how D-ASYNC plays a role in that.

· Open Source Zone ·
Free Resource

DON’T STRESS! Assess your OSS. Get your free code scanner from FlexeraFlexNet Code Aware scans Java, NuGet, and NPM packages.

I've been dreaming about the day when we can write regular code that can simply run in the cloud in a distributed manner without any awareness of the environment. Don't get me wrong, we already have such technologies today like Microsoft's Azure App Services with a combination of VSTS CI/CD, for example, or any application in a container that has an endpoint. That's great for one application, but what if we follow a microservice pattern or simply have more services? How do they communicate? How do we guarantee resiliency of a workflow that involves several apps? Of course, there are plenty of options but they go beyond invoking a program function as you would do in a monolith. That may sound obvious, but what if we can build more intelligent systems where the code itself becomes the first-class citizen of a cloud ecosystem?

Where Workflows Matter

You've probably seen architecture diagrams that use many FaaS like this:

Serverless Map-Reduce Architecture

It's an example of a map-reduce design using AWS Lambda in conjunction with Amazon S3. It doesn't really matter how it works and what exactly it does, I just want to emphasize on the complexity of the solution with many moving parts. On one hand, there are obvious benefits of the design, but on the other hand, it's hard to understand the workflow especially when you have too many small pieces — similarly to the harm of small functions in a code. If you had an infinitely vertically scalable fault-proof application, how would you do exactly the same thing in a code without thinking of distributing the logic across multiple deployments? There are few major ingredients needed in this example: events (pub-sub), invocation of next step, and persistence of job state.

1. Events

To describe a publish-subscribe model in a code, we usually use the Observer Pattern, however, some programming languages have built-in support for events like C# (from now on all the code will be demonstrated in C# — there is a reason for that which will be highlighted later):

class Publisher
{
  public event EventHandler Change;
}

class Subscriber
{
  public void Subscribe(Publisher publisher)
  {
    publisher.Change += OnChange;
  }

  private void OnChange(object sender, EventArgs eventArgs)
  {
    ...
  }
}

2. Invocation of Next Steps

I hope that this is self-explanatory. An invocation of the next step is just a function call:

class Workflow
{
  void Run()
  {
    Step1();
    Step2();
    ...
  }
}

class AsyncWorkflow
{
  async Task RunAsync()
  {
    await Step1Async();
    await Step2Async();
    ...
  }
}

3. State Persistence

To be honest, you may not even need to persist any state in a non-distributed application, however, there is some form of a persistence present anyway. When you call a sub-function, the context of the caller is stored on a stack in RAM, or in the case of async functions you have a Finite State Machine (FSM) that holds the state and conveys the context:

class Workflow
{
  void Run()
  {
    int abc = 123;
    Step1();
  }

  void Step1()
  {
    // The state of 'Run' is on the stack in RAM.
  }
}

class AsyncWorkflow
{
  async Task RunAsync()
  {
    int abc = 123;
    // 'RunAsync' is compiled into a state machine,
    // where the state (like 'abc') is saved on
    // the instance of that state machine.
    await Step1Async();
  }
}



Another example of a workflow would be this diagram from Microsoft's Architecture reference:

Anti-patterns and patterns in communication between microservices

It shows that synchronous communication between microservices can lead to cascading failures, thus asynchronous communication mechanism is preferable. How would you lay that out in terms of a simple code? Well, we can use the events concept (pub-sub) from the previous example and we need classes/objects (Object-Oriented Design) and async functions.

4. Service Definitions and Instances

Using the Abstraction Pattern you can express a (micro-) service definition with an interface:

public interface IBasket
{
  Task Add(StoreItem item);
}

Then you can implement the service with an actual class which can consume another (micro-) service to directly communicate to using...

5. Async Functions

public class Basket : IBasket
{
  // Dependency on another service.
  private readonly IOrdering ordering;

  // Use Dependency Injection pattern.
  public Basket(IOrdering ordering)
  {
    this.ordering = ordering;
  }

  public async Task Add(StoreItem item)
  {
    // Call function on another service asynchronously.
    await ordering.GetShippingOptions(item);
    ...
  }
}

The async-await syntax implies that a function might not execute immediately, where a Task is a completion promise which you can subscribe to, or poll on.

D·ASYNC

Distributed services and workflows with C#-native language features

What is D·ASYNC? In short, it's everything described above but in a reverse order — an open-source framework with the idea to translate code into architectural patterns of a distributed app. In other words, it makes the code the first-class citizen, where your classes (services) and functions (steps of workflow) can run on different platforms, which are subjects of future optimization rather than an initial design choice.

Why Code-First Is Important

It's not a secret that microservice architecture pattern can be prohibitively expensive for new projects, especially for startups and SMBs, where a monolith is the preferable solution. At some point, a monolith becomes too "heavy," and we try to break it apart — we've seen it over and over again. A few questions that arise during this process include: "What do we use to host new services?", "What inter-communication mechanism do they use?", "How do we ensure resiliency when calling multiple services?" While it can be fun to play with different technologies, they are just artifacts, not primary business objectives.

A monolith application already has the business logic, so why not keep using it instead of re-writing almost everything? Putting nonideal design choices aside, a code re-factoring process is natural to any project over its lifetime — you shape and organize the code, create new modules and components. They can be first candidates for separation into its own domain — its own (micro-) service. Such separation can feel natural with D·ASYNC, because the code is treated as a first-class citizen, which can be mapped to run in a separate deployment preserving same contract with already defined bounded context (DDD).

Adding more tools and libraries on top of existing programming language abstractions makes development definitely easier. However, such approach will never get us to the point where programming truly feels natural and easy, fast and productive. For example, before async-await, we had to use thread pool libraries like the Task Parallel Library (TPL) for .NET. But the approach to parallel programming has drastically changed when it became the part of the language and framework themselves. It does not mean that async-await syntax yields the most efficient results in terms of performance, but it definitely does in terms of productivity — the code merely looks natural and hides the complexity of vertical scaling.

How D·ASYNC Helps

The core concept is to take a concrete case of a distributed application and try to express the same intent in a code using just the syntax of a programming language, OOP paradigms, and Design Patterns as demonstrated in the 5 code examples at the beginning of this article. Not everything can be done in this way though because a general-purpose programming language might not have any analogous syntactic structures of distributed design patterns. Nevertheless, the goal is to hide the complexity of distributed programming whenever possible — making it natural to write and read, not necessarily most efficient performance-wise.

The code is the first-class citizen - D·ASYNC

To make those concepts work in a distributed environment, the statefulness of workflows is a prerequisite for resiliency and scalability (the code example #3), otherwise it will be another RPC with cascading failures. The persistence can be achieved by saving and restoring a state of finite state machines — the auto-generated ones from async methods. Here is what D·ASYNC engine does to the code from a previous example:

// This is a service, which is a part of a workflow.
// The interface defines the communication contract.
public class Basket : IBasket
{
  // Dependency on another (micro-)service.
  private readonly IOrdering ordering;

  // The Dependency Injection pattern translates to the
  // service discovery. The 'IOrdering' service can be
  // deployed separately on a different platform.
  public Basket(IOrdering ordering)
  {
    this.ordering = ordering;
  }

  // This is a routine of a workflow.
  // An async function is compiled into a state machine.
  public async Task Add(StoreItem item)
  {
    // This code block is the first state transition
    // of the auto-generated finite state machine.

    // Calling another async method results in saving
    // state of current routine (FSM) and scheduling
    // the executing of another routine. The ability to
    // persist the state differentiates this technology
    // from any form of Remote Procedure Call.
    // Communication with another service can be handled
    // by a service mesh for example.
    await ordering.GetShippingOptions(item);
    // When 'GetShippingOptions' is complete, it schedules
    // its continuation - the current 'Add' routine, which
    // resumes at exact point from saved state. For
    // example, the input argument 'item' is the data of
    // the state. You can look at this as 'Add' routine
    // subscribes to the completion event of the
    // 'GetShippingOptions' routine. That's what creates
    // a distributed event-driven workflow.

    // This code block is the second state transition
    // of the auto-generated finite state machine.
    // The 'await' keyword acts as a 'delimiter' between
    // state transitions.
  }
}

The D·ASYNC on Azure Functions blog post unveils a seamless demonstration of the technology where you simply push an application code to Git hosted on VSTS, which triggers CI/CD pipeline and automatically deploys it as a set Azure Functions, where nothing in the code is hard-coded to use such environment, similarly to the example above. That means that you can run the same application in single process, Windows or Linux, Azure or AWS without changing a single line.

How D·ASYNC Does Not Help

You should not think that D·ASYNC will magically turn any monolith into a distributed app where you can re-deploy any piece of code on demand. Instead, it's merely a language-integrated abstraction layer that helps you achieve exactly the same things as you do today in a more friendly way. You are still responsible for the architecture and design regardless if you implement microservices or just use Service-Oriented Architecture.

Another danger (as with any abstraction layer) is that you can easily miss performance, reliability, scalability, and security issues as described by fallacies of distributed computing. You can always blame it on tools and frameworks, but I believe that the root is in a programming language itself. A general-purpose programming language is not a cloud ecosystem programming language, thus we try to project distributed computing concepts onto abstractions of a simple language. Then, when we read that code we don't see the backward correlation.

Programming languages should evolve with the demand of majority of its users. No, we will not see distributed computing syntax in most popular languages any time soon. However, .NET platform has a very neat feature - a compiler with the codename "Roslyn". It allows you to introduce new syntax or keywords to C#. That means you can extend the language to describe new concepts. For example consider this code snippet:

service Basket
{
  routine void Add(StoreItem item)
  {
    trigger ordering.GetPrice(item);
  }
}

The new keywords service, routine, and trigger don't introduce any new functionality — they are transpiled into class, async Task, and await respectively. As for a developer, this increases visibility and awareness of distributed environment and helps to avoid issues at early stage. This will be one of the D·ASYNC for .NET features in the nearest future, but for now beware of the traps.

The Seed

The rough idea came to me about 3 years ago after being unsatisfied with previous experience of dealing with distributed workflow frameworks. The turning point was a StackOverflow question about scheduling a function on a different machine with Task.Run(), where the obvious answer was "not possible". With digging into auto-generated finite state machines from async methods at the same time, I put two things together to create the first concept of a workflow engine. It then took a very long time of following the latest trends around microservices, containers, serverless, and service mesh along with making the first proof-of-concept application to create a much bigger vision.

Thus the origin of the technology name: D for Distributed and Async after the use of async functionsto describe a workflow with underlying auto-generated finite state machines. C#/.NET were chosen due to technical feasibility in first place.

Open-Sourcing Challenges

D·ASYNC was living in a private repository on GitHub for a couple of years in R&D mode but is now available to the public. The project is born, doing its first baby steps towards adulthood, and waiting for the community to shape its personality.

Personal Struggles

Having a full-time job, I find it can be very difficult to find a balance between work, personal life, and a side project. It feels like a second full-time job that consumes most of my time, and where I sacrifice ordinary human communication. Investing into this project (including writing this article) involves waking up very early and staying up late on both workdays and weekends — day after day, month after month, coffee after coffee. It drains a lot of life power. It's not that easy to stay self-motivated over a time span of a few years to do such physically and emotionally exhausting activities. Sometimes a side project can get swept under the rug, but personally, unfinished business keeps me awake at night and leaves me with a sense of disappointment so I'm eager to pick it up again. It's not an expression of a complaint, but a description of a harsh reality when you dream big and try hard. The driver is a personal challenge to make something great and useful in the name of progress, even if it fails.

Open source code is like an avatar - there are live people behind it.

The Law

When you work on an open-source project you should always think about its relation to your current employer. Laws differ from state to state and country to country, but usually there is a common set of rules to preserve your intellectual rights:

  • Don't create a competitive project - usually boils down to not stealing customers and getting revenue from same/similar channels.
  • Don't work on a project during working hours. Keep work and personal business separate.
  • Don't use work computers and devices to develop your project.
  • Don't use software paid by your employer. Get your own subscriptions or use freeware.
  • Don't use knowledge acquired during your employment, learn more on a side. Your employer pays for it when you perform duties and grow in your career path. Visiting prepaid conferences counts.
  • Don't implement anything related to research and development at your company. Your employer anticipates a return on such investment.
  • Don't use help of your colleagues during working hours. They are paid by the employer as well. If any of your co-workers wants to contribute to your project, make sure that they abide the same rules and respect the employer.

While some of the points above can be debatable and hard to prove, breaking any of them would be simply unethical besides causing legal troubles. Just put yourself into your employer's shoes with any step you do and think about possible implications from the other side.

My current employer is not ready for open-source culture yet, so I decided not to bring cowokers to my project even though some of them expressed interest in contributing.

Get a mutual trust and respect with your employer.

Depending on your project and objectives, you might think about avoiding personal liability. Even with choosing the right license agreement, there is always danger to get into a legal fight. An example would be a patent troll lurking in the shadows and waiting to feed on your success when you grow big. Personally, I decided to go with a limited liability company even though the project is free to use and does not bring any revenue.

Regarding patents, I'm on the side of not having them for software. They impede the progress and don't bring a lot of value. Nowadays a software patent does not guarantee you a lot of protection, because it's ridiculously expensive to fight over — you'd probably pay a million US dollars in attorney fees. The approval process takes at least one year and technology changes very fast so your patent can be obsolete by that time you get it. Acquiring a patent can be a gamble — high patent attorney fees and hard to get it approved. The best use of a patent is a future valuation of your company if you have serious intentions and only if you can afford it.

Promotion and Contribution

From my perspective one of the biggest challenges is to promote the concept of the D·ASYNC project, but not the actual implementation which is still in preview and not recommended for production yet. The problem is in the paradigm shift — trending technologies mostly revolve around containers as a first-class cloud citizens. With D·ASYNC I'd like the community to start thinking about how to reach the next level of abstration and make code-first applications on top of existing technologies.

Are you ready for the paradigm shift?

At this moment I think the best contribution to D·ASYNC would be conceptual and ideological instead of adding new features to the code. This is slighty harder compared to other well-defined projects — such contributions usually require a long-term commitment with a high degree of scrutiny which do not give immediate value back. As a reader, you might get enlightened with the idea but when it comes to utilizing the actual implmentation you can find that it may not look exactly as advirtised or does not fit your needs - that's where most projects lose traction.

Afterword

This is my story of creating the most complex and challenging open-source project that took a time span of 3 years. I didn't want to focus solely on the project itself, but rather want to remind that there is much more than the code and features — an emotional part, legal aspects, and hurdles in perception. I hope you enjoyed this journey, and if you interested in the project development, you can find it on GitHub.

Try FlexNet Code Aware Today! A free scan tool for developers. Scan Java, NuGet, and NPM packages for open source security and license compliance issues.

Topics:
microservices ,serverless ,workflow engine ,cloud native applications ,monolith ,open source ,d async

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}