Agile Design Automation With Logic Bank
Reduce backend code by 40X, preserve design flexibility, and enable agile iterations.
Join the DZone community and get the full member experience.
Join For FreeAutomation
As engineers, we are always mindful of the need to minimize cost and time. Experience has taught us that while automation is valuable in reducing code, we must also preserve design flexibility and enable agile iterations. Let's break that down a bit, and then illustrate with an example.
To warrant the learning curve, automation must significantly reduce routine coding. 25% reduction in code is not compelling enough.
And the opportunity is there - a simple "cocktail napkin specification" of 5 statements can result in hundreds of lines of code. The goal becomes making these specifications rigorous, and executable.
But to deliver the value, there are additional critical automation requirements:
- Extensibility— so we can use standard languages and approaches to build what is not automated
- Manageability— so we can use existing tools and procedures to manage our system (e.g., database rows are tough to push to git)
- Scalablility— so there is no significant performance tax
Design Automation (Not Just Code Reduction)
We’ve all seen automation tools that reduce code but force us into design choices that don't fit our needs. Such a "flexibility tax" can negate the value of the automation.
So the real requirement is Design Automation - automation that not only reduces code, but preserves our design flexibility.
Agile Design Automation
Agile advocates that projects begin implementing early, even in the face of unknowns such as data size and transaction volumes. We then are forced to make assumptions about things like performance denormalizations, which become baked into the applications we write.
The problem is that these unknowns are often not discovered until late "go-live" testing. And that's after the long phase of dev iterations.
Agile Design Automation reduces code, preserves our design flexibility, and enables us to make iterations without recoding.
Case Study: Agile Design Automation
For a typical database system, the backend is nearly half the effort. The diagram below illustrates some automation services that can help:
- Our design (left), perhaps a database diagram, and a "cocktail napkin specification" of our transaction logic.
- Resulting code (lower right), numerous modules that implement our design. As noted above, without automation, there is well over an order of magnitude explosion of design to code.
- Automation Services (upper right), including:
- DBMS
- Object Relational Mapping (ORM) systems, such as SQLAlchemy for Python
- Logic, such as Logic Bank (explained below)
- UI Builders, such as Flask App Builder (shown below)
Let's explore how these technologies can automate dramatic reductions in tedious coding, preserve design flexibility, and provide agility so we can make changes to existing systems.
Example: Only Sales Reps Have Orders
To make things definite, let's consider an example where we have orders, owned by employees. Our requirements:
- Some Employees are "commissioned"
- Orders can be related only to Commissioned Employees
Database Design
We can represent the database design as follows:
In order to avoid consistency issues, or perhaps because the schema is locked, let's imagine we don't want to store the order_count
(no performance denormalization), as suggested by using italics.
Logic Cocktail Napkin Design
We could implement logic in "add order", but we also want to address other Use Cases. For example, "update employee" for non-commissioned employees should fail if they have assigned orders.
Are there other Use Cases we might have overlooked? Hold that thought...
So, our cocktail napkin design for "add order" and "update employee" is:
- Define
Employee.order_count
- Constraint to ensure
order_count
can be > 0 only for commissioned
Example 1: Add Missing Indices
Our first example is well-known, but is a great illustration of Agile Design Automation.
Back in the day, pre-relational DBMS systems required that we explicitly name indices on each and every query, in each and every program. Then, we would discover, late in the project, that a certain access was common, and required an index.
It was reasonably quick to discover this, and add the index. The problem was that all the existing programs did not know about the index - they were not independent of this critical aspect of database design. So, we would have review all of them and make the required revisions. And this took a lot of time.
Relational: Access Data Independence
With the advent of modern database systems, this huge problem was eliminated. Knowledge of the index structure was extracted out of each application, replaced by the query optimizer.
In our example, perhaps we forgot to place an index on Order.EmployeeId
. Even though we might find it late, during go-live testing, no problem - we add the index, and all the programs we've already written use it automatically, and run faster.
Relational database has delivered Agile Design Automation: we can express our design intent to add indices, and the optimizer will use them for dramatic performance improvements. It’s agile, since we can make the design change without changing program code.
Example 2: Add Performance Denormalization
But index structure is not the only critical design decision we need to make early. Another is "performance denormalization". Consider a rollup, like Employee.order_count
:
- Should I store a rollup as a physical column?
This provides near-instant access (no SQL select sum
), but requires all accessing programs to maintain it, consistently. That means more work, more chances of error.
- Or derive it on the fly?
This approach eliminates the consistency issues, but could perform slowly.
The performance implication can be substantial, particularly in the case of chained rollups - one rollup depends on another. For example, a Customer Balance depends on another rollup, Order.AmountTotal (a rollup from OrderDetail.Amount). This results in a set of sum queries.
Such chained rollups can literally prevent deployment. I personally had a situation where the Service Level Agreement (SLA) was 3 seconds, but the observed result during go-live testing was 3 minutes. It was clear that jobs were on the line. The solution was clear: performance denormalization.
No Obvious Answer — Make a Call and Begin Implementation
As with most interesting questions, there's no right answer - it depends on data and transaction volumes, which we may not know when the project starts. Even in the face of these unknowns, development must start.
In our example, we made the call to "derive on the fly." Perfectly reasonable. This assumption is then coded into dozens of application modules, including web apps, web services, reports, etc.
Problem: Go-Live Testing Uncovers Performance Problem
Then, during go-live testing, we discover that one "sales rep" is the automated website, and it has thousands of orders. Summing these up is too slow. We need to denormalize, to avoid the sum query.
Without design automation, this is a problem, because we have to go through all our apps to discover where we assumed "derive on the fly," and change the logic. This is very analogous to our pre-relational need to recode to use a new index. Automation can help.
Solution: Information Hiding Using ORM Virtual Attributes
Many projects use ORMs (Object Relational Managers) to provide an abstraction layer for data access. ORMs can provide a better programmatic interface (better attribute / object names, type checking / code completion, etc). Many any also provide virtual attributes - the ability to define attributes that are not stored in the database, but materialized by the ORM.
SQLAlchemy is one such example, where such support is termed "hybrid attributes". For example, you can define our order_count
like this:
Some quick background orientation. In the screenshot above, models.py
contains the Python classes that map to our tables. Shown here is part of the Employee
class (note: this file can be generated). In particular, note:
- lines 110-115 define stored columns, so you can code
anEmployee.Salary
- line 119 defines a self relationship, so you can code
anEmployee.Manager
to get an employees manager row, andanEmployee.Manages
to get a list of employees managed
Lines 123-131 define our hybrid (virtual) attribute as Python code. Note you can set breakpoints (the red dot).
It references order_count_sql
, defined as follows:
Employee.order_count_sql = column_property(
select([func.count(Order.Id)]). \
where(Order.Id == Order.EmployeeId))
Agile: Denormalize, Without Application Changes
So now, all of our apps can access this property, without any knowledge of whether it is stored, or derived (aka "information hiding"). Why is that important?
Let's return to our need to denormalize and store the order_count
column.
We can quickly alter the schema to create the physical column, and make a minor change to the ORM model to replace lines 123-131 with order_count = Column(Integer)
. This takes almost no time. But what about all the code already written?
Importantly, the ORM model change does not affect its external interface. And since the apps are coded to the ORM model (not the schema), there are no changes for all application code we've already written.
SQLAlchemy has delivered Agile Design Automation: preserved our "store vs. derive" design flexibility, and kept us agile since we can change our design without recoding.
Example 3: Agile Collaboration - Basic Web App
Agile emphasizes early working software to foster business user collaboration to drive iteration cycles. So, instead of database diagrams (which don't mean much to business users), we want to engage them with working software. Like running screens, with real data.
But that takes a long time. More automation is needed.
Flask App Builder is layered over SQLAlchemy. And we've built a generator — fab-quick-start — which can create multi-page, multi-table basic web apps in minutes, like this:
These are super for prototyping, testing, and fleshing out logic requirements. For more information, you can read this article about Instant DB Web Apps.
Example 4: Business Logic
Now let's consider how exactly how to build the business logic that implements our "cocktail napkin" design, that only Commissioned Employees can have Orders.
Logic Bank Makes the Cocktail Napkin Spec Executable
There are many legacy approaches for implementing business logic: triggers and stored procedures, ORM events, or controllers in User Interfaces. But these are all pretty code intensive, and the stakes are high - backend logic is typically nearly half the system.
We'll use Logic Bank to automate update transaction logic - multi-table derivations, constraints, and actions such as sending mail or messages. Logic Bank is based on SQLAlchemy, handling events to enforce logic. Logic consists of both:
Rules — 40X more concise, using a spreadsheet-like paradigm (sum, count, formula etc), and
Python — extensibility and manageability, using standard tools and techniques (code editors, IDEs, source control etc)
Not only is the rule-based approach is 40X more concise, it addresses over 95% of your backend database logic. That means nearly half your system is 40X more concise.
Let's face it: this is quite a claim. You can check out Logic Bank here, for a running example of 40X conciseness (complete with actual legacy vs. logic code comparisons), and approaches for key challenges of extensibility, manageability, and scalable performance. The examples are available to explore and run on GitHub.
With Logic Bank, we declare our logic in Python. The spreadsheet-like rules approach enables us to input our cocktail-napkin design directly, as lines 68-72 (we could not resist showing our favorite rule, lines 74-82):
Best of all, this design is executable (automated) via the Logic Bank runtime system, operating as a SQLAlchemy event handler. This addresses the Use Cases we imagined:
- Add Order - the Logic Bank rule engine will:
- Adjust
Employee.order_count
, per the rule on line 72 - Verify our constraint on line 68
- Adjust
- Update
Employee.IsCommissioned
- the Logic Bank rule engine will:- Verify our constraint on line 68 (e.g., refuse to reclassify a commissioned employee with assigned orders, by raising an exception and rolling back the transaction)
In the same way relational database has provided declarative automation for data retrieval, Logic Bank provides declarative automation of data update logic. You can read more about the declarative approach in Chris Date's book, What, Not How.
Logic Bank has delivered Agile Design Automation: backend code reduced by 40X, preserving design choices such as denormalization, with agility to change business logic without application recoding (more on this below).
Summary: Agile Design Automation
This short example has illustrated how Agile Design Automation can provide benefits to you and your organization.
Win #1: Automation - 40X Reduction in Backend Code With Logic Bank
Logic Bank uses spreadsheet-like rules to make your cocktail napkin spec executable, reducing backend database logic by 40X. This article provides examples and some historical background on this proven technology.
Win #2: Preserves Design Flexibility (e.g., Performance Denormalizations)
Logic Bank rules are layered over the SQLAlchemy class model, not the database schema. So, your logic is the same, whether or not you denormalize, or change your mind.
But, if you denormalize, what about consistency? Recall that was a risk in legacy approaches. This risk is now eliminated, as follows.
Your logic is extracted out of application code, into Logic Bank. Logic Bank automatically enforces your derivations and constraints, for all applications using SQLAlchemy. And it does so automatically, requiring no application action - eliminating consistency errors.
Contemporary engineering focuses on achieving reuse; this introduces automated reuse.
Win #3: Logic Agility for Applications
Let's tweak our go-live performance scenario to imagine we discover a logic error - we overlooked the requirement for Commissioned Orders. In a legacy (pre-Logic Bank) approach, we might be missing logic in web services, web apps, etc. A lot of code to review, redesign and test.
Since Logic Bank extracts logic out of applications, you now have logic agility, not just denormalization agility. You can add the 2 rules shown above, and all of the apps will behave accordingly, without change. Automated reuse.
Agility is further promoted since the rule execution is automatically ordered, based on system discovered dependencies. Add the rule to your rules_bank.py
code (above) in any location; it will be active.
Win #4: Automatic Reuse Catches Missed Use Cases
And there's a bonus beyond logic agility. As often happens, there are Use Cases we forgot to consider. For example, we neglected to design what it means to move an order from Employee-1 to Employee-2. No problem, this is automatically addressed per the rules above:
- Move Order from Employee-1 to Employee-2 - the Logic Bank rule engine will:
- Adjust Employee-1.order_count down by 1 (not a select sum - Logic Engine respects our denormalization design decision)
- Adjust Employee-2.order_count up by 1; this change triggers...
- Verify our constraint on line 68
This is a pretty big deal. By using a declarative approach, our logic is automatically re-used over all transactions that touch the data - even ones we might have overlooked. Even code we've already written.
Conclusion
Automation often levies a "flexibility tax,” reducing the flexibility required to achieve project objectives. But done right, as in the case of SQLAlchemy and Logic Bank, you get the time-savings of automation, yet retain control of critical design flexibility. That's Design Automation.
And it's not just flexibility, it's data independence - the ability to make those design decisions after the apps are all written. That's Agile Design Automation.
You can try Logic Bank and SQLAlchemy now, to explore this and other examples. Getting Logic Bank will install both packages and fab-quick-start; Logic Bank is open source on GitHub.
Opinions expressed by DZone contributors are their own.
Comments