How to Create and Manage a Complete and Effective Product Backlog

DZone 's Guide to

How to Create and Manage a Complete and Effective Product Backlog

A way to have INVEST user stories for your applications in a shorter time and along the right path.

· Agile Zone ·
Free Resource

In the Product Backlog, the only source of truth to build the product, many items must be included, such as non-functional requirements, risk mitigation actions, change requests, bugs, study of solutions, etc. But, first of all, it contains the user stories that describe the product features, which, in the case of a software application, are the actions that users will do on the data.

In this article, I would like to show an approach for creating the initial version of the Product Backlog for a software application, as to the user functionalities to be implemented, and its following enhancements, starting from the data and their life cycle.

The Approach

There are three steps:

  1. First, it is necessary to identify the data entities that the product to be created will have to deal with (i.e. for the management of which the software is intended to be a solution) and build a model with them, not necessarily complete, which shows the essential relationships between them; if necessary (but almost always it is), some relevant attributes of these entities must also be identified, particularly those related to a status or a classification of the entities which they belong to and which will be verified and modified by users thanks to the solution.
  2. Then, the roles involved in the management of these data entities must be identified, even in the form of personas: here also it is not essential to identify them all, just a first set to start.
  3. When you have got the first list of entities, some of their relevant attributes and you have identified the main roles involved, a set of actions, established previously and fairly stable for the type of product, applies to each role/entity pair: it is a matter of bringing out the possible events through which the life cycle of each entity takes place within the product to be created.

The basic idea is as follows: each instance of the entities at stake will have to be created, searched, listed, read, modified, and eventually eliminated, logically or physically, and for each of these events of the entity's life cycle presumably you need a function to take care of it.

Well, each role/action/entity triad will correspond to a possible user story. Actually it is easy to see that the role/action/entity triad fits perfectly to the structure "as a <role>, I want/can <action> <entity>".

Many of the role/action/entity combinations resulting from the three lists will not make sense: particularly when the combination links a role with an action that will never be performed by that role or an entity that does not fall within that role’s competences. So, to get the final set of valid combinations to be implemented, it is clear that we proceed by eliminating what is unnecessary and this is easier than the traditional process of ongoing addition of user requirements: consequently, the creation time of the initial backlog is reduced, even a lot if measures are taken in the creation steps which reduce the number of triads created.

This is the list of the main actions I use to get the product backlog of an application that manages structured data:

Main action

By examining the resulting list of triads, both in meetings with users and stakeholders (story-writing workshops) and within the Dev Team (planning and refinement meetings):

  • the relevant triads are selected,
  • they are reviewed, adding the reason why the business requires that action by that role on that entity (the "why": "so that ..."),
  • give them a priority, taking care the dependencies between the data, related to the business flow,

and a Product Backlog with a sufficient number of stories will be ready in a shorter time and development can start immediately and along the right path.

An Example

The best thing is to show how the approach works with an example. This is an application to manage an association of professionals or practitioners who can achieve qualifications for a kind of skill and who, when they become trainers, can offer courses for other members.

After some meetings with users and stakeholders, a first data model, a part of which (courses held by internal trainers) I show below, can be reached in a short time (the attributes deemed relevant are indicated — at the end of the article you can find the links to view the complete example, suitable for e.g. an association of golfers or chess players):

Data model

with these roles involved:

Roles and descriptions

If you now apply the actions indicated above to each role/entity or role/attribute pair, you get a few dozen combinations: of these, however, you can quickly come to identify only 22 role/action/entity triads to be considered valid and suitable, at least potentially, for the implementation (the colors better distinguish the roles; consider that a trainer is also a member of the association):

Role, entity, and attribute list

Based on this list you can work with users to clarify in the best way how each role/action/entity triad has to be interpreted in the context in which you are; the following list shows a possible outcome of the user story finishing work:

Role, entity, and attribute list

As you can see, many of the aspects to care of while implementing the application come out immediately and naturally, even things that the customer could not think about and that may emerge only in a next iteration.

I am certainly not talking about a way to automatically generate user stories: this is obviously impossible. Instead, I mean a controlled way that allows you:

  • to get a valid track to identify the user stories that bring value, because they are real actions performed by the users on real data;
  • to think, at no cost, stories that make the picture complete, so as not to forget anything related to the area and to provide useful suggestions for features that the customer may not even have thought about initially.


What are the advantages that I have experienced with this approach to create software applications that manage data? Many, really. Let's see them in more detail.

  1. In a short time you have a Product Backlog that guarantees coverage of the area (at least of that identified up to that moment), giving the PO greater ability to plan effective iterations of clear content from the beginning and to make the client trustful.
  2. The user stories generated are 'INVEST'(according to Bill Wake), as they are:
    • Independent: since each role/action/entity combination is distinct from each other, the user stories are independent of each other by construction;
    • Negotiable: the "dry" structure of the role / action / entity combination is on the one hand understandable to everyone, including users and stakeholders, on the other it gives all the room to discuss and explain the business reasons for the action (the "why"), without the risk of going astray;
    • Valuable: the explicit presence of role and action associated with the entity makes the factors clear for establishing the value of the story compared to the others;
    • Estimable and Testable: since the actions correspond directly to development patterns well known by those who will make the software, their chance of being estimated and tested is intrinsic and with a low contingency;
    • Small: since the actions are elementary, there is almost no need to decompose the stories (the triads do not create epics, except in very particular cases or with a too high-level data model): there is therefore a further saving of time.
  3. By ordering the entities according to the business flow or by placing in the first places the entities that are used later to generate and manage the others (in our example 'Course' comes before 'Course attendance'), and by ordering the roles from the point of view of the generated value (in our example the 'trainer' creates the courses and therefore precedes the 'member' who enrolls in them), you have a good guide to prioritize the user stories complying with the intrinsic dependencies between the data, so that what produced in each iteration has a real value and is truly "working", because it fits perfectly with what has been produced previously.
  4. Being the approach data-driven, it is easy to identify local contextshaving consistency: this allows a good guide:
    • to distribute the work across multiple teams with guarantee of minimum overlap and compliance with data dependencies;
    • to isolate contexts that can be placed separately, e.g. in a different database or in a remote environment.
  5. Since all the functional possibilities of the focused area are proposed from the beginning and the Product Backlog is build by setting asidethe less urgent or less significant options and not by gradual additions of what emerges in the work sessions with the stakeholders:
    • you reach the goal more quickly,
    • it is practically impossible to forget aspects of the involved area, and moreover
    • this way provides, at no cost, material to offer the customer additional or alternative functions.
  6. Once the first lists of entities and roles have been written down, and it is obvious that at the beginning we will talk with users and stakeholders about the most important things, which will never be put aside by going on with the work, it is already possible to quickly generate the track for a solid first list of user stories on which to set the first iterations.
  7. Thereafter, new entities, new attributes, even new roles will be added, and the Product Backlog will continue to be fed:
    • iteratively, when new details of the entities already involved emerge, and
    • incrementally, with new entities relating to new aspects of the solution.

Is the Approach Agile? Yes

One might think that the approach requires having a complete data model before starting, but this is not true at all and the approach preserves all the features that make it agile:

  • Quickly reaching a first model from which the triads can be generated depends mainly on the interaction between the participants and on the collaboration with the customer from the beginning;
  • The goal is "working software" in the shortest time possible, where "working" also means certainty of managing in a complete way the entities that the software must deal with, privileging the value and respecting the functional dependencies;
  • The approach allows responding promptly to change: any change or addition to the list of relevant entities and attributes means an immediate generation of new triads for new user stories that add to or replace those previously obtained;
  • The generation of triads in a guided way allows creating room for experimentation, with possibilities (that is: user stories) not initially thought of by the customer (experimentation over prescription).


An approach based on the identification of the entities related to the software to be created, the roles involved and the actions that must be performed on these entities in their life cycle allows you to have significant advantages in terms of time and effectiveness to have quickly a solid initial product backlog and to maintain it over time.

To aim at this identification, the involvement of users and stakeholders is essential as always: but, by reducing the work necessary to create user stories and with the guarantee of complete coverage of the area, the approach allows you on the one hand to focus better on the specific content (the "why") of the user stories to be implemented, with more value for the business, on the other hand to have always clear in front of you the possibilities not yet explored for the product to evolve.

At the following address you will find a complete example of user stories created with the approach I propose, the one I referred to in the text:


The material is made available under CC-BY license, so it can also be used for commercial use as long as my name is mentioned as the author of the stories.

To learn more about the approach, you can reach me at: .

agile development process, data driven development, data modeling, product backlog, user stories

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}