Good and bad actors in experimentation.
Join the DZone community and get the full member experience.Join For Free
Anytime, every time or sometime you would have heard someone going around with data analysis and saying maybe this could have happened because of this, maybe users did not like the feature or maybe we were wrong all the time. Any analysis and prediction in data analytics across industries experience what I call maybe syndrome. Your best bet to overcome this is to have the right set of data and tools in your arsenal.
With A/B or A/A test experiments to understand the impact of a new product or optimizations on the customer base and associated revenue growth, we often stumble upon the 'maybe syndrome!'
Technically the expectation was to have a positive impact after a feature rollout but we see the numbers down. Debugging this is especially challenging in a heterogeneous desktop traffic ecosystem with a hybrid set of browsers, network conditions, end to end experience powered from different applications and network tiers.
That’s where 'Live Diagnostics' comes to the rescue.
For example, you would have improved the speed of your end-to-end experience by 10x but may see a negative impact on the revenue, the assumption is the quicker the better.
Does this hold when you show broken pages to the new user base faster?
Even today when tracking client and server-side events are in place — only successful events posted after a page load helps to address the intractability aspect. But doesn’t explain if the user tried doing something and he/she couldn’t proceed further. The majority of the system in the world tracks all the users ( good and bad experience ones ) as a cumulative huge data set and makes decisions without being deterministic about the isolation of the bad and good actors in real-time.
Live Diagnostics works on the principle of observing different DOM mutations on the page against pre-requested document ids and store the result set in a local session object and flush the same using the traditional RUM (real user monitoring) mechanism. And uses the power of a software load balancer to get (when and who) the diagnostic module across the end-to-end pages without application-side code changes in different internal domains.
For instance consider the below example, On click to Action1, the expected successful state to the end-user is State 3. However due to additional info required or due to session corruption swapping across domains and subdomains, end users may end up in State 2 or State 4. So the closest you can get today is to track successful Call to Action on the page, embed both the server-side and client sidetracking to decide on a large set of the dataset. However, if we try to solve this problem by tracking the way the user visualizes, irrespective of what is tracked on client or server — it is quicker and can instantly isolate good and bad actors in real-time in your experimentation, thereby comparing sessions that have no bug experience in both (A/B) to measure the true value.
1.Register an interested observer with an associated callback
LD_callback is the callback executed during mutations and
LD_targetNode is what is the node of interest to observe for mutations.
2. In the callback — capture the document nodes that are being added or removed.
3. Apart from the added and removed nodes which helps in isolating the changes on the DOM and user page appearance, it is mandatory to capture the level of interaction by the users.
4. Flush the result-set from your session object on the respective RUM (real user monitoring) endpoint.
Once you are done with the above 4 steps, now you have a standalone module that can tie up with any page in your domain space and start emitting diagnostic metrics.
Ideally, if this was shipped with an individual application-side code base, you can have better control over what the user does on a particular page but may lose the end to end insights to the funnel, similarly, if it was part of the global code base, you lose the control on when and who for an experimentation run.
Hence leveraging the power of Envoy software load balancer where you can flush the gzip script based on different control mechanisms as part of
encodeData in the response stream, you now have an LD that can be On and Off based on the requirement.
5. Integration diagnostic module on the envoy SLB.
For instance, the user trying thrice to do the call to action, bad user experience:
Failure to complete a call to action:
- LD can embed in any static pages to identify genuine user action to Bots.
- LD can give insights into cached page interactions.
- LD can isolate good and bad actors in experimentation.
- LD can split error messages shown to end-users for issues that are isolated and true to only the production ecosystem.
- LD can refine the final analytic data that is derived for making big product, customer decisions.
- LD can diagnose issues not reported by users and alert for issues that are happening in production.
Next time, you are building user tracking or moving more and more content closer to users by edge computing, consider Live diagnostics in your tool arsenal to know what you don't and to know it before everyone does.
Opinions expressed by DZone contributors are their own.