DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Beyond Chatbots: How AI Is Rewriting Entire Business Models
  • Beyond Conversation: Mastering Context with Claude Code Skills and Agents
  • How To Build A White-label AI Chatbot: Here's the Complete Process
  • Beyond Request-Response: Architecting Stateful Agentic Chatbots with the Command and State Patterns

Trending

  • Ujorm3: A New Lightweight ORM for JavaBeans and Records
  • Building an Image Classification Pipeline With Apache Camel and Deep Java Library (DJL)
  • Multi-Scale Feature Learning in CNN and U-Net Architectures
  • Building a Skill-Based Agentic Reviewer with Claude Code: A Practical Guide Using Skills.MD, MCP Servers, Tools, and Tasks
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Testing Chatbots for the Unexpected

Testing Chatbots for the Unexpected

This article shows how to design a robust test strategy for a mission-critical enterprise chatbot by using automated testing.

By 
Florian Treml user avatar
Florian Treml
·
Updated Feb. 28, 22 · Opinion
Likes (3)
Comment
Save
Tweet
Share
7.8K Views

Join the DZone community and get the full member experience.

Join For Free

Quite often we are consulted to design a robust test strategy for a mission-critical enterprise chatbot. How is it possible to test something for all possible unexpected user behaviour in the future? How can someone confidently make assumptions on the quality if we have no clue what the users will ask the chatbot?

Short-Tail vs Long-Tail Topics

While we do not own a magic crystal ball to look into future usage scenarios, from our experience we gained the best results with a systematic approach in a continuous feedback setup. In almost every chatbot project the use cases can be categorized:

  • short-tail topics — the topics serving most of your users needs (typcially, the 90/10 rule applies — 90 % of your users are asking for only 10% of the topics)
  • long-tail topics — all the other somehow exotic topics that the chatbot could also answer
  • handover topics — topics where for whatever reason handover to a human agent is required

Examples for short-tail topics in the telecom domain:

  • Opening hours of the flagship stores
  • WLAN connectivity issues
  • Questions on individual invoice lines

Examples for long-tail topics in the telecom domain:

  • iPhone availability in the flagship stores
  • International roaming conditions
  • Lost SIM-card phone

Examples for handover topics in the telecom domain:

  • Contract cancellation
  • Business products and business customers

Getting Started

Our recommendations for the first steps in a chatbot project are always the same:

  1. Focus training and testing exclusively on the short-tail topics — typically, with a customer support chatbot, which are the majority of the chatbots we’ve been working with, there are a handful of topics only for which you have to provide a good test coverage
  2. Apart from that, leave the long-tail topics aside and design a clear human handover process

The challenge now is how to get a good test coverage for the short-tail topics — this is the real hard work in a chatbot project. 

Continuous Feedback

As soon as the chatbot is live, there is constant re-training required — this process involves manual work to evaluate real user conversations that for some reason went wrong and to deduct the required training steps. During this process, the test coverage will be increased — for the short-tail topics we can expect a near 100% test coverage within several weeks after launch: those are the topics asked over and over again, and as complex as human language may be, there is only a finite number of options how to express an intent in a reasonable short way.

As opposed to the title “Testing Chatbots for the Unexpected”, you can see that we suggest a very down-to-earth-approach — not testing for the unexpected or the unknown, but testing for “the most likely options” and try to get a high test coverage for those cases (this is possible as human language is complex but finite).

Automated Testing

When it comes to test automation, the critical question is if something is worth investing the initial automation effort.

  • Establishing automated testing for the short-tail topics is a must
  • Ensure a smooth handover process to a human agent
  • Long-tail topics can be handled with manual testing in the beginning
Chatbot

Published at DZone with permission of Florian Treml. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Beyond Chatbots: How AI Is Rewriting Entire Business Models
  • Beyond Conversation: Mastering Context with Claude Code Skills and Agents
  • How To Build A White-label AI Chatbot: Here's the Complete Process
  • Beyond Request-Response: Architecting Stateful Agentic Chatbots with the Command and State Patterns

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook