Earlier this year, I joined Treasure Data as the team’s first UX Designer. I’m here to craft delightful experiences for data scientists, analysts, and fellow data lovers.
Since its inception, Treasure Data’s product strategy has been to roll with the punches by responding expertly to individual user needs and building solutions feature by feature. Although this approach was quick and reactive, it became clear that Treasure Data just couldn’t advance to the next step of building exceptional experiences and products without dedicated designers. Now as a proud member of a newly-formed design team, I’m here to help graduate Treasure Data by shaping great user experiences with data-driven research and well-informed design recommendations.
All hopeful treasures of insight start with trawling through raw usage data. No detail is too small. Menu choices, clicks, field entries, and paths are being analyzed to improve the product’s experience. Let’s take a look at one example of this work.
The Pain in the Schedules
Data analysts often write and schedule queries to be executed regularly when automating data collection or report creation. However, if scheduled to execute every day at midnight, for example, a data loader may be in the middle of collecting relevant data when the clock strikes 12:00am. Delaying the execution often ensures referencing the most recent snapshot of the data. Currently, users are required to specify this delay amount in number of seconds within an open-ended text field. This, unfortunately, leads to a non-assistive conversation with Treasure Data, forcing users to judge and calculate values.
Instead, we can imagine a more helpful experience. By recommending a reasonable initial value with some common alternatives, we may reduce the mental load and stress often experienced in situations requiring open-ended input. The following sections walk through how I quantitatively sleuthed and arrived at this improved conversation.
Counting Things Carefully
I started by collecting all scheduled queries (819) across all accounts. Scheduled queries without a delay or a zero delay were ignored.
At first glance, our users specify an unreasonably wide range of delays, anywhere from 1 to 180,000 seconds (50 hours). 70% of delays fall below 1 hour (3,600 seconds). The average delay is around 1.5 hours (5,391 seconds), while the median delay is 40 minutes (2,400 seconds).
These basic statistics suggest that a default delay of “1 hour” would work well. But digging deeper surfaces some common mistakes and led me to clear alternatives for customization.
Finding the Pain Points
Where there are pain points, users make mistakes. In this case, mistakes include non-sensical values and data input errors.
Users often specify whole seconds such as 1,000 or 2,000 seconds, which initially look like good round choices. But after further inspection, these values respectively convert to 16m:40s and 33m:19s. A learned lesson would be that round seconds do not necessarily amount to a round time.
Some users even start with good intentions and perfect calculation, as evident from several instances of 10,800 seconds (3 hours). However, the same user specified a delay somewhere else for 10,080 seconds (2h:48m), seemingly entering the data incorrectly.
We can minimize these paint points by eliminating unnecessary calculations and replacing error-prone data entry fields with multiple choice “buckets.”
Bucket All The Things
First, I categorized time values as being either “common” or “uncommon” time values. A “common” time value is often used by the common public (e.g., 30 seconds, 5 minutes, 45 minutes, 3 hours, etc), while all others are not (e.g., 51 minutes).
When values were categorized into these buckets (41), common delay time values represented 83% of existing values.
But 41 options are way too many. Instead, a smaller subset can represent most cases. The following 10 values can cover over half the cases (55%) without any changes.
- 1 minute
- 5 minutes
- 10 minutes
- 20 minutes
- 30 minutes
- 45 minutes
- 1 hour
- 2 hours
- 3 hours
- 5 hours
Delays that do not already fall into those categories can be migrated to one of these buckets.
Change is Hard
Open up any UX Design book and you’ll find many reasons why these type of recommendations are necessary. Whether it be to reduce mental load on the user, minimize mistakes, or to maximize user efficiency; take your pick. However, we live in the real-world with many stakeholders. When transitioning users from a flexible tool to a more restrictive albeit easier one to use, resistance will be met. This is why sometimes a migration plan is more important than the design recommendation itself.
In my case, I had to adopt a conservative, multi-staged migration strategy to minimize friction and maximize backward compatibility, given the low priority of the feature in context of a future overhaul of the entire product. The first stage will implement two controls: a numeric text entry and drop down for specifying units (second, minute, hour). By default, 1 hour will be specified. This stage allows no change for the user to become accustomed to while allowing some time to prepare users for future change. The second stage will implement a dropdown with a default of 1 hour and recommended 10 alternatives.
There are plenty of other features that require the same level of attention and care. Improving small components like table settings and output connector parameters will demand basic statistics; while addressing large themes like transparent data management and secure asset management will call for real user observations and research. Regardless of size, these UX problems have yet to be solved, and I welcome these challenges.
Elmer Kim is a UX designer at Treasure Data, a big data startup. Along the way, he has designed a Text Analytics tool at IBM and researched the future of sensorized prosthetics at the University of Virginia. He's classically trained in systems engineering, data science, applied mathematics, and neuroscience.