DZone
Java Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Java Zone > Designing Human-Targeted Random IDs

Designing Human-Targeted Random IDs

Very usable human-targeted random IDs are short, only contain digits and ASCII letters, and are designed to prevent and detect typos.

Bertrand Florat user avatar by
Bertrand Florat
·
Apr. 18, 22 · Java Zone · Analysis
Like (7)
Save
Tweet
4.22K Views

Join the DZone community and get the full member experience.

Join For Free

Designing Human-Targeted Random IDs

NOTE: We don't deal here with technical IDs used as primary keys in relational databases. See my previous article here if you seek a great way to generate them.

Context

During one of my recent projects, I have been asked to design a scheme of IDs highly usable by humans. The business requirement was mainly to create pseudo-random values that can't be inferred or guessed in order to be used as a secret token printed on some official documents for future controls.

Later on, we had a similar requirement with lower security concerns: generating human-readable file numbers that can be printed on associated documents, verbalized on phone, or typed when doing searches.

Another well-known example (in France at least) is the ID (aka "SNCF number") attached by the French railway company with each train travel so one can open easily any travel details from your smartphone without being fully authenticated.

Main Criteria

After having compared existing solutions and analyzed the business stakeholder's requirements, these criteria emerged:

  • These IDs have to be short to be easily typed, read, or verbalized on phone by a human (no more than six to ten characters).
  • They have to integrate systems that prevent and detect typos.
  • They don't have to be unique (and can't because of their small size and thus variability). However, the system has to prevent collisions either by coupling these IDs with some other values (like a person's last name) or by retrying another attempt when a shuffle value already exists (the solution we use). You’ll have to remind that closed items may own the same ID (when doing a search by ID, for instance, make sure to make status into account).

How To Make These Values Truly Usable?

  • Limit the number of possible characters by using more than base-10 (decimal) numbers but add lowercase and uppercase letters. Avoid using others characters (punctuation marks,   diacritics,...) that are more difficult to read.  Hence, in theory, we can generate numbers made of up to 10 digits + 26 lowercase ASCII letters + 26 uppercase ASCII letters = base-62 numbers.
  • Ease typing and reading as much as possible: the number should be composed of no more than four or five characters easily memorized as a whole like aGty3. If longer, split the ID using hyphens (and underscores that could be difficult to read when used as an hyperlink).
  • Make sure that these values can be easily pasted using a single command into clearly separated text fields.

How To Prevent And Detect Typos?

  • Exclude confusing characters. Keep in mind that the similarity depends as well on the fonts used: an 'l' can be easily distinguished from a '1' when using a plain old monotype font but less when using a sans-serif one. We advise excluding the most problematic cases: 'O' and '0' (zero), 'Z' and '2' or 'l' and '1'. By dropping these characters, we now deal with base-56 numbers.
  • Reserve some bits as a CRC or checksum in order to detect most typos early on the frontend. Such systems are used by banks for decades on IBAN accounts for instance (using the MOD97 algorithm). Users will thank you for notifying them early and this GUI-side surface control prevents issuing some useless server-side queries and ugly error logs on the backend.

NOTE: Some light CRC solutions can’t detect all but most of the possible typos.

What About The Security?

  • If these human-readable IDs are used in serious matters dealing with money, security, or official documents, make sure to use a cryptographically secure pseudorandom number generator (CSPRNG) to generate the numbers that you will then convert to your base-56 number. For instance, when using a Linux server, make sure to use /dev/random and not /dev/urandom. This will greatly reduce the risk of collisions (the fact of generating twice the same value in a short amount of time).
  • The ID length should be proportional to the required difficulty to guess it.

Some Examples Please

Imagine you want only want to avoid '0'/'O' and '1'/'l' confusions and you want to generate ID with a collision risk as low as 1/2,6.10¹⁷, you can generate numbers (using a CSPRNG) like:

aTy2-5fTk-rp9z

or

bUD5-64kP-hlA4

For less critical use cases, fewer characters may be enough:

aTy2-5fTk

or

64kP-hlA4

For short-live and low-risk ID, see what SNCF does for travel files (only six capital letters):

XSDTGE

Conclusion

Generating readable random IDs for humans can be easily achieved, but a bunch of requirements must be taken into account. Their scheme has to vary according to the targeted usage but keep in mind that changing an existing scheme is cumbersome and can require maintaining several ID schemes for a long time. I hope that this article will help you to think about the not-so-obvious criteria making it easier to design them right at the first attempt. I would be glad to get feedback if I have forgotten important or obvious points.

Published at DZone with permission of Bertrand Florat. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How Database B-Tree Indexing Works
  • Modernizing Testing With Data Pipelines
  • 27 Free Web UI Mockup Tools
  • Pattern Matching for Switch

Comments

Java Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo