DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • A Complete Guide to AWS File Handling and How It Is Revolutionizing Cloud Storage
  • Observability Architecture: Financial Payments Introduction
  • RBAC With API Gateway and Open Policy Agent (OPA)
  • How to LINQ Between Java and SQL With JPAStreamer

Trending

  • A Complete Guide to AWS File Handling and How It Is Revolutionizing Cloud Storage
  • Observability Architecture: Financial Payments Introduction
  • RBAC With API Gateway and Open Policy Agent (OPA)
  • How to LINQ Between Java and SQL With JPAStreamer
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Permutation Distance

Permutation Distance

John Cook user avatar by
John Cook
·
Nov. 19, 19 · Tutorial
Like (2)
Save
Tweet
Share
12.85K Views

Join the DZone community and get the full member experience.

Join For Free

mountain-in-distance

If you have two sequences, and one is a permutation of the other, how do you measure how far apart the two sequences are?

This post was motivated by the previous post on anagram statistics. For a certain dictionary, the code used in that post found that the longest anagram pairs were the following:

certifications, rectifications
impressiveness, permissiveness
teaspoonsful, teaspoonfuls

Anagrams are more amusing when the two words are more dissimilar, at least in my opinion.

There are several ways to measure (dis)similarity. The words in the first two pairs above are dissimilar in meaning, but the words in the last pair are synonyms [1]. The pairs of words above are not that dissimilar as character strings.

It would be hard to quantify how dissimilar two words are in meaning, but easier to quantify how dissimilar they are as sequences of characters. We’ll look at two approaches: Hamming distance and transposition distance.

You may also like: Mathematical Functions and Converting Data Types in Groovy.

Hamming distance

The Hamming distance between two words is the number of positions in which they differ. We can see the Hamming distances between the words above by viewing them in a fixed-width font.

certifications
rectifications

impressiveness
permissiveness

teaspoonsful
teaspoonfuls


The Hamming distances above are 2, 5, and 4.

The anagrams with a maximum Hamming distance in my dictionary are “imperfections” and “perfectionism.” (There’s something poetic about these words being anagrams.) The Hamming distance is 13 because these 13-letter words differ in every position.

imperfections
perfectionism


Transposition Distance

Another way to measure the distance between two permutations is how many elements would have to be swapped to turn one list into the other. This is called the transposition distance. For example, the transposition distance between “certifications” and “rectifications” is 1 because you only need to swap the first and third characters to turn one into the other.

It’s not so obvious what the transposition distance is between “impressiveness” and “permissiveness.” We can find an upper bound on the distance by experimentation. The two words only differ in the first five letters, so the distance between “impressiveness” and “permissiveness” is no larger than the distance between “impre” and “permi.” The latter pair of words are no more than 4 transpositions apart, as the following sequence shows:

impre
pmire
peirm
perim
permi


But is there a shorter sequence of transpositions? Would it help if we used the rest of the letters “ssiveness” as the working space?

Computing transposition distance is hard, as in NP-hard. This is unfortunate since transposition has more practical applications than just word games. It is used, for example, in genetic research. It may be practical to compute short sequences, but, in general, one must rely on bounds and approximations.

Note that transposition distance allows swapping any two elements. If we only allowed swapping consecutive elements, the problem is much easier, but the results are not the same. When restricted to consecutive swaps, the distance between “certifications” and “rectifications” is 3 rather than 1. We can swap the “c” and the “r” to turn “cer” into “rec,” so we have to do something like

cer
ecr
erc
rec


I don’t know a name for the distance where you allow rotations. The words “teaspoonsful” and “teaspoonfuls” differ by one rotation, turning “sful” into “fuls.” While this can be done in one rotation, it would take several swaps.

Further Reading

  • Levenshtein distance.
  • Translating Robert Burns.
  • MachineX: Cosine Similarity for Item-Based Collaborative Filtering.
  • Introduction to Classification Algorithms.

[1] Although “teaspoonfuls” is far more common, I remember a schoolmarm teaching us that this is incorrect.

And I still recoil at “brother in laws” and say “brothers in law” instead; the majority is on my side on this one.

Measure (physics) POST (HTTP) Element Dictionary (software) application Data (computing) Strings IT Computing

Published at DZone with permission of John Cook, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • A Complete Guide to AWS File Handling and How It Is Revolutionizing Cloud Storage
  • Observability Architecture: Financial Payments Introduction
  • RBAC With API Gateway and Open Policy Agent (OPA)
  • How to LINQ Between Java and SQL With JPAStreamer

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: