Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Software Labels Translation Is Not So Easy

DZone's Guide to

Software Labels Translation Is Not So Easy

Creating clear, natural, and unambiguous labels and instructions for an application is hard enough. But, translating your prompts into another human language will present issues you may not have thought of.

· Java Zone
Free Resource

Learn how our document data model can map directly to how you program your app, and native database features like secondary indexes, geospatial and text search give you full access to your data. Brought to you in partnership with MongoDB.

Some developers have hardly ever touched software labels translation, some do it on a day-to-day basis. It sure helps to work in a country with more than one language – official or de facto.

Even for in the first case, it’s considered good practice to externalize labels in properties files. As for the second case, languages are in general related.

In Java, the whole label translation mechanism is handled through a hierarchy of properties files. At the top of the hierarchy lies the root file, at the second level language-specific files and finally at the bottom country-specific files (let’s forget about lower levels since I haven’t seen them used in 15 years). The translation for message strings are searched along a specific locale, starting from most specific – country, up to the root. If the translation is found at any level, the resolution mechanism stops there and the label is returned.

As a simple example, let’s take a simple use-case, displaying the number of items on a list. I’d need probably 3 labels:

  • No item found
  • One item found
  • Multiple items found

This is probably the resulting messages.properties file:

result.found.none=No item found
result.found.one=One item found
result.found.multiple={0} items found

Things get interesting when the customer wants to translate the software after the initial release into an un-related language. Let’s not go as far as pictograph-based languages such as Mandarin Chinese or RTL languages such as Arabic, but use Russian, a language I’m trying to learn (emphasis on try).

Russian is a language that has cases, like in Latin.

Case is a grammatical category whose value reflects the grammatical function performed by a noun or pronoun in a phrase, clause, or sentence. In some languages, nouns, pronouns, and their modifiers take different inflected forms depending on what case they are in.
[…]
Commonly encountered cases include nominative, accusative, dative, and genitive. A role that one of these languages marks by case will often be marked in English using a preposition.
— Wikipedia

So, what the fuss about it? Just translate the file and be done with it! Well, Russian is an interesting language for counting. With one, you’d use singular, from 2 to 4, you’d use plural and nominative case, but starting from 5, you use plural and genitive case – indicating quantity.

Now the keys will look like the following (the messages are not very important by themselves):

  • result.found.none
  • result.found.one
  • result.found.twotofour
  • result.found.five

Is it OK to translate, now? Not quite. Russian is derived from old Slavic and old Slavic had three grammatical numbers: singular, plural and dual. Russian has only singular and plural but there’s a remnant of that for the feminine case. In this case, you’d use две instead of два.

This requires the following keys:

  • result.found.none
  • result.found.one
  • result.found.two.feminine
  • result.found.two.notfeminine
  • result.found.threetofour
  • result.found.five

And this only for the things I know. I’m afraid there might be more rules than I don’t know about.

There are a couple of lessons to learn here:

  1. Translations are not straightforward, especially when the target is a language with different roots.
  2. Internationalization is much larger and harder than just translations. Think about dates: should month or day come first? And localization is even larger and harder than internationalization.
  3. The cost of translation is not null, and probably will be higher than expected. Estimates are hard, and wrong most of the times.
  4. Never ever assume anything. Implicit is bad in software projects… 

Discover when your data grows or your application performance demands increase, MongoDB Atlas allows you to scale out your deployment with an automated sharding process that ensures zero application downtime. Brought to you in partnership with MongoDB.

Topics:
translation ,localization ,language

Published at DZone with permission of Nicolas Frankel, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}