DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Handling Embedded Data in NoSQL With Java
  • Database Keys: A Comprehensive Guide
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines

Trending

  • Getting Started With Agentic Workflows in Java and Quarkus
  • 5 Common Security Pitfalls in Serverless Architectures
  • The Missing `bandit` for AI Agents: How I Built a Static Analyzer for Prompt Injection
  • Event-Driven Pipelines With Apache Pulsar and Go
  1. DZone
  2. Data Engineering
  3. Databases
  4. Designing a Multi-Language Database

Designing a Multi-Language Database

This article looks at three best-practice database designs to store data in multiple languages and easily scale to new markets.

By 
Antonello Zanini user avatar
Antonello Zanini
·
Jul. 20, 22 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
6.0K Views

Join the DZone community and get the full member experience.

Join For Free

Today we look at three best-practice database designs to store data in multiple languages and easily scale to new markets.

Reaching millions of users with an application is every developer's dream. Achieving this goal becomes easier if users from all over the world can use your application. Since not all users know English or your target language, you need to design your application to be multilingual. This would not be possible without a multi-language database.

Designing a multi-language database that can easily scale to new languages is not easy. This is why you should rely on best practices. Here, you will see why you need a multi-language database, three great multi-language database designs, and how to choose the best one for you.

Let's now learn everything you need to know about multi-language database design.

Why Do You Need a Multi-Language Database?

You never know how much your project will grow, and your small app for your friends could become an international service used by millions of people. If you want your product to be able to scale internationally, you need to design it so that you can easily adapt it to different cultures and markets. That is what internationalization is about.

Building a database ready for internationalization means designing a database that can store multilingual data. In other words, the backend should be able to provide data in multiple languages. To do this, the backend should connect and retrieve this data from a multi-language database.

Notice that giving users the possibility to switch between several languages is a nice feature to have. This is especially useful for polyglots or non-native speakers. So, even if your project is small and targets a local market, you should consider a multi-language database. After all, you cannot know in advance how successful your project will be and which users will use it. Let's now learn why choosing the right multi-language database design for your needs is crucial.

Why You Should Design Your Multi-Language Database Carefully

Changing a database is a very costly operation in terms of time and energy because it triggers a ripple effect. Modifying the structure of the database involves changing the backends connected to it. This may also require you to adapt the frontends that rely on those backends accordingly. As you can see, changing a database structure is not an operation without consequences. For this reason, you should design a database that can scale easily, and you do not have to change frequently.

As you can imagine, there are several ways to design a multi-language database, and each solution has its pros and cons. Considering the importance of the database structure for an application, you have to design your multi-language database wisely. This is why you should not start from scratch but rely on best practices.

Let's now take a look at some multi-language designs based on best practices.

3 Multi-Language Database Designs

Let's delve deeper into the pros and cons of three multi-language designs I personally used in my experience as a full-stack web developer working for startups all over the world.

1. Column Approach

In this approach, each field in a multilingual table has a number of columns equal to the number of languages supported by the database.

Each field in a multilingual table has a number of columns equal to the number of languages supported by the database.

Specifically, this is what the column name template looks like: columnName_languageCode

Pros
  • Simple: it is easy to implement.
  • Fast: it does not involve JOIN or slow queries.
  • Easy to deal with non-translated fields: if the translation for a field is missing, you can just use . E.g. COALESCE(name_it, name_en) → returns name_it if it is not NULL, otherwise the name_en default value.
Cons
  • Hard to maintain: adding a new language requires updating all multi-language tables in the database. This also means that you need to change your ORM mappings accordingly.
  • Not scalable: the number of columns in tables grows with the number of languages supported by the application.
  • Complex SELECT conditions: forget about SELECT *. You need to specify each column in your SELECT clause.

2. Row Approach

In this approach, there is one row for each language. The key to identifying a multi-language entity is the following composite primary key:

Plain Text
 
<id, languageCode>


There is one row for each language.

Pros
  • Simple: it is easy to implement.
  • Fast: retrieving the translated content only requires a WHERE condition on languageCode.
Cons
  • Complex keys: using a composite primary key makes identifying an element and JOIN queries more complex.
  • Duplicated content: to simplify things, non-translated columns generally store the same content saved in the columns of the default language row. This means that you will have a lot of duplicate content in multi-language tables.

3. Translation Table Approach

In this approach, each column of a multi-language table is an external key to a translation table. In other words, there is a translation table for each table that involves multi-language fields.

 Each column of a multi-language table is an external key to a translation table.

There is a translation table for each table that involve multi-language fields.

Pros
  • Scalable: adding a new language does not involve changes to the database structure.
  • Supports historical data: you can use the translation table to keep track of the translation history for each field in an entity.
  • Centralized: all translations for an entity are stored in one place.
Cons
  • Complex queries: queries become more complex because the information associated with a multi-language entity is spread over two tables.
  • Slow queries: retrieving all the info associated with an entity requires as many JOINs as the number of columns. Considering the burden that JOINs have on performance, this can easily become a problem.
  • Duplicated tables: it increases the size of your database in terms of tables. Plus, you need two tables to define each multi-language entity.

What Is the Best Multi-Language Database Design for You?

As you can see, every multi-language database design has the same amount of pros and cons. This means that there is no preferred approach in absolute terms. You have to choose carefully based on your requirements, needs, and goals. I have personally used all three approaches in different projects and learned the lessons I've shared.

The column approach is particularly useful when dealing with Big Data, especially from the perspective of having a database with tables with several millions of rows when you simply cannot afford JOIN queries or duplicated content. Although it is not the most scalable solution, the column approach is the only viable design with large data.

On the other hand, the row-by-row approach is useful when the branches of your company branches operating in the local market have freedom of action. In this case, duplication of content can even be an advantage.

Finally, the tabular approach is the right design if you are looking for an elegant and scalable solution and do not mind the performance drawback. The table approach is best for projects where you know that the data will not grow too large.

Conclusion

Designing a database that can store multilingual data is essential to help your business scale internationally. Also, it enables non-native speakers to use your application. Changing a database is a complex and time-consuming operation with non-negligible consequences on the application architecture. This is why you should design your database as multi-language right from the start. Here, we have looked at three different approaches to designing a multilingual database, studying their pros and cons, and going into detail about which one is best for you.

Database design Database

Published at DZone with permission of Antonello Zanini. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Handling Embedded Data in NoSQL With Java
  • Database Keys: A Comprehensive Guide
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook