Lost in Translation: Gaps of GPT-3.5 in South Asian and Middle Eastern Languages
This article reveals disparities in performance, including grammatical errors and inappropriate tone, in responses to non-English prompts in LLMs like GPT-3.5.
Join the DZone community and get the full member experience.Join For Free
Large language models have made remarkable advancements in recent years. However, much of this progress has centered on English language tasks, with less attention to non-English languages. It is a missed opportunity as these languages represent some of the fastest-growing economies. In our analysis of the capabilities of GPT-3.5 on non-English prompts, specifically South Asian and Middle Eastern languages, we uncovered disparities. While performance on English language prompts continues to impress, responses to prompts in other languages surface grammatical errors, inappropriate tone, and factual inaccuracies.
Our study shows that LLMs (GPT-3.5) are 2x slower if the prompt is not in English. Also, the quality of response degrades compared to instruction-following performance in English. As large language models move towards broader multilingual availability, these deficiencies demand urgent attention to support business expansion worldwide and capture more significant customer adoption.
Excluding millions of non-English speakers from enjoying the benefits of these powerful models forfeits tremendous value. It inhibits education, business expansion, creativity, and human progress across much of the globe. Prioritizing multilingual equity aligns with business imperatives to serve the broadest possible audience and ethical principles of inclusive innovation.
Through expanded datasets, targeted model architecture changes, and refinement of underlying linguistics, we can collectively work to make the promise of large language models generally universal. Our shared goal must be to empower all people, regardless of their native language, to harness these technologies towards human flourishing.
In this article, we show that a simple manual translation layer improves LLMs' response quality to non-English prompts as well as English prompts. It can democratize information access, enhance LLM performance across diverse languages, and truly globalize the benefits of AI. Additionally, this article delves deep into the power of translation integrations, promising to revolutionize how we communicate and engage with AI, breaking linguistic barriers and bridging global communities like never before.
Millions of users use ChatGPT across the globe as an everyday companion. The tool's ability to converse in natural language format makes users feel like they have a second brain that is aided by increasing the efficiency of tasks and provides human-like support. Some of the scenarios/questions where ChatGPT is used are:
- General knowledge questions (What is the surface temperature of Mars?)
- Homework help (Help me write an essay about algae and fungi)
- Life advice (How do I improve my mental health)
- Creative writing (Write a short story about dragons and unicorns)
- Procedural or how-to questions (Give me a detailed step-by-step method to brew green tea)
Linguistic Challenges in South Asian and Middle Eastern Languages
South Asian and Middle Eastern languages are rich in cultural nuances and unique linguistic features that pose significant challenges for machine learning models like GPT-3.5. These languages often have complex grammar structures, diverse dialects, and unique writing systems vastly different from English and other widely supported languages. As a result, GPT-3.5 needs help to accurately comprehend and translate these languages, leading to a loss of meaning and context.
One of the main challenges lies in the lack of training data available for South Asian and Middle Eastern languages. GPT-3.5's performance heavily relies on the vast amount of data it has been trained on, and the scarcity of high-quality training data for these languages hampers its ability to understand and generate text effectively. Additionally, the cultural diversity within these regions further compounds the linguistic challenges, as GPT-3.5 may need to capture the subtle nuances and context-specific meanings crucial for accurate translation.
The Importance of Accurate Translation in These Languages
Today, most LLMs are focused on the English language, but there is a whole list of South Asian and Middle Eastern languages that are important for trade and the economy. These regions have populations in the millions and serve as critical markets for any business worldwide to expand. To communicate with the customers and partners in these regions, it is essential to have effective and efficient translation tools to converse and exchange intended information.
Given the sensitivity of the languages and the sentiment they carry, it is crucial to communicate effectively in the business world, as inaccurate translations will lead to degradation of brand value, loss of trust, and legal concerns. The LLMs must also understand the cultural nuances that come with the languages. These actions will bridge the gap in translating non-English languages using LLM.
Business Use Case of Non-English-Focused Large Language Models (LLMs) Across Various Sectors
Non-English LLMs have the potential to impact various industries worldwide significantly. The use of AI for generating text, engaging in conversations in local languages, translating content, and creating localized speech or video can facilitate personalized content creation. This expansion can foster increased cross-country trade collaboration, promote cross-cultural cooperation, and generate localization jobs that contribute to economic growth in developing nations. Here are a few examples:
- Media entertainment: AI can create personalized content, e.g., convert existing movies, TV shows, and books into local languages at low cost and expand the reach while increasing revenue.
- Retail e-commerce: AI solutions backed by LLM can be used to create localized product listings, images that meet local trends, and product descriptions in local languages — this will increase customer satisfaction and product sales in new and existing markets.
- Banking and finance: AI can convert financial documents and services like customer service and platforms into local language. This can help increase the global flow of capital.
- Healthcare: AI can convert medical documents and enable easy reach to products such as predictive healthcare of typical diseases to under-deserved populations, increasing affordability and providing healthcare access to all. This will make it easier for all health practitioners worldwide to support and care without language barriers.
- Education: LLMs can translate education materials, create personalized learning, and provide quality education for everyone across the globe. This will enable innovation and technological advancement as education can now reach all parts of the world without language barriers.
- Politics and government: AI solutions can be used to translate and cater to all the citizens of a given nation. This will enable transparency and provide diverse citizens access to government documents.
Limitations of GPT-3.5 in Understanding Cultural Nuances
Currently, GPT-3.5 is one of the most powerful LLMs. However, it still struggles to capture the cultural and ethnic nuances of South Asian and Middle Eastern languages. The culture of any region plays a critical role in language evolution. For example, proverbs are deeply rooted in a given region's day-to-day lifestyle and culture. When a model translates these expressions, they are likely to lose their authenticity and intended meaning. Hence, adding a layer of this nuanced culture along with the translation process is critical for the success of LLMs.
Performance of Non-English Prompts
While GPT-3.5 exhibits remarkable capabilities in English, it could be more effective when dealing with languages such as Arabic, Hindi, Urdu, and Tamil. We used Google Translate to translate the English prompt “Write three paragraphs of 4 lines each about the solar system” and sent the prompt to GPT-3.5.
Table 1: ChatGPT Response to Translated Prompt in South Asian and Middle Eastern Languages
Opinions expressed by DZone contributors are their own.