shadowWalker's avatar

Multilingual database

Hi, So i'm about to start a multilingual project. I thought about making a translation table and model for each table and model, and i know that there is packages that done that. But for a big project with a lot of models and complex relations ... it's going to be a mess to add on top of all that translation tables and models.

How about storing a json string with all the data for each field for example:

articles table with columns

id | title | description ...

In the title field we store the json representation of this array

$article->title = json_encode(array(
    'fr' => "Titre de l'article",
    'en' => "Title of the article"
));
0 likes
11 replies
davorminchorov's avatar

I'd go with adding a 'language' field in the table and populate it with the right language when adding stuff to the table. It will also be easier to select stuff when I need only french or english articles.

return Article::where('language', 'fr')->get('title', 'description'); // or 'francais' for the long version. The query will be similar for english articles. 
kajetons's avatar
Level 3

Honestly, I'd just stick to your original approach. Store every entity as you normally would and retrieve translatable properties from a separate table with eager loading (or joins, if you're not using Eloquent). Might seem a bit messy for a large project but it works.

@Ruffles Your suggestion has several drawbacks. For starters, it requires a separate record for each entity with language specific properties. Seems fine in theory, but what if the URL needs to stay the same when switching between languages or change just a small part of it (like article slug while ID stays the same)? The ID now can't be used for identifying records since they don't share it anymore. Any other attributes that are translatable won't work either.

On top of that, it doesn't seem right to create multiple records where it is not necessary. Your approach is more suited to applications where the content is completely unique to each language (not just translated) and switching between languages is not used for translating a specific record and generally doesn't happen that often.

1 like
RemiC's avatar

For translations I use a master translation table with polymorphic relation to any table needing translation.

This table looks like this (simplified as it also implements versionning to keep track of outdated translations):

  • translatable_id
  • translation_id
  • translatable_type
  • language_id

This way I can add translation records without adding special column or dedicated tables. Only drawback is you have to scope on the translation table to filter out the 'translation record', but this logic can be encapsulated at the repository level, once you implemented it it's rather transparent.

shadowWalker's avatar

@Ruffles Like @kajetons said this approach isn't good for translating the same entry to multiple languages.

@desloc @kajetons How about i make a field for each language, for example

id | title_fr | title_en | title_ar | description_fr | description_en | description_ar

When writing the code all inserts, updates ... should be based on all the locales defined in the application. This way when i want to add a new language i will just add a new local to the list of locals and the code will work.

Now when it comes to the database structure, i would make a migration file for the shared fields like id, createad_at, updated_at ... And a seperate migration file for the translatable fields. This way when adding a new language i will just make a copy of the migration files of translatable fields, change the language prefix and run the migrations

shadowWalker's avatar

@RemiC I thought about that, but the only concern that i have is performance. Did you test this logic on a project with a lot of tables ... ? Thanks

RemiC's avatar

Having a lot of tables is not an issue. Having a huge number of record could be, but you can easily add up a translation table on a case by case basis, or use query caching.

kajetons's avatar

@shadowWalker It still doesn't seem right. The table itself shouldn't change unless the model needs extra functionality / behaviour. Extra languages don't fit this case. Besides, what if you eventually have to add several more languages? If there is a decent number of translatable fields, your table will end up with a huge number of columns and become an unmaintainable mess.

richardbishopme's avatar

Hi @shadowWalker,

Hope you're getting somewhere with it! I agree with @kajetons on this one, it wouldn't be easy to maintain if you wanted to add additional languages. Sure, if you think you might only need one or two languages, then it could work.

I'd probably take a similar approach to @RemiC. The example library I mentioned above does something similar, but you do have that extra table for each of the tables that you want to translate. However, it does support routing. You could say for example:

  • example.com/article/slug - default
  • example.com/en/article/slug
  • example.com/fr/article/slug

to get the translations.

1 like
shadowWalker's avatar

Well guys,

-JSON : is good, but when checking for the existance of a language, i need to read the fields.

-One master table for translations : a lot of rows in one table + a lot of joins, performance killer. caching data is going to be a nightmare.

-A field for each language in one table : well the used languages change from one project instance to another ...

I'm gona go with the a translation table for each table. I know it's gona add more files and tables but it has benifits compared to the others solutions:

  • no structure changes.
  • 1 more query with eager loading.

Thank you guys for you input.

Please or to participate in this conversation.