Proper way to translate a theme

Hi, I’m trying to translate my theme in different languages. I have already changed all of my strings from foo to {{ i18n "foo" }}, and now I just need to extract and translate them.

I see other themes (like Ananke) have an i18n folder where all translations are stored. I’d like to do the same thing, and as far as I understood from the docs, I need to use goi18n.

I tried, and ran goi18n extract just to get an empty active.en.toml file.

I don’t know how other themes do it, and the documentation lacks this information, so please if anyone wants to help me, what is the proper way to extract strings for translation? Thanks.

The “proper” way is to have an en.toml in your i18n folder with translations for your strings. One file per language. I read about goi18n extract but it did not work when I tried it and there are posts here on the forum too about it not working as expected. So it’s mostly handywork then. Not sure if this is a problem with the function name in Hugo or something upstream.

You can however run hugo and hugo server with a parameter so it tells you if a translation id could not be found:

hugo --i18n-warnings

This is probably the least worst solution, thanks. Honestly I think hugo should have some way to properly extract i18n strings and save them to a file, having to manually extract them is not ideal.

The following conditions must be true for hugo --i18n-warnings to emit warnings:

  • For the defaultContentLanguage:

    • There must be at least one content page in the defaultContentLanguage
  • For other languages:

    • There must at least one content page in the given language
    • There must be an i18n file for the given language (an empty file is fine)

Here’s a Bash script that creates or updates TOML i18n files based on languages defined in the site configuration. It parses output from hugo --i18n-warnings to create keys as needed.

There must at least one content page for each language.

Example of created file:

date_created=" "
date_modified=" "
title=" "

The values are a single space instead of an empty string to prevent creation of duplicate keys if you were to run the script twice or more.

Bash script
#!/usr/bin/env bash

main() {
  declare -a languages
  declare language
  declare -a translations
  declare translation
  declare key

  # Create array of languages defined in site configuration.
  readarray -t languages < <(hugo config | grep "languages" | sed 's/languages = map\[/\]/' | sed 's/:map/\n/g' | head -n-1 | awk -F'[\\]]' '{print $2}' | sed 's/ //g')

  # Create i18n files.
  mkdir -p "i18n"
  for language in "${languages[@]}"; do
    touch "i18n/${language}.toml"
  done

  # Create array of missing translations. Example:
  #
  #   en|date_created
  #   en|title
  #   es|date_created
  #   es|title
  readarray -t translations < <(hugo --i18n-warnings | grep "i18n" | sort | awk -F'|' '{print $3 "|" $4}')

  # Update i18n files.
  for translation in "${translations[@]}"; do
    language=$(awk -F'|' '{print $1}' <<< "${translation}")
    key=$(awk -F'|' '{print $2}' <<< "${translation}")
    echo "${key}=\" \"" >> "i18n/${language}.toml"
  done
}

set -euo pipefail
main "$@"
2 Likes

This looks to me not like the toml markup (or maybe I am doing something wrong or too explicit):

[contactform_email]
other = "Your Email Address"

My slugs are in brackets and then it’s an other for most of the items.

That script can be changed for my needs :slight_smile:

Items without pluralization rules

i18n/es.toml

about = "acerca de"
help = "ayuda"

template

{{ i18n "about" }} --> acerca de
{{ T "help" }}     --> ayuda

Items with pluralization rules

i18n/pl.toml

[day]
one = "{{ . }} miesiąc"
few = "{{ . }} miesiące"
many = "{{ . }} miesięcy"
other = "{{ . }} miesiąca"

template

{{ T "day" 1 }}   --> 1 miesiąc
{{ T "day" 2 }}   --> 2 miesiące
{{ T "day" 5 }}   --> 5 miesięcy
{{ T "day" 1.5 }} --> 1.5 miesiąca

Reference: https://unicode-org.github.io/cldr-staging/charts/37/supplemental/language_plural_rules.html

2 Likes

We need to update the docs about this. I always used the other method. Overkill probably :smiley: