Maintaining the heading anchor IDs across languages

Hello,

In some of large-scale documentation projects from companies like Amazon, Google, and Microsoft, the documents are localized in multiple languages but the anchor IDs to sections are consistent across languages (in English-title-anchorized format or another).

(As far as I know, the Microsoft docs are using Markdown as their underlying content format, although there might be some backend implementation.)

Is it possible to achieve similar functionality with Hugo with some configuration or other options?

The only idea that comes up to me is not straightforward, involving extra efforts or steps, like the following:

  • Author Markdown content in English (preferably) first. Use # heading {#id} syntax to manually specify anchor IDs while authoring, or perform some post-processing on the Markdown file to add anchor ID syntax to headings after authoring.
  • Localize the Markdown files, while keeping the anchor ID syntax next to headings intact.

I’d like to know if there is any other way to achieve this.

BTW, from a content management point of view, keeping the anchor IDs intact across languages seems to have advantages over having localized anchor IDs, due to the following reasons:

  • Translators (localizers) only need to change the language code in the links including the anchor IDs, without checking for localized anchor IDs.
  • The format of auto-generated anchor IDs for headings with Unicode characters varies across static generators or their internal options/libraries, so we cannot expect consistent localized anchor IDs, which is unfavorable for potential migration.

So I came up with this question. I’d appreciate any advice or sharing of experience.

This is certainly the simplest, though enforcing consistent usage would be challenging.

Another approach…

content/es/foo.md

## Gato

## Perro

i18n/es.toml

[heading_ids]
gato = "cat"  # Note that this is a reverse translation
perro = "dog" # Note that this is a reverse translation

layouts/_default/_markup/render-heading.html

<h{{.Level}} id="{{ T (printf "heading_ids.%s" .Anchor) | safeURL }}">{{ .Text | safeHTML }}<h{{.Level}}>

And you can check for missing “reverse” translations with:

hugo --printI18nWarnings 

However, with this approach you cannot use Hugo’s built-in .TableOfContents method. You will need to generate TOC’s client-side (with JS) or with your own partial.

See https://github.com/gohugoio/hugo/issues/8383

3 Likes