HUGO

Diacritics/accented charactes in taxonomy names and terms

Hi,

I’m returning to the $subj as I see removePathAccents=true was removed from hugo.

Thus, how to handle correctly taxonomy names and terms which should have diacritics/accented characters visible in final web page but not in URLs?

Example 1:

[taxonomies]
  "téma" = "témata"

Hugo will generate:

public/
public/sitemap.xml
public/index.html
public/css
public/js
public/about
public/about/index.html
public/index.xml
public/témata
public/témata/index.html
public/témata/česká-republika
public/témata/česká-republika/index.html
public/témata/česká-republika/index.xml
public/témata/index.xml
public/news
public/news/0001
public/news/0001/index.html
public/news/index.html
public/news/index.xml

But this is not what I want, I don’t want diacritics/accented chars in paths/URLs.

As I do not see any way to tell hugo how to keep those chars internally but “normalize” them for paths/URLs, I think the way to go is this:

[taxonomies]
  "tema" = "temata"

public/
public/sitemap.xml
public/index.html
public/temata
public/temata/index.html
public/temata/index.xml
public/temata/ceska-republika
public/temata/ceska-republika/index.html
public/temata/ceska-republika/index.xml
public/css
public/js
public/about
public/about/index.html
public/index.xml
public/news
public/news/0001
public/news/0001/index.html
public/news/index.html
public/news/index.xml

And to have my names generated in web pages via templates, I think I should use two YAML data files, one for taxonomy names translation, other one for terms translation. An example:

$ cat data/taxonomy.yaml
temata: Témata

$ cat data/terms.yaml
ceska republika: Česká republika

And then in respective templates I should probably use:

For taxonomy names - layout/taxonomy/list.html

<h1>{{ .Site.Title }} - Taxonomy - {{ index $.Site.Data.taxonomy (printf "%s" .Data.Plural) }}</h1>

For terms names - layout/taxonomy/term.list

<h1>{{ .Site.Title }} - Term - {{ index $.Site.Data.terms (printf "%s" .Data.Term) }}</h1>

Am I doing things complicated or is there a easier way? BTW, my target user is not an IT guy, thus I wanted first to make taxonomy names and terms names with diacritics/accented chars as it is in my language - Czech.

Thank you for comments & help.

PS: Another way would be to create for each taxonomy name and term name corresponding directory with _index.md file having title: <right name> in its frontmatter. But what would be the best way?

Jiri

PS: Another way would be to create for each taxonomy name and term name corresponding directory with _index.md file having title: <right name> in its frontmatter. But what would be the best way?

A third way would be to add these kind of files only for tags with special characters and use no special character in the frontmatter taxonomy. Not sure about the Czech language in it’s entirety but it might be less work.

I see there are more problems with taxonomy/terms names. An example with following tag values cause:

temata:
- f.o.o. => /temata/f.o.o./
- foo/bar => /temata/foo/bar/
- ../foo => error in hugo

For the first I would expect dots would be stripped; for the second I would probably expect ‘/’ would be replaced with ‘-’, for the last one I don’t know, just test.

Shouldn’t it be documented, ie. that values should be sane?

I was under the impression a-z0-9 is what is allowed (or used those). That second example might be a lucky case of sliding by. The third case is definitely false.

Maybe having in mind a-z0-9 for the first character is ok… but I wouldn’t trust the / version.

That might be something issue-tracker-worthy.

For archive: Slashes in a taxonomy term break path resolution · Issue #4090 · gohugoio/hugo · GitHub

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.