[SOLVED] German umlauts in tags

Hi there,
as I turn to Hugo from WordPress I encounter the same problems as in the beginning days of WP. German umlauts ( special characters ö,ä,ü,ß) in tags are not parsed into valid urls. The usual practice would be to get ä–>ae, ö–>oe, ü–>ue, ß–>ss.
Is there a way to to that?

There’s an undocumented feature named removePathAccents (see this topic) that you can place in your configuration file and that should remove/replace characters with accents on it. It kind of ‘cleans’ URLs from those characters.

@Jura Thank you for your quick reply.

Unfortunately that is solving the problem only partly. We get urls that are readable by browsers - that is good. BUT, by stripping out the dots on ü, ö, ä, the meaning of some words change or disappear (in most cases). This is a huge SEO problem for many languages (Swedish and Danish for example. Don’t think about Finnish or Hungarian).

Look at this example: My taxonomy is called ”stichwort” (tag), the term is called ”Männer” (men). This is now rendered to ”/stichwort/manner” instead of ”stichwort/maenner”. The word ”Manner” is an Austrian brand for cookies. Really not the desired outcome.

Finally, the search engines know the ”right” urls for a long time. It is not worth to create lots of redirects for already existing terms and even more redirects for more terms to come. One of the ideas of switching to Hugo was to reduce time consumption. It does not do so when we cannot fix this vital item.

So I suggest to consider including one of the following solutions into future Hugo versions:

  1. A possibility to map slug and title to terms in taxonomies; or
  2. An automated process to replace the characters as stated above.

I am sorry to say that I am not a programmer, so I cannot provide solutions myself. But I think that this is a very basic feature which would be much appreciated all over the world.

I agree with what you’re saying, I know a little bit German but enough to know the importance of umlauts. :slight_smile:

But I’m also not a developer so if it’s currently not possible with Hugo I can’t help much either. Perhaps @Leo_Merkel can help. He’s from Germany too and typically posts all kinds of insights here on the forum. :slight_smile:

I did come across this topic about Greek characters in URLs, and from what I understand it was solved on the server side. Perhaps @alexandros knows more. I did saw him post a Greek language tip, albeit not for URLs. But perhaps he struggled with Greek characters in an URL too.

Tnx for mentioning. Just tried the following working solution (might be a better one).

In Front Matter I set a new url for the page. No special settings in config.toml.

---
title : Männer
slug: /maenner/

menu:
  main:
    weight: 25
---

Hope this helps.

@Jura Never used this workaround for URLs but it might work.

I always use Latin characters for my URLs to avoid the display of UTF-8 characters with URL encoding. It may happen and it looks very ugly.

Also @paw regarding the SEO value of URLs that is debatable. Content is king not URLs

@Leo_Merkel I am aware that I can customize the slugs for single pages in this way. What I am asking for is a solution that would help for the taxonomy slugs that are created by Hugo.
Actually the function urlize would be the right place to provide the solution.

1 Like

@alexandros: We had the same annoying debates in the beginning of WordPress. There is some ignorance around the web according to other languages than English. It took many years with workarounds, plugins etc. Now WordPress has got a well working system to provide correct URLs and does not question the necessity of characters in foreign languages.

Imagine the following:
Hugo would replace “y” with “ü” and “a” with “ä”. Would you be happy if you set up a taxonomy called “year” showing up as /täxonomü/üeär ? That is exactly the reversed case.

Of course it is a SEO problem since Google IS language sensitive. Hugo is not fully (yet :slight_smile: )

I created a feature request for this topic.

Do you have an example?
I get valid urls …
tag with ä
blog with ä

tags/hä

is encoded as:

tags/h%C3%A4

witch looks OK for me

1 Like

What @it-gro said above and what is mentioned in this post should answer your question.

Sometimes it depends on the setup you use to have these characters encoded the correct way in URLs.

You could also try to do this replacement on your own with the technique that is mentioned in the first post of the same topic.

@it-gro, @alexandros
thank you both. With your help I figured out how to make the site work.
Now I only need to produce a hell of a lot of rewrites for the old tags like /maenner/ --> /mh%C3%A4nner/

1 Like