I can use i18n now and I have 2 languages, “en” and “zh-tw”.
I try to change “zh-tw” to “tw” and it’s still ok.
I’m curious, is it a good idea?
The language code is just a string in Hugo or it has specific meaning?
ps: The reason I want to use tw is it’s shorter and I check
Apple.com, it uses
Is it a good idea?
Probably not, but as always: it depends.
Your theme may be using this setting in the html element to flag the used language of your content to the browser. lang - HTML: HyperText Markup Language | MDN This may break if you try non standard stuff here.
Although Apple is a big company, those companies may not be right with their choices 100% of the times.
I would stay with the standard way!
I think you are right, it’s not a good idea. I checked Apple’s HTML code, it use zh-tw as the language code.
But if I keep to use “zh-tw”, is it possible to change URL to “tw” only?
I just want to shorten the URL.
As per IANA (standards body), no.
TW subtag has a type of
Description: Taiwan, Province of China
This means that it can only be used this way:
language_code+REGION so for Taiwan Chinese, the correct language tagging is
zh-tw (as per W3C, it is case-insensitive; see next link).
The language subtag syntax is:
zh is the language
Hant is the script (simplified and traditional respectively)
TW is the region.
General rule, if it can be shorter, the better. So instead of
zh-hant-tw just write
zh-tw but not
tw only as Apple is using. Some language features of Hugo (and GoLang itself) also relies on correct tag usage (like the datetime localisation).
tw works fine is because major browsers chose to interpret it as
zh-tw. However, you can not guarantee that
tw will work in other rendering engines and/or browsers, for example, browsers used for special needs like Braille and voice/speech browsing. (IIRC, there was one time Amazon’s gadgets doesn’t interpret incorrect tagging either but later added support because webmasters/web developers of major sites, like Apple in your example, refuses to fix their bug.)
If you want to learn more, I have a very old blog post about it here:
It’s geared towards Philippine languages (we can use up to the
variant tag) but it’s practically the same.
Add this in your
[languages] section in every language available (just change accordingly per language):
baseURL = "https://example.com/tw/"
baseURL = "https://tw.example.com/"
It will work fine even with
hugo server, the language selector will adjust accordingly.
However, if your content is already out there and indexed by search engines, you’ll have to either add aliases to your content .md files or write a redirect on the DNS or server level.
For URLs you’re free to choose what you want, as it infers no meaning, besides maybe SEO.
But in the HTML lang attribute you’d definitely want to stick to the standards code, because it’s meant for machine interpretation. For instance the automatic translation of websites by the browser. Or it may aid assistive technologies like screen readers.
As other have already mentioned, the subdirectory can be anything. The languageCode should on the other hand be set to standard values.
I have set up languages like this for one of my sites:
I have also set
This way the root of the site is Swedish and ‘/en/’ subdirectory is English.
The html lang attribute is however set from the
languageCode variable so becomes “sv-SE” and “en-GB” respectively.
The language key (e.g., the
[languages.en]) currently has four primary roles:
- It is prepended to the path portion of the published site (e.g.,
- It links the site with a translation file in the
i18n directory. To properly pluralize a translation, the language key must match one of the languages in:
- It is used to localize dates, numbers, percentages, and currencies. To properly localize these values, the language key must match one of the locales in:
- It is used to determine collation when sorting.
It’s a bit messy. I tried to provide a complete explanation here:
In the OP’s example, setting the language key to
zh-hant-zw will work with items 1 through 3 above. I’m not sure if it will trigger the proper collation sequence (I am not yet familiar with this code).
I appreciate all of your suggestions, obvious, it’s a bad idea to use “tw” instead of “zh-tw”.
After 2 days try and error, it works (It should be simple, but I make a stupid mistake…).
But I still 2 questions need help.
I put images in /static/images, I need to use
![ ](/en/images/xxx.jpg) and
![ ](/tw/images/xxx.jpg) to access images now, is it correct?
(I follow this doc: Static Files | Hugo)
In http://localhost:1314/en/, it still try to connect http://localhost:1313/tw/css/style/***. It’s not a problem in production mode because we use the same port, but it will break local’s environment.