Minification breaks attributes

On localhost I have the following:

<html lang="de_DE" class="no-js">

On the live site with hugo --minify it gets transformed to

<html lang=de_de class=no-js>

Is there any way to configure the minification so it does not touch the uppercase/lowercases of attribute values? Lowercasing the language code is invalid and does not really make sense here.

why??

All attribute names on HTML elements in HTML documents get ASCII-lowercased automatically, so the restriction on ASCII uppercase letters doesn’t affect such documents.

see https://html.spec.whatwg.org/

Values, not names :wink: de_DE is correct, de_de is false. There is no logical reason why the language code should be lowercased for minification (DE is the same amount of bits as de) and ONLY that part is lowercased (think about lowercasing meta tags (description) or anything else that is case-sensitive.

please check Information on BCP 47 » RFC Editor Section 2.1 Syntax

You should use lang="de-DE"

At all times, language tags and their subtags, including private use
and extensions, are to be treated as case insensitive: there exist
conventions for the capitalization of some of the subtags, but these
MUST NOT be taken to carry meaning.

From the same spec, further down the page


Although case distinctions do not carry meaning in language tags, consistent formatting and presentation of language tags will aid users. The format of subtags in the registry is RECOMMENDED as the form to use in language tags. This format generally corresponds to the common conventions for the various ISO standards from which the subtags are derived. These conventions include:

  • [ISO639-1] recommends that language codes be written in lowercase (‘mn’ Mongolian).
  • [ISO15924] recommends that script codes use lowercase with the initial letter capitalized (‘Cyrl’ Cyrillic).
  • [ISO3166-1] recommends that country codes be capitalized (‘MN’ Mongolia).

So yes, use a hyphen instead of an underscore, but I would also raise an issue with https://github.com/tdewolff/minify. Theoretically he’s not breaking anything, but the official recommendation is being ignored.

Apart from the “rules” I think it might be Hugo’s fault, not tdewolff’s


<html lang="{{ .Site.LanguageCode }}" class="no-js">

I read some tickets on Hugos Github recently that something is done to the languagecode part, and minification does not change any other attribute. I guess {{ .Site.LanguageCode }} should not be there in the template, but the code of the “currently loaded language”.

<html lang="{{ .Site.Language.Lang }}" class="no-js">

does the trick. with hat tips to @jmooring and @ju52.

If you hardcode <html lang="ru-Cyrl-BY"> into a template, --minify converts it to <html lang=ru-cyrl-by>. That’s a tdewolff/minify issue, not a Hugo issue.

But: If I hardcode <html data-something="BlaFasel"> it comes out as <html data-something=BlaFasel>. Why only the lang-attribute? Why at all? There is no saved bytes in lowercasing things. This is a deeper issue in whoever is responsible :slight_smile:

(Let’s ignore it :wink: I solved it by using the “real” language code from the configuration.)

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.