I have finally found an answer to my problem!
Unless there’s some mistake, of course!
But first things first: a huge thank you to @chrillek for your help. THANKS!
Now to the point: I need to enclose all content semantically pertaining to a level 2 heading within section
tags. That is, each h2
and its related content will be enclosed within a pair of section
tags. Likewise with level 3 headings: each h3
will be enclosed within its own div
tags.
Below there are (really) thorough explanations about it, but next comes my solution to the problem:
- this is the full code inside
layouts/_default/_markup/render-heading.html
:
{{ if eq .Level 2 }}
{{ printf "</section>" | safeHTML }}
{{ printf "<section>" | safeHTML }}
{{ else if eq .Level 3 }}
{{ printf "</div>" | safeHTML }}
{{ printf "<div>" | safeHTML }}
{{ end }}
<h{{ .Level }} id="{{ .Anchor | safeURL }}">{{ .Text | safeHTML }}</h{{ .Level }}>
- this is the code to add into
single.html
(usually replacing {{.Content}}
), with a few comments:
{{/* Next lines are about HTML «section» TAGS */}}
{{ $content1 := replaceRE `(?s)(</section>)` "" .Content 1 }} {{/* Removing first (unneeded) closing section tag */}}
{{ $end_section := "</section>" }}
{{ $content2 := printf "%s" $end_section | printf "%s%s" $content1 }} {{/* Now each h2 encloses its own related content, and any text preceding the first opening section tag is also included in this variable */}}
{{ with $content2 }}
{{if (findRE `(?s).*?<h3` . ) }}
{{ $pre_section := replaceRE `(?s)^(.*?)(?:<section>.*)` "$1" . 1 }} {{/* This is the content that doesn't belong to any section */}}
{{ $pre_section | safeHTML }}
{{ range (findRE `(?s)<section.*?</section>` . ) }} {{/* Iterate through sections */}}
{{ $first_div_removed := replaceRE `(?s)(</div>)` "" . 1 }} {{/* Removing first (unneeded) closing div tag */}}
{{ $last_ending_div_added := replaceRE `(?s)(</section>)$` "</div></section>" $first_div_removed }}
{{ $last_ending_div_added | safeHTML }}
{{ end }}
{{ else if (findRE `(?s).*?<h2` . ) }}
{{ . | safeHTML}}
{{ else }}
{{ $content1 | safeHTML }}
{{ end }}
{{ end }}
WARNING: if you intend to use this code in your own theme, you MUST check that it doesn’t have side effects, like introducing tags outside the rendered markdown content.
If you’re a seasoned Hugo programmer, then probably this is it for you. Otherwise below comes a long series of explanations about it all.
THE PROBLEM
Why would I want to add all those section
and div
tags?
Well, let’s start from an accesibility point of view: it appears that nowadays div
and section
are mostly similar tags if they are not given any id or ARIA attribute. That is, if they are included as plain tags (<div>
and <section>
), they will both be ignored by accesibility tools.
So here they are only used for styling and user experience reasons.
Why not only div
tags?
Easy, because when reading the html output, it is easier to me to see the structure of the document. It’s easier to tell apart what are «main» headings (h2
) from subheadings (h3
).
What has all this to do with user experience?
I will use those tags to highlight the matching ToC element, so wherever you will be in the document, you will be able to see the current heading highlighted in the ToC.
THE CODE IN RENDER-HEADING
Why use this render-hook at all?
Because it will easily add necessary tags in the appropriate places. It would take a lot of work adding them later, after the markdown content has already been rendered.
Does it leave the code finished?
No. It adds unneeded tags which have to be removed afterwards. It also can’t add all the necessary tags, which will have to be included later on.
More precisely, it adds an extra closing </section>
tag right before the first h2
, that needs to be removed. There’s also an extra closing </div>
tag on the first h3
inside each h2
. Likewise there’s a closing </section>
tag that needs to be added at the end of the content (to close the last opened section), and a closing </div>
tag that needs to be added right before closing each section.
Eeeerm… An example, please?
My render-hook code leaves the html output like this:
<article>
</section>
<section>
<h2>Heading level 2 text</h2>
...
</div>
<div>
<h3>Heading level 3 text</h3>
...
</div>
<div>
<h3>Heading level 3 text</h3>
...
</div>
<div>
<h3>Heading level 3 text</h3>
...
</section>
<section>
<h2>Heading level 2 text</h2>
...
</div>
<div>
<h3>Heading level 3 text</h3>
...
</article>
Why not also include headings 4, 5 and 6?
First, because in my template they won’t be included in the ToC, so no need.
Moreover, because most probably that would create a structure a la Bootstrap, which I don’t really want.
However, they can easily be added in the render-hook. I’m not so sure if the work to be done in single.html
would be that simple. I haven’t checked.
Advice? Do not include headings below level 3 (h4, h5, h6).
THE CODE IN SINGLE.HTML
Main problems to face here: absolute control over the dot (context), and good knowledge about regular expressions in Hugo.
If you do some dummy mistakes, usually there won’t be any warning, just weird results or behaviours, unless there’s something Hugo evaluates as critical. Be advised.
There will be times you will write a regular expression which checks in regex101.com and it won’t give the desired result in Hugo, so the data you think you’re manipulating won’t be there, or won’t be the exact data you need. Triple-check your regular expression, and what has been passed as content or context to the regex.
Debugging example
Consider
{{ $content1 := replaceRE `(?s)(</section>)` "" .Content 1 }}
It will search inside some data (.Content
) for a regular expression (between backticks) and replace what has been found (if it has been found) by something else (between quotation marks). And it will do it only once.
.Content
is the rendered html coming from the markdown.
In my code this line searches for the first (unwanted) closing section tag. Only the first one found in the document. That will leave my html as:
<article>
<section>
<h2>Heading level 2 text</h2>
...
But if I make some typing mistake in the regex (regular expression), Hugo will tell me nothing, and my custom variable $content1
will be empty or will do nothing (will leave .Content
as is). So: check, check, and then check what is inside that variable. Just temporarily I may add this:
{{ $content1 := replaceRE `(?s)(</section>)` "" .Content 1 }}
{{ $content1 }}
and Hugo will print the content of the variable in an ugly way, which is as plain text, without newlines, nor styling. But it usually will be enough to check if it has edited the right data.
If I want it printed as proper html, I will swap it by:
{{ $content1 | safeHTML }}
Giving all level 2 headings their own content
I need each h2
enclosed within section tags. With the previous step I have removed the first, unwanted closing tag, but now I lack an ending closing tag. So here it comes the next 2 lines of code:
{{ $end_section := "</section>" }}
{{ $content2 := printf "%s" $end_section | printf "%s%s" $content1 }}
Most probably they can be merged, but for clarity sake I add my needed closing tag in a variable, and then I create another variable with all the data in $content1
followed by that closing tag. Note that I have to add variables in reverse order.
Now I have all proper section tags inside $content2
. If I check it with {{ $content2 | safeHTML }}
and read the html output, I will see:
<article>
<section>
<h2>Heading level 2 text</h2>
...
</section>
<section>
<h2>Heading level 2 text</h2>
...
</section>
</article>
Let’s go for level 3 headings
First, I want to make sure I will work with the right content, so {{ with $content2 }}
will force the data inside $content2
to be my context. REMEMBER: the dot will be the same as $content2
.
Now, if you think about it, there will be pages with some text before the first h2
appears. So I have to treat it differently than content within headings. If I were to use just the data inside section tags, then that starting content would vanish and wouldn’t be there in the final html page.
So {{if (findRE `(?s).*?<h2` . ) }}
and {{if (findRE `(?s).*?<h3` . ) }}
will help me with that:
- if the first condition is true (there exist level 3 headings), then something will be done with the context
- if not, then it is checked if level 2 headings exist and the context will be the output
- on the contrary, if there are no headings (if both previous conditions are false), then
$content1
will be the output, which in the event there are no headings, then there won’t be any section tags, and $content1
and .Content
will be the same thing
Now that all 3 scenarios are taken into account, let’s focus in the most complex of all 3, when there are level 3 headings.
We have to remove the first </div>
inside each section block, and add a new </div>
at the end of such section, so I loop between sections to do that. But I have to remember there can be content before the first section, so I have to isolate it and then start the loop.
Finding the content before the first section
It appears there’s no «find a string of text and save it to my variable» function in Hugo. Instead of that I have to use a somehow contrived replace function: {{ $pre_section := replaceRE `(?s)^(.*?)(?:<section>.*)` "$1" . 1 }}
- search inside
$content2
once (the context, the dot, is $content2
here)
- use a regular expresion to find the desired text (
(?s)^(.*?)(?:<section>.*)
), which in this code finds everything from the start of the content up to the end, but matches only up to the first <content>
tag (not included)
- in the regular expression there’s a capturing group, which means this is what I need (
(.*?)
)
- in addition to that, there’s a non-capturing group, which means find also this, but don’t save it (
(?:<section>.*)
). In this case the non-capturing group captures everything from the first <section>
to the end of the content
- use this capturing group as the replacement string (
"$1"
): as the regular expression has found all the iteration content, but only has saved into memory the capturing-group, it means find what I want and replace everything with that
- save the replaced text into my variable
NOTE: the Hugo function findRE
stores slices (or arrays, if you prefer that), instead of strings of text. If I use this function rather than the previous replacement, Hugo will throw an error telling me I am trying to save a plain string into something that expects an array of strings.
After isolating it, the content before any section is added to the output ({{ $pre_section | safeHTML }}
). And it’s added before the loop is started.
The loop
With {{ range (findRE `(?s)<section.*?</section>` $content2 ) }}
I create a loop that isolates each section block.
On each iteration I remove the first </div>
with {{ $first_div_removed := replaceRE `(?s)(</div>)` "" . 1 }}
(the context is $content2
, yet).
Then using the previous variable, I add a </div>
at the end of the section block: {{ $last_ending_div_added := replaceRE `(?s)(</section>)$` "</div></section>" $first_div_removed }}
, but note that to prevent adding the closing div
after the closing section
, I have replaced the closing section
tag with </div></section>
.
Finally I just add the contents of this last variable to the output: {{ $last_ending_div_added | safeHTML }}
.
And that’s it!
Now about the danger of creating section
and div
tags everywhere, even where I don’t want them: as far as I can tell in my template the previous code does not affect anything outside the rendered markdown (even when there are headings present). So for now I think it’s safe as it is. But suggestions will always be welcomed.