Headings enclosed on html section tags

Hi:

I’m new to the forum, not so new to Hugo, but still a newbie.

I’m not a programmer, but I have started to code a new theme for an open source project, and I wish to have a table of contents where the hugo section currently being read is highlighted, so the reader knows where he/she is.

My problem: I wish to enclose each heading and all their contents between html <section> tags, because I wish to use the intersection observer to highlight the current hugo section.

I’ve searched and found this post:

I’ve already tried it and it works!.. Sadly not as I need it…

As it is, the code provided encloses h2 headings within divs, and by doing so, it ignores any h3, h4, … section, although they are included within the h2 div tags. So the toc correctly highlights h2 headings, but none of the lower level headings.

I’ve managed to make hugo surround any heading level between divs by editing the render-heading.html into:

{{ if eq .Level 2 }}
    {{ printf "<!-- end-chunk -->" | safeHTML }}
    {{ printf "<!-- begin-chunk data-anchor=%q -->" .Anchor | safeHTML }}

        {{ else if eq .Level 3 }}
        {{ printf "<!-- end-chunk -->" | safeHTML }}
        {{ printf "<!-- begin-chunk data-anchor=%q -->" .Anchor | safeHTML }}

        {{ else if eq .Level 4 }}
        {{ printf "<!-- end-chunk -->" | safeHTML }}
        {{ printf "<!-- begin-chunk data-anchor=%q -->" .Anchor | safeHTML }}

{{ end }}
<h{{ .Level }} id="{{ .Anchor | safeURL }}">{{ .Text | safeHTML }}</h{{ .Level }}>

though the result is that now each section ends where a new heading starts, no matter its level.

That is:

  • I need:
    <section>
    h2
        <section>
        h3
        </section>
    </section>
  • and my new code gives:
    <section>
    h2
    </section>
    <section>
    h3
    </section>

I have edited the template code into:

{{ range (findRE `(?s)<!-- begin-chunk.*?(?:<!-- end-chunk -->|$)` .Content) }}
  {{ $anchor := replaceRE `(?s).+data-anchor="(.+?)".+` "$1" . }}
  {{ $block := replaceRE `(?s)<!-- begin-chunk.+?-->(.+?)(?:<!-- end-chunk -->|$)` "$1" . }}
  {{/* Remove leading and trailing newlines. */}}
  {{ $block = trim $section "\n" }}
  <section id="{{ $anchor }}">
  {{ $block | safeHTML}}
  </section>
{{ end }}

I just changed the html tag from div to section and the variable $section to $block to avoid confusion with the html tag.

Despite the code uses variables and regular expressions that I still don’t completely understand, I specifically have no idea where the begin-chunk and end-chunk come from. Moreover, I have no clue why the html comment codes are used at all.

I guess such code must need to be expanded with nested variables, functions or new regular expressions, but no matter how hard I’ve searched, I haven’t found any documentation, tutorial or example that helps me.

I’m not looking for someone to send me the correct code (that would be warmly welcomed, though). Instead, what I’m looking for are links or explanations to know how to code it myself. That way I will be able to find solutions to future problems for myself.

Would anybody be kind to lead me in the right direction, please?

Thanks

You could use a Page.Scratch in your render_heading.html to remember the last level and add section start/end tags depending on that and the current level.

However, your approach seems to be a bit brute force. Headings can happen everywhere, not just in content (think navigation, aside sections etc.). Your render_heading template would treat the all the same.

A list or single template might be a better place to deal with the section logic – and it’s also a lot easier there, I guess.

1 Like

Thanks. I will think about this and try to figure out the code.

If you mean that it will add a couple extra tags on each heading, well, yes. But I don’t see that as a bad behaviour, though.

Depending on where you read about this, it is advised or not using sections to enclose headings and its related content. E.g., even though it’s an old document, the W3C talking about html5 sections and ARIA seems to suggest it’s a good practice for accesibility (I may be wrong). So, following that explanation simply adding opening and closing html section tags is not bad per se, as they are officially included in html5 and won’t harm the document. If only included and not used for anything, I guess you’re adding some kbytes to the html page for nothing.

On the other hand, in my template all navigation elements (including the page toc) are simply list elements with some css styling. I have no aside, though, so I can’t say for sure it won’t be a problem there.

Be it as it may, I will also consider using custom code in the list or single template.

Thanks again.

P.S.: and even if I can’t figure out the code I need, I still can use the current custom code I showed in this thread, because with it I can highlight the current heading without gaps, in case of long documents. It’s just that I wish to highlight the current h2 heading AND the current lower level heading being read.

No, that’s not what I mean. But: using a heading render hook for that might add sections at places where they do not belong. For example, I use several h2 headings in an aside – and those should definitely not be enclosed in section tags.

In any case, it would be easier to see what you’re after and offer suggestions if you posted a link to your repository. For example, I don’t really understand if you want to highlight the TOC entries referring to the currently visible content or the content itself or both. In the first case (highlight the TOC entries), why would you even need the sections for the intersection observer? The observer could react on the headings only.

Ok, fine. I will seriously consider adding code into the single template. Thing is I just don’t know how. I’ll figure it out.

Erm… Sorry, no repository. I’m not a programmer and I have yet no clue about git. I’m still learning Hugo and slowly building a theme, so I’m working on a local server and a repo is not yet considered.

But I can share the code here or upload screenshots, which is what I’ll do now.

About the intersection observer: a web browser sees everything on a webpage as blocks, rectangles with content inside. The intersection observer, among other things, is able to know when a certain block is visible on the viewport (on screen), so a script can do things when that block is visible. If the block goes out of view, then nothing can be done depending on that block because is no longer visible.

All of this is important because the block surrounding the heading tags sooner or later will get out of view, so the intersection observer can no longer tell the script to highlight the toc element.

The problem has been shown in the post preceding the one I linked in my first post here: watch the toc and you will see that there are points where no toc element is highlighted. That’s because at that point there are no headings visible on screen. I already saw that problem time ago, but didn’t found any solution to it.

Here you can see my current theme (with a bare-bones style applied to it) and what I wish to have in my page toc (the highlights have been added by hand with the browser dev tools):

The text is in Spanish, but that is not important. You can see an h3 on screen (hugo section number 5.1). With the original code from @jmooring I can highlight the whole h2 section, because the div (or section) surrounds everything up to where the next h2 appears. That’s one thing I need.

With my custom code I would be able to highlight the first h3 heading in the toc, but not the h2 heading (let’s say the h2 heading is no longer visible on the viewport; it currently really is because the upper quote is contained in the h2 section, but if you scrolled down a bit more, then the h2 block will no longer be on screen).

And this is really what I want to accomplish:

Here, the h2 block is yet highlighted, because it still surrounds everything up to the next h2 heading. AND the current visible block is also highlighted, and is a block that ends where the next h3 heading starts, so no gaps in the highlighting, even if the current h3 heading gets out of view.

Hope this helps.

I have finally found an answer to my problem!

Unless there’s some mistake, of course!

But first things first: a huge thank you to @chrillek for your help. THANKS!

Now to the point: I need to enclose all content semantically pertaining to a level 2 heading within section tags. That is, each h2 and its related content will be enclosed within a pair of section tags. Likewise with level 3 headings: each h3 will be enclosed within its own div tags.

Below there are (really) thorough explanations about it, but next comes my solution to the problem:

  • this is the full code inside layouts/_default/_markup/render-heading.html:
        {{ if eq .Level 2 }}
            {{ printf "</section>" | safeHTML }}
            {{ printf "<section>" | safeHTML }}
        {{ else if eq .Level 3 }}
            {{ printf "</div>" | safeHTML }}
            {{ printf "<div>" | safeHTML }}
        {{ end }}
        <h{{ .Level }} id="{{ .Anchor | safeURL }}">{{ .Text | safeHTML }}</h{{ .Level }}>
  • this is the code to add into single.html (usually replacing {{.Content}}), with a few comments:
        {{/* Next lines are about HTML «section» TAGS */}}
        {{ $content1 := replaceRE `(?s)(</section>)` "" .Content 1 }} {{/* Removing first (unneeded) closing section tag */}}
        {{ $end_section := "</section>" }}
        {{ $content2 := printf "%s" $end_section | printf "%s%s" $content1 }} {{/* Now each h2 encloses its own related content, and any text preceding the first opening section tag is also included in this variable */}}

        {{ with $content2 }}
            {{if (findRE `(?s).*?<h3` . ) }}
                {{ $pre_section := replaceRE `(?s)^(.*?)(?:<section>.*)` "$1" . 1 }} {{/* This is the content that doesn't belong to any section */}}
                {{ $pre_section | safeHTML }}

                {{ range (findRE `(?s)<section.*?</section>` . ) }} {{/* Iterate through sections */}}
                        {{ $first_div_removed := replaceRE `(?s)(</div>)` "" . 1 }} {{/* Removing first (unneeded) closing div tag */}}
                        {{ $last_ending_div_added := replaceRE `(?s)(</section>)$` "</div></section>" $first_div_removed }}
                        {{ $last_ending_div_added | safeHTML }}
                {{ end }}
            {{ else if (findRE `(?s).*?<h2` . ) }}
                {{ . | safeHTML}}
            {{ else }}
                {{ $content1 | safeHTML }}
            {{ end }}
        {{ end }}

WARNING: if you intend to use this code in your own theme, you MUST check that it doesn’t have side effects, like introducing tags outside the rendered markdown content.

If you’re a seasoned Hugo programmer, then probably this is it for you. Otherwise below comes a long series of explanations about it all.

THE PROBLEM

Why would I want to add all those section and div tags?

Well, let’s start from an accesibility point of view: it appears that nowadays div and section are mostly similar tags if they are not given any id or ARIA attribute. That is, if they are included as plain tags (<div> and <section>), they will both be ignored by accesibility tools.

So here they are only used for styling and user experience reasons.

Why not only div tags?

Easy, because when reading the html output, it is easier to me to see the structure of the document. It’s easier to tell apart what are «main» headings (h2) from subheadings (h3).

What has all this to do with user experience?

I will use those tags to highlight the matching ToC element, so wherever you will be in the document, you will be able to see the current heading highlighted in the ToC.

THE CODE IN RENDER-HEADING

Why use this render-hook at all?

Because it will easily add necessary tags in the appropriate places. It would take a lot of work adding them later, after the markdown content has already been rendered.

Does it leave the code finished?

No. It adds unneeded tags which have to be removed afterwards. It also can’t add all the necessary tags, which will have to be included later on.

More precisely, it adds an extra closing </section> tag right before the first h2, that needs to be removed. There’s also an extra closing </div> tag on the first h3 inside each h2. Likewise there’s a closing </section> tag that needs to be added at the end of the content (to close the last opened section), and a closing </div> tag that needs to be added right before closing each section.

Eeeerm… An example, please?

My render-hook code leaves the html output like this:

<article>

  </section>
  <section>
     <h2>Heading level 2 text</h2>
        ...
        </div>
        <div>
            <h3>Heading level 3 text</h3>
                ...
        </div>
        <div>
            <h3>Heading level 3 text</h3>
                ...
        </div>
        <div>
            <h3>Heading level 3 text</h3>
                ...
  </section>
  <section>
     <h2>Heading level 2 text</h2>
        ...
        </div>
        <div>
            <h3>Heading level 3 text</h3>
                ...

</article>

Why not also include headings 4, 5 and 6?

First, because in my template they won’t be included in the ToC, so no need.

Moreover, because most probably that would create a structure a la Bootstrap, which I don’t really want.

However, they can easily be added in the render-hook. I’m not so sure if the work to be done in single.html would be that simple. I haven’t checked.

Advice? Do not include headings below level 3 (h4, h5, h6).

THE CODE IN SINGLE.HTML

Main problems to face here: absolute control over the dot (context), and good knowledge about regular expressions in Hugo.

If you do some dummy mistakes, usually there won’t be any warning, just weird results or behaviours, unless there’s something Hugo evaluates as critical. Be advised.

There will be times you will write a regular expression which checks in regex101.com and it won’t give the desired result in Hugo, so the data you think you’re manipulating won’t be there, or won’t be the exact data you need. Triple-check your regular expression, and what has been passed as content or context to the regex.

Debugging example

Consider

{{ $content1 := replaceRE `(?s)(</section>)` "" .Content 1 }}

It will search inside some data (.Content) for a regular expression (between backticks) and replace what has been found (if it has been found) by something else (between quotation marks). And it will do it only once.

.Content is the rendered html coming from the markdown.

In my code this line searches for the first (unwanted) closing section tag. Only the first one found in the document. That will leave my html as:

<article>

  <section>
     <h2>Heading level 2 text</h2>
        ...

But if I make some typing mistake in the regex (regular expression), Hugo will tell me nothing, and my custom variable $content1 will be empty or will do nothing (will leave .Content as is). So: check, check, and then check what is inside that variable. Just temporarily I may add this:

{{ $content1 := replaceRE `(?s)(</section>)` "" .Content 1 }}
{{ $content1 }}

and Hugo will print the content of the variable in an ugly way, which is as plain text, without newlines, nor styling. But it usually will be enough to check if it has edited the right data.

If I want it printed as proper html, I will swap it by:

{{ $content1 | safeHTML }}

Giving all level 2 headings their own content

I need each h2 enclosed within section tags. With the previous step I have removed the first, unwanted closing tag, but now I lack an ending closing tag. So here it comes the next 2 lines of code:

{{ $end_section := "</section>" }}
{{ $content2 := printf "%s" $end_section | printf "%s%s" $content1 }}

Most probably they can be merged, but for clarity sake I add my needed closing tag in a variable, and then I create another variable with all the data in $content1 followed by that closing tag. Note that I have to add variables in reverse order.

Now I have all proper section tags inside $content2. If I check it with {{ $content2 | safeHTML }} and read the html output, I will see:

<article>

  <section>
     <h2>Heading level 2 text</h2>
        ...
  </section>
  <section>
     <h2>Heading level 2 text</h2>
        ...
  </section>

</article>

Let’s go for level 3 headings

First, I want to make sure I will work with the right content, so {{ with $content2 }} will force the data inside $content2 to be my context. REMEMBER: the dot will be the same as $content2.

Now, if you think about it, there will be pages with some text before the first h2 appears. So I have to treat it differently than content within headings. If I were to use just the data inside section tags, then that starting content would vanish and wouldn’t be there in the final html page.

So {{if (findRE `(?s).*?<h2` . ) }} and {{if (findRE `(?s).*?<h3` . ) }} will help me with that:

  • if the first condition is true (there exist level 3 headings), then something will be done with the context
  • if not, then it is checked if level 2 headings exist and the context will be the output
  • on the contrary, if there are no headings (if both previous conditions are false), then $content1 will be the output, which in the event there are no headings, then there won’t be any section tags, and $content1 and .Content will be the same thing

Now that all 3 scenarios are taken into account, let’s focus in the most complex of all 3, when there are level 3 headings.

We have to remove the first </div> inside each section block, and add a new </div> at the end of such section, so I loop between sections to do that. But I have to remember there can be content before the first section, so I have to isolate it and then start the loop.

Finding the content before the first section

It appears there’s no «find a string of text and save it to my variable» function in Hugo. Instead of that I have to use a somehow contrived replace function: {{ $pre_section := replaceRE `(?s)^(.*?)(?:<section>.*)` "$1" . 1 }}

  • search inside $content2 once (the context, the dot, is $content2 here)
  • use a regular expresion to find the desired text ((?s)^(.*?)(?:<section>.*)), which in this code finds everything from the start of the content up to the end, but matches only up to the first <content> tag (not included)
  • in the regular expression there’s a capturing group, which means this is what I need ((.*?))
  • in addition to that, there’s a non-capturing group, which means find also this, but don’t save it ((?:<section>.*)). In this case the non-capturing group captures everything from the first <section> to the end of the content
  • use this capturing group as the replacement string ("$1"): as the regular expression has found all the iteration content, but only has saved into memory the capturing-group, it means find what I want and replace everything with that
  • save the replaced text into my variable

NOTE: the Hugo function findRE stores slices (or arrays, if you prefer that), instead of strings of text. If I use this function rather than the previous replacement, Hugo will throw an error telling me I am trying to save a plain string into something that expects an array of strings.

After isolating it, the content before any section is added to the output ({{ $pre_section | safeHTML }}). And it’s added before the loop is started.

The loop

With {{ range (findRE `(?s)<section.*?</section>` $content2 ) }} I create a loop that isolates each section block.

On each iteration I remove the first </div> with {{ $first_div_removed := replaceRE `(?s)(</div>)` "" . 1 }} (the context is $content2, yet).

Then using the previous variable, I add a </div> at the end of the section block: {{ $last_ending_div_added := replaceRE `(?s)(</section>)$` "</div></section>" $first_div_removed }}, but note that to prevent adding the closing div after the closing section, I have replaced the closing section tag with </div></section>.

Finally I just add the contents of this last variable to the output: {{ $last_ending_div_added | safeHTML }}.

And that’s it!

Now about the danger of creating section and div tags everywhere, even where I don’t want them: as far as I can tell in my template the previous code does not affect anything outside the rendered markdown (even when there are headings present). So for now I think it’s safe as it is. But suggestions will always be welcomed.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.