Iterate through a page headings

Morteza_Karimi · December 7, 2021, 10:24pm

I’m wondering if headings in a page can be accessed and iterated over from within a shortcode.
Imagine a content page like below:

## Introduction

One morning, when Gregor Samsa woke from troubled dreams, he found himself transformed in his bed into a horrible vermin.

## My Heading

He lay on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment.

### My Subheading

A collection of textile samples lay spread out on the table - Samsa was a travelling salesman

Is it possible to access a dictionary of headings and sub-headings of this page?
I’m guessing .Page.Contents would be a good place to start and maybe a where filter to extract headings and construct a dictionary?
I also see that there’s a .Page.TableOfContents variable that seems to be exactly what I need. But it seems to be the rendered html version, What I need is a nested dictionary of headings and sub headings within a page.

jmooring · December 7, 2021, 11:54pm

There’s no easy way to do it, but you might have a look at this:
https://discourse.gohugo.io/t/splitting-content-into-sections-based-on-header-level/33749/6

chrillek · December 8, 2021, 10:29am

.Contents is already HTML. So you’d have to search for <h(\d)>([^<]+)</h\1> with a regular expression. Then the text of the headings would be accessible in the second capturing group, their level in the first one. However, afaik, a dictionary is a key-value map, so I don’t see how that will help you to retain the hierarchical structure of your headlines.

Like .Contents, same difference.

What do you want to achieve?

alexandros · December 8, 2021, 11:27am

The variable is called .Content (there is no s in the end).

Morteza_Karimi · December 8, 2021, 10:02pm

I made some progress getting the headings using findRe . For some reason content was not available to me but I was able to get .Page.RawContent
Code below gets the headings and adds them to a list:

{{ $headings := slice }}

{{ range (findRE `\n#+([^{\n]+)` .Page.RawContent)}}
    {{ $headings = $headings | append (anchorize (trim . "\n# ") )}}
{{end}}

<h{{ $.Level }} id="{{ index $headings 1}}">{{ $.Text | safeHTML }} <a href="#{{ index $headings 1}}"></a></h{{ $.Level }}>

Answering your broader question of what I’m trying to do. I’m trying to override the way headingID is generated by Hugo, by using a markup hook for headings.
I need heading IDs to be in English even if the heading text is not English. For example for a given page in French, I can construct a list of this page headings (French headings) using the code above. I’m hoping to fetch the English version of the same page and do the same, then correlate the order of headings and apply English IDs to the French headings.
The part that I’m stuck now is getting the English page, seems like that scope is not accessible in the heading hooks.

chrillek · December 9, 2021, 9:10am

This will not find a heading on the first line of your file because of the leading \n. Also, the ‘#’ marker(s) are followed by a space in markup, which your RE gobbles up in the capturing group. Is that intended?

Morteza_Karimi · December 9, 2021, 7:01pm

You’re right. The following seems to be working better:
#+ ([^{\n]+)[ \n]
But is there a way to retrieve the capturing group in Hugo? As far as I know that’s not supported?

chrillek · December 10, 2021, 7:51am

You’ll probably have do too same tricks with ReplaceRE.

alexandros · December 10, 2021, 9:18am

For an example using a captured group in the replaceRE function see:

Not sure how you could retrieve the captured group with findRE.

Also there are a few more topics in the forum about your question, look them up perhaps you find something else.

chrillek · December 10, 2021, 10:16am

You’d need something like a match list/map/dictionary that you can access. Go seems to have the required functionality, but it is apparently not exposed in Hugo.

There’s an example on how to work with capturing groups here:

That’s clumsy (to use a friendly term). A more complete support for REs in Hugo would probably be nice.

alexandros · December 10, 2021, 4:01pm

I see that you opened:

So you will most likely get an answer there by the maintainer or perhaps someone else might look into the issue.

Topic		Replies	Views
Get h2 tags from .Content support	3	823	November 17, 2021
Splitting .Content into sections based on header level support	5	2497	July 9, 2021
Render headings differently on home page support	7	127	August 15, 2024
Headings enclosed on html section tags support	6	167	March 9, 2025
Any way to get deeper information about a page's content than just {{ .Content }}? support	3	641	February 22, 2018

Iterate through a page headings

Related topics