Parsing Backlinks in Hugo

I’d love to get feedback on performance improvements in the backlinks.html partial I share here:

Cool little partial you have here.

As far as performance, you’re only iterating site.RegularPages once, so that seems fine.

I’m no regex expert – though, I hear you are, ha – but the regex seems like the next place I would look. Ways to make it match earlier or on less, etc.

Assuming we limit the scope to links in markdown, here is one idea:

  • In a render hook, find the page (site.GetPage) and add the current page to a Scratch “backlinks map” in the target page.
  • In a dummy output format that is rendered first, do a {{ range site.RegularPages }}{{ $tmp := .Content }}{{ end }}
  • Then you should have a ready-to-use backlink index in your Page.Scratch (or Page.Store depending on your requirements) when you render your HTML (or RSS).

Unfortunately, 99% of my cross-linking is using relref: It’s easy and does the broken link detection automatically. So the link render hooks won’t work here, right?

I’m no expert either. I came up with that regex to remove all the false matches for backlinks. I’ll keep looking for better ways to write that regex.

Thinking a bit more… I think I can do this in 2 places:

  • In a link render hook to handle normal md links, and
  • a custom relref shortcode that overrides the default one, and does this Scatch update in the Page’s context.

But I didn’t understand this step:

What does this step do? Also how do we control which output format is rendered first?

It’s a way to trigger the render hooks. There is a weight attribute on the output format and I suspect they fall back to the name, so if you prefix it with “_”, e.g. "_preprocessformat` you should be safe.

Note that when I can get the partialCached thing to run once only you could get a simpler setup by having a function you could include at the top of your template which did the loop above.

