Stripping footnotes from Summary

I’ve made use of {{ .Summary }} to render out either the auto generated X character summary or my manually split summary however it looks like footnote tags are still included even though the footnotes themselves are not, leaving unclickable [^1] in the summary.

This post mentions the same problem and that solving it may not be simple.

That said, what would be involved in stripping out the footnote tags e.g. [^1] when displaying .Summary?

The other alternative I’m considering is adding a post-summary = “< p >stuff</ p >” to the frontmatter but I’d prefer to only use that in cases where I want a pretty different summary of a post due to the need for markup and duplication of content.

Is this something .Summary should handle, or should I suck it up and use replaceRE on Summary to remove the footnote tags?

This should be fixed – and I suspect https://github.com/spf13/hugo/pull/2303 does – I haven’t tested it, though.

Seems to still be the problem in 2022.

I think you have to:

  • Split the summary manually using the <!--more--> tag, OR
  • Define the summary in front matter, OR
  • Use replaceRE to strip the tags

https://gohugo.io/content-management/summaries/#summary-splitting-options

1 Like

Yes, these options are there (although the <--!more--> tag wouldn’t help much in a case of a footnote in the first sentence), but this is a clear enough case that should, IMO, be considered a bug.

I mean, I don’t believe there’s a usecase where that rogue “1” at the end of a word in summary is actually desired. Stripping footnotes is what should be done by Hugo automatically when generating .Summary both by autosplitting and when <--!more--> is used. A frontmatter-defined summary might be an exception, but are there really any users that place a footnote there?

Actually, thinking more about this, can you provide an example?

With automatic summaries, HTML tags are stripped.
https://github.com/gohugoio/hugo/issues/8910#issuecomment-903158600

Oh, the tags are stripped alright, which actually makes the case worse.

You see, a footnote[1] is rendered as a link containing some text. In the case of Discourse it is replaced with some JS magic, but in Hugo, it would be just 1, wrapped in <a> and <sup>.

The Hugo automatic summary process (as well as with the <--!more--> tag) strips those tags. The resulting summary looks like:

You see, a footnote1 is rendered as a link containing…

And since that “1” there is no longer wrapped in any tags, there’s no good way to replaceRE it. Definitely a bug IMO.


  1. like this one, yep. ↩︎

I understand.

This markdown:

A footnote[^1] here. A superscript<sup>®</sup> after it.

Is rendered by Goldmark as:

<p>A footnote<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> here. A superscript<sup>®</sup> after it.</p>

So the regex to strip it out is not simple.

There’s been discussion in the past related to using bluemonday to sanitize HTML, and it has a regex option. We might want to look at that before spending more time on what we have now, which is a little bit broken.

Yep, I wouldn’t want to try doing this with regex. Of course, in HTML terms the task is pretty simple: completely wipe out any node that has id prefixed with “fnref:” and anything inside it.

Of course, this needs to be done before the tags are stripped in the process of .Summary generation, which is AFAIK not something a user can currently do.

Do you want me to open a GitHub issue to document this?

Sure.

In the meantime…

{{ .RawContent | replaceRE `\[\^\d+\]` "" | .Page.RenderString | plainify | truncate 250 }}
1 Like

Yeah, who needs those .Summary crutches anyway!

The Github issue 10503 above was closed, so adding this here.

I’m using this instead of {{ .Summary }} to strip the footnote from summaries:

{{ replaceRE `<sup id="fnref:"*.+</sup>` "" .Summary | safeHTML }}

It matches substrings beginning with <sup id="fnref:" and ending with </sup>, and replaces it with nothing.

For the example above, this regex should result in this:

<p>A footnote here. A superscript<sup>®</sup> after it.</p>

your regex does work the way you wrote. In fact it matches

  1. <sup id="fnref: - text
  2. "* - any number of double quotes (even zero)
  3. .+ - sequence af characters (at least one, longest match)
  4. </sup> - text

which will slurp in all between the start and end and return <p>A footnoteafter it.</p>,
also text between first and second footnote.

try it out: Summary footnote - wrong

Guess in case:

  • the footnote identifier is a number
  • the footnote (incl. end tag) is completely available in the summary

You want that one: <sup id="fnref:\d+".*?</sup> which will match:

  1. <sup id="fnref: - text
  2. \d+ - a number of digits
  3. " - a double quote
  4. .*? - any number of characters (even zero, shortest match)
  5. </sup> - text

try it out: Hugo summary regex (working)

ps. and I remember there where lot’s of changes regarding summaries - so maybe … it’s not relevant anymore.

maybe a candidate to archive

Thank you for the correction. It “worked” on my site (single footnote instances within summaries) but I clearly didn’t test it on the example string above.

If anybody plans to use that regex, I don’t believe it’ll work if you’re using named footnotes[^like-this], but this should work.[^1]