Content parsing based on set of rules

I would like to implement custom content parsing rules for advanced typography and symbols. For instance, I want all content that matches a set of predefined patterns to be automatically replaced, for instance:

  • turn " - " into " — "
  • turn “CMD” into “⌘”
  • turn “…” into “…”
    etc.

What is the best approach to implement such functionality without breaking the source code?
I would love to create an advanced typography module and share it with the community. Any guidance is appreciated.

Thanks!

Hugo includes the typographer extension that replaces “—” with the UTF8 long dash and three normal dots with the UTF8 ellipsis. We can also configure this typographer to replace dumb quotes with the curly ones of a specific language (English by default).

It’s also possible to use the function replaceRE on the rendered content.

Please consider, that we can use all UTF8 characters in Markdown directly. The substitution of ASCII characters is a convenience for the most frequent ones.

2 Likes

Thank you for quick help, Georg! Typographer extension is exactly what was needed.

Can’t wrap my head around using replaceRE though. I tried {{ .Content | replaceRE CMD "$1" "⌘" }}, but it breaks the code. What confuses me, is if the rendered content is going to be processed, doesn’t that mean that the content should be re-processed again?

The parameters of replaceRE must be strings. Which CMD Is not. And you don’t even need REs here, as you’re simply replacing one string with another.

1 Like

There is a collection of replacements for missing inline markup and a few typographic features in this module.

I’m using them for my coming theme https://perplex.desider.at .

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.