How I Extended Markdown for Hugo


#1

I extended Markdown beyond what is provided by Hugo’s various standard rendering options. My requirements were:

  • Be Markdown-flavored. Use generic Hugo with No Shortcodes
  • Automatically increment Heading Levels
  • Styled Blocks
  • Easy Media Insertion
  • Font Awesome Icons Insertion
  • Easy MathJax formatting

I achieved my goals using a layout partial template that is called in place of .Content. I will describe each of these features below, and discuss my concerns with this approach, and provide the code. I am very interested in your thoughts on this approach.

Be Markdown-flavored. Use generic Hugo with No Shortcodes

Requirement: Use only standard Markdown or new syntax that is inspired by the ideals of Markdown.

Rational: As much as possible writers should use John Gruber’s original Markdown, However while Markdown covers the basics of writing for the web, it does not cover the needs of all writers. Hugo via Blackfriday supports various extensions to Markdown such as tables and definition lists. Still there are many extensions that Blackfriday does not currently support. Additionally, one of the inspiring aspects of Markdown has been the ability of each writer or community to adapt it to their own needs.

The Hugo communities approach to extending Markdown support has been to allow the use of a handful of eternal formats or shortcodes

  • External formats require additional installation and configuration, and provide limited additional flexibility.
  • Shortcodes while very powerful are a “markup” syntax and are decidedly not in the spirit of Markdown.
  • While shortcodes can contain Markdown, they unfortunately loose the context of the current page so that features such as reference links break

Finally, this should be done using a generic standard build of Hugo. It should work anywhere: your local machine, a server, a service such as netlify.com.

Increment Headings

Requirement: Increment all headings by one level. The markdown # My Heading will be rendered as <h2>My Heading</h2>

Rational: Content headings should be treated as relative to their context. The page Front Mater ‘Titles’ will become <H1> heading, and the content headings to be under it. For simplicity for writers and compatibility with editing tools, markdown content headings should start with # and be incremented by the renderer.

Styled Blocks

Requirement: add Markdown syntax to provided stylized <div> blocks. The content of these blocks will be rendered as markdown. For example:

::: alert red
This is a **Red Alert!** Take cover.
:::

results in:

<div class="alert red">
This is a <b>Red Alert!</b> Take cover. 
</div>

Rational: plain Markdown lacks the option for anything except very basic formatting. This provides for almost unlimited flexibility of adding content related container layout. The ::: syntax is taken from markdown-it-container and has been discussed as a potential extension to CommonMark

Easy Media Insertion

Requirement: recognize common embeddable media content

> https://vimeo.com/1234567

results in:

<div class='embed video-player'><iframe src='https://player.vimeo.com/video/1234567' width='100%' height='385' frameborder='0' webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe></div>

Rational: make media insertion easy for writers with minimal extra syntax. In fact, we leverage the standard markdown blockquote so it works nicely in Markdown aware editors. Currently providing support for Vimeo and YouTube.

Font Awesome icons

Allow the insertion of Font Awesome icons. For example :fa-camera-retro: will result in:

<i class="fa fa-camera-retro" aria-hidden="true"></i>

You can add additional Font Awesome or other classes. :fa-camera-retro fa-3x text-danger:

<i class="fa fa-camera-retro fa-3x text-danger" aria-hidden="true"></i>

Any required CSS or JS is automatically added to the page without need to add site-wide or in the front matter.

Easy MathJax formatting

Requirement: support for MathJax that is friendly to writers

Recognize math inline

`$ x^2 $`

and block

$$ x^2 $$

and apply the appropriate tags and javascript.

Any required CSS or JS is automatically added to the page without need to add site-wide or in the front matter.

Issues with this approach

This is a hack to force Hugo to do things it is not setup to do.

  • Code Fences
    I want to able to use Code Fences; however, my approach has conflicts. Still working…

  • No Shortcodes Allowed
    Shortcodes are not processed by .RawContent or markdownify. While we are intentionally avoiding the common uses of Hugo’s shortcodes, this approach prevents you from using them in content for any reason. For my use cases, I don’t see this as an issue. I however, I would prefer to not disable Shortcodes entirely.

  • The regular expressions in Go are a little limited. It does not
    recognize start of line ^ and end of line $ so I am using \n to look for line
    breaks. There is no support for Negative lookahead, which makes it difficult
    to catch cases such as escape codes. Some of this maybe my lack of knowledge
    of the Go language.

  • Forces the use of safeHTML
    I am not entirely sure if this is introducing a safety issue here? Some requirement of trusting the content. Ideally Hugo will implement a sanatizeHTML function.
    Need to track: https://github.com/spf13/hugo/issues/1457

  • Need to replace all instances of .Content in templates with my layout partial template.

  • How much of a performance hit is this? Should I be using .partialCached? (Remember this is being applied to every page on the site.)

How to use

In your templates replace .Content with .contentMarkdownCustom. Create the partial template contentMarkdownCustom.html and add this code:

{{ $s := .RawContent }}
{{ $s :=  delimit (slice "\n" $s "\n") "" }}

{{ $s :=  $s | replaceRE "\n(#+) " "\n#$1 " }}

{{ $s :=  $s | replaceRE "\n::: *\n" "\n\n«MDBLOCK-END»\n\n" }}
{{ $s :=  $s | replaceRE "\n::: ([a-z0-9 ]*)\n" "\n\n«MDBLOCK-class-$1»\n\n" }}

{{ $s :=  $s | replaceRE "\n> https?://vimeo[.]com/([0-9]*) *\n" "\n\n<div class='embed video-player'><iframe src='https://player.vimeo.com/video/$1' width='100%' height='385' frameborder='0' webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe></div>\n\n" }}
{{ $s :=  $s | replaceRE "\n> https://youtu.be/([a-zA-Z0-9]*)" "\n\n<div class='embed video-player'><iframe class='youtube-player' type='text/html' width='640' height='385' src='http://www.youtube.com/embed/$1' allowfullscreen frameborder='0'></iframe></div>\n\n" }}

{{ $s :=  $s | replaceRE "\n([$][$].*[$][$])" "\n\n<div class='mathFormat'>$1</div>\n\n"}}
{{ $s :=  $s | replaceRE "\n[`]([$].*[$])[`]" "\n\n<span class='mathFormat'>$1</div>\n\n"}}

{{ $s :=  $s | replaceRE "([^`]):(fa-[a-z0-9 -]*):" "$1<i class='fa $2' aria-hidden='true'></i>" }}

{{ $s :=  $s | markdownify }}
{{ $s :=  $s | replaceRE "<p>«MDBLOCK-END»</p>" "</div>"  }}
{{ $s :=  $s | replaceRE "<p>«MDBLOCK-class-([^»]*)»</p>" "<div class='$1' >" }}

{{ $s :=  $s | safeHTML }}
{{- $s -}}


{{ $hasMathFormat := findRE "<(div|span) class='mathFormat'>" $s }}
{{ if ge (len $hasMathFormat) 1 }}
    <script type="text/javascript"
      src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
    </script>
    <script type="text/x-mathjax-config">
    MathJax.Hub.Config({
      tex2jax: {
        inlineMath: [['$','$'], ['\\(','\\)']],
        displayMath: [['$$','$$'], ['\[','\]']],
        processEscapes: true,
        processEnvironments: true,
        skipTags: ['script', 'noscript', 'style', 'textarea', 'pre'],
        TeX: { equationNumbers: { autoNumber: "AMS" },
             extensions: ["AMSmath.js", "AMSsymbols.js"] }
      }
    });
    </script>
    <script type="text/x-mathjax-config">
      MathJax.Hub.Queue(function() {
        // Fix <code> tags after MathJax finishes running. This is a
        // hack to overcome a shortcoming of Markdown. Discussion at
        // https://github.com/mojombo/jekyll/issues/199
        var all = MathJax.Hub.getAllJax(), i;
        for(i = 0; i < all.length; i += 1) {
            all[i].SourceElement().parentNode.className += ' has-jax';
        }
    });
    </script>
{{ end }}


{{ $hasFontAwesome := findRE "<i class='fa fa-" $s }}
{{ if ge (len $hasFontAwesome) 1 }}
        <script src="https://use.fontawesome.com/53dd1f1274.js"></script>
{{ end }}

#2

I would suspect the answer to that would be BIG.


#3

Any thoughts on using .partialCached for this?

My sites are still too small for the performance impact to be bothersome, so I haven’t worked on anymore optimization, but it is on my to do list.


#4

Well, when I say BIG it is all relative. With small sites, it should be negligible, and compared to the rest of the bunch (Jekyll and friends) it would still be very fast.


#5

Interesting. I’m working on something similar (but a lot more limited) to make images turn into <figure> elements with the alt text in a <figcaption>.

In the single.html template, .Content is run through replaceRE (and safeHTML):

{{ .Content | replaceRE "<p><img src=\"([^\"]+)\" alt=\"([^\"]+)\" /></p>" "<figure><img src=\"$1\" alt=\"$2\"><figcaption>$2</figcaption></figure>" | safeHTML }}

Seems to work nicely and keeps the Markdown file clean (which I want because I also want to run them through Pandoc to make PDFs).


Proposal to add a new shortcode "table" to Hugo