Short codes: api/template-design

I have been puzzling a little with creating a PR for this issue: https://github.com/spf13/hugo/issues/565

Basically a refactoring of when/how the short codes are rendered, and also includes the new syntax introduced in https://github.com/spf13/hugo/issues/480:

 {{< >}}: Typically raw HTML. Will not be processed.
 {{% %}}: Will be processed by the page's markup engine (Markdown or (in future) Asciidoctor)

Enough introduction, on to the topic of this topic:

The current parsing code for short codes is effective and compact, but kind of hard to read and understand – making changes to it (even rearranging it) seem like a scary task for me.

So, inspired by Rob Pike’s talk called “Lexical Scanning in Go” (Google it, video and slides are found online — very interesting!), I decided to try to rewrite the scanning/parsing parts related to short codes.

I thought this would be a hard sell to @spf13 (I have deleted much of his code).

But running some simple benchmarks (1000 big docs with lots of different short codes on the {{%-form (only syntax known to all versions. This is a very non-typical document set) shows significant improvements in page throughput.

And it’s easier to understand (for me, anyway), easier to test in isolation and gives fairly good error messages.

Two questions (mostly) to @spf13 (you can use this coming PR or ignore it – that’s what the “request” part in pull request means):

  1. I notice the support for htmlencoded and non htmlencoded quotes. It’s not hard to implement, but I wonder if this is a real requirement or just “nice to have”?

  2. The parser fixes unbalanced quotes in parameters. I see some problems with this if we want to expand on the syntax in the future.

Looking at the Go language and template syntax, I say a fairly strict parser is a good thing, if good error messages is in place. For browsers it makes sense to fix bad HTML, but Hugo users own the source, and they are smart and wants to do it to according to spec.

This flexibility comes at a cost; it has to be tested and one never knows if it will bite you in the foot in the future when wanting to add some more syntactic sugar.

2 Likes

Its great that you’ve decided to clean this up more after working on your short code pr. Dealing with ratiionalizing interactions between various systems, especially ones that do essentially the same thing, like templating, can be a headache!

I would note that one of the proposed tags, {{%, is one of the Mustache tags for unescaped variables. I don’t know how that will affect your processing.

Perhaps tags or delimiters other than{{ should be used. For a tool I’ve been working on, I just used a simple ‘#’ prefix in front of mine so that I could process my template and leave the embedded Go template code intact. However, ‘#’ was a poor choice and I’ll probably change it on my next rewrite of it.

Fyi, I’m writing my Mustache library using Go’s template design as the basis, of which Rob Pike’s lexer is a piece. Since, in the large, it follows the common lexer/AST parser, with some unique engineering within them, as you know, I would expect them to be fairly performant. The design is also used wherever they need such functionality within Go’s libraries. Scanners and tokenizers are used too, but for what I am doing the lexer/parser route seems more appropriate.

I just realized that someone shouldn’t be mixing Go templates and Mustache templates so nvm my concern about the delimiters, I’m slow…

Looking forward to seeing your new version.

Those tags never reaches the markup engine chosen.

Nothing would make me happier than replacing my code with better code. No ego here at all. My entire goal of this project is to build a great website experience. A lot of the code for Hugo was written to be functional, but with the intent to optimize further later.

I believe only support for one format is critical. Initially we had not quite figured out the right flow for when to process shortcodes, so we wrote the parser flexibly.

As I’ve said on other posts I think the parsing and rendering of the shortcodes should happen independently of the rendering of the other content of the post and then merged as a final step. If we follow that approach we would only need to parse one of them.

Yes - that was the main task I was set to implement (handling of the rendering indepentently) - OK, the I’ll leave it as I have now.

I have added a PR here: https://github.com/spf13/hugo/pull/586

Note the open and interested and related discussion in https://github.com/spf13/hugo/issues/480 – that could need more heads thinking.