Subtleties of .Plain and .PlainWords

I’m curious what the inaccuracies are, and suggest that the the full expected experience of these methods be a part of the clarity and completeness of the documentation.

Using these methods on HTML output seems to have at least the described effect in the commit, but I’m not aware of the other effects or when using JSON.

@jmooring

Overview

To simplify this discussion and examples we will:

  1. Disable Goldmark’s typographer extension
  2. Render a page using two output formats, HTML and plain text
  3. Examine the files that Hugo creates, not how they appear in a browser

In support of #3, we will use screen captures instead of fenced code blocks in this post.

Configuration

[markup.goldmark.extensions]
typographer = false

[outputFormats.TXT]
isPlainText = true   # use pkg.go.dev/text/template
mediaType = 'text/plain'

[outputs]
home = ['HTML','TXT']

Layouts

layouts/
└── _default/
    ├── index.html  --> uses pkg.go.dev/html/template
    └── index.txt   --> uses pkg.go.dev/text/template

The layouts are identical:

image

Markdown

Regardless of output format, this markdown will be rendered to HTML by Goldmark.
Notice that the last two lines are different.

image

Result

Rendered by layouts/_default/index.txt

This is the TXT output format.

Comments:

  • The symbols as template pipelines are untouched.
  • The value of .Plain is exactly what Goldmark produced after the HTML tags are stripped.
  • The left double quotation mark, right double quotation mark, and right single quotation mark are untouched as they pass through Goldmark, and then through Go’s text/template package.

Rendered by layouts/_default/index.html

This is the HTML output format.

Comments:

  • The symbols as template pipelines are sanitized by Go’s html/template package.
  • The value of .Plain is exactly what Goldmark produced after the HTML tags are stripped and it is sanitized by Go’s html/template package.
  • The left double quotation mark, right double quotation mark, and right single quotation mark are untouched as they pass through Goldmark, and then through Go’s html/template package.

Summary

How you use .Plain in your templates depends on the output format, specifically the value of the isPlainText property, which controls whether we use pkg.go.dev/text/template or pkg.go.dev/html/template.

You may wish to pipe .Plain through the htmlUnescape function depending on what you are trying to achieve.

Testing

Feel free to try it yourself.

git clone --single-branch -b hugo-forum-topic-41335 https://github.com/jmooring/hugo-testing hugo-forum-topic-41335
cd hugo-forum-topic-41335
hugo --quiet && cat public/index.txt public/index.html

References

2 Likes

So I see that the TXT and HTML do not output the same exact thing from .Plain but currently in the docs there is no mention of any of this for any output format. So having what you’ve written here, as an expansion and clarification (replacement) on what I had originally made in the PR would solve the issue I was trying to solve. When I was playing with those methods, there was no context as to what to expect so I got very unexpected results because the normal behaviour wasn’t described. And you’ve done so clearly and deeply. So yea, this should be what the PR consists of. At the very worst case, have a link to this thread in the docs for .Plain and .PlainWords with a backreference in the htmlUnescape docs.

This is about as much as I am prepared to add to the description of .Plain

https://github.com/gohugoio/hugoDocs/pull/1870

The description of .PlainWords references .Plain so I do not see a need to duplicate the information.

Finally, a description of the difference between output formats (text/template vs. html/template) belongs elsewhere, possibly near the description of the isPlainText property on this page.