Possible bug: dividers such as <!--more--> at the start of a line eat up the rest of it on rendering

[EDIT] Hugo v0.146.3

Context: I wrote an “extract” partial, which works similar to a summary except it uses two <!--extract--> dividers (and falls back to .Summary if it does not find them). This works (it properly extract the content between the dividers), but in the course of testing it, I noticed that sometimes the full .Content was missing the extract, and this led me to test dividers without even using my partial.

Hello,

I think I’ve come across a weird bug with dividers (e.g. annotations of the form !<--something-->).

The following markdown content showcases this bug:

Test 1 begin
Lines with <!--starting-->a pair of tags<!--ending--> in the middle are rendered OK.
Test 1 end

Test 2 begin
<!--starting-->Lines with a tag at the beginning and another in the middle<!--ending--> are rendered EMPTY.
Test 2 end

Test 3 begin
Lines with <!--starting-->a tag in the middle and another at the end are rendered OK.<!--ending-->
Test 3 end

Test 4 begin
<!--starting-->Lines with a pair of tags at the beginning and end are rendered EMPTY.<!--ending-->
Test 4 end

Test 5 begin
<!--starting-->Lines with a single tag at the beginning are rendered EMPTY.
Test 5 end

This content is rendered as:

Test 1 begin Lines with a pair of tags in the middle are rendered OK. Test 1 end

Test 2 begin

Test 2 end

Test 3 begin Lines with a tag in the middle and another at the end are rendered OK. Test 3 end

Test 4 begin

Test 4 end

Test 5 begin

Test 5 end

End of tests

When the line does not start with a divider, then the dividers are correctly removed before rendering, and the result is correctly integrated into the paragraph beginning with “Test N begin” and ending with “Test N end”.

But when the line starts with a divider, it is as though the whole line is considered empty for rendering purposes, which causes the preceding “Test N begin” to be considered an isolated paragraph, and the following “Test N end” as well.

I haven’t found anything in the documentation that forbids dividers within a line, and indeed, as the tests above show, the current hugo code partially supports dividers within a line.

I would therefore tend to consider it a bug that a divider at the start of a line eats up said line, and I would start working on proposing a fix, but before that I’d appreciate if an active hugo dev could confirm that it is indeed a bug.

Regards,
Albert.

Some notes…

Minimal example

1<!--x-->2<!--x-->3

<!--x-->4<!--x-->5

6<!--x-->7<!--x-->

<!--x-->8<!--x-->

<!--x-->9

CommonMark

https://spec.commonmark.org/dingus/?text=1<!--x-->2<!--x-->3 <!--x-->4<!--x-->5 6<!--x-->7<!--x--> <!--x-->8<!--x--> <!--x-->9

<p>1<!--x-->2<!--x-->3</p>
<!--x-->4<!--x-->5
<p>6<!--x-->7<!--x--></p>
<!--x-->8<!--x-->
<!--x-->9

Goldmark (unsafe)

main.go
package main

import (
	"bytes"
	"fmt"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/renderer/html"
)

func main() {
	md := goldmark.New(
		goldmark.WithRendererOptions(
			html.WithUnsafe(),
		),
	)

	input := `1<!--x-->2<!--x-->3

<!--x-->4<!--x-->5

6<!--x-->7<!--x-->

<!--x-->8<!--x-->

<!--x-->9
`

	var buf bytes.Buffer
	if err := md.Convert([]byte(input), &buf); err != nil {
		panic(err)
	}

	fmt.Println(buf.String())
}
<p>1<!--x-->2<!--x-->3</p>
<!--x-->4<!--x-->5
<p>6<!--x-->7<!--x--></p>
<!--x-->8<!--x-->
<!--x-->9

Hugo (unsafe)

<p>1<!--x-->2<!--x-->3</p>
<!--x-->4<!--x-->5
<p>6<!--x-->7<!--x--></p>
<!--x-->8<!--x-->
<!--x-->9

Goldmark (safe)

main.go
package main

import (
	"bytes"
	"fmt"

	"github.com/yuin/goldmark"
)

func main() {
	md := goldmark.New()

	input := `1<!--x-->2<!--x-->3

<!--x-->4<!--x-->5

6<!--x-->7<!--x-->

<!--x-->8<!--x-->

<!--x-->9
`

	var buf bytes.Buffer
	if err := md.Convert([]byte(input), &buf); err != nil {
		panic(err)
	}

	fmt.Println(buf.String())
}
<p>1<!-- raw HTML omitted -->2<!-- raw HTML omitted -->3</p>
<!-- raw HTML omitted -->
<p>6<!-- raw HTML omitted -->7<!-- raw HTML omitted --></p>
<!-- raw HTML omitted -->
<!-- raw HTML omitted -->

Hugo (safe)

<p>123</p>
<p>67</p>

Summary

  1. Goldmark (unsafe) matches the reference implementation
  2. Hugo (unsafe) matches Goldmark (unsafe)
  3. Goldmark (safe) eats characters (FAIL)
  4. Hugo (safe) matches Goldmark (safe)

Conclusion: this is an upstream issue.

https://github.com/yuin/goldmark/issues/499

1 Like

The Goldmark project lead closed the corresponding issue, noting that it works as expected. He refers to the CommonMark spec, specifically:

https://spec.commonmark.org/0.31.2/#html-blocks

This part of the spec is complicated, and I should have spent time looking at it before logging the issue… start condition #2 in particular.

1 Like

So… does this mean the behavior I see is normal and not a bug?

Yes. It follows the CommonMark spec.

1 Like

Understood. I’ll just make sure to document that if I want an extract to start at the beginning of a paragraph, I’ll have to put the divider on its own line before the text rather than start the paragraph with it.

Thanks for your effort and feedback!

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.