Goldmark not closing img and hr tags?

PSG · July 14, 2023, 10:02pm

I’ve noticed a couple of problems with the goldmark markdown renderer, and I wonder whether other people have seen them?

It isn’t closing self-closing tags.

***

converts to
<hr> while I’m expecting <hr /> and so are my RSS parsers

![Notability Organisation](/images/notability-1.jpg)

converts to

<img src="/images/notability-1.jpg" alt="Notability Organisation">

Web browsers are forgiving, and can live with this. RSS readers on the other hand are much less forgiving, and break!

Do I have to replace all my markdown images with html img tags? (I’m already allowing html in goldmark).

Thanks for any advice

Tom_Durand · July 14, 2023, 10:16pm

There is no problem.
html - How to close <img> tag properly? - Stack Overflow":

Actually, only the first one is valid in HTML5
<img src='stackoverflow.png'>  

PSG · July 14, 2023, 10:19pm

I know - but goldmark isn’t doing the second one like it should (look at my example)

The common mark spec says that it should create the self closing version; in other words it should produce

<img src="/images/notability-1.jpg" alt="Notability Organisation">

but it doesn’t.

For reference:

https://spec.commonmark.org/0.30/#example-577

PSG · July 14, 2023, 10:20pm

Similarly it should be producing <hr /> but it isn’t at the moment.

Tom_Durand · July 14, 2023, 10:21pm

Oh, understood. In that case it is a good thing the implementation is apparently correcting the reference, which would be wrong, since you need html.

Here is a list of tags that should not be closed in HTML5:

 <br>    <hr>    <input>

PSG · July 14, 2023, 10:24pm

I don’t follow you - the implementation isn’t following the commonmark spec, and as a result hugo is producing RSS feeds which are being rejected by readers because they don’t like parsing unclosed IMG tags which are missing the closing />

This seems like a problem in Goldmark to me, and I’m wondering whether other people have noticed it.

PSG · July 14, 2023, 10:25pm

That might be the case for HTML5, but unfortunately nobody has told the RSS parsers that (especially the one in Chrome!)

Tom_Durand · July 14, 2023, 10:25pm

Ok I didn’t register this:

RSS readers on the other hand are much less forgiving, and break!

It’s not browsers who are forgiving (in this case at least), they’re just accepting the norm. RSS just follow another norm, xml, with closing tags everywhere. so you’re right, goldmark issue. that a bit of regexp could solve, I think ?

PSG · July 14, 2023, 10:29pm

Sorry, full story here

I’m rebuilding a site by using Hugo, which I now love
As part of this, I’m producing an RSS feed so that people can subscribe to new posts on my site
RSS feed not working in RSS readers. Loading the feed up in Chrome gives specific errors to track down. The most recent errors have been objecting to <hr> because it wants <hr /> and now objecting to <img> because it wants <img />

It isn’t the XML bit of RSS which is the problem, it is the parsing of the CDATA content within the xml <content> tag.

It seems that the short term solution is to go back through about 100 posts and convert all image references to HTML rather than use markdown for it.

Tom_Durand · July 14, 2023, 10:31pm

I suggest you wait for other people’s feedback before doing something that extreme…
But again, you should try rather rexexp your way out of these problems, in case it’s JUST a matter of (self)closing tags. Seems way less time-consuming (and error-prone) than injecting a heapload of html.

jmooring · July 15, 2023, 5:51am

If the declaration at the beginning of your HTML file is <!DOCTYPE html>, you are telling the browser that you are feeding it HTML5, which doesn’t require forward slashes when closing void elements such as img and hr. In this case, with that declaration, the browser isn’t being forgiving… it’s doing what you told it do to.

If you want to generate XHTML instead of HTML, tell the renderer to generate XHTML.

[markup.goldmark.renderer]
xhtml = true

This:

![kitten](a.jpg "A kitten!")

Become this:

<img src="a.jpg" alt="kitten" title="A kitten!" />

But I am highly suspicious of your need to go this route. Although I’ve only tested a few RSS readers, I’ve never had a problem with HTML5 wrapped in CDATA tags.

ju52 · July 15, 2023, 8:33pm

https://validator.w3.org/feed/

try to validate your feed. If there are errors … something is to do.

Topic		Replies	Views
Goldmark: The return of the empty <p> tags support	4	491	November 26, 2020
Html img syntax stopped working in version 0.63 support	3	423	February 17, 2020
External link in a new tab with Goldmark and images support	3	1781	May 28, 2020
<img> tag does not work support	2	412	October 12, 2022
Hugo / Markdown questions	1	394	June 13, 2020

Goldmark not closing img and hr tags?

Related topics