Issue with RSS Feed Generation in Hugo v0.111.2 and v0.138.0

Hi, I am using Hugo v0.111.2 in my production environment and v0.138.0 in my development environment. I’m facing an issue with RSS feed generation: the RSS feed XML files generated by the two versions are different.

The feed generated with v0.138.0 seems fine, and when I validate it using W3C Feed Validation Service, for Atom and RSS, it says the RSS feed is valid. However, the feed generated with v0.111.2 is marked as invalid.

Currently, I cannot upgrade Hugo in my production environment due to various constraints. Were there any major changes related to RSS feed generation in versions after v0.111.2? If so, could someone please explain what changed?

If these changes are significant, please let me know if upgrading to a newer version would be worthwhile.

Idea: post your actual generated feed with the valid and invalid code so we have an idea what the actual issue is. Or the RSS layout file if you have one. I understand you have an issue, but I do not know what the issue is. And asking for changes to Hugo between versions is more or less unhelpful, but…

Blackbox idea: Something changed with the way SUMMARIES are built. It might be that your feed is showing a summary, so it might have some encoding the validator doesn’t like.

A completely different idea is why your production environment is so much lower. Can’t you update?

You can at least downgrade your local Hugo. You might download a binary from the release page and then run ./hugo instead of hugo using a local binary that is version bound. This will give you at least identical results. 26 versions is a lifetime in regards to hugo versions, and you might miss much more issues.

A very last full-blown solution would be to run hugo locally, commit the public directory, and then release that directory, instead of running hugo on your server.

1 Like

Hi @davidsneighbour,

Thanks for the reply!

Here are the details:

in v0.111.2, I get rss feed like the following:

<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
        <title>Title of my blog</title>
        <link>https://linktomyblob.com/en/</link>
        <description>This blog is about my personal journey</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <atom:link href="https://linktomyblob.com/en/" rel="self" type="application/rss+xml" />
        <item>
            <title>Introduction</title>
            <link>https://linktomyblob.com/en/introduction/</link>
            <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
            <guid>https://linktomyblob.com/en/introduction/myself</guid>
            <description>I&rsquo;m a developer, and I&rsquo;m planning to contribute more to open-source projects. I&rsquo;m also watching out for more such opportunities.</description>
        </item>
    </channel>
</rss>

When I paste the above-generated RSS feed into W3C Feed Validator for validation, I get the following error:

In v0.138.0, I get the rss feed like the following:

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Title of my blog</title>
        <link>https://linktomyblob.com/en/</link>
        <description>This blog is about my personal journey</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <atom:link href="https://linktomyblob.com/en/" rel="self" type="application/rss+xml" />
        <item>
            <title>Introduction</title>
            <link>https://linktomyblob.com/en/introduction/</link>
            <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
            <guid>https://linktomyblob.com/en/introduction/myself</guid>
            <description>&lt;h1 id=&#34;myself&#34;&gt;About Myself&lt;/h1&gt;&#xA;&lt;h3 id=&#34;introduction&#34;&gt;Introduction&lt;/h3&gt;&#xA;&lt;p&gt;I’m a developer, and I’m planning to contribute more to open-source projects.&lt;/p&gt;&#xA;&lt;p&gt; I’m also watching out for more such opportunities.&lt;/p&gt;</description>
        </item>
    </channel>
</rss>

When I paste the above-generated RSS feed into W3C Feed Validator for validation, I get the following result:

Congratulations!

This is a valid RSS feed.

(unable to upload the image directly as new users can only put one embedded media item in a post.)

Regarding the Hugo version, the frontmatter content is being written by multiple people. If I have to upgrade the version in the production environment, I need solid reasons to justify the upgrade. While I don’t mind downgrading the Hugo version in my local environment, I want to understand why the RSS feed generated by v0.111.2 differs from v0.138.0.

Guess @davidsneighbour is right with his guess regarding summary.

If you use the embedded RSS template there’s been an issue with escaping

I found an issue with HTML escaping solved with v0.121.0

The RSS templates are different:

embedded RSS template for v0.111.2

  <description>{{ .Summary | html }}</description>

embedded RSS template for v0.121.0

  <description>{{ .Summary | transform.XMLEscape | safeHTML }}</description>

In release v134.0 the behavior of the .Summary changed. (Summary comparision)[Content summaries | Hugo] but – at least regarding summary – nothing has changed in the templates.

embedded RSS template for v0.138.0

  <description>{{ .Summary | transform.XMLEscape | safeHTML }}</description>

If that’s really what your raw RSS looked like, you were not using the embedded RSS template. The embedded RSS template for v0.112.2 would have rendered this instead:

<description>I&amp;rsquo;m a developer.</description>

So the problem you have with v0.112.2 is caused by a custom RSS template, which you need to share with us if you need additional assistance.

1 Like

Hi @jmooring

Here is my custom RSS template:

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" >
    <channel>
        <title>{{ .Site.Title }}</title>
        <link>{{ .Site.BaseURL }}</link>
        <description>{{ .Site.Params.description | htmlEscape }}</description>
        <atom:link href="{{ .Site.BaseURL }}index.xml" rel="self" type="application/rss+xml" />
        {{ range .Pages }}
            <item>
                <title>{{ .Title | htmlEscape }}</title>
                <link>{{ .Permalink }}</link>
                <description>{{ .Summary | htmlEscape }}</description>
                <pubDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 MST" }}</pubDate>
                <guid>{{ .Permalink }}</guid>
            </item>
        {{ end }}
    </channel>
</rss>

Edit: Apologies, I had something different in mind and provided false advice. Please see the following response.


If you replace all occurrences of htmlEncode with transform.XMLEscape then it shouldn’t return any validation errors.

In fact htmlEncode is certain to cause issues by introducing the & character.

htmlEntity isn’t a function. I think you mean transform.HTMLEscape.

It doesn’t cause any problems; some form of escaping is required.

The & character is permitted in XML when used in an escape sequence.
For example, you must replace the > character with the &gt; escape sequence.

Although transform.HTMLEscape is sufficient in most cases, there are some special characters that are allowed in HTML documents that are not allowed in XML documents, so you should use transform.XMLEscape instead.

Compare that with Hugo’s embedded RSS template, where we do this:

<description>{{ .Summary | transform.XMLEscape | safeHTML }}</description>

After escaping special characters, we pipe the result through the safeHTML function so that Go’s html/template package doesn’t escape it again.

Using this construct, in Hugo’s default configuration, this markdown…

I'm a developer.

Is rendered to:

&lt;p&gt;I&amp;rsquo;m a developer.&lt;/p&gt;

When testing, make sure you are viewing the source. In your browser, if you are visiting http://localhost:1313/index.xml, press Ctrl+U to view the actual source.


Another way to handle this is with a CDATA section:

<description>{{ printf "<![CDATA[ %s ]]>" .Summary | safeHTML }}</description>
1 Like

Thanks for the detailed response @jmooring, I had something different in mind and provided false advice instead…

If I’m not hijacking the thread, does it make sense to markdownify the .Summary, or is it returned already rendered as HTML?

To make sure there is no HTML (opinionated, some tags are valid within <description>) in the output, I have resorted to this sequence of transformations in my custom RSS template, but maybe its extraneous.

.Summary | markdownify | htmlUnescape | plainify | transform.XMLEscape | safeHTML

Given the opportunity, it seems the embedded template has some English in it: <description>Recent content …

Maybe it shouldn’t force any such content to be useful for international users? I could provide a PR if it helps.

No, it does not.
https://gohugo.io/content-management/summaries/#comparison

This should be sufficient:

<description>{{ .Summary | plainify | transform.HTMLEscape | safeHTML }}</description>

Site and theme authors should override the embedded template as needed using translation tables. We are not adding translation lookups to the embedded templates.

1 Like