RSS issue on multilanguage

Hi,
I got following rss.xml file

{{- $pctx := . -}}
{{- if .IsHome -}}{{ $pctx = .Site }}{{- end -}}
{{- $pages := slice -}}
{{- if or $.IsHome $.IsSection -}}
{{- $pages = $pctx.RegularPages -}}
{{- else -}}
{{- $pages = $pctx.Pages -}}
{{- end -}}
{{- $limit := .Site.Config.Services.RSS.Limit -}}
{{- if ge $limit 1 -}}
{{- $pages = $pages | first $limit -}}
{{- end -}}
{{- printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" | safeHTML }}
<rss version="2.0"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
>
  <channel>
    <title>{{ if eq  .Title  .Site.Title }}{{ .Site.Title }}{{ else }}{{ with .Title }}{{.}} on {{ end }}{{ .Site.Title }}{{ end }}</title>
    <link>{{ .Permalink }}</link>
    <description>Recent content {{ if ne  .Title  .Site.Title }}{{ with .Title }}in {{.}} {{ end }}{{ end }}on {{ .Site.Title }}</description>
    <generator>Hugo -- gohugo.io</generator>
    {{ if not .Date.IsZero }}<lastBuildDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 -0700" | safeHTML }}</lastBuildDate>{{ end }}
    {{- with .OutputFormats.Get "RSS" -}}
    {{ printf "<atom:link href=%q rel=\"self\" type=%q />" .Permalink .MediaType | safeHTML }}
    {{- end -}}


    <image>
    	<url>{{.Site.BaseURL}}{{ .Site.Params.logorss }}</url>
    	<title>{{ if eq  .Title  .Site.Title }}{{ .Site.Title }}{{ else }}{{ with .Title }}{{.}} on {{ end }}{{ .Site.Title }}{{ end }}</title>
    	<link>{{ .Permalink }}</link>
    	<width>75</width>
    	<height>75</height>
    </image>

    {{ range $pages }}

    {{ if ne .Params.sitemap_exclude true }}
    <item>
      <title>{{ .Title }}</title>
      <link>{{ .Permalink }}</link>
      
      {{ if .Params.featuredImage }}
        {{ $ftimgsrc := resources.Get .Params.featuredImage }}
        <description>{{- printf "<![CDATA[<img src=\"%s\" alt=\"%s\" />]]>" $ftimgsrc.Permalink .Title | html }}  {{ .Summary | html }}</description>
      {{ else }}
        <description>{{ .Summary | html }}</description>
      {{ end }}

      <pubDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 -0700" | safeHTML }}</pubDate>
      <guid>{{ .Permalink }}</guid>
      {{ if .Params.categories }}{{ range $i, $e := .Params.categories }}<category>{{ $e | urlize }}</category>{{ end }}{{ end }}
      {{ if .Params.tags }}{{ range .Params.tags }}<category>{{ . | urlize }}</category>{{ end }}{{ end }}

      <dc:creator>{{ .Site.Author.name }}</dc:creator>

    </item>
    {{ end }}
    {{ end }}
  </channel>
</rss>

Which working well at multiple sites, but on mine, on multilangual part /en/ I got error reported n Microsoft Edge when you open it localhost:1313/en/index.xml.

This page contains the following errors:
error on line 278 at column 185: PCDATA invalid Char value 31

No issue on main content in default language localhost:1313/index.xml

any help?

You are using a non latin character somewhere, arent’you?

This is not quite my expertise but a casual Google search wielded the following: PCDATA invalid Char value 31 · Issue #39 · neitanod/forceutf8 · GitHub

Try isolating the offending string and try passing it through one of Hugo’s safe functions to sanitize it.

2 Likes

Thanks.

I investigated that and found that {{ .Summary | html }} containing characters that’s braking XML format.

Cannot use safeHTML in <description> (I am using in <content:encoded>).

Any other way to remove strange characteds from .Summary?

Tried plainify but still no change.

Identified which post summary is breaking that and it’s following:

When I moved to Hugo with my website, I looked to optimise everything and implement new techniques. Once Safari gain native WebP support back in 2020, I implement WebP following PawelGrzybek.com - WebP and AVIF images on a Hugo website. The post introduced not only how to implement WebP (at the time when Hugo <0.83 haven’t support it), but also shown how to go step further by implementing AVIF.

This method require you to have WebP/AVIF files stored along with PNG/JPG and not relay on rendering them when the site is build.

I was interested in implementing this as well, but after some tests in my environment I decided not to, and here I will explain why (and it is not about browser compatibility — Safari incompatibility).

From Not so fast with AVIF, WebP is still the way

So here we got < symbol that can possibly break it and /.

Any advice is appreciated.

ps. Theoretically I can change this character, but would prefer future proof solution.

The above construct looks a bit weird. I am not aware of a html function. Are you sure it works?

Anyway when I said that you should try the safe functions, I had htmlEscape in mind.

Yes, I struggle to find this in functions but yes, its working.

It replace <p> to that &lt;p&gt; which is desirable in <description>, where the 1st is desirable in <content:encoded> when used (and I used it with safeHTML)

<description>{{ .Summary | html }}</description>
        <content:encoded>{{ "<![CDATA["  | safeHTML  }}{{ printf "<img src=\"%s\" />" $ftimgsrc.Permalink | safeHTML }}{{ .Summary | safeHTML }}{{ "]]>" | safeHTML  }}</content:encoded>

Sadly, cannot use htmlEscape as the output need to be html in the description as above.

But, when you will see <content:encoded> this symbol is also braking XML with safeHTML.

Found the issue.
It’s nothing above and the code is all correct.

I go back to the markdown file in Visual Studio Code (previously used Atom) and found, highlighted in red strange invisible character that was breaking the whole thing. When I removed that all start working.

Thanks for your assistance.

ps. in case somebody would like, my finalised personal rss.xml template.

{{- $pctx := . -}}
{{- if .IsHome -}}{{ $pctx = .Site }}{{- end -}}
{{- $pages := slice -}}
{{- if or $.IsHome $.IsSection -}}
{{- $pages = $pctx.RegularPages -}}
{{- else -}}
{{- $pages = $pctx.Pages -}}
{{- end -}}
{{- $limit := .Site.Config.Services.RSS.Limit -}}
{{- if ge $limit 1 -}}
{{- $pages = $pages | first $limit -}}
{{- end -}}
{{- printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" | safeHTML }}
<rss version="2.0"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:webfeeds="http://webfeeds.org/rss/1.0"
  xml:base="{{ .Permalink }}"
>
  <channel>
    <title>{{ if eq  .Title  .Site.Title }}{{ .Site.Title }}{{ else }}{{ with .Title }}{{.}} on {{ end }}{{ .Site.Title }}{{ end }}</title>
    <link>{{ .Permalink }}</link>
    <description>Recent content {{ if ne  .Title  .Site.Title }}{{ with .Title }}in {{.}} {{ end }}{{ end }}on {{ .Site.Title }}</description>
    <generator>Hugo -- gohugo.io</generator>
    {{ if not .Date.IsZero }}<lastBuildDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 -0700" | safeHTML }}</lastBuildDate>{{ end }}
    {{- with .OutputFormats.Get "RSS" -}}
    {{ printf "<atom:link href=%q rel=\"self\" type=%q />" .Permalink .MediaType | safeHTML }}
    {{- end -}}


    <image>
    	<url>{{.Site.BaseURL}}{{ .Site.Params.logorss }}</url>
    	<title>{{ if eq  .Title  .Site.Title }}{{ .Site.Title }}{{ else }}{{ with .Title }}{{.}} on {{ end }}{{ .Site.Title }}{{ end }}</title>
    	<link>{{ .Permalink }}</link>
    	<width>32</width>
    	<height>32</height>
    </image>

    {{ range $pages }}

    {{ if ne .Params.sitemap_exclude true }}
    <item>
      <title>{{ .Title }}</title>
      <link>{{ .Permalink }}</link>
      
      {{ if .Params.featuredImage }}
        {{ $ftimgsrc := resources.Get .Params.featuredImage }}
        <description>{{ .Summary | html }}</description>
        <content:encoded>{{ "<![CDATA["  | safeHTML  }}{{ printf "<img src=\"%s\" />" $ftimgsrc.Permalink | safeHTML }}{{ .Summary | safeHTML }}{{ "]]>" | safeHTML  }}</content:encoded>
        <webfeeds:cover image="{{ .Params.featuredImage | absURL}}" />
      {{ else }}
        <description>{{ .Summary | html }}</description>
        <content:encoded>{{ "<![CDATA["  | safeHTML  }}{{ .Summary | safeHTML }}{{ "]]>" | safeHTML  }}</content:encoded>
      {{ end }}

      <webfeeds:icon>{{.Site.BaseURL}}{{ .Site.Params.logo }}</webfeeds:icon>
      <webfeeds:logo>{{.Site.BaseURL}}{{ .Site.Params.logo }}</webfeeds:logo>
      <webfeeds:accentColor>137faa</webfeeds:accentColor>
      <webfeeds:related layout="card" target="browser" />
      <webfeeds:analytics id="UA-2374382-4" engine="GoogleAnalytics" />

      <pubDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 -0700" | safeHTML }}</pubDate>
      <guid>{{ .Permalink }}</guid>
      {{ if .Params.categories }}{{ range $i, $e := .Params.categories }}<category>{{ $e | urlize }}</category>{{ end }}{{ end }}
      {{ if .Params.tags }}{{ range .Params.tags }}<category>{{ . | urlize }}</category>{{ end }}{{ end }}

      <dc:creator>{{ "<![CDATA["  | safeHTML  }}{{ .Site.Author.name }}{{ "]]>" | safeHTML  }}</dc:creator>

    </item>
    {{ end }}
    {{ end }}
  </channel>
</rss>

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.