Prevent content from being rendered/added to sitemap (single page sites)

Trying around with a single page site, which uses content files to better structure the content.

The current structure looks like this:

content/index/team.md
content/index/team/$members.md
layout/index/single.html <-- empty
layout/index/list.html <-- empty
layout/index.html <-- single page site magic

The page can be built fine from the content files for now. This is done by ranging through all pages, with the right section “index” etc.

Now the question is how to prevent these content files from being shown to the public.
/index/team/ already returns 404s
/index/ still returns 200s
/sitemap.xml lists all pages (even empty or 404ing pages)

Any hints on what to do here?

The current workaround for the sitemap is to provide a sitemap.xml layout file similar to:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {{ range where .Site.Pages "Section" "!=" "index" }}
      {{ $perm := add $.Site.LanguagePrefix "/index/" }}
      {{ if ne .RelPermalink $perm }}
      <url>
        <loc>{{ .Permalink }}</loc>
        <lastmod>{{ safeHTML ( .Date.Format "2006-01-02T15:04:05-07:00" ) }}</lastmod>{{ with .Sitemap.ChangeFreq }}
        <changefreq>{{ . }}</changefreq>{{ end }}{{ if ge .Sitemap.Priority 0.0 }}
        <priority>{{ .Sitemap.Priority }}</priority>{{ end }}
      </url>
    {{ end }}
  {{ end }}
</urlset>

This already removes all content pages except the list/index page at /index/.

1 Like

Only issue left seems to be that /index/ renders and doesn’t return a 404. Even so list.html and single.html are both empty and therefore the site should be empty (which it is) and return a 404 as empty pages shouldn’t be generated afaik.

List pages still return empty html (should be 404):

/index/
/de/index/

This seems to be due to an index.xml file being created within these directories.
Unfortunately using DisableRSS did not work and even if it did, this option would only be global, which is not preferable.

Any suggestions?

Current workaround to not “leak” content via /index/index.xml:

Overwrite rss.xml with something like this:

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    {{ if and (ne .RelPermalink "/index/index.xml") (ne .RelPermalink (add .Site.LanguagePrefix "/index.xml")) }}
      <title>{{ with .Title }}{{.}} on {{ end }}{{ .Site.Title }}</title>
      <link>{{ .Permalink }}</link>
      <description>Recent content {{ with .Title }}in {{.}} {{ end }}on {{ .Site.Title }}</description>
      <generator>Hugo -- gohugo.io</generator>{{ with .Site.LanguageCode }}
      <language>{{.}}</language>{{end}}{{ with .Site.Author.email }}
      <managingEditor>{{.}}{{ with $.Site.Author.name }} ({{.}}){{end}}</managingEditor>{{end}}{{ with .Site.Author.email }}
      <webMaster>{{.}}{{ with $.Site.Author.name }} ({{.}}){{end}}</webMaster>{{end}}{{ with .Site.Copyright }}
      <copyright>{{.}}</copyright>{{end}}{{ if not .Date.IsZero }}
      <lastBuildDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 -0700" | safeHTML }}</lastBuildDate>{{ end }}
      <atom:link href="{{.URL}}" rel="self" type="application/rss+xml" />
    {{ end }}
    {{ range first 15 (where .Data.Pages "Section" "blog") }}
      <item>
        <title>{{ .Title }}</title>
        <link>{{ .Permalink }}</link>
        <pubDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 -0700" | safeHTML }}</pubDate>
      {{ with .Site.Author.email }}<author>{{.}}{{ with $.Site.Author.name }} ({{.}}){{end}}</author>{{end}}
        <guid>{{ .Permalink }}</guid>
        <description>{{ .Content | html }}</description>
      </item>
    {{ end }}
  </channel>
</rss>

Slowly it seems I am fighting hugo more than I should have to.