HUGO

Do not index certain pages

Hi! Say I have content that looks like

content/
├── 2020
│   └── first-blog.md
└── internal-boring-do-not-index
    └── ref.md

How do I prevent internal-boring-do-not-index from appearing in:

  • index.xml
  • sitemap.xml
  • internal-boring-do-not-index/index.xml

I don’t want to use robots.txt because I don’t want to draw attention to this content.

Many thanks!

I do this with a parameter in front matter:

---
title: This page will be excluded from sitemap
robotsdisallow: true
---

Then using such a construct in your template, e.g. sitemap.xml:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:xhtml="http://www.w3.org/1999/xhtml">
  {{ range where .Site.RegularPages "Params.robotsdisallow" "!=" true }}
    ...
  {{ end }}
</urlset>

Live example here

Also see

1 Like

Really struggling to understand those confusing build options.

[hendry@t480s show]$ hugo version
Hugo Static Site Generator v0.73.0/extended linux/amd64 BuildDate: unknown
[hendry@t480s show]$ cat content/do-not/foo.md
---
title: Do no index me
_build:
  render: false
  list: never
  publishResources: false
---


Boo!
[hendry@t480s show]$ grep do-not public/index.xml
[hendry@t480s show]$ vim public/index.xml
[hendry@t480s show]$ grep not public/index.xml
[hendry@t480s show]$ grep not public/sitemap.xml
    <loc>http://example.org/do-not/</loc>

I’m probably better off creating a Go template.

That is not the same URL as the one you “unlisted”.

I’m sorry, Why? Because it’s a section?

Because it’s not where you foot the _build configuration. I’m not sure how to explain it differently.

Oh, so I need to use the _index.md with _build stanzas to exclude a section?

Did I understand correctly?

Thanks @martignoni … I ended up with

 ...
  {{ range .Data.Pages -}}
  {{- if or (eq .Section "services") (eq .Section "2020")}}
...

in my layouts/_default/sitemap.xml

1 Like