I’m looking to generate a multilingual sitemap in my Hugo site, but without creating multilingual pages.

I’ll be using a translation proxy to translate the website itself, but I want to pre-generate the sitemap so that I don’t need to worry about keeping it up-to-date and valid with the alternate languages. Hugo can do that for me - so far so good.
However, in order for any pages to be included into the multilingual sitemap, I need to create two of each content file (or more if I’m adding more languages), which sounds wasteful to me, since I only need the sitemap entries.
Which, along with defaultContentLanguageInSubdir = true, generates more or less what I want:


It does include the Japanese pages, but I can just remove/ignore/not deploy those to hosting. More importantly, this makes sure that all entries in the sitemap XML will have both the Japanese and the English URL defined. If I remove the language-specific MD file, only the English one is added.

I know that a page without an implicit language (either from directory structure or from file name) will be assigned to the default language. I think what I’m looking to achieve is to have a page without a specific language be included in all languages, at least as far as the sitemap goes (generating the pages themselves, while not necessary, is not that much of an issue, since I can always exclude them from the deployment, and it’s good to simulate the folder structure).

Is there any way to achieve this “include in all languages implicitly” behavior?

I fail to understand what you want, because having a sitemap with Japanese items but not having these pages created will result in errors (or penalties?) in most search engines.

Having said that, I recently learned that Hugo automatically creates sitemap indexes for translated websites. One sitemap file per language and one sitemap index file.

Maybe create a sitemapindex layout that ranges through your structures however you want them to be organized?

Also have a look at @ju52’s repo in this topic.

That is true, but it won’t be the case. The purpose of a translation proxy is to create those pages virtually, in real-time for each request, so I don’t need the Japanese pages for deployment, only the English ones, which will get translated separately. Doing the translation process inside Hugo is counter-productive for several reasons, including difficulty in deployment and in proofreading the translations.

That said, I might give your idea a try. Right now, I have this:

{{ printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\" ?>" | safeHTML }}
<urlset xmlns="" xmlns:xhtml="">
{{- range .Data.Pages }}
{{- if not .Params.excludeFromSitemap }}
		<loc>{{ .Permalink }}</loc>{{ if not .Lastmod.IsZero }}
		<lastmod>{{ safeHTML ( .Lastmod.Format "2006-01-02T15:04:05-07:00" ) }}</lastmod>{{ end }}{{ with .Sitemap.ChangeFreq }}
		<changefreq>{{ . }}</changefreq>{{ end }}{{ if ge .Sitemap.Priority 0.0 }}
		<priority>{{ .Sitemap.Priority }}</priority>{{ end }}{{ if .IsTranslated }}{{ range .Translations }}
				hreflang="{{ .Lang }}"
				href="{{ .Permalink }}"
		/>{{ end }}
				hreflang="{{ .Lang }}"
				href="{{ .Permalink }}"
		/>{{ end }}
{{- end -}}
{{ end }}

But because the path structure is easily predictable even after proxying (the proxy swaps /en/ for /ja/), I might be able to just repeat for each language.

I don’t know how you sort your languages, but you can range (line 3) through anything you have defined by a selection like taxonomies or slugs.

For the translations, I’m going the way the I18N docs on the site recommended, having a block like this in my config.toml:

DefaultContentLanguage = "en"
defaultContentLanguageInSubdir = true

    weight = 1
    weight = 2

Although having them defined like this creates both languages in content too. And having only one drops the language prefix (fairly obviously, I guess).