I have some normal content pages and some pages with “build”: {“list”:“never”} parameter. After building the site, I need to get ALL urls of ALL pages, including the ones with “build”: {“list”:“never”}.
I’ve tried to use all variations of methods like “.Site.AllPages”, “site.Pages”, etc and tried getting URLs via “hugo list all” command after the build. “debug” command option do not return list of urls either.
Pages with “build”: {“list”:“never”} seem to be always excluded from all results.
Is there any possible built-in way to get a list of all URLs that were built/published including the ones with “build”: {“list”:“never”} parameter?
What is your goal when setting list = 'never'? It sounds like you want to exclude a page from some page collections, but not all page collections? Is that accurate?
@irkode Thanks for proposed workaround, it worked for me and I achieved what I needed.
@jmooring My goal is to hide page from everywhere on the site (sections, taxonomies, sitemaps, etc) but still build some pages. So if I know URL then I can access it, but if I don’t know URL then there is no way to find it through the website.
I was expecting that Hugo has some built-in way to list all pages that were built & published, including the ones with “build”: {“list”:“never”}. However I haven’t found any options to do that (except proposed workaround).
So even though Hugo builds the page (e.g. I have output page .html file and I can access it if I know URL), I need to lookup through filesystem by some script or use mentioned workaround to get a list of ALL pages & urls that were built.
@irkode it related to my specific task. Technically these pages are still visible and can be accessed if you know the link, but I don’t want them to be discoverable through the website. For example, I can still index them on Google via their Indexing API, but they cannot be crawled by the bots or someone else (because there no references to these pages anywhere on the site).
@jmooring There are like 50k of these pages.
Yes, there is URL pattern like site_url/section1/term_level_1/term_level_2/page, does it matter?
I’m currently testing workaround with adding permalinks of the pages to the store via {{- site.Store.Add “AllSinglePages” (slice .Permalink) -}}. It looks like it increased build time, but not sure yet how much.
After testing, I found out that using {{- site.Store.Add “AllSinglePages” (slice .Permalink) -}} on single page template results in almost 2x memory usage.
In general, I found a compromise by putting site.Store.Add into if statement, so its adding only pages that have some particular parameter in front matter (same pages that have “build”: {“list”:“never”} param).
It would be nice if “hugo list” command would have a flag to list all pages including “build”: {“list”:“never”} pages, because technically they are still published, just not listed in sections/etc.
my example stored the complete page, you could try to just store needed value(s) which might reduce the memory footprint but this depends on internals - test it
you could also use warnf in templates to print values to the log and parse that in a post build step.
If the pattern has something unique when compared to the other pages, you can control the build option in your site configuration (by environment) instead of in front matter. Then you can do:
hugo -e development list all
A common pattern could be a string used in all paths, or even the number of path segments.