Situationally removing trailing slashes in URLs (yes, I read every thread)

Hi,

This is the last technical hurdle for converting my 500+ post blog from Jekyll and Hugo. With a bit of elbow grease, scripting and help from this most excellent community every other issue was solved.

I’ve searched everything here and on Google and I’m having cold feet on making the migration due to this URL issue. Having a ~10 year old domain’s URLs change or having a 301 redirect on every request is less than desirable.

For reference my intent is to host this on a server with nginx serving the content.

URL expectations and Hugo

Here’s a quick rundown of what I’m trying to accomplish and what Hugo does. Notice the trailing slash vs no trailing slash on some of the URL examples.

Desired URLs:

What Hugo generates with uglyURLs: false:

What Hugo generates with uglyURLs: true:

What I can deal with

  • Jekyll uses page15 where as Hugo uses page/15 for paginated resources
    • I prefer Hugo’s format for this but is there a way to change it to support the other one?
    • In the worst case scenario I can 301 redirect these pages and not lose much sleep over it

What I can’t deal with

  • If I use ugly URLs then the HTML response returned by nginx will include .html in all of the URLs and crawlers will think that’s the source of truth
    • I do not want to 301 redirect all of those to pretty URLs
  • If I don’t use ugly URLs then all links produced by Hugo will have a trailing slash
    • I do not want to 301 redirect the ones that don’t have it
      • In my case, that’s 500+ posts and a few pages

In either case above:

  • It will result in either my SEO changing or most of my requests being a 301 before the content is served
  • I’d very much like to continue using relref and ref because it helps prevent broken links
    • I could technically hard code my preferred URL style for every link but to be honest, that is a big enough hit that I’d abort the migration over it, link validation is too important to lose, I have hundreds of cross links

Looking at Let’s Encrypt’s Hugo site

While researching this I found https://gohugo.io/showcase/letsencrypt/.

If you visit https://letsencrypt.org/ they are using pretty URLs and you can see the trailing slash on most pages.

However, if you visit their blog such a Intent to End OCSP Service - Let's Encrypt you can see there’s no trailing slash. There’s also no trailing slash in the HTML.

There is no 301 redirect happening here, you can verify that with curl -I https://letsencrypt.org/2024/07/23/replacing-ocsp-with-crls which returns a 200. Curl would normally expect you to set -L to follow redirects.

I don’t see anything configured in a special way at the Hugo level to support this behavior.

Based on website/netlify.toml at main · letsencrypt/website · GitHub it looks like they are using Netlify but I don’t see any special rules defined here either.

What are they doing to get this effect? Their blog is indeed using Hugo based on website/content/en/post at main · letsencrypt/website · GitHub, there’s posts from a few days ago.

Content rewriting as a last resort

I know nginx has a non-default module to support this with sub_filter Module ngx_http_sub_module. I’d very much like to avoid using this as this requires a custom compiled version of nginx and it does this at runtime on every request.

I could likely write a custom Python script to do this on the published directory as a final build step but this seems like a really brittle solution.


Is there any way to accomplish what I’m trying to do with Hugo itself?

Thanks.

no

regarding trailing slashes I would say:

  • one style or another, don’t mix
  • SEO will recover over the time unless you keep duplicates (maybe some SOA tuning stuff can help)

UGLY: so consistent behavior - and yes the html file is the truth

NOUGLY: a consistent behavior - just bad for servers that handle it with redirects…
/ means in fact the webserver will decide what to do. Usually they scan for .html, .htm, a server page …

I would recap on your “desire”
IMHO your last desired does not match the layout it’s a list page without /

That said, I would go with the one or other - make a choice - don’t look back - you’re not going that way

Even if this were the case, the problem exists since Hugo will not let you use pretty urls without the trailing slash. I won’t be changing my canonical URLs to always have a trailing slash and having ugly URLs with .html being crawled as the definitive source isn’t something I want to commit to.

Fortunately I did come up with a solution in the end by overriding rel and relref to trim the trailing slash for all URLs except for the ones I wanted to keep the / on. The only pain point here is you can’t use shortcodes everywhere (layouts, etc.) so it requires duplicating this logic in a few spots and remembering to use this outside of the content directory.

I’m in the same situation, but with a site with around 100k pages. Adding trailing slashes to all URLs will seriously hamper the migration. I don’t care about “best practices” – I just need to replicate the old site EXACTLY, but now with Hugo. And the old site nowhere uses trailing slashes. In fact, a page with trailing slash will 301 to remove that slash.

@nickjanetakis would you please share how you did this?

It is possible to work without 301 redirects. For example, you can do it this way:

  1. Choose a canonical URL ending with / (the standard Hugo format) and specify it in the meta tag.
  2. Do not set up redirects, but display the new page “transparently.”

Something like this in Apache:

DirectorySlash Off
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ $1/ [L]

This way, the “migration” will happen gradually.

Thanks for trying to help :slight_smile: However, you have misunderstood me. In the quote above I was referring to what the old site is CURRENTLY doing.

I know how to migrate to trailing slashes, and I’m aware of the strongly held opinions of the Hugo creators (e.g. expressed at How can I remove the trailing slash in pretty URLs? · Issue #492 · gohugoio/hugo · GitHub ) that the trailing slashes MUST be used, but for this specific use case, that is not an option, and I am required to exactly match the old site. Many others are in the same situation.

I disagree with that statement. It’s as simple as that.

That statement is true of a specific website that is specifically coded in PHP to do that. I’m not telling you about how Apache works, or how Nginx works, or how Hugo works, or how anything is supposed to work. I’m telling you how my legacy site works.

But let’s get back on topic, please. I’m interested in knowing what @nickjanetakis meant by “overriding rel and relref":

I understand that you’re describing how your current website works. Treating people like they’re stupid never helps.

But you won’t be able to completely bypass the server to fix your problem. (Or else you’d have to generate HTML files without file extensions, etc.)


Otherwise, it should be a breeze for someone as sharp as you to modify the default relref shortcode (Relref shortcode) and add a few filters to .Permalink and .RelPermalink.

Bye.

That was not my intention. Please forgive me. I was simply trying to be explicit in order to remove ambiguity.

Thanks for the link. It took me quite a while reading the docs to figure this out, so I’ll add a little summary of my findings here for future searchers.


  • The difference between a shortcode and a partial is that the shortcode is invoked from within the content, rather than from within a layout. Source.
  • We can override the RelRef shortcode by copying the source code to layouts/_shortcodes/relref.html
    • The source code is by default {{ relref . .Params }} which means that the shortcode is simply calling the RelRef method (which is not the same as the RelRef shortcode itself).
    • We could change this to any valid Go Template, it seems. For example: {{ or (strings.TrimRight “/” (relref . .Params)) “/” }} to strip trailing slashes from all but the root URL.
  • Some sites may use link render hooks instead (meaning Rel and RelRef might not be used at all).
    • We can override these by creating layouts/_markup/render-link.html and putting any valid Go Template in there that we like.

I have yet to test this, so I’ll update the above if I find any mistakes.