Hugo support for URLs without a trailing slash?


#1

The issue has been discussed before, but maybe it is time to ask for the feature again.

URL’s actually was without ending slash until v0.12, but it was changed to end with a slash. The main reason given for this was that it works on any server out of the box. spf13 wrote:

"After a long debate, we discovered that the definitive correct behavior for a URL representing a directory was to end in a slash. Personally I liked the look of the nonslash ending urls better, but a compelling enough case was made to change my mind. Ending in a slash works on every web server out of the box without any additional configuration. Consequently we have not enabled a way to go back to the previous (erroneous) behavior.

If there is a compelling enough reason to restore the old behavior as a configurable option it wouldn’t be hard to do, I’m just not aware of one." [link]

But why do we need “compelling reasons” in order to add the behaviour as a configurable option? I guess it’s because it takes some effort to implement (because surely such an option would not be harmful). But then again, spf13 writes that it would not be hard to implement.

Here are three compelling reasons for implementing such a configurable option:

  1. Many people wants this (follows from the previous debate, but also from blog posts like these: [1][2])
  2. When migrating an existing site, you usually want to maintain URL structure. Most CMS systems have urls without trailing slashes.
  3. Trailing slashes does not make sense semantically

#2

I would say that adding such an option in Hugo < 0.20 would be too much work. Now it is doable.

But

  1. Many? This comes up now and then, but I read every issue and every thread here and on GitHub, and in the total picture, I would say it is a rare request.
  2. Do you have some data backing that “most CMS systems” claim?
  3. Yes they do. Semtantically /myblogpost/ points to a directory where “/myblogpost/index.html” lives. You could also turn on uglyURLs, which would give you /myblogpost.html.

#3

I manage content for a living and have never heard of this, but I could be wrong of course. This seems to me a server-related concern that is ultimately decoupled from a GUI/CMS layer.

How are trailing slashes not “semantic?”


#4

Sorry for making too bold statements based on assumptions rather than data. I shall try to make up for that.

1)
I surrender, bep have much better data to judge this than me.

2)
Here is what I got:

  • Wordpress is flexible and allows you to set a permalink structure with or without trailing slash. So for some Wordpress migrations, it will be an issue, for others it won’t
  • Drupal does not use trailing slash [#]
  • Joomla does not have trailing slash either [#]
  • ExpressionEngine, it seems, went from trailing slash in EE1 to no trailing slash in EE2 [#]

This is not enough data to make any claims of “most CMS systems”. But it seems that “no trailing slash” is common in CMS’es, and thus point (2) seems valid

3)

@bep, You are right about the semantics, as understood as semantics of URL resolution on most, if not all web servers. I was not thinking about those semantics, but rather the semantics of URLs generally, decoupled from technicalities on web servers.

RFC 1630 has the following to say about using slash:

If the URN has a hierarchical nature, then the slash delimiter shall be used in the URI encoding; If the URN has a hierarchical nature, the most significant part shall be encoded on the left in the URI encoding;

The “semantics” of using a slash is thus according to this definition to delimit hierarchical content. Using these semantics stricly, it does not make sense to add a trailing slash.

However, a trailing slash has gotten a special meaning by usage. It has been used for directories. I guess we can call this meaning semantics too, even though it may only exist as a convention, and not in any specification. In this way, it makes semantical sense to use a trailing slash for indexes, such as a listing of all tags at a location “/tags/”. But not for individual posts.

4)
Perhaps a more compelling reason than the ones I came up with originally is that of flexibility - The freedom to have your URL’s as you want them.

@bep, you say support for URLs without a trailing slash is do-able now. Do you have a solution in mind which affects all urls, or one that can be customized? A general option which can be overridden would be nice and flexible.

Cheers,
Bjørn


#5

I say doable, but I did not say that I have thought of a solution. This is not on my todo-list, so if someone wants it, that someone should create a spec, suggest it here and then implement it. You could wait for me to get interested, but that may never happen.

I would like to add this:

  • Of all the listed CMSes above, I think all of them are backed by some kind of server platform, which has a role in actual serving these URLs, i.e. they have some kind of HTTP router. Hugo does not. We server from file system or from a file server. We must meet common ground here, and I think the trailing slash is just that.
  • In my head, “http://bep.is/myblogpost” is an URL that points to a file. “http://bep.is/myblogpost/” is an URL that points to a directory, which most web-servers have a way of handling (list directory or show a default file (i.e. index.html)).

#6

I’d like to show my interest in this feature.

I’m getting informed about “static cmss” (or whatever they are called) and I’m wondering about this URL problem.

As you mention, an url with a trailing slash should point to a directory. But when I create new content, I don’t usually want to create directories. I usually want to create “end resources” and as I understand it it’s the engine’s choice to present them as directories.

Following with your example, if I’m considering only the URL, I’d expect http://bep.is/myblogpost/ to have some form of listing, perhaps the list of my own blog posts. I’d expect http://bep.is/myblogpost to be concrete content.

Another point to be made is that it’s not mandatory for a file to have an extension, so /myblogpost.html is as valid as /myblogpost. And in my mind the default Content-Type, if the web server doesn’t know better, should be html (that’s what http was meant to transfer, and why those acronyms share the HyperText part).

So that’s my reasoning for requesting the possibility of removing trailing slashes.


#7

I would also like to show interest in this issue. This is the one issue which is preventing me from migrating some of my existing sites over to Hugo. Seems crazy to me the vast amount of options Hugo provides to support all sorts of different niche use cases but provides no option for this.

It’s not particularly hard to configure a web server to support this and static site hosting services like Netlify offer straightforward means of setting this up.

Just my 2¢ anyway, would be nice to have an option to enable somewhere.


#8

There is a PR to add this, but sadly the issue starter did not take it the extra mile to the finish line: