Interaction between the baseUrl and relativeUrls settings

In a nutshell, here is the issue I’m experiencing:

  • Create a Hugo site with some content and a manually defined menu config at config/_default/menus.yaml with some urls pointing at your content. Also, enable relativeUrls: true in your config.yaml.
  • Run Hugo with --baseUrl /foo/bar
  • Place your rendered HTML on a web server in a subfolder structure matching /foo/bar

In this scenario, the links in the nav will be rendered as href="./foo/bar/pages/page-1" for example. Clicking the link will take you to /foo/bar/foo/bar/pages/page-1, which is incorrect. I’m unsure If I’m just misunderstanding how baseUrl is supposed to be used or if there are limitations surrounding it, but I interpreted this setting to be used if your site is hosted in a subfolder and you need to prefix absolute URLs with it. With relativeUrls turned on, my expectation is that it would look at the baseUrl setting and a URL like /foo/bar/pages/page-1 along with the current page context and have enough information to render a proper relative link. If I pass in a baseUrl of /foo/bar, should Hugo not assume that my home page is at /foo/bar/index.html? When relativeUrls is post-processing that page, should it not be reconciling that? I would think that the strategy here is to just not set baseUrl if you are using relativeUrls, but what if you need to link to content that is relative to the domain root, and not the context root? Likely the way to go is turn off baseUrl, keep relativeUrls turned on and then manually render absolute, fully qualified URLs that won’t get transformed by relativeUrls to content at the domain root?

A more domain-specific explanation of the particular problem we are facing is written up here, if it’s helpful: current --baseUrl setting is incompatible with relativeURLs: true · Issue #1664 · mitodl/ocw-studio · GitHub

Why are you setting relativeUrls: true?

I can only think of one scenario where relativeURLs = true is required—when you want to build a site that you can navigate from the file system (no server). And in that case, the baseURL must be /.

If you set it to false, instead of specifying url for each internal menu entry, specify pageRef.

[[menu.main]]
name = 'Home'
pageRef = '/'
weight = 10

[[menu.main]]
name = 'Posts'
pageRef = '/posts'
weight = 20

Use url for external entries.

[[menu.main]]
name = 'Hugo'
url = 'https://gohugo.io/'
weight = 30

When rendering the menu, there’s no need to transform the URL.

{{ with site.Menus.main }}
  <p>Menu:
   {{ range . }}
     <a href="{{ .URL }}">{{ .Name }}</a>
   {{ end }}
  </p>
{{ end }}

That’s exactly why we are setting relativeUrls to true. MIT OCW is a platform with a 20 year history that we recently built a new publishing system for with a custom CMS built in Django that creates content for Hugo sites and publishes them to Github repos, then we orchestrate the publishing process using a CI/CD runner called Concourse. Each course site is built individually and deployed to an S3 bucket in a subfolder called “courses” with a homepage that is another Hugo site at the root. Each site needs to be viewable on the web under a subfolder, but also downloadable and viewable on a local file system.

We organize our menus in a menus.yaml configuration file to simplify how sites are authored and built in our CMS, and to keep all of the menu entries in one spot that is easily machine readable by simply parsing the menu configuration file. This project has been in development for a couple years and released about 10 months ago. I’ve pretty much read the Hugo documentation front to back many times, and read many a post here on the forums, in Github, anywhere I can. I’ve never seen pageRef referenced anywhere. When I Google it, all I see are some release notes so to find out about that I’d have to already knew it existed or read every single release’s notes. That seems interesting though, I assume that it locates a Page object and then has the ability to generate a relative URL through RelPermalink or something like that?

Edit:

I see now that .PageRef is referenced as a property on menu item objects here. This is the first page that comes up when you Google “Hugo menus” and it’s not apparent what all the possible values are for the config file. You have to go into the page about menu item variables, then read the description about PageRef specifically to find out that it can be used in a config. I assume that’s the case with all of these variables? The different case changes trip me up sometimes with different parts of Hugo, like how the CLI arguments will be camelCase, then the property of the object that you access in a template is PascalCase. Is it always the case that properties from the objects referenced by a configuration file section are available to access in the configuration, or is there some kind of mapping?

The menu documentation needs some love. It should explicitly describe how to create menus:

  1. With individual entries in site configuration
  2. At the page level
  3. Using sectionPagesMenu in site configuration

With #1 and #2, each applicable key should be described. Expecting users to infer applicable keys from the menu entry variables is, as you have found, less than ideal.

Exactly, similar to the site.GetPage method. The .URL menu variable will use the page’s .RelPermalink. When iterating through menu entries, you can grab page properties, so you could (but do not have to) do something like:

{{ with site.Menus.main }}
  <p>Menu:
    {{ range . }}
      {{ with .Page }}
        <a href="{{ .RelPermalink }}">{{ .Title }}</a>
      {{ else }}
        <a href="{{ .URL }}">{{ .Name }}</a>
      {{ end }}
   {{ end }}
  </p>
{{ end }}

No. .Page and .Children are not menu entry keys; they are template variables. That’s why we need to modify the documentation as I described above.

There’s an open issue about this:
https://github.com/gohugoio/hugoDocs/issues/1594

@jmooring Thank you for your detailed reply! This clears a lot up. In the case of OCW, we’re putting a lot of effort into trying to make the content portable and reusable. We actually wanted the Hugo site content (markdown, etc.) to be publicly available on Github but ran into rate limiting issues with our CMS when doing a full site republish. So, the course content is unfortunately locked behind MIT’s Github Enterprise server because we can disable rate limiting there. However, the themes are open source at GitHub - mitodl/ocw-hugo-themes: A Hugo theme for building OCW websites. The Django based CMS is also open source, at GitHub - mitodl/ocw-studio: Open Source Courseware authoring tool.

The current live site uses all what I like to call “root-relative” URLs, aka URLs that begin with /. Each course site is built in a pipeline individually, and when Hugo is run it’s run with --baseUrl /courses/course_id. I’ve found that the terminology surrounding how we describe URLs is often confusing. According to the URL spec:

  • https://whatever.org is an absolute url
  • relative urls have several subtypes
  • /a/b/c is a “relative url of subtype ‘path-absolute schemeless’”
  • ./a/b/c is a “relative url of subtype ‘path-relative schemeless’”

It gets extra confusing when you’re trying to render a site that is browseable offline from a hard drive without a server, because in MacOS and other UNIX based operating systems, a path that starts with / is an “absolute path” but in the HTML it’s still considered a “relative” URL. For single course downloads, we have a separate “offline” build currently that builds the site without --baseUrl so it will work offline. It also does some UI things though such as hide UI elements that wouldn’t make sense to show in a downloaded course. Another thing that OCW offers though is the concept of a “mirror drive,” aka a hard drive loaded with the entire OCW site that they will ship off to countries with bad or no internet access. Our goal here is to make the “offline” build be only about the UI differences and have a universal URL strategy for the whole site that renders relative URLs for both builds, meaning creating a mirror drive would be as simple as syncing down the S3 bucket that the live site is served from.

You’re welcome!

As you have learned, the relativeURLs option in site configuration has a number of open issues, as welll as requested enhancements. For example:

And there are several more that were closed by stalebot due to lack of activity or interest.

As far as I know, the only scenario in which this option is useful is when building an offline site[1]. That use case is infrequent, which probably explains the low/no priority of the related issues.

Even if you set the baseURL to /, on a monolingual site, you will still run into challenges when trying to build an offline site (the home URL is ‘/’ instead of index.html, integrity attributes on link and script elements have to removed, etc.).

Here’s another set of examples comparing relative vs. absolute URLs:

And in our context, this is further confused when serving from a subdirectory, or with a multilingual site, or both:

https://example.org/subdirectory/de/posts/post-1/

In this case the site root (each language is a site) is:
https://example.org/subdirectory/de/

The host root is:
https://example.org/

And I don’t know what to call this:

https://example.org/subdirectory/

  1. Usually, when I see it used, it is misused because site authors assume it does something different that in actually does. ↩︎

subBaseUrl :joy:

1 Like