Pros and Cons of Ugly URLs?

The docs describe how “pretty” and “ugly” URLs work, but they don’t give any reasons why you would choose one over the other for a new site.

What are the pros and cons of each option?

For example, pretty URLs are shorter, but need an extra directory to be created for every page.

1 Like

Just search “pretty urls seo”, and you will get hundreds of articles telling you why pretty urls are better.

Here is the top result: https://moz.com/learn/seo/url

I did use a search engine, and most of the results were like the site you mentioned, which compares dynamic URLs such as example.com/?page=52 with “SEO-friendly” ones such as example.com/category/page/

They don’t compare URLs ending in “page-name/” with those ending in “page-name.html” which are the two default options available with Hugo.

In fact, the “optimal” format is shown as
http://www.example.com/category-keyword/subcategory-keyword/primary-keyword.html
which includes the file extension.

1 Like

In my head, the only pro for uglyURLs is if you’re running in a setting where you got no choice, i.e. serving directly from a file system (in combo with relativeURLs=true).

3 Likes

Okay, so far we have the following pros and cons for ugly URLs:-

Pro:
No need to start a web server to test your local pages.
Fewer directories needed (so quicker FTP uploading)

Con:
They are slightly longer

Pretty URLs are pretty ugly if you have multiple output formats. For example, Instead of /foo/bar.json you’ll end up with /foo/bar/index.json. I’d love a way to have pretty URLs for HTML and ugly URLs for other stuff…

You can:

  1. Set UglyURLs=true in config.
  2. Set NoUgly=true on the HTML output format, see RSS default definition:
2 Likes

Nice, thanks!

In case someone finds this topic in the future, the relevant config.toml snippet looks like

uglyurls = true

[outputFormats.HTML]
    nougly = true
5 Likes

Does anyone know if there is a limit to how many directory levels a search engine spider will crawl?

For example, they might crawl as far as /level1/level2/level3/ but not as far as /level1/level2/level3/level4/

I ask, because if there is a limit, you’d reach it sooner with pretty URLs than ugly ones.

I’m also wondering if search engines consider pages in different directories to be less related to each other than ones in the same directory.

Does anyone know if there is a limit to how many directory levels a search engine spider will crawl?

Technically depends on the spider program and how it is implemented, though I doubt the limit is an easily attainable one, if one exists at all.

I’m also wondering if search engines consider pages in different directories to be less related to each other than ones in the same directory.

I think they wouldn’t pay much attention to that, especially with all of the dynamic servers out there. If I understand correctly, a spider basically reads a page, does what it can to parse the content (microdata helps a lot with that), and follows the links it finds on the site. A link inside an <article> element would be viewed as related to that article, while a link with rel="next" is interpreted as the page that comes after the current one, which is a different sort of relationship.

If you use microdata, the search engine can have a lot more information to go from, such as connecting that two articles on different sites were written by the same person.

My guess is that you’re worried about SEO, which almost always means “optimizing for Google.” Related information isn’t based on directory structure as much as the cross-linking you’re doing within the site, but that’s not necessarily your biggest concern as much as developing really solid content and improving your “link juice”; i.e., getting other reputable sites to link your content while you also link to other reputable sites. You can also do some structured data (JSON+LD) to make your content a bit more machine readable. As far as controlling what’s crawled, I would recommend looking into. Google’s webmaster tools:

https://google.com/webmasters/tools

You can schedule crawls, temporarily hide pages, etc.

FWIW, I agree with @bep that there is really no reason to use ugly URLs unless you absolutely have to.

Also @Wysardry, it’s good to get info directly from the source:

https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html?m=1

I’ve thought of another pro for pretty URLs: You can change the file extension to .php, .shtml or whatever after the site is live if you need to add dynamic features, without breaking any existing links (as the filename is not usually shown).