Builds running for a very long time (taking hours, when they previously took 2 minutes)

Background

I have a Hugo based blog website that I rebuild nightly and deployed to an Azure Web App using an Azure DevOps pipeline (via an Ubuntu based Microsoft hosted build agent).

This has been working fine for months, the Hugo build step taking around 3-4 minutes to generate the site.

Last week I updated the Hugo version used by the build agent from 0.122.0 to 0.131.0 and noticed that the Hugo build time jumped to around 10 minutes, interesting but not an issue.

The Issue

In the past couple of days my Azure DevOps build has been timing out, timing out after 60 minutes, and after updated the timing out after 120 minutes.

Today, I have been able to repeat the problem on a local PC using both Windows and Linux and using Hugo Extended 0.131.0 and 0.133.

Analysis

If I set the Hugo logging to Debug and do a basic build I can see the following

C:\tmp\BMBlogs-Hugo [main]> hugo --logLevel debug
Start building sites …
hugo v0.131.0-bfbee17932ff24009008aa94cdd75c0c41f59279+extended windows/amd64 BuildDate=2024-08-02T09:03:48Z VendorInfo=gohugoio

INFO  build:  step process substep collect files 1952 files_total 1952 pages_total 1950 resources_total 2 duration 250.5829ms
INFO  build:  step process duration 252.1844ms
INFO  build:  step assemble duration 58.2282ms
DEBUG Fetched remote resource: http://bit.ly/DDDNorth2
... many lines
DEBUG Fetched remote resource: http://blogs.blackmarble.co.uk/blogs/rhepworth/Coffee_thumb_01A4F9C6.jpg
INFO  static: syncing static files to \ duration 1m9.3318393s

and then the process appears to hang/stop, there is no more output.

By using Process Monitor - Sysinternals | Microsoft Learn I can see that Hugo is trying to access https://g2a02-26f0-00da-0584-0000-0000-0000-356e.deploy.static.akamaitechnologies.com which is a non-existent page.

I suspect the issue is cache related as this error is seen when I clone the repo to a new folder, but an old copy that has previously been built and used for local testing workings

Questions

The fact a resource might not exist is understandable as this web site is a blog site with some posts dating back many years, so pages reference resources that can be unavailable.

So my questions are:

  1. Is there any more debugging I can use to find the problem pages/images/style without removing items of the site one by one?
  2. Is there a way to make the build process terminate when it cannot access a remote resource?3. Any idea why this shoudl have started happening over in the last few days?

You may try to get an outlook of the situation using Template metrics.

Error handling/reporting with resources.GetRemote:

https://gohugo.io/functions/resources/getremote/#error-handling

As always, you are more likely to receive a prompt and accurate response if you post a link to your project’s Git repository.

See Requesting Help.

For resource.GetRemote we do some retries for certain HTTP status codes, but we do eventually timeout. That doesn’t help if you have lots of failing requests, though.

You can try to set timeout=5s or something short in your config. Default is 30s.

After much fiddling I think I have found the issue. As suspect the problem was an old resource link in a blog post dating back to 2012.

The problem was the post contained the image link

![Visual Studio](http://i.microsoft.com/visualstudio/11/images/visual_studio_logo.png)

I assuming this URL worked until last weekend, but if you try to access it now you get a DNS failure.

image

This DNS failure appears to cause the Hugo build process is looping/pausing getting this resource. The build process does not fail or progress, hence my Azure DevOps build timesout.

The way I found this problem page using the same technique I have used in the past for sorting Hugo issues, that is repeatedly rebuilding the site adding content pages back in blocks until I isolated the page. A slow process but it got me there in the end.

Thanks for the suggestions

Richard

In Hugo 0.133.0 we added the URL to the default error message in resources.GetRemote (better late than never), but you can also do as @jmooring said above, handle the error fro resources.GetRemote yourself (or at least, most errors).

I had added the error logging as suggested to my template

  {{ with $result := resources.GetRemote $file }}
    {{ with .Err }}
      {{ errorf "%s" . }}
    {{ else }}
    
      {{- $image = resources.GetRemote $file -}}
     .....

This is showing the expected errors for most resources e.g.

ERROR failed to fetch remote resource from 'https://www.microsoft.com/en-gb/developers/images/articles/content/471/01-04.jpg?v=1.0': Forbidden

But for http://i.microsoft.com/visualstudio/11/images/visual_studio_logo.png I did not get an error.

The ‘pause’ seemed to happen elsewhere in the render, hence the need for the manual process to find the problem markdown.

FYI

There’s no need to call resources.GetRemote twice as shown in your code snippet above. If the resource is found it’s assigned to the dot inside of the with block.

2 Likes