I have a Hugo based blog website that I rebuild nightly and deployed to an Azure Web App using an Azure DevOps pipeline (via an Ubuntu based Microsoft hosted build agent).
This has been working fine for months, the Hugo build step taking around 3-4 minutes to generate the site.
Last week I updated the Hugo version used by the build agent from 0.122.0 to 0.131.0 and noticed that the Hugo build time jumped to around 10 minutes, interesting but not an issue.
The Issue
In the past couple of days my Azure DevOps build has been timing out, timing out after 60 minutes, and after updated the timing out after 120 minutes.
Today, I have been able to repeat the problem on a local PC using both Windows and Linux and using Hugo Extended 0.131.0 and 0.133.
Analysis
If I set the Hugo logging to Debug and do a basic build I can see the following
C:\tmp\BMBlogs-Hugo [main]> hugo --logLevel debug
Start building sites …
hugo v0.131.0-bfbee17932ff24009008aa94cdd75c0c41f59279+extended windows/amd64 BuildDate=2024-08-02T09:03:48Z VendorInfo=gohugoio
INFO build: step process substep collect files 1952 files_total 1952 pages_total 1950 resources_total 2 duration 250.5829ms
INFO build: step process duration 252.1844ms
INFO build: step assemble duration 58.2282ms
DEBUG Fetched remote resource: http://bit.ly/DDDNorth2
... many lines
DEBUG Fetched remote resource: http://blogs.blackmarble.co.uk/blogs/rhepworth/Coffee_thumb_01A4F9C6.jpg
INFO static: syncing static files to \ duration 1m9.3318393s
and then the process appears to hang/stop, there is no more output.
I suspect the issue is cache related as this error is seen when I clone the repo to a new folder, but an old copy that has previously been built and used for local testing workings
Questions
The fact a resource might not exist is understandable as this web site is a blog site with some posts dating back many years, so pages reference resources that can be unavailable.
So my questions are:
Is there any more debugging I can use to find the problem pages/images/style without removing items of the site one by one?
Is there a way to make the build process terminate when it cannot access a remote resource?3. Any idea why this shoudl have started happening over in the last few days?
For resource.GetRemote we do some retries for certain HTTP status codes, but we do eventually timeout. That doesn’t help if you have lots of failing requests, though.
You can try to set timeout=5s or something short in your config. Default is 30s.
I assuming this URL worked until last weekend, but if you try to access it now you get a DNS failure.
This DNS failure appears to cause the Hugo build process is looping/pausing getting this resource. The build process does not fail or progress, hence my Azure DevOps build timesout.
The way I found this problem page using the same technique I have used in the past for sorting Hugo issues, that is repeatedly rebuilding the site adding content pages back in blocks until I isolated the page. A slow process but it got me there in the end.
In Hugo 0.133.0 we added the URL to the default error message in resources.GetRemote (better late than never), but you can also do as @jmooring said above, handle the error fro resources.GetRemote yourself (or at least, most errors).
There’s no need to call resources.GetRemote twice as shown in your code snippet above. If the resource is found it’s assigned to the dot inside of the with block.