Interestingly, after the output is presented, the process hangs and does nothing for a while (i.e. I have no diagnostic data about what’s happening in this interval) before returning:
| EN
-------------------+-------
Pages | 161
Paginator pages | 7
Non-page files | 0
Static files | 5969
Processed images | 0
Aliases | 23
Sitemaps | 1
Cleaned | 0
Total in 70603 ms
So that’s roughly a 1min 10 seconds build time. That seems unexpectedly large. I’ve tried various experiments to rm content/posts/* to see if the build gets better, but I gained no insights.
I see this same “silent period” of hanging when i run hugo server.
Notable factors
I recently added many small files to a top-level section
I am running this via the Docker image klakegg/hugo:latest since I’m on an older laptop
hugo server displays the same behavior on startup, but after serving begins on port 1313, edits to source files rapidly change the served site
I was unable to duplicate this delay, even using klakegg/hugo:alpine so that I could factor out the Docker startup time by shelling in and running time hugo --templateMetrics --templateMetricsHints:
...
| EN
-------------------+-------
Pages | 5635
Paginator pages | 1137
Non-page files | 8
Static files | 100
Processed images | 0
Aliases | 2848
Sitemaps | 0
Cleaned | 0
Total in 161473 ms
real 2m41.634s
user 1m37.842s
sys 0m26.932s
There was no delay between the end of the build and the return to the container’s shell prompt. (12-inch MacBook running Catalina, Docker 3.1.0)
Note that this runtime is consistently 6 times longer than running Hugo locally on the same data. A lot of that may simply be the poor performance of Docker volumes on Macs; you didn’t include any information about your environment.
Clever thing to note, @bep! And please let me know of my workflow is somehow creating the technical confusion that’s hurting performance. Here’s how I investigated to get an answer to your question.
I took a sample of what’s in public and did some scratch investigation with ind public -type f |sed 's/\(.*\/\).*/\1/'|sort|uniq -c |sort -n|grep -v big-image-dir:
As you see, it’s not that there are a lot of files in a few directories, there are lots of directories, each with two files. This site hosts my twitter export archive thus, for each tweet, 2 files.
So the main content appears to be "Results from my last hugo run` plus the images and media in order to make the posts for the 2 year’s worth of posts load locally.
Why Are Things This Way?
In my “local” config, I set staticDir = "public". Doing so means that /images/year/month/day/some_image.ext paths resolve inside of my documents’ content. Great.
When I “publish,” I push my git repo to my host and let a post hook build static files and sync them to the web-server-served directory via rsync. I also rsync all my images etc. in ./public/images.
Critically the same path for images etc. in my content works “in production” as well.
I’m starting to wonder:
Is my staticDir config forcing hugo to crawl those ~6K files after it generates the new payload?
If so, is there a better way to address keeping the asset linking working for both my production and local authoring contexts?
I’ll apologize for verbosity here and hope that thorough motivation and technical background help you diagnose more easily. As ever, thanks.