What is --printMemoryUsage detailling about?

I have a repository with a (unhealthy) large amount of modules that are hosted on Github (not sure if that is important, but I feel like it is). Starting hugo server takes a long time, long as in 6 to 12 minutes. I thought running with --printMemoryUsage might shed some light if something is going on. It displays blocks like the following at the beginning of running the command:

Alloc = 151.2 MB
TotalAlloc = 262.9 MB
Sys = 229.6 MB
NumGC = 14

In the end, it grew to

Alloc = 319.1 MB
TotalAlloc = 55.9 GB
Sys = 886.1 MB
NumGC = 351

I am pretty sure that all modules and the website repo itself do not sum up to 55GB in file size. What exactly is the TotalAlloc about? Also, the others, what might they be about? Is this a normal behaviour? The site has 108 posts, so it’s not some large dataheavy website.

I did “vendor” the modules with a little time win, but then startup stayed long. I remember, a while back, when Hugo was in it’s 80s, I did move all modules into my theme (layout directory), and the startup time improved down to about 1.5 minutes. Did not do that now yet. But I somehow think I will.

Is there anything that looks weird on this? I got used to the delay because, in the optimum case the server, once up and running, stays up, but it seems to me that this is not normal behaviour due to the large amount of modules used.

Once running the allocation goes down to this:

Alloc = 183.9 MB
TotalAlloc = 1.9 GB
Sys = 421.3 MB
NumGC = 28

My repo is at GitHub - davidsneighbour/kollitsch.dev: Content and website setup for kollitsch.dev.

I’d like to know more too - I’m having a painful time trying to debug a Docker error 137 (Out Of Memory) issue caused during a Hugo build. Whilst the build can succeed (in around 52 mins) on my local desktop (iMac, 3.8Ghz 8core i7, 72Gb RAM) it is failing repeatedly on Docker runners with memory limits of 32 or 64Gb RAM.

In my instance I’ve only added the Docsy theme, but have around 50k pages to build

Example stats from last successful build on local machine:

                   |   EN
-------------------+---------
  Pages            |  53897
  Paginator pages  |      0
  Non-page files   | 418260
  Static files     |     42
  Processed images |      0
  Aliases          |      0
  Sitemaps         |      1
  Cleaned          |      0

Total in 3111856 ms

… and my mem stats top out at

Alloc = 156.0 GB
TotalAlloc = 3.1 TB
Sys = 175.7 GB
NumGC = 965

though they drop a little on Alloc and Sys by the end.

I can’t find any project documentation that addresses the output of --printMemoryUsage so like the OP here I’m guessing at what the numbers imply.

On my Docker (executor with image registry.gitlab.com/pages/hugo/hugo_extended:latest) running on a VM with 64GB of RAM allocated, I get OOM errors and final memory outputs of

Alloc = 69.7 GB
TotalAlloc = 2.9 TB
Sys = 72.9 GB
NumGC = 358
Killed

Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit code 137

If I limit the docker memory size to 32GB (as I believe Go has a default language behaviour of trying to step memory limits to twice the system memory size, and am presuming this follows for Hugo) then I get as far as:

Alloc = 40.0 GB
TotalAlloc = 2.9 TB
Sys = 41.9 GB
NumGC = 374
Killed

Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit code 137

… and this is where the numbers and observed behaviours start to puzzle me with differences between macOS (running on the base OS) and Ubuntu (VM host to the Docker runtime) on how they handle memory. In the latter it looks like when Alloc hits the RAM+Swap limit then it fails; whereas on macOS, the alloc massively exceeds the RAM+Swap (in fact RAM usage for Hugo process doesn’t appear to exceed 30Gb and swap doesn’t grow past 4gb according to system utils) it continues and eventually completes the build task.

I’m presently at a loss as to how to interpret and proceed to fix, so all advice welcome.

Many thanks.

@marshalc this situation will greatly improve in the next version of Hugo. You cannot currently do a lot to improve the situation other than “doing less”. We’re going to introduce a LRU cache with a dynamic size (which we’re going to use for all of the memory intensive stuff in Hugo) which will effectively delegate to Go’s garbage collector to keep the memory usage under control. I have done some initial tests and have seen some very impressive results.

2 Likes

I think I can safely report that after testing 0.114 for a while, that memory usage has dropped on the same site build by roughly 25-30%. Processing time has improved as a result, though more testing needed here on a bigger machine (128GB ram) to see if that’s removed the main bottleneck.

Sample output with essentially the same site, but now on Hugo 0.115.1 + Docsy 0.7.0, and using the same iMac, but with RAM now upgraded to 128GB (from 72GB).

                   |   EN
-------------------+---------
  Pages            |  54624
  Paginator pages  |      0
  Non-page files   | 415299
  Static files     |     34
  Processed images |      0
  Aliases          |      0
  Sitemaps         |      1
  Cleaned          |      0

Total in 2622088 ms

… That’s a (3111856 → 2622088) 52min → 44min improvement (15%?)

Final memory output shows

Alloc = 104.0 GB
TotalAlloc = 3.0 TB
Sys = 137.1 GB
NumGC = 963

… with that also being the maximum Sys value in the timeline (unlike previously where it would peak about 80% of the way through before dropping down near the end). I make that a Sys mem improvement of (175.7 → 137.1) 22% less?

I’m in the process of putting a VM together with 128GB RAM + 128GB SWAP to compare the difference in processing time now we’ve found the upper limit of memory required to successfully build. The 64GB + 250GB Swap Linux VM I’ve tried previously was taking nearly 2 hours to do this build however (but there’s too many different variables there to compare reliably).

Could I make a request that the --printMemoryUsage output also includes some form of counter (or date time)? It would be helpful to easily see how many memory statements have been outputted in a build - my anecdotal observations here is that the 0.114/0.115.1 builds are showing more memory steps than the earlier 0.112 build was doing, and as I’m not data scraping all the outputs consistently, that summary number would help me understand durational differences or being able to report when in a build process the memory has jumped/changed notably.

I will try to remember that. I can add that in the next Hugo version (which I admit I have said a few times before now, but now I mean it) you can set a upper memory limit (we will try to set a sensible default based on how much ram there is on your PC) and Hugo will try to keep its memory usage below that limit (by releasing stuff we don’t need anymore to Go’s GC). This is obviously great for big sites, but also for large short lived content (e.g. big JSON files).

1 Like