Is there an option to write to disk rather than memory?

dlwodnjs42 · December 5, 2019, 6:55pm

Currently, we have around 70k static files and are thinking of expanding/scaling to the millions in the near future. We have an aws ec2 instance with 16 CPU 240 ram 2TB of disk space that we’re running the command hugo on. Now this is actually taking up 175~GB of ram to generate the public/ file (we tested 35k static files and it takes ~91GB of ram). We’ve also tried hugo server --renderToDisk but the cost for memory is the same. Hence, i’m not sure how we can scale since ram is pretty expensive to extend to the TBs…

How does the build process work for hugo? Is there a way to use disk rather than building by using memory? Any ideas or other options on improving the memory usage?

bep · December 5, 2019, 7:37pm

That does what it says (writes files to disk). This is also what you get when running hugo (writing to memory would not make much sense when you want to publish it). When you don’t see any effect when doing that, it tells me that it’s “something else” going on.

I have some plans to rework the build process into a streaming/partial/partitioned build, which you would need for “millions of pages” to be practical, but that is not happening next week.

That said, 70k files and 170GB ram does not add up. I have done tests on my old MacBook (that had GB of ram) with 50K content files without much hickup, but it’s impossible for me to tell you what from a bird’s view.

dlwodnjs42 · December 5, 2019, 7:46pm

Is there a set schedule or estimated schedule on when the streaming/partial/partitioned build will be released?

Even if these 50K content files each have a few pages of json data that they reference to? Because we have 70k frontmatter files that reference a datafile that hold few pages of json.

bep · December 5, 2019, 8:58pm

OK, so that may be the reason. If you put large amount of JSON inside /data all of that is read into memory.

I suspect that you will have a better memory situation if you store the JSON and similar data with the content (see bundles) (or put them in /assets) and use transform.Unmarshal to load.

bep · December 6, 2019, 7:46am

Also, Go’s built-in JSON decoder is known to be very ineffective/memory hungry. We have an open issue about this.

This issue might also be worth tracking:

Topic		Replies	Views
--renderToDisk: does it use less RAM on `hugo server`? support	4	1264	March 10, 2022
How to generate direct page on disk, without using too much memory support	1	563	December 22, 2016
Hugo scalability? support	12	1926	March 6, 2021
Does having a lot of static files cause this, and if so, I'm just curious why, and is there any workaround? support	5	657	April 17, 2022
Why does hugo need so much memory for big site support	1	615	March 19, 2021

Is there an option to write to disk rather than memory?

Related topics