Is there an option to write to disk rather than memory?

Currently, we have around 70k static files and are thinking of expanding/scaling to the millions in the near future. We have an aws ec2 instance with 16 CPU 240 ram 2TB of disk space that we’re running the command hugo on. Now this is actually taking up 175~GB of ram to generate the public/ file (we tested 35k static files and it takes ~91GB of ram). We’ve also tried hugo server --renderToDisk but the cost for memory is the same. Hence, i’m not sure how we can scale since ram is pretty expensive to extend to the TBs…

How does the build process work for hugo? Is there a way to use disk rather than building by using memory? Any ideas or other options on improving the memory usage?

That does what it says (writes files to disk). This is also what you get when running hugo (writing to memory would not make much sense when you want to publish it). When you don’t see any effect when doing that, it tells me that it’s “something else” going on.

I have some plans to rework the build process into a streaming/partial/partitioned build, which you would need for “millions of pages” to be practical, but that is not happening next week.

That said, 70k files and 170GB ram does not add up. I have done tests on my old MacBook (that had GB of ram) with 50K content files without much hickup, but it’s impossible for me to tell you what from a bird’s view.

1 Like

Is there a set schedule or estimated schedule on when the streaming/partial/partitioned build will be released?

Even if these 50K content files each have a few pages of json data that they reference to? Because we have 70k frontmatter files that reference a datafile that hold few pages of json.

OK, so that may be the reason. If you put large amount of JSON inside /data all of that is read into memory.

I suspect that you will have a better memory situation if you store the JSON and similar data with the content (see bundles) (or put them in /assets) and use transform.Unmarshal to load.

2 Likes

Also, Go’s built-in JSON decoder is known to be very ineffective/memory hungry. We have an open issue about this.

This issue might also be worth tracking: