I work for a news organization and we were considering hugo. I wanted to test the theoretical limits of hugo to make sure it can handle our current article count.
I created a quick script that generated 10000 pages over a couple of years per month
After generating 2 million in under 10 seconds with my script. I went on to hugo to see if it compiled. Anyone have any experience generating this large of a site with hugo. I was unable to perform a build as it froze my setup for hours.
Build time is all about theme complexity, and many of the available themes make assumptions that scale poorly (like including every article in sidebars and menus on every page, and not using partialCached).
Yes that does, thats about 18ms a page since you have two languages. So it really could take like 10 mins with 2M pages. Im really not sure what happened with my setup. Was that a build from scratch or did you rebuild an existing folder?
so 33% slower on a full rebuild. ima give ur theme a go with testing its limits. maybe something in my theme just sucks. The company would kill me if i wiped out they’re seo backlink history. Otherwise i would have just done the last few years
A small script to generate an over-simplified 10,000 page site. You can tweak it to simulate a more realistic site.
Script
#!/bin/bash
for i in {1..10000}; do
date=$(date -v -${i}d +'%Y/%m/%d')
hugo new ${date}.md
cat << EOF >> content/${date}.md
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean diam est, rutrum vel tincidunt eu, gravida vitae arcu. Donec quis dignissim libero. Mauris ultricies justo quis risus convallis, et malesuada elit mollis. Donec quis tortor nec massa eleifend gravida quis sed lectus. Praesent auctor leo aliquet, placerat enim a, eleifend nunc. Suspendisse congue massa eget tristique vestibulum. Nulla commodo lacus arcu, sit amet tempor tellus vulputate id. Maecenas non auctor eros, ut tincidunt sapien. Donec luctus, purus tincidunt pharetra viverra, lectus dolor euismod lorem, nec ullamcorper risus sem a tortor. Sed porta aliquet metus quis dapibus. Cras porta risus ut pharetra tincidunt. Nulla facilisi. Vivamus ut ante consequat, hendrerit purus nec, sagittis nisi. Quisque dignissim ullamcorper metus id tincidunt. Suspendisse condimentum nisl in commodo commodo.
EOF
done
hugo -D
Output
| EN
+------------------+-------+
Pages | 10031
Paginator pages | 0
Non-page files | 0
Static files | 0
Processed images | 0
Aliases | 0
Sitemaps | 1
Cleaned | 0
Total in 4108 ms
The beautifulhugo theme doesn’t use partialCached, so it will definitely have scaling issues; applying it to my 8,500-article blog, for instance, increased the average build time from 15 seconds to 37. It also increased memory usage from 316 MB to 440, so I suggest monitoring RAM usage next time you try to build, and see if you have enough to handle two million articles.
When I originally ported my blog to Hugo, I wrote a Bash script to generate random Hugo articles using Wikipedia’s Special:Random URL, adding random dates, tags, authors, etc. It gave me a very realistic site to test against.
Hugo is fast, but is currently memory constrained when we’re talking millions of things. I’ve been doing some prototype work behind the scenes to make it scale to those levels. It’s not happening next week, but …
Hi. Hugo build time is quite slow for my website at this moment, and I don’t have millions of articles.
hugo server -D -F --renderToDisk -d "/local/build/path"
Building sites …
| FR
+------------------+------+
Pages | 277
Paginator pages | 9
Non-page files | 759
Static files | 1734
Processed images | 444
Aliases | 3
Sitemaps | 1
Cleaned | 0
Total in 129974 ms
As @brunoamaral, I have a custom theme with almost the same list of features (a bit less in fact). I use partialCached function (website repo here). Image processing and static file copy seems to be responsible of most of the build time. The problem is even more important if you have taxonomy enabled and a lot of taxonomy list pages generated (e.g. one list of article for every tag). So If you process images, have a lot of static files in your website, and taxonomy pages, you should include this in your benchmark
It seems that on a mac, the static file copy and the caching of the processed images is more optimized. So the “first build time” drop drastically the second time you run the build command. I run on a windows surface 3, not a lightning bolt, and not a mac, but a mac user built my site twice on his mac and reported a drop from 21 seconds to 2 sec. I talk about this issue on github and on this forum.
@lamyseba I took a look at your site out of curiosity. I feel it’s rendering way too slow for your use case and have a few suspicions about the reason why.
You have a lot of template files that I feel could be trimmed using the baseof.html, it would take a closer look to be sure of this and depends on your requirements for the homepage, list and single views.
You could cut down a lot of file size on your images using imageOptim (see notes below)
I also managed to get minor improvements by using partialCached in some places (again, see notes please).
Hope this helps !
Before optimization, after resources have been cached
Thanks a lot for this reply @brunoamaral !
You wrote
after resources have been cached
This is the key problem, It seems that on window at least, resources never get cached (except in fast render mode, while the server is running after the first build). But when you stop the server and start it again, all resources are rebuilt, or at least copied once again to the destination, and I never get a first built time of 2 sec when lauching a hugo server command.
Still, I’ll implement with pleasure your suggestions. The only one I didn’t understood is
You have a lot of template file that could be trimmed using the baseof.html
Do you mean I should cut the code in those template and put it in the baseof.html, even if this makes baseof.html a huge file?