Page generation performance expectations?


#1

I’ve been tasked with doing a quick POC with Hugo to see if it’s a candidate to replace one of our Wordpress news sites. We currently have around 600,000 articles and add anywhere between 300 and 500 articles/day.

I stood up a couple of EC2 instances today with v0.14, but I kept running out of memory. I finally was able to get a c3.8xlarge (32 CPU, 60GB RAM Amazon Linux AMI 2015.03 (HVM), SSD Volume Type) to process all of the files. Each article was a lorem ipsum file with roughly 500 words per article.

Here’s the command I ran to generate the files:

hugo --theme=hyde --buildDrafts=true -v --stepAnalysis=true

and here’s the output

INFO: 2015/06/17 Using config file: /home/ec2-user/testsite/config.toml
INFO: 2015/06/17 syncing from /home/ec2-user/testsite/themes/hyde/static to /home/ec2-user/testsite/public/
INFO: 2015/06/17 syncing from /home/ec2-user/testsite/static/ to /home/ec2-user/testsite/public/
initialize & template prep:
	34.293011ms (34.809303ms)	    0.41 MB 	5590 Allocs
load data:
	737.475µs (35.655751ms)	    0.01 MB 	106 Allocs
import pages:
	8m13.569913231s (8m13.642244348s)	 74428.80 MB 	238142254 Allocs
INFO: 2015/06/18 found taxonomies: map[string]string{"tag":"tags", "category":"categories"}
build taxonomies:
	2.038774844s (8m15.740457992s)	  166.93 MB 	5400106 Allocs
render and write aliases:
	18.683899ms (8m15.816083371s)	    0.00 MB 	0 Allocs
render and write taxonomies:
	97.388µs (8m15.874773846s)	    0.00 MB 	27 Allocs
render & write taxonomy lists:
	260.392µs (8m15.933551392s)	    0.02 MB 	386 Allocs
ERROR: 2015/06/18 Site's .BaseUrl is deprecated and will be removed in Hugo 0.15. Use .BaseURL instead.
render and write lists:
	40.101730603s (8m56.09864924s)	 4363.72 MB 	89405746 Allocs
render and write pages:
	7m10.387532987s (16m6.584133592s)	 18082.48 MB 	359400988 Allocs
WARN: 2015/06/18 Unable to locate layout for 404 page: [404.html theme/404.html]
render and write homepage:
	1m17.422124837s (17m24.068511321s)	 24415.30 MB 	97805015 Allocs
render and write Sitemap:
	39.044707027s (18m3.190981739s)	 4477.36 MB 	106800359 Allocs
600000 of 600000 drafts rendered
0 future content
600000 pages created
0 paginator pages created
0 tags created
0 categories created
in 1083253 ms

Any input on whether or not these are reasonable numbers for my test case? We have a max generation time of less than 60 seconds.

Thanks,

Darin


#2

This is definitely the largest site I’ve ever heard of for Hugo. We had an earlier issue of someone triggering a bug with 10k posts, which at the time was the largest site I’d heard of for Hugo. He found a race condition that we fixed. He told us he did a POC with a bunch of other SSGs and Hugo was the only one that could even finish in under an hour (hugo took about 4 seconds IIRC). Of all the site generators he wrote, only one of the other generators even finished.

600k is a lot of posts. I’m curious what the total size of just the content files is on disk. I think you will likely be pushing the limits of the IO on the disk to even read that number of files in 60 seconds.

I’m also curious how long it takes for you to just copy the content directory (time cp -r content content.test). That will at least give us a minimum benchmark of what the lower end could be.

These numbers don’t seem unreasonable, but do look like a bit longer than I would think based on smaller tests I’ve run.

Hugo will run faster if step analysis and verbosity aren’t enabled.

There are also a few things we could do to make it even faster for this large use case. This is the first real case I’ve seen where doing partial renders may be worth it.

Thanks for considering Hugo for this. I don’t know if we are up for the challenge, but I’m personally excited to have a motivator for us to improve Hugo even further.


#3

We’re running on general purpose SSD. I can switch it out to a different instance type so I can use Provisioned IOPS tomorrow to see if that makes any difference or not.

The public directory is 9.7GB.

The raw content directory is 4.6GB.

Here’s the time:

time cp -r content content.test

real	9m9.339s
user	0m2.048s
sys	0m31.768s

#4

600K pages in 60 seconds sounds like a nice challenge. If a big news site considers to use Hugo, I would be willing to join in a brainstorming about how that could be done. I imagine the current Hugo would eat a lot of memory, so some kind of stream processing could be considered.


#5

After reprocessing this morning on an i2.2xlarge with Provisioned IOPS:

hugo --theme=hyde --buildDrafts=true -v  --stepAnalysis=true
INFO: 2015/06/18 Using config file: /home/ec2-user/testsite/config.toml
INFO: 2015/06/18 syncing from /home/ec2-user/testsite/themes/hyde/static to /home/ec2-user/testsite/public/
INFO: 2015/06/18 syncing from /home/ec2-user/testsite/static/ to /home/ec2-user/testsite/public/
initialize & template prep:
	6.458431298s (6.780116173s)	    0.52 MB 	5636 Allocs
load data:
	883.541µs (6.781094318s)	    0.01 MB 	106 Allocs
import pages:
	1m51.682039682s (1m58.491692514s)	 74427.48 MB 	238142188 Allocs
INFO: 2015/06/18 found taxonomies: map[string]string{"tag":"tags", "category":"categories"}
build taxonomies:
	1.819535274s (2m0.367722608s)	  166.93 MB 	5400106 Allocs
render and write aliases:
	16.246169ms (2m0.440781078s)	    0.00 MB 	0 Allocs
render and write taxonomies:
	96.292µs (2m0.497578371s)	    0.00 MB 	27 Allocs
render & write taxonomy lists:
	232.835µs (2m0.554502627s)	    0.02 MB 	386 Allocs
ERROR: 2015/06/18 Site's .BaseUrl is deprecated and will be removed in Hugo 0.15. Use .BaseURL instead.
render and write lists:
	34.121322837s (2m34.730233144s)	 4363.44 MB 	89405709 Allocs
render and write pages:
	6m8.851459403s (8m43.642322315s)	 18254.55 MB 	362400629 Allocs
WARN: 2015/06/18 Unable to locate layout for 404 page: [404.html theme/404.html]
render and write homepage:
	58.727010135s (9m42.437117399s)	 24435.03 MB 	97805023 Allocs
render and write Sitemap:
	47.131823295s (10m29.634679433s)	 4772.51 MB 	106800379 Allocs
600000 of 600000 drafts rendered
0 future content
600000 pages created
0 paginator pages created
0 categories created
0 tags created
in 625858 ms

So quite a bit faster.

Here’s the time run:

time cp -r content content.test

real	0m18.809s
user	0m1.080s
sys	0m15.584s

Minutes down to 18 seconds.

Also, here’s the output from my lorem generator to see how fast it is to generate and write 600k docs to disk:

./bin/hugo-lorem /home/ec2-user/testsite/content/post/
11:53:41.705 [main] INFO  HugoLorem - processed 50000
11:53:50.854 [main] INFO  HugoLorem - processed 100000
11:53:58.418 [main] INFO  HugoLorem - processed 150000
11:54:07.814 [main] INFO  HugoLorem - processed 200000
11:54:17.222 [main] INFO  HugoLorem - processed 250000
11:54:28.521 [main] INFO  HugoLorem - processed 300000
11:54:35.847 [main] INFO  HugoLorem - processed 350000
11:54:42.031 [main] INFO  HugoLorem - processed 400000
11:54:48.329 [main] INFO  HugoLorem - processed 450000
11:54:55.489 [main] INFO  HugoLorem - processed 500000
11:55:01.847 [main] INFO  HugoLorem - processed 550000
11:55:08.611 [main] INFO  HugoLorem - processed 600000
11:55:08.613 [main] INFO  HugoLorem - 0:01:34.708

#6

Good. As expected, the IO was the biggest bottleneck. The provisioned IOPS is a welcome improvement. It’s down to 10 minutes.

18 seconds feels too fast. Something else must be going on here. The OS has lots of optimizations here and some very specific ones for the copy routine. It must be doing something beyond the actual bytes read and copy. I can’t believe it can copy 10 GB of hundreds of thousands of files in 18 seconds. It’s not the size that I find hard to believe, but the number of files. I think it must be caching it. Hugo could possibly be benefiting from that cache, but we aren’t doing anything specific to be able to.

Did you remove the destination directory first?

To do a proper benchmark you should remove the destination directories every time.

In your case for the copy this would be the content.test directory and for the generation it would be the public directory.


#7

I don’t think that was it. I started the new instance with a clean volume each time. I did try deleting content.test and public after the last run and I saw that the HDD was pegged based on how I had it configured.

With the last test I did (results below), I sized up the instance to an i2.4xlarge (16 vCPU and 122 GB RAM) and attached a 200GB Provisioned IOPS volume with the IOPS set to 6000. Here are the results:

./bin/hugo-lorem /home/ec2-user/testsite/content/post/

18:23:27.002 [main] INFO  HugoLorem - processed 50000
18:23:32.682 [main] INFO  HugoLorem - processed 100000
18:23:38.358 [main] INFO  HugoLorem - processed 150000
18:23:44.053 [main] INFO  HugoLorem - processed 200000
18:23:49.668 [main] INFO  HugoLorem - processed 250000
18:24:00.167 [main] INFO  HugoLorem - processed 300000
18:24:07.847 [main] INFO  HugoLorem - processed 350000
18:24:13.504 [main] INFO  HugoLorem - processed 400000
18:24:19.438 [main] INFO  HugoLorem - processed 450000
18:24:26.414 [main] INFO  HugoLorem - processed 500000
18:24:31.812 [main] INFO  HugoLorem - processed 550000
18:24:37.303 [main] INFO  HugoLorem - processed 600000
18:24:37.305 [main] INFO  HugoLorem - 0:01:16.430

time cp -r content content.test

real	0m16.882s
user	0m1.164s
sys	0m15.188s

hugo --theme=hyde --buildDrafts=true -v --stepAnalysis=true

INFO: 2015/06/18 Using config file: /home/ec2-user/testsite/config.toml
INFO: 2015/06/18 syncing from /home/ec2-user/testsite/themes/hyde/static to /home/ec2-user/testsite/public/
INFO: 2015/06/18 syncing from /home/ec2-user/testsite/static/ to /home/ec2-user/testsite/public/
initialize & template prep:
	351.394703ms (354.057907ms)	    0.54 MB 	5644 Allocs
load data:
	792.468µs (354.951988ms)	    0.01 MB 	106 Allocs
import pages:
	1m31.199485719s (1m31.582473998s)	 74427.64 MB 	238142156 Allocs
INFO: 2015/06/18 found taxonomies: map[string]string{"tag":"tags", "category":"categories"}
build taxonomies:
	1.734869883s (1m33.37217586s)	  166.93 MB 	5400106 Allocs
render and write aliases:
	14.78402ms (1m33.441452075s)	    0.00 MB 	0 Allocs
render and write taxonomies:
	85.644µs (1m33.496067107s)	    0.00 MB 	29 Allocs
render & write taxonomy lists:
	231.986µs (1m33.550878275s)	    0.02 MB 	386 Allocs
ERROR: 2015/06/18 Site's .BaseUrl is deprecated and will be removed in Hugo 0.15. Use .BaseURL instead.
render and write lists:
	33.474794023s (2m7.078339867s)	 4363.44 MB 	89405706 Allocs
render and write pages:
	1m11.194554477s (3m18.333621207s)	 18254.64 MB 	362400190 Allocs
WARN: 2015/06/18 Unable to locate layout for 404 page: [404.html theme/404.html]
render and write homepage:
	50.939685961s (4m9.344074395s)	 26227.58 MB 	97805016 Allocs
render and write Sitemap:
	39.851999698s (4m49.264834561s)	 4477.36 MB 	106800359 Allocs
600000 of 600000 drafts rendered
0 future content
600000 pages created
0 paginator pages created
0 tags created
0 categories created
in 288956 ms

time rm -rf content.test/

real	0m8.698s
user	0m0.312s
sys	0m7.976s

time rm -rf public/

real	0m38.465s
user	0m2.204s
sys	0m28.280s

time cp -r content content.test

real	0m16.820s
user	0m1.128s
sys	0m15.144s

hugo --theme=hyde --buildDrafts=true -v --stepAnalysis=true

INFO: 2015/06/18 Using config file: /home/ec2-user/testsite/config.toml
INFO: 2015/06/18 syncing from /home/ec2-user/testsite/themes/hyde/static to /home/ec2-user/testsite/public/
INFO: 2015/06/18 syncing from /home/ec2-user/testsite/static/ to /home/ec2-user/testsite/public/
initialize & template prep:
	7.747208ms (8.303872ms)	    0.54 MB 	5646 Allocs
load data:
	73.204µs (8.489458ms)	    0.01 MB 	106 Allocs
import pages:
	1m33.815698001s (1m33.851815309s)	 74427.55 MB 	238142041 Allocs
INFO: 2015/06/18 found taxonomies: map[string]string{"tag":"tags", "category":"categories"}
build taxonomies:
	1.725997747s (1m35.63226383s)	  166.93 MB 	5400106 Allocs
render and write aliases:
	14.627662ms (1m35.701348611s)	    0.00 MB 	0 Allocs
render and write taxonomies:
	81.151µs (1m35.755811919s)	    0.00 MB 	32 Allocs
render & write taxonomy lists:
	234.993µs (1m35.810481172s)	    0.02 MB 	386 Allocs
ERROR: 2015/06/18 Site's .BaseUrl is deprecated and will be removed in Hugo 0.15. Use .BaseURL instead.
render and write lists:
	33.295439543s (2m9.158478923s)	 4363.44 MB 	89405708 Allocs
render and write pages:
	1m9.4628844s (3m18.681787678s)	 18254.63 MB 	362400291 Allocs
WARN: 2015/06/18 Unable to locate layout for 404 page: [404.html theme/404.html]
render and write homepage:
	51.041658401s (4m9.792705514s)	 24447.66 MB 	97805023 Allocs
render and write Sitemap:
	41.13710261s (4m50.997916102s)	 4477.36 MB 	106800359 Allocs
600000 of 600000 drafts rendered
0 future content
600000 pages created
0 paginator pages created
0 tags created
0 categories created
in 291029 ms

du -sh content

4.6G	content

du -sh public

9.7G	public

time cp -r public public.test

real	1m27.495s
user	0m3.104s
sys	0m39.152s

One thing I noticed on both hugo runs is that top was reporting total memory used up around 80GB and the hugo process was showing reserved memory at ~55GB. With the previous instance, we only had 60GB of RAM, so we were probably swapping.

It looks like throwing hardware at this helps a lot. I’ve got a couple of other tweaks I can try on the volume, but based on the volume monitoring during the runs, I don’t think it will make much difference because we weren’t hitting any caps during these runs.


#8

Wow… Thanks that’s very helpful.

On this new hardware it’s rendering just under 5 minutes. I huge improvement from the initial 18 minutes and much closer to the goal of 1 minute.

I’m curious how long it would take if you didn’t enable --stepAnalysis and -v (verbosity).

I would think the difference is more noticeable with smaller sites so it It may not make much of a difference, but I think it could improve a bit.

These numbers look about right to me based on my smaller tests. 600k pages is a lot to process. Hugo is processing them at about 0.5 ms per page to hit the numbers you are getting right now.

Are the CPUs all being used? Are they pretty much pegged during this process?

There may be a few tweaks we could do there to help things without changing much to the codebase.

I’d also be curious how it would perform if we built Hugo with Go 1.5 pre-release.


#9

There’s very little difference:

hugo --theme=hyde --buildDrafts=true

ERROR: 2015/06/18 Site's .BaseUrl is deprecated and will be removed in Hugo 0.15. Use .BaseURL instead.
600000 of 600000 drafts rendered
0 future content
600000 pages created
0 paginator pages created
0 tags created
0 categories created
in 286420 ms

There were times when all CPUs were active, but they were never pegged. Most of the time, there was only one core running.

I turned stepAnalysis back on to try to track some of these down, but I didn’t do them all.

“render and write pages” only uses half the CPUs. I saw it blip all of them for a second or 2, but it was fairly consistent that it was only half and only the same 8, not a dance around all 16.

“render and write homepage” only used 1.

“render and write Sitemap” only used 1.


#10

The last two is one Go routine for a reason. I notice the homepage generation is taking up a fair amount of time/memory (1 minute). I guess you use the default hyde template that generates a summary of all 600K pages, which I wouldn’t say is a normal use case (but then again, you do not have any taxonomies either, so …).

The part about only using half of the 16 cores is interesting. The number of Go routines is calculated as GOMAXPROCS * 4. It would be interesting to know whats reported as GOMAXPROCS env on your system (I guess Go sets this at application startup).

It would be interesting to see the same test run with a Go 1.5 Hugo binary.


#11

There are very few static site generators with caching features, here is one, and it’s not meant for production websites: http://hammerformac.com/docs/cache.html I think the only way to generate that many pages quickly is going to be to have a cache that only regenerates the pages that change. But, it’s certainly nice to know that Hugo can get the job done.

If you have Varnish in front of the stack to serve content while it’s generating, a few minutes “queue” for new published content doesn’t seem unreasonable and is quite common for very large websites.


#12

So your original test of 1083253 milliseconds is about 18 minutes to generate 600,000 Lorem ipsum posts, is that right?

I am amazed at how fast hugo builds and renders the static sites!

One question on your data though.

All the benchmarks I have seen on Hugo is “Lorem Ipsum” dummy text stuff generated solely for testing purposes. How realistic are these posts, considering they are all text only, with 0 image embeds, or video embeds.

Also the template used to generate do not have a whole lot of go template logic.

So I want to know how/if the built performance will change on “real” data that

  1. Has realistic news / media content like images, gifs, video embeds etc and
  2. Uses a template that has if... else logic, loops etc

I did notice in my own builds with a 400-ish posts that I migrated from wordpress to markdown, that the build performance changes significantly from template to template.


#13

It is fairly easy to write ineffective Go templates. I did some work in this department for 0.15, but there still arel some cases where, if you do stuff with Page.Site.Pages in a page template for every page in your site, it will slow things down. Hugo will be smarter about this in the future.