171.456 docs, 22 taxonomies, 20 minutes

bep · May 22, 2017, 6:25pm

1 Hugo build with 200K+ pages, see above.

bep · May 31, 2017, 7:46am

@JLKM @jgreely did you by any change use TOML as page front matter?

If so, see https://github.com/spf13/hugo/issues/3464

And

https://github.com/spf13/hugo/issues/3541

If yes, it would also be interesting if you could repeat the test with YAML.

bep · May 31, 2017, 10:21am

Also see

https://github.com/spf13/hugo/pull/3545

jgreely · May 31, 2017, 7:32pm

Yes, for all of my tests. I don’t have time to set up the exact same tests again at the moment, but I do have a work-in-progress site with 50,000+ recipes in 67 sections and 782 categories (0-8 categories per article, with most having 1-2). Using 0.21 on a Mac, here’s the TOML versus YAML comparison.

TOML:

0 draft content
0 future content
0 expired content
56842 regular pages created
851 other pages created
0 non-page files copied
9198 paginator pages created
782 categories created
total in 572439 ms

YAML:

Built site for language en:
0 draft content
0 future content
0 expired content
56842 regular pages created
851 other pages created
0 non-page files copied
9198 paginator pages created
782 categories created
total in 560110 ms

Soon as I have a chance, I’ll replicate the torture test, since this one doesn’t show a huge difference. Oh, and before I forget to mention it, in all of these tests, I’ve been working in a directory that’s excluded from Spotlight indexing, which can seriously interfere with timing.

Amusing side note: when I started building the recipe site (which bulk-converts MasterCook MX2 files from various archives into Hugo content files), I grabbed the Zen theme for a simple, clean look, and watched the first build eat my memory and disk, because the theme embeds links to every content page in the navbar and sidebar. Every file in public was over 5 MB in size, and top reported it using 40GB of compressed pages when I killed it.

-j

bep · May 31, 2017, 7:39pm

@jgreely thanks, looking at your “monster test”, you have lots of different taxonomy terms (not very realistic, maybe?), which I have not tested well – I will add that variant to my benchmarks as well.

jgreely · May 31, 2017, 7:47pm

I’d have called it completely unrealistic if it weren’t for JKLM’s original site, which has 22 taxonomies ranging from 6 to 8,163 terms (mean 3,763, median 345).

-j

jgreely · June 1, 2017, 4:46pm

I replicated the big test and kicked it off just before going to bed last night. I took the same 1,000 randomly-generated articles, replicated them into a total of 20 sections, and then used my script to add 20 10,000-term taxonomies. I ended up with a total of four sites, generating the YAML versions with rsync and hugo convert --unsafe toYAML:

TOML, no additional taxonomies: 363945 ms
YAML, no additional taxonomies: 325886 ms
TOML, 20x big taxonomies: 3460447 ms
YAML, 20x big taxonomies: 3548300 ms

Yes, the baseline test took 11% longer with TOML, while the really big test took 2.5% longer with YAML. I suspect that any parsing issues are small compared to the amount of time it spends in the single-threaded assemble() function, which is what the stack traces of my earlier big test showed.

Ideally, I’d run each test 10 times to make sure the timing differences are real, but at the very least, I can say that YAML isn’t obviously faster on a big site with lots of taxonomies.

-j

bep · June 1, 2017, 4:51pm

Yes, I have reproduced it in my benchmarks now – the TOML issue does have a fair effect on “normal sites”, but not what you’re dealing with.

bep · June 3, 2017, 8:39am

I have looked into this, and for the “multiple tags per page” case, my surprising conclusion is that the front matter handling of lists is the big factor here.

I did a quick hack where I accepted tags as CSV string and just split it into tags:

benchmark                                                                    old ns/op       new ns/op     delta
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=20-4     4008327269      68389054      -98.29%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=50-4     9946870164      80731086      -99.19%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=80-4     15994140132     87300135      -99.45%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=20-4     4004805422      77224226      -98.07%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=50-4     9886487339      87193850      -99.12%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=80-4     15525606788     101154964     -99.35%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=20-4     73795543        69456250      -5.88%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=50-4     94281466        76522987      -18.84%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=80-4     113422149       87748893      -22.64%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=20-4     84782504        75945331      -10.42%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=50-4     104024340       90037331      -13.45%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=80-4     126581039       109897153     -13.18%

benchmark                                                                    old allocs     new allocs     delta
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=20-4     32880961       325913         -99.01%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=50-4     81733175       356537         -99.56%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=80-4     130582482      387931         -99.70%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=20-4     32902873       338776         -98.97%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=50-4     81752074       369615         -99.55%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=80-4     130602150      400849         -99.69%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=20-4     439205         328285         -25.25%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=50-4     636468         359278         -43.55%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=80-4     832516         390919         -53.04%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=20-4     461346         341230         -26.04%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=50-4     668633         372878         -44.23%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=80-4     875490         404527         -53.79%

benchmark                                                                    old bytes      new bytes     delta
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=20-4     1279495480     37490478      -97.07%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=50-4     3165165576     40039393      -98.73%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=100,tags_per_page=80-4     5123007552     42866740      -99.16%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=20-4     1250653472     38725958      -96.90%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=50-4     3131357648     41197570      -98.68%
BenchmarkSiteBuilding/YAML,num_pages=500,num_tags=500,tags_per_page=80-4     5058255136     44584950      -99.12%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=20-4     38848232       36935079      -4.92%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=50-4     45143742       40513162      -10.26%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=100,tags_per_page=80-4     51166390       45180727      -11.70%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=20-4     41045064       38201983      -6.93%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=50-4     48266240       41843627      -13.31%
BenchmarkSiteBuilding/TOML,num_pages=500,num_tags=500,tags_per_page=80-4     56945800       47070758      -17.34%                                                                                  2d

Note that in the above, the TOML library used is BurntSushi (which is in the current Hugo master), which is much faster than in the Hugo release version.

The work done in the above is the site building excluding rendering.

This is a comparision of the tags front matter handling of the different file types (this is not BurntSushi):

BenchmarkFrontmatterTags/JSON:1-4    1000000     2039.00 ns/op      912 B/op     20 allocs/op
BenchmarkFrontmatterTags/JSON:11-4    300000     5202.00 ns/op     1640 B/op     44 allocs/op
BenchmarkFrontmatterTags/JSON:21-4    200000     7993.00 ns/op     2392 B/op     65 allocs/op
BenchmarkFrontmatterTags/YAML:1-4     200000     9359.00 ns/op     5928 B/op     66 allocs/op
BenchmarkFrontmatterTags/YAML:11-4    100000    21218.00 ns/op     8408 B/op    140 allocs/op
BenchmarkFrontmatterTags/YAML:21-4     50000    32852.00 ns/op    10920 B/op    211 allocs/op
BenchmarkFrontmatterTags/TOML:1-4     100000    21505.00 ns/op     9231 B/op    173 allocs/op
BenchmarkFrontmatterTags/TOML:11-4     20000    82919.00 ns/op    19808 B/op    625 allocs/op
BenchmarkFrontmatterTags/TOML:21-4     10000   141847.00 ns/op    31200 B/op   1106 allocs/op
```

This is the same test with BurntSushi TOML lib:

```
benchmark                              iter        time/iter   bytes alloc          allocs
---------                              ----        ---------   -----------          ------
BenchmarkFrontmatterTags/JSON:1-4    500000    2161.00 ns/op      912 B/op    20 allocs/op
BenchmarkFrontmatterTags/JSON:11-4   200000    5565.00 ns/op     1640 B/op    44 allocs/op
BenchmarkFrontmatterTags/JSON:21-4   200000    7907.00 ns/op     2392 B/op    65 allocs/op
BenchmarkFrontmatterTags/YAML:1-4    200000    9511.00 ns/op     5928 B/op    66 allocs/op
BenchmarkFrontmatterTags/YAML:11-4   100000   21651.00 ns/op     8408 B/op   140 allocs/op
BenchmarkFrontmatterTags/YAML:21-4    50000   33415.00 ns/op    10920 B/op   211 allocs/op
BenchmarkFrontmatterTags/TOML:1-4    200000    7670.00 ns/op     2912 B/op    60 allocs/op
BenchmarkFrontmatterTags/TOML:11-4   100000   17948.00 ns/op     5184 B/op   138 allocs/op
BenchmarkFrontmatterTags/TOML:21-4    50000   28741.00 ns/op     7536 B/op   210 allocs/op
```

The tag count is 1, 11 and 21.

Topic		Replies	Views
Generating a rather large website support	20	1768	July 1, 2019
225.000 docs, 19 taxonomies, 160 minutes support	6	1067	June 14, 2017
Slow build support	18	2019	June 8, 2021
https://twitter.com/SmiileLiive Announcements	7	1231	November 30, 2017
GetTerms getting slows as the content grows support taxonomy , performance	8	354	June 21, 2024

171.456 docs, 22 taxonomies, 20 minutes

Related topics