171.456 docs, 22 taxonomies, 20 minutes

I’ve done some more structured testing (with default pagination). I created 1,000 random articles with simple tags/categories, and copied them into 20 different sections, for a total of 20,000 content files. Then I wrote a standalone script that would generate X taxonomies each containing Y terms, and insert 1-X random taxonomies into each article, each with 1-5 random terms.

Without the random taxonomies, build time was 268956 ms.

With one 1,000-term taxonomy, it was 290217 ms.

With ten 1,000-term taxonomies, it was 332273 ms.

With one 10,000-term taxonomy, it was 316232 ms.

With four 10,000-term taxonomies, it was 427258 ms.

With four 10,000-term taxonomies and 6-10 terms/taxonomy, it was still only 511538 ms:

0 draft content
0 future content
0 expired content
20000 regular pages created
80088 other pages created
20 non-page files copied
75566 paginator pages created
10000 drove created
9999 pneumonographic created
10 categories created
10 tags created
9999 pleasingness created
9999 smidgen created
total in 511538 ms

So I decided to go for broke, and generated 20 10,000-term taxonomies with 1-5 terms/taxonomy. The build has been running for 50 minutes so far, using only a single core and 4.2 GB of RAM, not spinning up the fans, and has only written out the static files.

If I’m reading the stack trace correctly, only one thread is active, and it’s spending all of its time in assemble().

Update: Final results after 68 minutes:

0 draft content
0 future content
0 expired content
20000 regular pages created
371332 other pages created
0 non-page files copied
207794 paginator pages created
9253 formalesque created
9211 ankyloglossia created
9285 accidie created
9294 cholestanol created
9291 hala created
9280 undisgraced created
9273 brocho created
9270 subsist created
9252 featherless created
9275 turner created
9290 unawfully created
9280 overwalk created
9300 dicker created
9246 electoral created
9302 antalkali created
9296 overdaintily created
9284 tomeful created
9316 extrafloral created
9322 coruscation created
10 categories created
9283 scranny created
10 tags created
total in 4073392 ms

So, 5x the number of taxonomies/terms, 10x the runtime, and most of that was spent in a single-threaded routine that was neither reading from nor writing to the disk.

-j