Slow build

Hi. For project with 74k pages It has taken 2 hours to complete the build process. How can I improve this process?
My test project - link

Build Environment:
OS: Windows 10
Theme version: uBlogger 2.0.2
Hugo version: v0.82.0-9D960784+extended windows/amd64
CPU: i7-8700k
RAM: 64 Gb

Template Metrics:

      cache     cumulative       average       maximum
  potential       duration      duration      duration  count  template
      -----     ----------      --------      --------  -----  --------
         34 3h43m12.8652976s  258.300198ms    2.1508279s  51850  partials/paginator.html
          0 3h1m46.7718164s  294.475182ms      2.24354s  37038  taxonomy/list.html
          0 1h58m2.516989s  956.321494ms    3.6619671s   7406  index.html
          0 43m9.8747477s  349.699533ms    2.0080836s   7406  _default/section.html
          0 6m43.1075993s    5.442986ms    916.7331ms  74060  posts/single.html
         52 1m44.6540826s     831.161µs    470.5185ms  125913  partials/head/seo.html
        100  1m9.6202124s     552.923µs    415.3596ms  125913  partials/assets.html
        100   52.7127869s     711.757µs     259.022ms  74060  partials/single/footer.html
         81   49.6175927s     394.062µs     163.911ms  125913  partials/head/link.html
         98   38.0491164s     302.185µs    120.9722ms  125913  partials/header.html
        100   28.4732132s     226.134µs    204.6176ms  125913  partials/init.html
         66   20.9978555s     166.764µs     58.9209ms  125913  partials/head/meta.html
         77   20.8493734s      55.195µs     52.5245ms  377739  partials/plugin/style.html
         33   15.8955116s      71.495µs    121.1824ms  222330  partials/function/content.html
          0   14.2811206s     192.831µs     52.9344ms  74060  _default/summary.html
        100   13.4774627s     107.037µs     71.1422ms  125913  partials/footer.html
         50   11.8410908s       23.51µs     38.8065ms  503652  partials/plugin/script.html
        100    8.5253161s      16.926µs     31.7966ms  503652  partials/scratch/script.html
         81    4.2298641s      57.114µs     10.0154ms  74060  partials/breadcrumbs.html
        100    2.9802139s      23.668µs     40.3442ms  125913  partials/plugin/compatibility.html
         33     2.886856s      12.984µs     10.0083ms  222330  partials/function/checkbox.html
         33    2.3565852s      10.599µs     10.0206ms  222330  partials/function/escape.html
          0    1.1522854s    1.1522854s    1.1522854s      1  sitemap.xml
        100    1.1021682s       8.753µs      20.138ms  125913  partials/plugin/analytics.html
        100    757.2052ms      10.224µs      4.1271ms  74060  partials/plugin/share.html
          0    142.8965ms    142.8965ms    142.8965ms      1  index.rss.xml
        100    106.9933ms      14.446µs      8.5102ms   7406  partials/home/profile.html
        100     41.3375ms         558ns      1.0225ms  74060  partials/under-post.html
          0      1.0196ms       509.8µs       509.8µs      2  taxonomy/terms.html
          0            0s            0s            0s     15  _internal/alias.html
          0            0s            0s            0s      1  404.html
          7            0s            0s            0s     13  partials/function/path.html
         69            0s            0s            0s    163  partials/rss/item.html
          0            0s            0s            0s     15  taxonomy/rss.xml
          0            0s            0s            0s      1  posts/rss.xml


                   |  EN
-------------------+--------
  Pages            | 74095
  Paginator pages  | 51835
  Non-page files   |     0
  Static files     |    88
  Processed images |     0
  Aliases          |    15
  Sitemaps         |     1
  Cleaned          |     0

Total in 7175916 ms

First your average duration is very high (at least for whyt I used to see).
For me it takes 11ms to build 8 pages. And I am very sure that on my machine (worse CPU and RAM then you) the IOs are bottlenecking and not the CPU or the RAM.

I am not sure tho how good HUGO is when it comes to multicore support etc. But may you want to give use some more infos about:

  1. how much is the computer (CPU/RAM/IO) utilized when building?
  2. what IO storage device (HDD, SSD, NVMe?) do you have?

I personal run everything what is related to web and to servers on linux but running on windows should not impact this to much.

Anyway a “partial build” command would be nice to first check where changes has been made and then render these pages (and the respecting other files like sitemap etc) instead of always the whole website.

изображение
This photo shows a build of 400k pages running for testing, the SSD disk is not heavily loaded at that

CPU Usage:

This at least looks like it does not utilize all cores effectively.
More shocking for me is that it utilizes 46GB RAM. But its still not bottlenecking as you have more.

1 Like

If you look at what your Template Metrics is saying:

The paginator partial is what is taking up the most time. Optimise this, and also try using partialCached instead.

2 Likes

Not sure why you are comparing processor usage now :wink: The posted metrics are quite clear:

         34 3h43m12.8652976s  258.300198ms    2.1508279s  51850  partials/paginator.html
          0 3h1m46.7718164s  294.475182ms      2.24354s  37038  taxonomy/list.html
          0 1h58m2.516989s  956.321494ms    3.6619671s   7406  index.html
          0 43m9.8747477s  349.699533ms    2.0080836s   7406  _default/section.html
          0 6m43.1075993s    5.442986ms    916.7331ms  74060  posts/single.html
         52 1m44.6540826s     831.161µs    470.5185ms  125913  partials/head/seo.html
        100  1m9.6202124s     552.923µs    415.3596ms  125913  partials/assets.html

The 100 in the left column tells you to replace all calls to partial/assets.html with partialCached. Then the most time is used up in taxonomy/list.html and partials/paginator.html. Trying to increase the “posts per page” value will decrease your pagination time. But that will always be the most time-hungry task. Try working with the internal tools like nextpage/prevpage instead of a nice numbered pagination (if you use one). Not sure about the taxonomy/list.html issues, but if you post the layout here we might find out about it.

The index.html shouldn’t take that long, because it’s just one page (the home page).

And lastly: There might be something wrong with your seo partial, because I would expect it to be a 0% cacheable layout. It should be much different per page or it won’t help with any seo if it is the same over multiple pages.

1 Like

Thank you. Are there any Hugo themes with an implementation of this?

Thank you. Here’s a link to a build-ready test project - https://www.dropbox.com/s/lakg1vnrwhqbp2y/test_site.zip?dl=1

I pared the test site down to 5000 posts.

paginate value in config.toml Build Time
10 45s
20 24s
40 18s
80 11s

The paginator partial is called from:

  • themes/uBlogger/layouts/index.html
  • themes/uBlogger/layouts/_default/section.html
  • themes/uBlogger/layouts/post/posts-by-lastmod.html
  • themes/uBlogger/layouts/post/posts-by-views.html
  • themes/uBlogger/layouts/taxonomy/list.html

So that’s a lot of pagination for a large site, with very little functional return. I can’t imagine anyone paging through (75000/10 = 7500) pages to find the post they reallly, really want. That’s what search is for.

It seems like using a “recent posts” approach might be better. For example, limit the home page to the 30 most recent posts, and page through that (3 pages of posts instead of 7500).

2 Likes

Thank you all for the recommendations.

  1. For the test, I replaced “partial” with “partialCached” in all theme files and it reduced the build time by about 30%, but expectantly the active pagination page visualization stopped working.
  2. When building large projects, the time increases not linearly. When I enable debugging, I see that the starting pagination pages are generated faster than the following pages, with no shortage of resources. With a large number of pages it may takes up to several minutes to generate one pagination page. I think there is a problem with the processing logic.
  3. When the pagination size is increased the speed of building increases significantly, but it still takes a lot of time.
1 Like

Is that what you meant to say? It “increases significantly” but “still takes a lot of time.” Did you mean it doesn’t increase significantly?

1 Like

Here are the results of building a test project with 5k posts:
paginate=10 time=64sec
paginate=70 time=15sec

Here are the results of building a test project with 440k posts and Template Metrics:
paginate=70 time=21hour

Template Metrics:

      cache     cumulative       average       maximum
  potential       duration      duration      duration  count  template
      -----     ----------      --------      --------  -----  --------
          0 21h18m19.5846794s  12.207478064s   25.6204492s   6283  index.html
          0 6h30m6.1835117s  744.874248ms    8.9105094s  31423  taxonomy/list.html
         34 5h52m28.8094889s  480.774954ms    4.8092696s  43989  partials/paginator.html
          0 4h48m29.5687443s   39.358713ms    2.5501426s  439790  posts/single.html
          0 1h35m14.6294192s  909.538344ms    9.1915207s   6283  _default/section.html
         78 1h5m36.9829158s    8.137927ms    2.4541406s  483782  partials/head/seo.html
        100 54m42.9727038s    6.786057ms    2.3661334s  483782  partials/assets.html
         20 32m46.980674s    1.490565ms    2.3201311s  1319620  partials/function/content.html
          0  27m3.157284s    1.845377ms    2.3171313s  879580  _default/_markup/render-link.html
         94  22m1.954616s    2.732541ms    2.3491331s  483782  partials/head/link.html
          0 14m5.3290819s    1.922119ms    1.9221076s  439790  _default/_markup/render-image.html
        100 13m41.1202268s    1.867073ms     2.209125s  439790  partials/single/footer.html
        100  9m0.1503981s    1.116516ms    1.5960891s  483782  partials/init.html
         99  8m2.3640138s     997.068чs    438.0259ms  483782  partials/header.html
         93  7m8.6851108s     974.749чs    1.7610991s  439790  partials/plugin/image.html
          0 5m23.2277344s     734.959чs    435.0238ms  439790  _default/summary.html
         77 3m49.0796785s     157.839чs     86.0066ms  1451346  partials/plugin/style.html
         78 3m28.2154763s     430.391чs     20.0515ms  483782  partials/head/meta.html
          0 3m17.6425625s      112.35чs     18.9994ms  1759160  _default/_markup/render-heading.html
         50 2m54.3572148s      90.101чs     24.0001ms  1935128  partials/plugin/script.html
        100 2m46.5719862s     126.251чs    181.0098ms  1319370  partials/function/resource.html
        100 1m55.8300995s      59.856чs     57.0032ms  1935128  partials/scratch/script.html
         66 1m35.3392137s     108.391чs     17.9562ms  879580  partials/plugin/link.html
        100 1m28.9863911s     183.939чs     80.0047ms  483782  partials/footer.html
         81 1m22.9060416s     188.512чs     20.0123ms  439790  partials/breadcrumbs.html
         20   58.2277445s      44.124чs     31.0034ms  1319620  partials/function/checkbox.html
         20   53.7947319s      40.765чs    209.0125ms  1319620  partials/function/escape.html
        100   34.9459366s      72.234чs     18.9839ms  483782  partials/plugin/compatibility.html
          0   32.0748369s   32.0748369s   32.0748369s      1  sitemap.xml
        100   16.0309971s      33.136чs     15.0005ms  483782  partials/plugin/analytics.html
        100   13.2331704s      30.089чs     18.9489ms  439790  partials/plugin/share.html
          0    3.3352101s    3.3352101s    3.3352101s      1  index.rss.xml
          0    2.1439377s   85.757508ms     100.924ms     25  taxonomy/rss.xml
         66    1.6285396s     6.16871ms     11.0595ms    264  partials/rss/item.html
        100    1.0701647s       2.433чs     15.0005ms  439790  partials/under-post.html
        100    137.0349ms       21.81чs      1.0134ms   6283  partials/home/profile.html
          0     94.9295ms     94.9295ms     94.9295ms      1  posts/rss.xml
          0     42.9981ms    21.49905ms     34.9978ms      2  taxonomy/terms.html
          0      4.9858ms      4.9858ms      4.9858ms      1  404.html
          2      3.1557ms      63.114чs      1.0838ms     50  partials/function/path.html
          0       999.7чs      39.988чs       999.7чs     25  _internal/alias.html


                   |   EN
-------------------+---------
  Pages            | 439845
  Paginator pages  |  43964
  Non-page files   |      0
  Static files     |     89
  Processed images |      0
  Aliases          |     25
  Sitemaps         |      1
  Cleaned          |      0

Total in 77723218 ms
1 Like

I understand the general point of speed vs volume, and the general interest for this.
But just for my own sake, @Andrey1 what is the rationale for you keeping the pagination for 440k post ?

I think this is actually a good challenge, because is shows problems in scalablility.
If big Players like NewYork Times etc once want to switch over to Hugo with their 20million subsites how long would it take to generate and how to optimize that render/build time. Even for small sites people should optimize their render/build times as it safes time.

I always try to see thing in extremities, very less pages, many pages to identify bottlenecks and to makes things more effective and efficient. So I personally like the topic and that he could at least speed up build time by about 30%!

I think everyone agree with you, especially Bep who already stated clearly that he has some good ideas about working on those area, and some WIP.

But the point for this main post is : What is the use case for having a full pagination work for 750.000 pages.
Where is the human who will search something this way ?
So even if 750.000 pages with pagination will build under 10sec, I suspect this doesn’t makes sense from a design/UI/user point of view

1 Like

Clearly, the conclusion is that it’s easier to cut out pagination. But what about the tags, as a result, the tags will also contain 10,000 posts, we cut them out as well. As a result, we cut everything out and leave only the search to the user.

Despite the seeming logic: who will flip through a lot of posts on the page, this is not logical, since most sites have full pagination.

The store may need it, and hugo is not particularly good for stores, but it can still be made for a store. On portfolio sites it would be nice to have pagination too. Websites of illustrators, there is also no need for a search.

I’m not arguing against pagination.
Pagination is good, and your examples are fine.

Point is how much objects will you paginate ?
A shop with 100 page for searching items ? Mmmmm, not sure about his conversion rate.
A portfolio with 100 pages ? Mmmmm are you giving up before 90 ? before 70 ?

since most sites have full pagination.
And all those site are efficients for their goals ?

That is as if you would say “it’s better not to transplant hearts anymore, they take to long to transplant.”

Your main aim should be the usability and accessibility of the resulting website (apart from interesting content) for the end-user, not how convenient or inconvenient the act of creating the site is.

If a page is a catalogue page of for instance all marvel characters, there is no need to have a pagination so people can go 10 characters forward or 20 characters forward. Have a proper search so I can skip right to Korg.

You could sort your taxonomies like it’s done on wikia, with a small box for each item with a title and a link, instead of a full-page preview. This way you can put more items on one list page and reduce the time to compile pagination for that (50 per page, 100 per page).

And lastly: if it’s indeed a weblog with thousands of entries, why not trying something along the lines of prev-page, next-page (for the timeline readers) and prev-month, prev-year, next-month, next-year (for the skimmers). this would be pseudo-static for the month/year (partialCached per month/year) and prev/next might be easier on the compilation time.

1 Like

From my practice, it follows that through pagination, tags, and filters Google is better at indexing sites, including online stores, than other ways. Again, this is an assumption based only on the results of practical experiments and it may not coincide with other assumptions in other cases.

As you may have noticed, @jmooring and @bep are currently working on speeding up builds, so maybe there will be a hugo release in the near future. That’s great. The faster build speed will allow me to move from Django to static sites on many projects, and improve, among other things, the recent relevance of Core Web Vitals.