New Collections.D Function vs generating five random numbers

I tested the new collections.D function compared to generating random numbers discussed here and here are the results. Test site was generated with hugo new theme test --themesDir . and 1000 pages /content/posts/ added with a bash script. Codes used in both cases in page.html independently–

{{ $t := debug.Timer "TestRandom" }}
    {{- $pages := $.CurrentSection.RegularPages }}
    {{- range seq 3 }}
      {{- with math.Rand | mul $pages.Len | math.Floor | int }}
        {{- with collections.Index $pages . }}
            <a href="{{ .RelPermalink }}">
              <span>{{ with .Parent }}{{ .Title }}{{ else }}{{ .Title }}{{ end }}</span>
              <h2>
                {{ .Title }}
              </h2>
            </a>                    
        {{- end}}
      {{- end }}
    {{- end }}
{{ $t.Stop }}
{{ $t := debug.Timer "TestCollectionsD" }}
    {{- $p := $.CurrentSection.RegularPages }}
    {{ range collections.D $p.ByDate 3 ($p | len) }}
      {{ with (index $p .) }}
            <a href="{{ .RelPermalink }}">
              <span>{{ with .Parent }}{{ .Title }}{{ else }}{{ .Title }}{{ end }}</span>
              <h2>
                {{ .Title }}
              </h2>
            </a>                    
        {{- end}}
      {{- end }}
{{ $t.Stop }}

The results (run 3 times). Public folder deleted after each test.

  1. TestRandom
INFO  timer:  name TestRandom count 1000 duration 246.01199ms average 246.011µs median 198.349µs
INFO  timer:  name TestRandom count 1000 duration 231.67484ms average 231.674µs median 212.158µs
INFO  timer:  name TestRandom count 1000 duration 232.786682ms average 232.786µs median 213.086µs
  1. Collections.D
INFO  timer:  name TestCollectionsD count 1000 duration 687.046636ms average 687.046µs median 586.14µs
INFO  timer:  name TestCollectionsD count 1000 duration 715.324944ms average 715.324µs median 606.359µs
INFO  timer:  name TestCollectionsD count 1000 duration 711.437674ms average 711.437µs median 603.998µs

Your thoughts? @jmooring @irkode

My Bad! Collections is not shuffling. I misunderstood its use! :grimacing:

The example above generates the same random numbers each time it is called

The first argument passed to the collections.D function is the seed value, not a sorted page collection. Not sure what you’re trying to do here.

For comparison, I modified the example site we used in the referenced topic to test collections.D as well:

git clone --single-branch -b hugo-forum-topic-54636 https://github.com/jmooring/hugo-testing hugo-forum-topic-54636
cd hugo-forum-topic-54636
hugo --logLevel info

Result:

TestListRandomPagesMethodD count 10451 duration 1.636667303s average 156.603µs median 116.838µs
TestListRandomPagesMethodA count 10451 duration 3.615900262s average 345.986µs median 294.806µs
TestListRandomPagesMethodB count 10451 duration 5.182145369s average 495.851µs median 408.615µs
TestListRandomPagesMethodC count 10451 duration 12.536497344s average 1.19955ms median 778.693µs

In the results above, TestListRandomPagesMethodD uses the collections.D function, and is much faster than anything else we tried. By comparison, it’s about 8 times faster than using collections.Shuffle (TestListRandomPagesMethodC) in this example.

Note that the Method D implementation in this example uses now.UnixNano as the seed value, so the result of collections.D is very unlikely to be duplicated or cached.

Finally, the list of random pages created by Method D retains the same order as the original page collection. If you want to randomize the order of the list, pass the result through shuffle. Since we’re working with a very small slice the performance impact is neglible.

Right. That’s why you need to change the seed value.

Choosing an appropriate seed value depends on your objective.

Objective Seed example
Consistent result 42
Different result on each call int time.Now.UnixNano
Same result per day time.Now.YearDay
Same result per page hash.FNV32a .Path
Different result per page per day hash.FNV32a (print .Path time.Now.YearDay)
1 Like

the first one does not ensure unique numbers.
second one as pointed out has a unneccessary sort.

offline so no way to play but trust in @jmooring measures

What do you mean by “first one”?

Can you explain?

What do you mean by “second one”?

Can you explain?

The test results are from the template code in layouts/page.html here:

git clone --single-branch -b hugo-forum-topic-54636 https://github.com/jmooring/hugo-testing hugo-forum-topic-54636
cd hugo-forum-topic-54636

all point to @tyco 's inital post.

first codeblock . uses random numbers as index. no test if the number has been drawn before…

second codeblock uses $p.ByDate as seed. guess this requires some sorting.

so nothing new here.

So the runtime (speed) is one thing. The difference gets even more prominent if you look at the memory allocation stats. There’s some benchmarks that in the Hugo code base that tests the template function:

BenchmarkD2/n=5,max=100-10              35292430                31.91 ns/op            0 B/op          0 allocs/op
BenchmarkD2/n=50,max=1000-10            37702552                31.88 ns/op            0 B/op          0 allocs/op
BenchmarkD2/n=10,max=10000-10           37686912                31.92 ns/op            0 B/op          0 allocs/op
BenchmarkD2/n=500,max=10000-10          37796763                32.29 ns/op            0 B/op          0 allocs/op
BenchmarkD2/n=10,max=500000-10          37493311                31.97 ns/op            0 B/op          0 allocs/op
BenchmarkD2/n=5000,max=500000-10        37566424                31.88 ns/op            0 B/op          0 allocs/op

The last line picks 5000 random integers between 0 and 500000 in 32 nanoseconds (!) with 0 memory allocations. Try to do that with shuffle.

The Best Algorithm No One Knows About | Kerf blog uses the “generate a list of random Tweets” as an example that’s not practical do solve via shuffle.

1 Like
Error: error building site: render: failed to render pages: render of "/home/tifenak/sites/hugo-forum-topic-54636/content/10-pages/10-pages-01.md" failed: ""/home/tifenak/sites/hugo-forum-topic-54636/layouts/page.html:93:27": execute of template failed: template: page.html:93:27: executing "main" at <now>: wrong type for value; expected int; got int64

@bep did your pull request for this issue mess up UnixNano? V0.149.1 also affected.

For now pass the value through the int function. I’ve update the test repo and documentation accordingly.

1 Like

How is per day time period determined?

https://gohugo.io/methods/time/yearday/

That’s not what I intended with the question. Intended meaning is if the pages change if a build is triggered after each day.

If the day of the year changes (e.g., 364 → 365) the result will change.

1 Like