Splitting post content for Algolia search index

Hi all. In case it’s helpful to anyone else who’s struggling with this, my solution is below.

This gives the desired output.

{
  "requests": [
    {{ range $idx, $page := $.Site.Pages -}}
      {{ if $idx -}},{{- end }}
      { "action": "updateObject",
        "body": {
          "objectID": {{ print $page.RelPermalink "_1" | jsonify }},
          "order": 1,
          "title": {{ $page.Title | jsonify }},
          "href": {{ $page.Permalink | jsonify }},
          "content": "{{ range first 1000 $page.PlainWords }}{{ . }} {{ end }}"
        }
      }{{ if gt $page.PlainWords 1000 }},
      { "action": "updateObject",
        "body": {
          "objectID": {{ print $page.RelPermalink "_2" | jsonify }},
          "order": 2,
          "title": {{ $page.Title | jsonify }},
          "href": {{ $page.Permalink | jsonify }},
          "content": "{{ range first 1000 (after 1000 $page.PlainWords) }}{{ . }} {{ end }}"
        }
      }{{- end -}}
    {{ end }}
  ]
}

It might be a horrible way of doing it, but I’ve created an if statement within the loop which checks the wordcount and creates an additional record for longer posts. Each record contains 1000 words. I’ve capped it at 2000, but if your posts are longer you can just create extra ifs. Object IDs are unique (as required by Algolia), deduping can be done with title or href, and order lets you sequence the records.

POSTing this to Algolia as per this thread works as expected.

2 Likes