How are you implementing site search

Thanks, @sebz, lunr.js looks great and your gist gave me a real boost on figuring out how to put everything together.

I’ve gone a slightly different direction with it though and am using a slightly hacky, but effective way to generate the JSON file with Hugo itself so I don’t need Grunt or any external tools like that.

Here’s what I’m doing:

I created a dummy content file like content/ that just has its type set to json.

Then I made layouts/json/single.html with something like this:

[{{ range $index, $page := .Site.Pages }}{{ if ne $page.Type "json" }}{{ if $index }},{{ end }}
    "href": "{{ $page.RelPermalink }}",
    "title": "{{ $page.Title }}",
    "tags": [{{ range $tindex, $tag := $page.Params.tags }}{{ if $tindex }}, {{ end }}"{{ $tag }}"{{ end }}],
    "content": "{{ $page.PlainWords }}"
}{{ end }}{{ end }}

When Hugo publishes the site, it will create public/json/index.html with content that is actually just JSON data. At that point, a simple cp or mv to public/static/js/lunr/PagesIndex.json and you’re in business (I just add that to my deploy script). Everything else should basically work the same.

Obviously, that’s an ugly hack of a template, and anywhere else on your site that you might loop over all the content, you need to be careful to exclude the dummy content. So far, it seems to be pretty reliable for escaping things and generating valid JSON, but it’s not ideal. The .PlainWords method on the Page object doesn’t seem to be documented anywhere and I suppose might disappear or change in future Hugo releases, but currently it’s the cleanest way I could figure out for getting the text of the page content with all markup/rendering stripped out.

With increasingly powerful front-end technology, I think there’s a lot of potential for extending sites this way, publishing a JSON dump of the content (or some subset of it) and letting JS do interesting things with it.

A couple features added to Hugo would make it a lot easier and cleaner:

  • built-in JSON output. Eg, if you could just do $page.JSON and get a nice JSON string out.
  • if not that, at least a cleaner way to get the raw, original text of the content without rendering it.
  • some way to have Hugo publish to a different path, or at least set the extension on the content (making the mv/cp unecessary)