Hugo + self-hosted Search (Elastic or similar) Integration - tips?

I’m not certain where the boundaries of this challenge lie (Hugo feature? Theme feature? Custom development?), so this is more of a request for help and pointers on the topic as my searching for prior examples hasn’t delivered anything suitable for me yet.

I’ve got a couple of “large” Hugo sites for internal to the workplace access. Both contain either personal or sensitive data, and thus are not internet accessible. By large - one build is currently ~60k pages, and the other (in development - because it is very hard to build!) is about 1.5million pages. They’re both based on Hugo + Docsy (a setup we have running well on our Gitlab Pages system).

Free text searching of all content is a must have feature for these sites, so fixing search is a bit of a priority for me presently.

My initial concern is that for the 60k site, the offline-search-index.nnnnn.json file is ~850MB at present, and that’s causing slow browser page load times - and therefore that a dedicated, not-in-browser-memory-JS-based search platform would help.

My second goal is that we have multiple Hugo+Docsy builds on a range of projects on the Gitlab system, and I’d like an option to search across them all - so a single search index.

We have an Elastic Search cluster setup for another project internally, which would seem like an ideal platform to try and utilise for these two goals since we can’t use an off-site search-as-a-service platform such as Algolia.

Therefore I have essentially two requests here:

  1. Sanity check - am I correct in needing to look at a self-hosted search platform?
  2. If so, then what is recommended, and are there any guides you can point me at to help please?

I’ll note that I had seen a Bonsai SaaS guide previously (that’s currently 404-ing), and I’ve seen a 5 year old NPM module for Hugo+ElasticSearch, but that last one doesn’t seem to be complete or relevant.

Thanks in advance!

As always, I highly recommend Pagefind.

2 Likes

Pagefind sounds really promising. The only thing that bugs me is the missing integration with hugo serve as far as I understood.

Did you solve this or do I have a wrong understanding?

1 Like

I didn’t, but Régis Philibert did (see the 2022-07-31 update therein):

1 Like

It’s some time since I looked into this, but for self-hosting I would look at Bleve:

https://blevesearch.com/

Their site seems to be partly down (which is not great PR for Hugo since it’s built with … Hugo).

1 Like

Hi, just posted a simple tip as a new topic on how to use the Hugo server in local development with Pagefind.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.