I am looking to try and configure Pagefind to work for over 1 million pages. Currently it has been crashing after trying 300,000 pages with hugo. The idea would be to utilize: multisite
Chat GPT ideas:
UI strategy: The emitted search-init.js defers merging shards until DOMContentLoaded. Change that logic to attach on the first keystroke if you want an even lighter landing.
Glob limiting: Each shard call restricts indexing to q/{prefix}/**/*.html so your canonical URLs remain correct (we keep --site at public/).
So..
The idea would be to treat each prefix for our q and a as a individual site: ie qeeebo.com/q/ab/
qeeebo.com/q/cc/
etc
So sharding the content via the prefixes we are already using. Any ideas from others? Essentially the problem we are looking a solution for is to have a simple search that works on 1 million + static pages.
also in our context we are just searching the titles of the pages and not the content answers to keep this scalable.