Search Index .json-file for Lunr.js

#8

Thank you for the clarifications!

#9

Having custom (aka non-Google-based) search is the last stumbling block before fully migrating to Hugo, so I wonder if this is recommended way to do it?

The two popular Python-powered static-site-generators have plugin which use Tipue (jquery plugin for that so wonder if anyone has deployed it with Hugo and/or how does it compare with Lunr.js?

#10

Lunr.js is a vanilla JS plugin and it’s blazing fast (although it can be slower on larger sites)

Tipue is a jQuery plugin so it’s not as fast as Lunr.js since jQuery is slower than vanilla JS

#11

Would it be sufficient for sites with several hundreds blog posts?

Any other recommendation?

#12

I haven’t used Lunr.js it with hundreds of pages yet. But I have found out a way to keep the JSON file size in check by having Lunr.js search only for titles and descriptions -not the whole content-.

If you need a full blown content search then look into Algolia. It’s a paid product but it has a free plan also. And people in this forum seem to love it.

#13

I’ve used Lunr and Tipue on larger sites. A few hundred posts shouldn’t be a problem, particularly if you limit your fields, as @onedrawingperday suggests. For larger sites they can be a slog and if search is important you’d probably want a server-side solution like Algolia.

#14

How did you like Tipue on larger sites?

[quote]For larger sites they can be a slog and if search is important you’d probably want a server-side solution like Algolia.
[/quote]

Well, for my private/non-profit and small business needs, cashing $49/mo is a bit prices in regard to Algolia…my whole webhosting at Webfaction is $10/mo. :slight_smile:

#15

@gour Tipue is fine. I wouldn’t use it now because I try not to use JQuery, but I think it performs pretty well. You can see it in action by hitting this page, and I think there or 2-3k pages on this site: http://www.ndbooks.com/books/?q=animal&page=1

One thing I didn’t like (and not sure if this is a general issue, or just the implementation I was using, but it reads from the Title meta tag, which means the results don’t look great.

Here’s an implementation with Lunr.js/vanillajs and the site has around 200 pages (though I’m querying a lot of text on this one) https://www.retroreport.org/search-results/?query=hillary

I hear you about price. I’m deciding right now between Algolia and building my own on a couple of new sites (actually, moving the two sites above to Hugo right now). I think it depends on how important search is to your users. If there’s no real ROI, then it’s certainly not worth it.

Here’s a presentation we had at our Meetup last year on some search options: https://www.thenewdynamic.org/event/2016/08/04/search-for-static-sites-by-aidan-feldman/

3 Likes
#16

Thanks for this discussion, is there any guide how to apply Lunr.js to hugo generated sites? I am not Javascript expert… Thank you very much. I would be interested in some search field to be available on generated sites.

#17

Use the forum search, and read about how to request help.

#18

Hey @Ladislav_Jech,

Sorry for the late reply, feel I’m a main culprit in the Lunr discussions. @maiki is right, that’s how most of us have learned (while trying to ask only questions that haven’t been answered)

But as just might happen, I have the thread for you right here (check the links in my second post in the thread first).

#19

I am also looking for a complete lunr.js setup guide.

I had stumbled across the thread you linked, and I got the index.json created by Hugo working. But that’s only small part of the whole setup. As I don’t know JS, I am lost at how to interface that .json file with lunr.js in the Hugo templates.

So I started collecting real-world examples where people use lunr.js + Hugo: How to add lunr.js to your site?

I haven’t yet found time to understand the sources linked in there, reverse engineer and extract only the lunr.js stuff into my site.

I have also opened an issue on HugoDocs requesting to add a start-to-finish Hugo + lunr.js setup guide.

#20

Hey mate,
I utterly suck at js, you really only need to generate the searchindex and then make sure lunr.js points to it – there’s no “javascript coding” involved.

/Good luck

#21

You need to generate the search index with Lunr, so I don’t see how you can avoid the JS coding bit.

#22

@kaushalmodi, this is very similar to the solution I use – however make sure the index is generated on the fly.

I don’t remember where I found the JS that halfelf used, but I didn’t write it myself (although I wish I had time to).

#23

Building a search index is a fairly slow process, so I would guess that it would be faster to:

  • Serve a pre-built index
  • Add Gzip and some sensible cache headers to that index
1 Like
#24

Kinda new here and I know this discussion is a few months old, but thought I’d mention I recently went through setting this up on my site. I wrote up walk-through here: https://www.mattwalters.net/posts/hugo-and-lunr/ (I have very little on my site but you’ll get search results if you search for things like “hugo” or “spacex”)

After reading over this discussion I might should make a few changes to my setup (particularly the parts talking about using alternate output formats)

My implementation is still evolving and has progressed some from that walk-through. For instance, I pre-build the actual Lunr index (not just the JSON of content) during the build process instead of having Lunr build it on the page load. This means I don’t have to call lunr.add() on the individual documents that are in the JSON, that step is done as part of the build process rather than in the browser which speeds getting the search results page ready to show results.

@Rick if you look at the search.js that actually running on my site (instead of the one in the walk-through) you can see my solution for letting the user enter a search phrase on one page but perform the search on another page. Essentially the search form uses the GET method to submit to the search results page and when the search results page loads, it checks to see if the query parameter for their search phrase is populated. If it is, then it performs a search using that phrase.

I’m hoping to put up a new post when my efforts settle some more to show how I pre-build the index, etc. But if someone is really interested and wants to look at the raw code without much explanation, then let me know and I can throw up a gist or something for the relevant parts. It does require a slightly more complex build process because you have to use node to execute some custom JS that does the index building but I was able to get it working with netlify’s CI/CD pretty easily.

Search - am I doing it right?
#25

Interesting, out traveling but will have a look as soon as possible!

#26

Looks good mate! What search parameters are you using, e.g. just title or entire content of a result?

#27

@Rick In my write-up, I’m putting title, summary, and content into a JSON file and then pulling that from the server and building the index in the browser. When a search happens, I get back the ref from Lunr and look up the Title and Summary from the same file to display to the user. I’ve change the approach over the past couple of nights though.

In my current set up, I write out 2 files at build time:

File 1: Contains Title and Content
File 2: Contains Title and Summary

I have a little JavaScript file that gets run by node during build, and it pre-builds the index needed by Lunr by using File 1, this creates a File 3 (fully built index).

When the user loads the search / search-results page, I load File 3 and File 2 from the server via Ajax. File 3 is what Lunr executes the search against (so it is searching Title and full Content), getting a ranked listing of ref fields. I then use File 2 to display the Title and Summary for those posts / pages.

Ideally, I could include the Title and Summary inside the pre-built index (File 3), but Lunr doesn’t seem to support this, and based on an Issue / ticket I found on Github it doesn’t look like adding this kind of support is anywhere on the horizon.

Really, the “win” to my current process is that the building of the index in the browser can, apparently, take a good bit of time depending on amount of content which is obviously bad for visitors. Not really an issue on my site right now, but might be if I ever import my old content.

1 Like