How are you implementing site search

In my testing, when I used Plain, something about linebreaks in the result was making the JSON invalid. Switching to PlainWords fixed it. I didn’t really spend any time trying to figure out exactly why.

Thanks @sebz!

I used your gist to integrate search. I wrote my own index generator that preprocesses my Markdown (gets rid of all of the words lunr doesn’t use, html elements, …). It’s way less generic than the solution @thraxil provided, but it works for me (read: low chance of it being useful for others). I then created a search.html partial template with some parameters I keep in my site config.toml file. Works great!

I have everything deployed with a Fabric script, so it’s still as easy as hugo new .., edit the post, deploy.

See this:

It ends with a demo with a search prototype in the Gophercon Hugo-built site.

2 Likes

It seems to be building the index using https://github.com/blevesearch/hugoidx

http://www.blevesearch.com/news/Site-Search/

Just spent the afternoon experimenting with Bleve. Very impressed. They do lack a little in the documentation department (I was guessing a little in the document mapping), but that is coming.

But as it is, with the fairly big optional language support, I think it has to stay as a side project of Hugo – plugin, maybe.

@sebz I’m just looking at Hugo for the first time, but I’ve been developing a portfolio site for a bit now using Jekyll. I implemented search using Tipue, but I’m hoping to remove the jQuery dependency entirely and move to lunr.js. Although Hugo is obviously different, any chance you’d be willing to share some of the code for you how you implemented lunr.js? Do you have the ability to customize or “promote” certain results for certain terms? Any help would be greatly appreciated. I’m considering moving a rather large public-facing site to static, and Jekyll is a little too slow for my tastes. Thanks!

In an earlier post he linked this Gist. The comments should help to give you an idea of how the script works.

@digitalcraftsman thanks for sharing the gist :slight_smile:

@rdwatters, as I just commented in the gist, this solution is far from perfect… In my case the number of pages has dramatically increased and downloading the index on the client side is now an issue (here: https://doc.airvantage.net/av/).

A possibility would be to have a small nodejs webapp to serve a search API based on the same lunrjs index. But it implicates adding a server, managing the index updates and monitoring this small webapp and so on…

As we speak, I’m evaluating the integration of Google Custom Search instead of this lunrjs based solution… I know… It’s a shame :slight_smile:
So far it looks good, I’m afraid of losing a bit of flexibility though. Adding the small nodejs webapp is not excluded yet.

I’ll post my results in this thread soon.

Cheers !

I have tested the Bleve search, and that is fairly easy to set up with Hugo. It needs a server, but that is mostly straight forward with Go. Don’t think it’s hostable out-of the box on Google’s appengine, but that is coming, I guess.

@bep, in fact, one of my coworker (and Go fanboy :P) tested it too. He encountered some issues (solved by the Bleeve devs) and finally got stuck on a problem with raw HTML files we have along with the Markdown content…

To be honest I did not take the time to investigate, lazy to jump into the Go world… :frowning:

@bep and @sebz Thanks or the feedback. I guess my hope was that there was a way for me to write directly to a .json using Hugo (keep in mind I just barely discovered this SSG a couple days ago, although I love Jekyl and Hexo) so that I wouldn’t need to support two separate applications. The beauty of using something like Jekyll with GH pages is that everything is a little more self-contained. Very excited to start learning Go and this SSG. Cheers to all!

How big is your index? Are you including the full text of all of your content, or just a summary?

How are you generating your index?

If you’re generating an index with a build process you can minify and add a cache-busting fingerprint. The fact that the index is a static file means that the browser will cache it so you should only have to download it once (each time you update the index).

I was kind of bummed that Hugo doesn’t support writing metafiles out of the box (looping through site content to create an index). It’s easy to get around if you have a build process, though!

That seems like a good feature for someone to add. I like the idea that
Hugo could do this out of the box.

If you happen to be serving your Hugo site with Caddy, there’s a search add-on that automatically indexes your site and provides a search endpoint and search page (screenshot) that you can customize. The module is fairly new so it needs some proving, but it has a pretty solid foundation since it’s powered by bleve, which was mentioned in this thread earlier.

(The benefit here is that the search is built into your web server; Hugo generates the site and your web server automatically indexes the new pages.)

@spf13 In future, wouldn’t it be possible to use bleve as search engine for the Hugo website?

It’s possible today. They have an add on that reads Hugo source.

You mean hugoidx. But I meant that this would be a great alternative to the Google search field in the docs of Hugo.

Yes, or the plugin for Caddy. I agree that the Hugo site should use a search engine closer to the engine, and it will happen … but it involves work.

Is there any plan for Hugo to embrace one or the other for a future release?