A simple javascript based full text search function

The full text search presented here by Tanja Becker (Javascript) and myself (integration in Hugo) is based on a JSON array that contains all of the website’s content and a Javascript search function that processes the array. The JSON array is generated automatically by HUGO. Integrating the search function into an existing HUGO website is very easy with the following steps:

  1. Create Template index.json
  2. Configure config.toml
  3. Insert Javascript search.js into website
  4. Create shortcode search.html and adjust if necessary
  5. Create Web page search.md

Example…


JSON array

Create a template index.json in the directory themes/<myTheme>/layouts/ :

index.json:

[ {{- $i := 0 -}}
{{- range where .Site.RegularPages "Section" "ne" "" -}}
   {{- if not .Params.noSearch -}}
      {{- if gt $i 0 }},{{ end -}}
      {"date":"{{ .Date.Unix }}", "url":"{{ .Permalink }}", "title":{{ .Title | jsonify  }}, "summary":{{ with .Description}}{{ . | plainify | jsonify }}{{ else }}{{ .Summary | plainify | jsonify }}{{ end }}, "content":{{ .Content | plainify | jsonify }},"tags":[ {{- $t := 0 }}{{- range .Param "tags" -}}{{ if gt $t 0 }},{{ end }}{{ . | jsonify }}{{ $t = add $t 1 }}{{ end -}} ], "section": {{ .Section | jsonify -}} }
      {{- $i = add $i 1 -}}
   {{- end -}}
{{- end -}} ]

This creates a JSON array that automatically contains the content of all pages from all sections. Pages with noSearch: true in the Front Matter are ignored. Description overwrites Summary .

Add the following setting in config.toml :

[outputs]
   home = [ "HTML", "JSON" ]

This ensures that a file index.json (based on the template index.json) is automatically created in the document root, which contains the entire content of the website.

Search function

  • Save the Javascript search.js in static/js/search.js
    Download search.js
  • Create a shortcode search.html in the directory themes/<myTheme>/layouts/shortcodes/ with the following content
  • and include it with {{<search>}} in your search.md search page.

search.md:

---
title: "Site Search"
summary: "Search website for keywords"
date: 2020-10-29
---

{{< search >}}

search.html:

<form id="custom-search" name="custom-search" method="post" action="" onsubmit="customSearchResults(); return false;">
    <p>
    <input id="custom-search-field" type="text" name="search" value="" title="Search String" placeholder="SearchString">
    <!-- <input type="submit" value="Suchen"> -->
    </p>
    <!-- <p><em>Search Section:</em><br>
        <input type="checkbox"name="section[]" value="site" checked="checked"> all<br>
        <input type="checkbox" name="section[]" value="post"> Blog<br>
        <input type="checkbox" name="section[]" value="other-section"> Other Section
    </p> -->
    <p>
        <input type="radio" name="option" value="AND" checked="checked"> UND-Suche<br>
        <input type="radio" name="option" value="OR"> ODER-Suche
    </p>
</form>

<div id="custom-search-results"></div>

<script>
// CUSTOM AREA
let params = {
    json_src       : '/index.json', // for multiple sources: comma separated list of JSONarrays
    minlength      : 3,
    defaultsearch  : 'AND',
    sort_date      : 'DESC',
    autocomplete   : 1, // 0: form needs a submit button
    section_search : 0, // 1: needs checkboxes with name="section[]"
    badwords       : 'and,there,the,this,that,you', //ignore this words
    json_wait      : '<p><em>loading...</em></p>',
    json_ready     : '<p><em>Please enter a search term</em></p>',
    extern_icon    : ' (external link)', // marker for external links (optional)
    err_badstring  : '<p>To short!</p>',
    err_noresult   : '<p>Unfortunately no search result. Please try again.</p>',
    err_norequest  : '<p style="text-align: center; color:red;">The full text search is currently not available.</p>',
    err_filefailed : '<p style="text-align: center;color: red;">Search file could not be retrieved.</p>',
    res_one_item   : '<p><em>[CNT] RESULT</em></p>',
    res_more_items : '<p><em>[CNT] RESULTS</em></p>',
    res_out_top    : '<ul>',
    res_out_bottom : '</ul>',
    res_item_tpl   : '<li><a href="[URL]">[TITLE]</a><br>[DATE]:[SUMMARY]<br><em>[SECTION][TAGS]</em></li>',
    add_searchlink : '<p><a href="https://duckduckgo.com/?q=site:yourdomain.com [QUERY]" target="_blank"><i>Not satisfied with the search results? External search via DuckDuckGo ...</i></a></p>'
};

// Translation of section name (optional)
let section_trans = {
    "post" : "Blog",
    // "other-section" : "Other Section"
};

let searchfield_weight = {
    "title"   : 5,
    "tags"    : 5,
    "summary" : 2,
    "content" : 1
};
// CUSTOM AREA END
</script>
<script type="text/javascript" src="/js/search.js"></script>

Download source code search.html

Notes

This example provides a simple search function that works in every HUGO website (the HTML code in search.html may have to be adapted to the existing theme):
test here …

  • The search begins as soon as at least 3 characters have been typed ( minlength ).
  • badwords contains a list of words to be ignored.
  • If you prefer to trigger the search via the submit button, you have to set autocomplete to “0”.
  • You can also call the search page via ?query=searchterm
  • The search results are sorted according to the number of hits for the search term, with searchfield_weight as a weighting. “5” means that the hit in the corresponding field is weighted with a factor of 5.
  • The search function searches through all Sections. If you want to show in the search results which section the result comes from, you can use section_trans to translate the internal section key.
  • It is also possible to link to an external search engine (in the example DuckDuckGo), if the internal search does not provide a satisfactory result in individual cases (add_searchlink)

More options

  • The search function can search different sections separately. This must be set section_search to “1” and the corresponding checkboxes added to the search form (commented out in the example).
  • It is possible to specify several JSON arrays as the source for the search. This is especially useful if you want a single search function to search multiple websites. To indicate in the results which hits are local and which are from an external website, each item in the array of the external source can contain the value "extern":"1". This causes the string from extern_icon to be added to the search result after [TITLE]. Example: bienenkiste.de/suchen/?query=Königin
  • With very large JSON files, it may take a moment for the file to load. During this time is json_wait displayed. A spinner/loader icon can be inserted here - if available in the theme. As soon as the file is loaded json_ready is displayed.

(Translation of my German blog article Volltextsuche für Hugo)

24 Likes

Funny, I have released a similar article lately, but with a different goal:

This is a full text search, however, the caveat is to use a database with a small rest-like script.
The sql file for the database is generated when Hugo runs and the Javascript does the search.

The downside of your solution is of course that it is keyword only and that the download size grows as your blog grows and the search will become slower over time, especially with limited data rates in non western countries.

But, as we Germans say, you got to die one death.

3 Likes

Interesting solution! Will be remembered for other problems. Thanks!
You’re right: I had a different goal. I wanted to write a recipe for HUGO dummies that can be used without any programming or server admin knowledge.
I was also worried at first that it might be a bit slow. But I think for 90% of HUGO websites it will probably be enough and run super fast. The example https://bienenkiste.de/suchen/ searches almost 200 pages. The two JSON files together are ~650 KB in size. Many homepages are 10 times “heavier” today. And the javascript runs very very fast.
The main problem with larger sites is that I have not provided pagination. But you can compensate this a little by increasing minlength.

3 Likes

This sounds awesome, thanks for sharing it. I’m going to look into how we can integrate this into some starter sites we’re working on including in Site.js (https://sitejs.org).

I’d also be interested in seeing what the performance characteristics would be if it was to use JSDB (https://ar.al/2020/10/20/introducing-jsdb/) instead of JSON.

3 Likes

Just wanted to say thanks for this code! I was able to integrate it into my new site and it works great :slight_smile:

https://educationfunding.uk

3 Likes

Same here this was a great article and really fast/easy to get setup. I’d much rather use this than have a gulp task that generates some JSON you have to remember to deploy as part of your release.

Thank you for the contribution.

2 Likes

I’ve just applied your code. As a programming-dummy and hugo-novice thank you for this beatiful piece of work! It works flawlessly, immediately after I created all files required in instruction.
Thanks :smiley:

3 Likes

This is so simple and easy to set up. Thank you for your work!

Just wander if you could highlight the keyword searching.

Qiang Lu

Superb. Work like a charm and can integrate with existing search boxes on site. Really impressed!

1 Like

I don’t think this function is that important. Maybe someone else would like to extend the script? There are javascripts for keyword highlighting on the internet. What is missing is the looping through of the keyword to the displayed page.

This works great for me locally. When I export the site, it is just showing “loading” under the search. A warning is displaying “Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at file:///index.json. (Reason: CORS request not HTTP).”

I am still just getting started with Hugo. I think it has to do with the url it’s fetching it from, but I am not too sure where in the script it is calling the file up. Any guidance would be helpful.

The problem is the wrong URL ///index.json automaticly generated based on baseurl in config.toml.
To possible ways to solve the problem:

  1. specify the correct baseurl (e.g. baseurl = “https://my-website.com/”) # Include trailing slash!
  2. insert the full URL auf index.json in the CUSTOM AREA of search.html (e.g. json_src:‘https://my-website.com/index.json’)

(no. 2 doesn’t work local. I would prefer no. 1)

1 Like

Thanks for the reply- do you think a relative link might work?

try it. (I’don’t know what you specified in config.toml, baseurl) see: Gohugo without baseurl ? or with relative url only?

How would you make this work on a multilingual site? This search works only for the default language when defaultContentLanguageInSubdir = false is set in the config.toml. Otherwise I get the following error message: Search file could not be retrieved.

I think the index.json needs to be adjusted for that but I am not able to figure out how.
Thank you in advance!

I am using this in two languages on my site, not a big issue.

Firtly, use i18n to translate strings in a template, like

[Search]
other = "Search"

Then, in the config, for each language, you may want to specify /search/ string manually.
Example:

[languages]
  [languages.pl]
    search = "/szukaj/"

  [languages.en]
    search = "/en/search/"

You need to create two markdown files for results.
For example, PL is my default, so I got search.md, and for EN search.en.md

each markdown with specified url like url: /szukaj/ or /url: /en/search/

You will need to replace search action in to adjust to multi languages

action="/search/?query="

This will call shortcode {{< search >}}
Then you need two shortcode files. My example is search.html and `search.en.html.

Noticed that you calling shortcode by using just search NOT search.en.

Let me know when you will be stuck and I can help you with that.

3 Likes

Thank you for your help. I eventually realized what needed to be adjusted. It is the search.html that has to be adapted to each language. So, for instance, search.en.html needs the following line:
json_src : /en/index.json

All solved now for me.
And thanks to the author of this search function!

3 Likes

Thank you very much! Flawless integration, it just works and is easily customizable. 10/10. :slight_smile:

1 Like