HUGO

A simple javascript based full text search function

The full text search presented here by Tanja Becker (Javascript) and myself (integration in Hugo) is based on a JSON array that contains all of the website’s content and a Javascript search function that processes the array. The JSON array is generated automatically by HUGO. Integrating the search function into an existing HUGO website is very easy with the following steps:

  1. Create Template index.json
  2. Configure config.toml
  3. Insert Javascript search.js into website
  4. Create shortcode search.html and adjust if necessary
  5. Create Web page search.md

Example…


JSON array

Create a template index.json in the directory themes/<myTheme>/layouts/ :

index.json:

[ {{- $i := 0 -}}
{{- range where .Site.RegularPages "Section" "ne" "" -}}
   {{- if not .Params.noSearch -}}
      {{- if gt $i 0 }},{{ end -}}
      {"date":"{{ .Date.Unix }}", "url":"{{ .Permalink }}", "title":{{ .Title | jsonify  }}, "summary":{{ with .Description}}{{ . | plainify | jsonify }}{{ else }}{{ .Summary | plainify | jsonify }}{{ end }}, "content":{{ .Content | plainify | jsonify }},"tags":[ {{- $t := 0 }}{{- range .Param "tags" -}}{{ if gt $t 0 }},{{ end }}{{ . | jsonify }}{{ $t = add $t 1 }}{{ end -}} ], "section": {{ .Section | jsonify -}} }
      {{- $i = add $i 1 -}}
   {{- end -}}
{{- end -}} ]

This creates a JSON array that automatically contains the content of all pages from all sections. Pages with noSearch: true in the Front Matter are ignored. Description overwrites Summary .

Add the following setting in config.toml :

[outputs]
   home = [ "HTML", "JSON" ]

This ensures that a file index.json (based on the template index.json) is automatically created in the document root, which contains the entire content of the website.

Search function

  • Save the Javascript search.js in static/js/search.js
    Download search.js
  • Create a shortcode search.html in the directory themes/<myTheme>/layouts/shortcodes/ with the following content
  • and include it with {{<search>}} in your search.md search page.

search.md:

---
title: "Site Search"
summary: "Search website for keywords"
date: 2020-10-29
---

{{< search >}}

search.html:

<form id="custom-search" name="custom-search" method="post" action="" onsubmit="customSearchResults(); return false;">
    <p>
    <input id="custom-search-field" type="text" name="search" value="" title="Search String" placeholder="SearchString">
    <!-- <input type="submit" value="Suchen"> -->
    </p>
    <!-- <p><em>Search Section:</em><br>
        <input type="checkbox"name="section[]" value="site" checked="checked"> all<br>
        <input type="checkbox" name="section[]" value="post"> Blog<br>
        <input type="checkbox" name="section[]" value="other-section"> Other Section
    </p> -->
    <p>
        <input type="radio" name="option" value="AND" checked="checked"> UND-Suche<br>
        <input type="radio" name="option" value="OR"> ODER-Suche
    </p>
</form>

<div id="custom-search-results"></div>

<script>
// CUSTOM AREA
let params = {
    json_src       : '/index.json', // for multiple sources: comma separated list of JSONarrays
    minlength      : 3,
    defaultsearch  : 'AND',
    sort_date      : 'DESC',
    autocomplete   : 1, // 0: form needs a submit button
    section_search : 0, // 1: needs checkboxes with name="section[]"
    badwords       : 'and,there,the,this,that,you', //ignore this words
    json_wait      : '<p><em>loading...</em></p>',
    json_ready     : '<p><em>Please enter a search term</em></p>',
    extern_icon    : ' (external link)', // marker for external links (optional)
    err_badstring  : '<p>To short!</p>',
    err_noresult   : '<p>Unfortunately no search result. Please try again.</p>',
    err_norequest  : '<p style="text-align: center; color:red;">The full text search is currently not available.</p>',
    err_filefailed : '<p style="text-align: center;color: red;">Search file could not be retrieved.</p>',
    res_one_item   : '<p><em>[CNT] RESULT</em></p>',
    res_more_items : '<p><em>[CNT] RESULTS</em></p>',
    res_out_top    : '<ul>',
    res_out_bottom : '</ul>',
    res_item_tpl   : '<li><a href="[URL]">[TITLE]</a><br>[DATE]:[SUMMARY]<br><em>[SECTION][TAGS]</em></li>',
    add_searchlink : '<p><a href="https://duckduckgo.com/?q=site:yourdomain.com [QUERY]" target="_blank"><i>Not satisfied with the search results? External search via DuckDuckGo ...</i></a></p>'
};

// Translation of section name (optional)
let section_trans = {
    "post" : "Blog",
    // "other-section" : "Other Section"
};

let searchfield_weight = {
    "title"   : 5,
    "tags"    : 5,
    "summary" : 2,
    "content" : 1
};
// CUSTOM AREA END
</script>
<script type="text/javascript" src="/js/search.js"></script>

Download source code search.html

Notes

This example provides a simple search function that works in every HUGO website (the HTML code in search.html may have to be adapted to the existing theme):
test here …

  • The search begins as soon as at least 3 characters have been typed ( minlength ).
  • badwords contains a list of words to be ignored.
  • If you prefer to trigger the search via the submit button, you have to set autocomplete to “0”.
  • You can also call the search page via ?query=searchterm
  • The search results are sorted according to the number of hits for the search term, with searchfield_weight as a weighting. “5” means that the hit in the corresponding field is weighted with a factor of 5.
  • The search function searches through all Sections. If you want to show in the search results which section the result comes from, you can use section_trans to translate the internal section key.
  • It is also possible to link to an external search engine (in the example DuckDuckGo), if the internal search does not provide a satisfactory result in individual cases (add_searchlink)

More options

  • The search function can search different sections separately. This must be set section_search to “1” and the corresponding checkboxes added to the search form (commented out in the example).
  • It is possible to specify several JSON arrays as the source for the search. This is especially useful if you want a single search function to search multiple websites. To indicate in the results which hits are local and which are from an external website, each item in the array of the external source can contain the value "extern":"1". This causes the string from extern_icon to be added to the search result after [TITLE]. Example: bienenkiste.de/suchen/?query=Königin
  • With very large JSON files, it may take a moment for the file to load. During this time is json_wait displayed. A spinner/loader icon can be inserted here - if available in the theme. As soon as the file is loaded json_ready is displayed.

(Translation of my German blog article Volltextsuche für Hugo)

10 Likes

Funny, I have released a similar article lately, but with a different goal:

This is a full text search, however, the caveat is to use a database with a small rest-like script.
The sql file for the database is generated when Hugo runs and the Javascript does the search.

The downside of your solution is of course that it is keyword only and that the download size grows as your blog grows and the search will become slower over time, especially with limited data rates in non western countries.

But, as we Germans say, you got to die one death.

2 Likes

Interesting solution! Will be remembered for other problems. Thanks!
You’re right: I had a different goal. I wanted to write a recipe for HUGO dummies that can be used without any programming or server admin knowledge.
I was also worried at first that it might be a bit slow. But I think for 90% of HUGO websites it will probably be enough and run super fast. The example https://bienenkiste.de/suchen/ searches almost 200 pages. The two JSON files together are ~650 KB in size. Many homepages are 10 times “heavier” today. And the javascript runs very very fast.
The main problem with larger sites is that I have not provided pagination. But you can compensate this a little by increasing minlength.

1 Like

This sounds awesome, thanks for sharing it. I’m going to look into how we can integrate this into some starter sites we’re working on including in Site.js (https://sitejs.org).

I’d also be interested in seeing what the performance characteristics would be if it was to use JSDB (https://ar.al/2020/10/20/introducing-jsdb/) instead of JSON.

1 Like