Building a JSON search index - trouble with .Plain

Hi all,

I’m trying to let hugo build a JSON search index for my site. I created an index.json template and set up my output formats correctly - the file exists and there are no errors.

Here’s what it does:

{{- $.Scratch.Add "index" slice -}}
{{- range .Site.Pages -}}{{- $.Scratch.Add "index" (dict "title" .Title "tags" .Params.tags "content" .Plain "permalink" .Permalink) -}}{{- end -}}
{{- $.Scratch.Get "index" | jsonify -}}

Now the problem is that I have a bunch of pages .Plain returns nothing for, as they don’t have markdown content - their template pulls the content in from different smaller *.md files referenced as page resources.

Is there any function that could take the whole (generated) page including the content from the respective page resources, strip that of any HTML and add it to my index?

…or is my only option to add some more frontmatter that would be used for the index when it comes to those pages?

Any ideas appreciated, even up to building the index post-hugo with gulp before deploying (if that could work…?).

Hmm, I’m not sure. Does the .Content variable contain the right content that you look for? (Including your referenced assets.)

If that’s the case, you can pass .Content through the plainify function to strip any HTML formatting and get plain text: .Content | plainify.

That should be roughly the same as using .Plain, but this time hopefully with the referenced assets included. :slightly_smiling_face:

Thanks for the suggestion :slightly_smiling_face:

I tried .Content without plainify earlier and it showed the same behaviour if I’m not mistaken.

I’ll give it another shot tomorrow just to make sure though.

I’m guessing you have a single.html that, in addition to .Content adds a lot of stuff from the template. What you could probably do is to create a copy of single.html to, say mysingle.html and then replace the content in single.html with something ala:

{{ $content := .Render "mysingle" }}
{{ $plain := content | plainify }}
{{ $content }}
1 Like

Thanks for your reply.

Correct, the problem is a single.html that pulls all of its content from the referenced page resources (that are needed, as parts of that content are also pulled by the single’s parent list page and the home page).

I tried your approach:

{{ $content := .Render "featuresCt" }}
{{ $plain := $content | plainify }}
{{ $content }}

The site built without errors and looked alright, the JSON index is still blank for the respective pages. No other page variables (tried .WordCount and .Summary) yield results for the respective pages either, as if they’re skipped for their lack of non-frontmatter content in the actual index.md files.

Using hugo version v0.58.0-64D8BF1E windows/amd64 BuildDate: 2019-09-04T15:43:46Z if that’s of any significance.

Sorry for the double post, just sorted it out myself.

What worked was adding an if to get to the pages in the 2 sections that pull their content from page resources and returned empty .Content so far.
Inside that if statement, I added another scratch and a range to go through all the data from .Resources.Match "**.md". That data then gets sent through plainify and added as the content for the respective page in my JSON index.

Might not be pretty, but it works and it doesn’t require redundant search keywords etc.

Here’s a more detailed piece on my workaround, just in case someone stumbles over this in the future: https://ttntm.me/blog/hugo-plain-function-ignores-page-resources/

2 Likes