Omit shortcode content from wordcount

I want to include a shortcode which has text in it within pages, but not have the words of that shortcode be counted within the word count total of the page. Is this possible to wrap the shortcode in a tag which tells it to not be counted?

Will you ever use the same shortcode more than once on the same page?

Not in my use case. I guess you can subtract a shortcode’s wordcount from the page’s total if it’s used on a page somehow?

shortcode

{{ printf "<!-- begin wordcount exclude -->" | safeHTML }}
{{ .Inner | .Page.RenderString }}
{{ printf "<!-- end wordcount exclude -->" | safeHTML }}

template

{{ $begin := "<!-- begin wordcount exclude -->"}}
{{ $end := "<!-- end wordcount exclude -->"}}
{{ $regex := print "(?s)" $begin ".*?" $end }}
{{ $wordcount := replaceRE $regex "" .Content | countwords }}

What I have for the template. How do I change this to show the proper wordcount?

	{{- if (.Param "ShowWordCount") }}

	{{ $begin := "<!-- begin wordcount exclude -->"}}
	{{ $end := "<!-- end wordcount exclude -->"}}
	{{ $regex := print "(?s)" $begin ".*?" $end }}
	{{ $wordcount := replaceRE $regex "" .Content | countwords }}
	
	<h5 id="wordcount"><center>Word Count: {{ $wordcount }}</center></h5>
	
	<h5 id="wordcount"><center>Word Count: {{ .WordCount }}</center></h5>
	
	
	{{- end }}

Seems like the values are wrong.

Word Count: 989 ← more with fewer??

Word Count: 910 ← .WordCount

Seems like .Content | countwords and .WordCount give different values.


It appears the issue is that replaceRE $regex "" .Content produces this which maybe is messing with the wordcount? Does it need to strip HTML entities first?

plainify doesn’t help I think?

I don’t know enough about this to say what is right. 951 is the wordcount Scrivener and wordcounter net both report for the text.

Can you share:

  • The content page (markdown)
  • The shortcode

The files

1 Like

Thanks. I was able to reproduce the problem. I’ll have a look at it in a few hours.

1 Like

For the moment, let’s ignore what other word counters return, and focus on the difference between .WordCount and countwords .Content.

(.CountWords) (countwords .Content)
Original 910 1002
Replace with ' 910 924
Replace with ... 910 910

Obviously, the countwords function doesn’t behave as expected with certain characters. I will log a bug. Our options at this point:

  1. Programmatically replace the troublesome characters in .Content before counting. Easy to do, but there are almost certainly other characters that cause problems.
  2. Use your original suggestion, and subtract the shortcode count from .WordCount. We still the run the risk of encountering characters that countwords does not like, but it is (theoretically) much easier to control the content of the shortcode.

Let’s go with #2.

{{ $begin := "<!-- begin wordcount exclude -->"}}
{{ $end := "<!-- end wordcount exclude -->"}}
{{ $regex := print "(?s)" $begin ".*?" $end }}

{{ $excluded := findRE $regex .Content }}
{{ $excluded = apply $excluded "replace" "." $begin "" }}
{{ $excluded = apply $excluded "replace" "." $end "" }}
{{ $excluded = delimit $excluded " " }}

{{ $wordcount := sub .WordCount (countwords $excluded) }}

<p>.WordCount = {{ .WordCount }}</p>
<p>Word count excluding shortcode output: {{ $wordcount }}</p>

With your content page this produces:

.WordCount = 910
Word count excluding shortcode output: 897

This code will work with multiple instances of the shortcode on a page.

Finally, with respect to Hugo’s count vs. Scrivener and the others, try this on the page:

<p>.WordCount = {{ .WordCount }}</p>
<p>Length of .PlainWords = {{ len .PlainWords }}</p>
{{ range .PlainWords }}
  {{ . }}<br>
{{ end }}

Scrivener and the others may do it a bit differently, but this looks pretty good to me.

3 Likes
1 Like

Great, thank you!

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.