Can you break a page summary by sentences not paragraphs from v0.134?

Related to summaryLength applies differently since 0.134.0 · Issue #12837, I want to have summaries of my pages which are whole sentences but not necessarily whole paragraphs. Previously I had set summaryLength = 10 in my config and used the auto summary to get back a summary which was suitably short. This doesn’t work anymore, because the new summary code returns at least one whole paragraph.

Using the strings.Truncate function doesn’t work for me out-of-the-box because there’s no way to ensure that the summary returned will be a whole sentence.

For example, if the post starts off as:
“”"
The sixth of January, 1482, is not, however, a day of which history has preserved the memory. There was nothing notable in the event which thus set the bells and the bourgeois of Paris in a ferment from early morning.

It was neither an assault by the Picards nor the Burgundians, nor a hunt led along in procession, nor a revolt of scholars in the town of Laas, nor an entry of “our much dread lord, monsieur the king,” nor even a pretty hanging of male and female thieves by the courts of Paris.
“”"
then I would want the summary to be “The sixth of January, 1482, is not, however, a day of which history has preserved the memory”, since that is the integer-sentence-length summary closest to 10 words.

Is there some way to achieve this, or to restore this to the summary functions?

No.

You could use manual summaries.

shortening the automatic Summary to a sentence breakpoint within the paragraph will be a little complex, as this may also contain html tags. And Identifying the ‘dot’ as sentence end dot might be a problem, too.

a simple brute force, if the content does not contain html tags but plain text could be something like that to manually calculate.

lacks lot of checks like

  • are there 10 words
  • handling multiple paragraphs in the first 10 words
      {{ $text := .PlainWords }}
      {{ $i := 10 }}
      {{ range after 10 $text }}
         {{ $i = add $i 1 }}
         {{ if findRE `\.$` . }}{{ break }}{{ end }}
      {{ end }}
      <p>{{ delimit (first $i $text) " " }}</p>
1 Like

Dot followed by a space and an uppercase letter (will miss sentences starting with a digit, though). Works only for certain languages, though.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.