Automatic summary's `summaryLength` seems broken in the case of `plainify`

Since v0.134.0, Summary was returned in type HTML, if there is some pictures elements in Summary, it probably produces empty summary with plainify, is there a workaround to bring old behavior back?

It would be great if there is PlainSummary method, it would save users time to modifying their content.

I don’t know how Hugo exactly processes the summary, but here are my studies:

To see my guess, scroll down to the bottom.


I tested with this markdown content.

First content.

<img src="https://xdaforums.com/data/assets/logo/xda-white-text.png">

Pellentesque quis neque non turpis scelerisque sodales. Suspendisse ultrices magna quis nisl malesuada, non porttitor dui eleifend. Duis ac purus vitae sapien semper eleifend. Nunc sit amet massa nec nisl mollis facilisis. Vivamus quis dui eget ex dignissim ultricies id eget neque. Vivamus pharetra nibh fringilla, varius ex in, facilisis arcu. Sed at dui sit amet orci sagittis ultrices. Nam facilisis justo et neque tincidunt euismod. Nam quis ligula ac ipsum auctor rutrum et non orci. Mauris pharetra elit sed nunc dapibus, quis porta ligula pulvinar. Sed bibendum ex id sagittis ullamcorper. Suspendisse eget eros pretium, congue lacus non, posuere lacus. Quisque id augue vitae metus facilisis sollicitudin at quis ipsum. Proin tristique justo eu tellus interdum maximus. Integer pretium velit iaculis, ullamcorper purus ac, pulvinar nibh. Vivamus pellentesque enim et sem varius, at mollis lectus elementum. Donec eget tortor cursus, tincidunt elit a, sagittis nisi. Nunc iaculis quam et tortor finibus, interdum sagittis leo lobortis. Aenean quis ex viverra, viverra diam vel, posuere tellus.

The generated meta tag is like this:

<meta name="description" content="First content.
Pellentesque quis neque non turpis scelerisque sodales. Suspendisse ultrices magna quis nisl malesuada, non porttitor dui eleifend. Duis ac purus vitae sapien semper eleifend. Nunc sit amet massa nec nisl mollis facilisis. Vivamus quis dui eget ex dignissim ultricies id eget neque. Vivamus pharetra nibh fringilla, varius ex in, facilisis arcu. Sed at dui sit amet orci sagittis ultrices. Nam facilisis justo et neque tincidunt euismod. Nam quis ligula ac ipsum auctor rutrum et non orci. Mauris pharetra elit sed nunc dapibus, quis porta ligula pulvinar. Sed bibendum ex id sagittis ullamcorper. Suspendisse eget eros pretium, congue lacus non, posuere lacus. Quisque id augue vitae metus facilisis sollicitudin at quis ipsum. Proin tristique justo eu tellus interdum maximus. Integer pretium velit iaculis, ullamcorper purus ac, pulvinar nibh. Vivamus pellentesque enim et sem varius, at mollis lectus elementum. Donec eget tortor cursus, tincidunt elit a, sagittis nisi. Nunc iaculis quam et tortor finibus, interdum sagittis leo lobortis. Aenean quis ex viverra, viverra diam vel, posuere tellus.
">

This seems to ignore what I declared in hugo.yml, though.

hasCJKLanguage: false
summaryLength: 25

But this is not that important for now.


Now I changed the content like this:

First content.

![](https://xdaforums.com/data/assets/logo/xda-white-text.png)

Pellentesque quis neque non turpis scelerisque sodales. ...

The Hugo theme I use modifies output of ![]() and produces <p><picture><img class="img-fluid" src="https://xdaforums.com/data/assets/logo/xda-white-text.png" alt="" loading="lazy"></picture></p>.

But the summary is cut-off like this:

<meta name="description" content="First content.
">

I put the raw HTML code into markdown like this:

First content.

<p>















  
  <picture>
  <img class="img-fluid" src="https://xdaforums.com/data/assets/logo/xda-white-text.png" alt="" loading="lazy">
</picture>
</p>

Pellentesque quis neque non turpis scelerisque sodales. ...

(HTML code is directly copied from the generated code above, without any modification)

The generated meta tag is good:

<meta name="description" content="First content.
Pellentesque quis neque non turpis scelerisque sodales. ...
">

I have a shortcode that puts HTML code into article.

First content.

{{< gallery/image src="https://xdaforums.com/data/assets/logo/xda-white-text.png" >}}

Pellentesque quis neque non turpis scelerisque sodales. ...

The result of the shortcode is <div class="gallery" align="center"><div style="width: 100%"><figure class="gallery-figure"><div class="gallery-container"><div class="gallery-inner" style="width: 100%"><picture><img class="img-fluid" src="https://xdaforums.com/data/assets/logo/xda-white-text.png" alt="" loading="lazy"></picture></div></div></figure></div></div>.

The meta tag is broken.

<meta name="description" content="First content.
">

But if I replace shortcode part to pure HTML code like this:

First content.

<div class="gallery" align="center"><div style="width: 100%"><figure class="gallery-figure"><div class="gallery-container"><div class="gallery-inner" style="width: 100%"><picture><img class="img-fluid" src="https://xdaforums.com/data/assets/logo/xda-white-text.png" alt="" loading="lazy"></picture></div></div></figure></div></div>

Pellentesque quis neque non turpis scelerisque sodales. ...

It works again!

<meta name="description" content="First content.
Pellentesque quis neque non turpis scelerisque sodales. ...
">

So here is my guess.

Hugo can exclude pure HTML code from ‘automatic summary’.
But it completely ignores every words after ‘generated HTML code’.

Like I said before, I don’t know well about Hugo’s code, so this is just an assumption.

But I hope this tests will help devs diagnose the reason of this problem.

I don’t understand what you mean. Please provide a concise example.

Here is an example, I think the Summary was truncated early with (summaryLength), the plainify isn’t helpful in this case.

// index.html
{{ range .Pages }}
  {{ .Summary | plainify | warnf "%s" }}
{{ end }}
// content/foo.md
---
title: Foo
---

{{< gallery >}}

Bar
// shortcodes/gallery.html
<div>
  <picture>
    <img src="imgs/1.jpg" alt="1"/>
  </picture>
  <picture>
    <img src="imgs/2.jpg" alt="2"/>
  </picture>
  <picture>
    <img src="imgs/3.jpg" alt="3"/>
  </picture>
  <picture>
    <img src="imgs/4.jpg" alt="4"/>
  </picture>
  <picture>
    <img src="imgs/5.jpg" alt="5"/>
  </picture>
</div>
$ hugo
hugo v0.134.0-77df7bbbff8ce6b56ed693270088de973a87d5ce+extended windows/amd64 BuildDate=2024-09-03T09:54:22Z VendorInfo=gohugoio

WARN

$ hugo-0.129.0
hugo v0.129.0-e85be29867d71e09ce48d293ad9d1f715bc09bb9+extended windows/amd64 BuildDate=2024-07-17T13:29:16Z VendorInfo=gohugoio

WARN  Bar

$ hugo-0.132.0
hugo v0.131.0-bfbee17932ff24009008aa94cdd75c0c41f59279+extended windows/amd64 BuildDate=2024-08-02T09:03:48Z VendorInfo=gohugoio

WARN  Bar

I was unable to reproduce as described, so my example must be missing something:

git clone --single-branch -b hugo-forum-topic-51466 https://github.com/jmooring/hugo-testing hugo-forum-topic-51466
cd hugo-forum-topic-51466
hugo server

That’s strange, but I can replicate it by adding one more picture element to your example.

What is summaryLength in your site config?

I haven’t configure it, I create the example with hugo new site.

Here is the hugo.toml.

baseURL = 'https://example.org/'
languageCode = 'en-us'
title = 'My New Hugo Site'

In this case, Summary | plainify will produce an empty string or much shorter text, would PlainSummary be considered as a backwards compatibility feature request, to convert HTML to plain text, then truncated to whole sentence closest to summaryLength?

1 Like

No that will not happen. Mostly because ContentWitoutSummary + PlainSummary = Content is not possible, but also because it’s not in the … budget.

I have, however opened up the issue above to see if it’s possible to tweak some of the mentioned corner cases above, but there’s a reason we have multiple summary types: If you’re not happy with the result of auto, add a summary divider or add some front matter summary.

2 Likes

OK, I see.

I think this patch should solve most issues. I will try to get out a patch release some time today:

3 Likes

Thank you very much.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.