ReadingTime is computed as purely CJK or non-CJK, should this be changed?

DeinAlptraum · June 16, 2022, 9:41pm

Currently, the reading time is computed as (p.wordCount + 212) / 213 for non-CJK languages, or (p.wordCount + 500) / 501 for CJK languages. See here.

It seems like a page is treated as CJK as long as at least one CJK character in it is recognized, and also hasCJKLanguage is set to true in the config.

This obviously leads to incorrect reading times on pages with mixed CJK/non-CJK content: the reading time is computed as if the content was purely CJK even if there’s only e.g. a single Kanji somewhere in there.

The logic for recognizing both types of text and computing their respective word counts correctly is already there in the function I’ve linked above. One would just have to sum up the CJK and non-CJK counts separately, compute their read times independently via the CJK and the non-CJK formula respectively, and finally sum up those separate measures for the total word count and reading time, which should give us an accurate measure even for mixed text.

Imo this would also make the differentiation depending on the setting of hasCJKLanguage unnecessary, since it can be treated correctly in mixed text already, though I guess this is up for debate.

Changing this seems simple enough and like a nice possible first contribution for me, and I’m also interested in changing this for my blog anyway. Are there any thoughts on this? Should I open an issue, and does this qualify as a bug?

chrillek · June 17, 2022, 10:48am

I’m wondering what the point of a “reading time” is anyway if it’s calculated from the word count. German, for example, tends to have longer words (and sentences) than English. And people have different reading speeds.

See

for reading speeds in different languages. Words per minute ranged from 138 (Arabic) to 228 (English). And reading speed drops about 20% after 60 years of age.

tut · June 20, 2022, 12:00pm

Personally, I don’t take it at face vaue, but rather see it as the amount of time I should dedicate to read a certain content (which may be more or less time than the reading time).

idarek · June 20, 2022, 12:32pm

I am using this kind of code to present read time

<div class="readtime">
<div class="headico"><svg viewBox="0 0 1792 1792" xmlns="http://www.w3.org/2000/svg"><title>{{ T "ReadTime" }}</title><path d="M1024 544v448q0 14-9 23t-23 9h-320q-14 0-23-9t-9-23v-64q0-14 9-23t23-9h224v-352q0-14 9-23t23-9h64q14 0 23 9t9 23zm416 352q0-148-73-273t-198-198-273-73-273 73-198 198-73 273 73 273 198 198 273 73 273-73 198-198 73-273zm224 0q0 209-103 385.5t-279.5 279.5-385.5 103-385.5-103-279.5-279.5-103-385.5 103-385.5 279.5-279.5 385.5-103 385.5 103 279.5 279.5 103 385.5z"/></svg></div>
{{ if lt (math.Round (div (countwords .Content) 400.0)) 1 }}1{{ else }}{{ (math.Round (div (countwords .Content) 400.0)) }}{{ end }} min.
</div>

I have been reading my own articles and counting how long it took me to read them fully and then adjusting the math function to serve as a guide based on my reading experience.

There is no folder rule. Work it based on your experience. As @tut mentioned, this is giving users a rough idea of how long it will take them to read.

If I will see that I got an article with 40 minutes of read time, it will land on my reading list for reading later, but something that is 2-3 minutes I may read as I browse.

bep · June 20, 2022, 3:35pm

@DeinAlptraum you are probably right, and I will look closer on the PRs re this that has surfaced recently.

As long as the reading time is a static value it will just be an indication as to “long read”, “a quick read” …

cshoredaniel · June 21, 2022, 1:59am

A bit of an aside: I wonder if there is a better way to capture the idea of long read, quick read, etc without either requiring a legend (what does ‘long read’ mean anyway if that is all you say), or not particularly accurate ‘reading time’.

The only think I can come up with is to use use phrases, but that are relatively well understood without an explicit legend (like quick read for ~1-3 regular paragraphs, ‘short letter’ (<= 500 words in English), ‘full page article’, ‘medium essay’, ‘long essay’, ‘short story’, novella, short novel, etc.
That would of course need some work to come up with a good set of descriptors and would it really make much difference in the end?

Maybe having a ‘length score’ (though that does, at least at first, require explaining the scale, though that could be a footnote). (Adding to my to do list; look up if there are UX or other design research on this kind of topic; it would be unexpected if this was a new thought or topic).

Topic		Replies	Views
0.76.0: i18n gone wrong? support	8	1136	October 9, 2020
Count Word function customized to exclude code support	5	523	August 23, 2021
.WordCount & .ReadingTime don't work in russian i18n support	0	913	May 10, 2019
ReadingTime in seconds support	5	1614	December 16, 2017
How to test a string for CJK characters tips & tricks	0	754	January 22, 2022

ReadingTime is computed as purely CJK or non-CJK, should this be changed?

Related topics