Auto link keyword shortcode

I found the following short code on https://gohugohq.com

{{ printf "<%s>" (.Get 0) | safeHTML }}
{{ $.Scratch.Set "content" .Inner }}
{{ range where $.Site.Pages.ByPublishDate.Reverse ".Params.seo" "ne" nil }}
    {{ $tmpPage := . }}
    {{ range .Params.seo }}
        {{ $.Scratch.Set "content" (($.Scratch.Get "content") | replaceRE . (printf "<a href=\"%s\" title=\"%s\">%s</a>" $tmpPage.RelPermalink $tmpPage.Title . ) )  }}
    {{ end }}
{{ end }}
{{ $.Scratch.Get "content" | safeHTML }}
{{ printf "</%s>" (.Get 0) | safeHTML }}

It works but i would like to do something else and am too inept to adjust the code to my desires. Maybe someone can help me with that.

What i would like to do is:

  • range through titles of specific section
    {{ range where .Site.Pages.ByTitle "Section" "section-name" }}
  • if {{ .Title }} is found within the {{ .Content }} <p>
  • change that text into <a> and link it to it’s title
  • if {{ .Title }} of the current content equals, do not link anything

So if content with title “X” has a text which at one point states “Elon Musk likes anime” and there exists a content with title “Elon Musk”, the short code should do the following:

  <a href="link to Elon Musk" title= "Elon Musk " >Elon Musk</a> likes anime.

But if the content with title “Elon Musk” has in it’s text “Elon Musk likes anime” that is not linked since we already are in that content.

What you ask is a pretty complex feature that is not built into Hugo. It can be done alas with a few caveats.

First of all I assume that you are looking to cross link various single pages and I am going to give you some pointers:

  • Hugo has a Related Content feature that even though it does not index the Content of a page it can be used to store the Permalink and Title variables of every page in a project.

  • Content is a monolithic variable. Meaning that you have to manually split it if it contains the value of one of the Titles stored in the previous step. If the keyword appears somewhere in the middle then you will split the Content in 2 parts and then index them with the cross link HTML in between.

  • You would probably need to use .Scratch to store the variables and retrieve them from the outside of the .Related range

  • Also you would need to use the printf function so that you can cast the variable strings without error inside the .Content string.

  • All of the above will be done directly in the single.html template and not from a shortcode.

  • However this poor man’s search engine might slow down the build time of your project if you have tons of pages.

Anyway I have given you a way of how to go about it without the code. If someone more skilled than me has a simpler approach maybe they could offer some advice.

this sounds way beyond what i can do, but will try to give it a try…might take quite a while

currently looking at findRE and replaceRE and https://gist.github.com/tdegrunt/045f6b3377f3f7ffa408
and some other stuff

I would avoid using RegEx for such a use case.

Perhaps rather than doing all that maybe it would be best to have a look at the ref and relref shortcodes to create your cross links.

Also my previous post outlined a pretty complex way of going about it and the results are not guaranteed.

I have found a way to use Related Content for creating links across articles by detecting keywords but unfortunately there is a limitation. See my other topic

Bottom line keyword detection is possible so that .Content can be split and the link inserted. But you would need to use a front matter parameter for all the content files that do not have a cross link. Unless this issue finds a solution.

If you don’t mind adding a front matter parameter to your content files I could post a working snippet for generating those links.

Would that require updating the front matter for all past content each time new content is added?

No.

I have refactored further. So here is the simplest solution that I could come up. It should be blazing fast.

Configuration of Related Content in config.toml

[related]

threshold = 80
includeNewer = true
toLower = false

[[related.indices]]
name  = "crosslink"
weight = 100

You will need to add a front matter parameter to content files that you want to crosslink:
i.e. crosslink = "true"

Then in layouts/_default/single.html

{{ $related := .Site.RegularPages.Related . }}
{{ with $related }}
{{ range . }}
{{ $title := .Params.title }}
{{ $permalink := .Permalink }}          
{{- if and (strings.Contains $.Content $title) (ne $title $.Params.title) -}}
<li>{{ $content := split $.Content $title }}{{ index $content 0 | safeHTML }}<a href="{{ $permalink }}">{{ $title }}</a>{{ index $content 1 | safeHTML }}</li>
{{ end }}
{{ end }}
{{ end }}
{{ if not $related }}
<li>{{ .Content }}</li>
{{ end }}

With the above if the title parameter of a content file is found in the .Content then it will be turned into a link. If the keyword is in a content file with the same title parameter it will not be turned into a link.

Content files that do not belong in the related index crosslink will have their content displayed normally.

And that’s it.

I had a hard time figuring out the if not $related condition because I hadn’t removed a content file from the related index.

So this is my take into an automatic internal links generator based on keyword.

If anyone can improve this then please do.


CC / @bep @regis @rdwatters Thank you for your replies in the other topic and on GitHub.

1 Like

A quick note: If you have several indices, you might want to use the RelatedTo method to make sure you use only one index (if that is what you’re after).

1 Like

Thanks.

For some reason when I use this method {{ $related := .Site.RegularPages.RelatedTo ( keyVals "crosslink" "true") }} the if not $related condition renders empty.

But {{ $related := .Site.RegularPages.RelatedIndices . "crosslink" }} works fine.

And I tested with another Related index defined and everything again works fine.

Thanks for the heads up.

Yes, and that was also what I meant (I remembered incorrectly).

1 Like

It’s brilliant that Hugo’s Related Content can be leveraged into building internal links automatically.

So once again, thanks for developing all these cool Hugo features during the past year @bep ! I learn something new every day.

And @rokh thanks for creating the topic I wouldn’t have thought of doing this if you hadn’t asked.

1 Like

this is probably not meant as boolean but string? i added it to one piece of the content

added to config

    related:
      threshold: 80
      includeNewer: true
      toLower: false
      indices:
       - name: "crosslink"
       - weight: 100

using {{ $related := .Site.RegularPages.RelatedIndices . "crosslink" }}

it goes to if not related part and the default related partial (the example from docs) stops working with the change in config

thank you for all the help so far, will look into it more as soon as time permits

Here is a working test repo with the above.

2 Likes

Ah apologies my yaml syntax was wrong. I managed to set it up properly now, i think. There however is a slight problem. i am not sure as how to describe it so i will use a real example

first two paragraph text:

`Described as XYZ by Name One. Described as real life ABC by Name Two.

According to Name Two, events began on a…`

This turns first occurrence of Name Two into a link. Name One not set up yet. Next, second paragraph, where Name Two occurs again is cut just before the name, it does not show any content (every paragraph gone)

So it becomes:

`Described as XYZ by Name One. Described as real life ABC by Name Two.

According to`

If i add crosslink: "true" to Name One it is not linked in the first paragraph. The first paragraph is repeated after the second one in which Name One is linked but Name Two is not.

After it, the second paragraph is repeated in full with Name Two in it not linked and then follows the rest of the entire content.

Sorry if i am being annoying with this. I realize it might be something rather complicating to do or i am doing something wrong. I understand if there is no easy solution. It is not very important, was just an idea about a nice thing to have.

It seems that you will be using multiple cross links within content files.

The example I posted accounts for just one crosslink that needs to have content before and after.

You will need to modify the example for your use case in order to split the content as needed.

Of course this can get pretty complex quite fast.

That’s why above I directed you to the ref shortcode.

thank you, i will try to do it