Get images URL from content

Hi,
Is it possible to get links to all images used in the content (not in frontmatter)?

For example

---
title: Article with images
---

lorem ipsum
![alt](/images/22/image1.jpg)

lorem ipsum
lorem ipsum
![alt2](/images/21/image2.png)

lorem ipsum

lorem ipsum

![alt3](/images/22/image7.png)

than get

/images/22/image1.jpg
/images/21/image2.png
/images/22/image7.png

Ideally, I would like to use it in JSON inside <script type="application/ld+json"> per article

    [{
      "@context": "https://schema.org/",
      "@type": "ImageObject",
      "contentUrl": "https://example.com/images/22/image1.jpg",
    },
   {
      "@context": "https://schema.org/",
      "@type": "ImageObject",
      "contentUrl": "https://example.com/images/21/image2.png",
    },
   {
      "@context": "https://schema.org/",
      "@type": "ImageObject",
      "contentUrl": "https://example.com/images/22/image7.png",
    }]

Appreciate any pointing to the way how to achieve it.

Was thinking about {{ .Resources.Match "images/*" }} or just {{ .Resources.ByType "image" }} but that’s results null

{{ range findRE `<img.+?>` .Content }}
  {{ replaceRE `.*src="(.+?)".*` "$1" . }}
{{ end }}
2 Likes

Thank you.

it is not working on my page when I want to use it (probably due to image processing), but it is working as expected on my testing site so will try to figure out this from there.

Wonder if you can assist me with that further

Its working on that

<p>
  <img src="http://localhost:1313/images/image.png" alt="my image">
</p>

But doesn’t work on that.

<picture>
 <source media="(max-width: 428px)" srcset="/images/2022/11/image_-_8_hu451a2b91819a3d54456f55044d283390_230670_856x0_resize_q75_h2_lanczos.webp 2x,
                                        /images/2022/11/image_-_8_hu451a2b91819a3d54456f55044d283390_230670_428x0_resize_q75_h2_lanczos.webp" type="image/webp">
 <source media="(max-width: 834px)" srcset="/images/2022/11/image_-_8_hu451a2b91819a3d54456f55044d283390_230670_2048x0_resize_q75_h2_lanczos.webp 2x,
                                        /images/2022/11/image_-_8_hu451a2b91819a3d54456f55044d283390_230670_856x0_resize_q75_h2_lanczos.webp" type="image/webp">
 <source media="(max-width: 1440px)" srcset="/images/2022/11/image_-_8_hu451a2b91819a3d54456f55044d283390_230670_2048x0_resize_q75_h2_lanczos.webp 2x,
                                        /images/2022/11/image_-_8_hu451a2b91819a3d54456f55044d283390_230670_1440x0_resize_q75_h2_lanczos.webp" type="image/webp">
 <source media="(min-width: 1441px)" srcset="/images/2022/11/image_-_8_hu451a2b91819a3d54456f55044d283390_230670_2048x0_resize_q75_h2_lanczos.webp" type="image/webp">

 <img src="http://localhost:1313/images/2022/11/image_-_8.jpg" alt="img alt" decoding="async" width="2048" height="1536" fetchpriority="high">

</picture>

Which is strange, as in both examples you approaching <img

Is that really what your rendered HTML looks like, or have you removed some newlines?

When you inspect an element with your browser’s dev tools, the HTML is formatted to make it easier to read.

View the actual source instead (typically Ctrl + U).

The above is copied from the source but there are 2 differences.

Both output rendering following string in content like this (example)

![my image](/images/image.png)

But, the output

<p>
  <img src="http://localhost:1313/images/image.png" alt="my image">
</p>

if from the site when its using build in rendering option in clean hugo

where the other is processed through render-image.html as follow

{{- $image := resources.Get .Destination -}}

<picture{{ with .Title }} role="img" aria-labelledby="{{ printf "figcaption-%d" ($image.Permalink | crypto.FNV32a) }}"{{ end }}>

{{ $isJPG := eq (path.Ext .Destination) ".jpg" }}
{{ $isJPEG := eq (path.Ext .Destination) ".jpeg" }}
{{ $isPNG := eq (path.Ext .Destination) ".png" }}

{{ if or ($isJPG) ($isJPEG) ($isPNG) }}

        {{ $tinyw := "428x webp" }}
        {{ $smallw := "856x webp" }}
        {{ $mediumw := "1440x webp" }}
        {{ $largew := "2048x webp" }}
        {{ $originw := (printf "%dx webp" $image.Width) }}

        {{ $data := newScratch }}
        {{ if ge $image.Width 428 }}
                {{ $data.Set "tiny" ($image.Resize $tinyw) }}
        {{ else }}
                {{ $data.Set "tiny" ($image.Resize $originw) }}
        {{ end }}
        {{ if ge $image.Width 856 }}
                {{ $data.Set "small" ($image.Resize $smallw) }}
        {{ else }}
                {{ $data.Set "small" ($image.Resize $originw) }}
        {{ end }}
        {{ if ge $image.Width 1440 }}
                {{ $data.Set "medium" ($image.Resize $mediumw) }}
        {{ else }}
                {{ $data.Set "medium" ($image.Resize $originw) }}
        {{ end }}
        {{ if ge $image.Width 2048 }}
                {{ $data.Set "large" ($image.Resize $largew) }}
        {{ else }}
                {{ $data.Set "large" ($image.Resize $originw) }}
        {{ end }}

        {{ $tiny := $data.Get "tiny" }}
        {{ $small := $data.Get "small" }}
        {{ $medium := $data.Get "medium" }}
        {{ $large := $data.Get "large" }}

        <source media="(max-width: 428px)" 
                srcset="{{with $small.RelPermalink }}{{.}}{{ end }} 2x,
                        {{with $tiny.RelPermalink }}{{.}}{{ end }}"
                type="image/webp">

        <source media="(max-width: 834px)" 
                srcset="{{with $large.RelPermalink }}{{.}}{{ end }} 2x,
                        {{with $small.RelPermalink }}{{.}}{{ end }}"
                type="image/webp">

        <source media="(max-width: 1440px)" 
                srcset="{{with $large.RelPermalink }}{{.}}{{ end }} 2x,
                        {{with $medium.RelPermalink }}{{.}}{{ end }}"
                type="image/webp">

        <source media="(min-width: 1441px)" 
                srcset="{{with $large.RelPermalink }}{{.}}{{ end }}"
                type="image/webp">
 
{{ end }}
                
 <img
  src="{{ $image.Permalink | safeURL }}"
  alt="{{ .Text }}" {{ with .Title }} title="{{ . }}"{{ end }}
  decoding="async"
  {{ if or ($isJPG) ($isJPEG) ($isPNG) }}width="{{ $image.Width }}"
  height="{{ $image.Height }}"{{ end }}
  loading="lazy">
  {{ with .Title }}<span id="{{ printf "figcaption-%d" ($image.Permalink | crypto.FNV32a) }}" class="caption">{{ . }}</span>{{ end }}
</picture>

And this is where the problem appears.

It looks like findRE when processing .Content it doesn’t see images when they are subject to render-image.html this is why it does not output any results.

Looking through forum found this post where its mentioned that

(…) Render Hooks are not available from the outside (…)

If that’s the case, I need to find a different way to achieve that.

You removed the newlines from what you posted.

Look at your render hook—the img element has newlines.

Change the regex to handle newlines within the img element.

{{ range findRE `(?s)<img.+?>` .Content }}
  {{ replaceRE `(?s).*src="(.+?)".*` "$1" . }}
{{ end }}

See https://pkg.go.dev/regexp/syntax#hdr-Syntax

2 Likes

Wow, didn’t know that this is so fragile on new lines. It looks like my browser unify the lines and don’t see them broken down. Thanks again, will have a try with this.

Not fragile. The regex spec is quite clear.

Thanks, my lack of knowledge on that aspect then :slight_smile:

I will share shortly what I have been working on as it may be helpful for others.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.