How to generate index.json that outputs each pages Image src url?

Hi, I’m having trouble figuring out how to get the index.json output to format correctly.

You can see my current output here:

Specifically if you look at the “img_url” key, my goal is to grab every image listed on the page by it’s src value. A bonus would be to grab the images absolute url.

Currently my output is extracting the full html code for the img tag.

See github here for the project files and code I’m currently using:

I have a shortcode that resizes images in posts to fit to screen which I think may be effecting output, but I’m unsure.

Here is my current index.json:

[
{{ range $index, $page := (where .Site.Pages "Kind" "page") -}}
{{- if ne $page.Type "json" -}}
{{- if and $index (gt $index 0) -}},{{- end }}
{
	"uri": "{{ $page.Permalink }}",
	"title": "{{ htmlEscape $page.Title}}",
	"categories": [{{ range $tindex, $tag := $page.Params.categories }}{{ if $tindex }}, {{ end }}"{{ $tag| htmlEscape }}"{{ end }}],
	"tags": [{{ range $tindex, $tag := $page.Params.tags }}{{ if $tindex }}, {{ end }}"{{ $tag| htmlEscape }}"{{ end }}],
	"description": "{{ htmlEscape .Description}}",
	"content": {{$page.Plain | jsonify}},
	"image_url": [{{ range $pageimgs := (findRE `<img[^>]*src="([^"]+)"[^>]*>` .Content )}}{{ $myvar := replaceRE `/resize.+?/` `/` $pageimgs }}{{ $imageslice := slice $myvar }}{{ range $imageslice }}"{{ . }}",{{ end }}{{ end }}]
}
{{- end -}}
{{- end -}}
]

Is what I’m trying to accomplish possible?

You could use the findRESubmatch to capture the sub matching.

{{/* ... */}}
{{- $imgs := slice -}}
{{- range findRESubmatch `<img[^>]*src="([^"]+)"[^>]*>` .Content -}}
  {{ $imgs = $imgs | append (index . 1) }}
{{- end -}}
"image_url": {{ $imgs | jsonify }}
{{/* ... */}}

Btw, it would be better to use slice, dict and jsonify to format your JSON template (easy to read by others), for example.

{{- $imgs := slice "foo.jpg" "bar.png" -}}
{{- $data := slice -}}
{{- $data = $data | append (dict 
	"title" "foo"
	"content" "foo content"
	"imgs" $imgs)
-}}
{{- $data = $data | append (dict 
	"title" "bar"
	"content" "bar content"
	"imgs" $imgs)
-}}
{{- $data | jsonify -}}
1 Like

Thank you for the quick reply, and recommendations.

So I tried to update my code and hugo to include your suggestions:

I’m on version v0.111.3

My new code is:

[
{{ range $index, $page := (where .Site.Pages "Kind" "page") -}}
{{- if ne $page.Type "json" -}}
{{- if and $index (gt $index 0) -}},{{- end }}
{{- $imgs := slice -}}
{{- range findRESubmatch `<img[^>]*src="([^"]+)"[^>]*>` .Content -}}
{{ $imgs = $imgs | append (index . 1) }}
{{- end -}}
{
        "uri": "{{ $page.Permalink }}",
        "title": "{{ htmlEscape $page.Title}}",
        "categories": [{{ range $tindex, $tag := $page.Params.categories }}{{ if $tindex }}, {{ end }}"{{ $tag| htmlEscape }}"{{ end }}],
        "tags": [{{ range $tindex, $tag := $page.Params.tags }}{{ if $tindex }}, {{ end }}"{{ $tag| htmlEscape }}"{{ end }}],
        "description": "{{ htmlEscape .Description}}",
        "content": {{$page.Plain | jsonify}},
        "image_url": {{ $imgs | jsonify }}
}
{{- end -}}
{{- end -}}
]

But when I run my hugo command to serve the site in local host:

hugo serve -d -t hugo-w3-simple --bind "0.0.0.0" --baseURL http://"$ip":1313

I get this error:

Failed to acquire a build lock: %s open /home/israel-master-key/my-blog/quickstart/.hugo_build.lock: no such file or directory 

hugo v0.102.3+extended linux/arm64 BuildDate=2022-09-13T20:32:22Z VendorInfo=ubuntu:0.102.3-1ubuntu1

P.S. I’ll have to figure out how to make my JSON format prettier next, thanks for the tips.

Have a look on template that I presented to build Images Sitemap

Code can be easily adopted to output what’s required in JSON file.

1 Like

This seems a totally different issue, it’s recommended to create a separate topic to get noticed by much more users.

I have not encountered similar problems and cannot provide more information.

1 Like

Silly me, I had to exit my Screen session that as running in background. I can confirm @razon your solution worked perfectly here is the new output:

"image_url": ["/images/23/03/comptia/understanding-motherboards.png","/images/23/03/comptia/motherboard.png","/images/23/03/comptia/motherboard-chipsets.png","/images/23/03/comptia/front-and-side-panels.png","/images/23/03/comptia/cache-memory.png","/images/23/03/comptia/ram-slots.png","/images/23/03/comptia/bios-chip.png"]

Thank you everyone for all you help this will help me alot!

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.