Using GetRemote to get new entries from API by incrementing ID

I’m dealing with an API that doesn’t provide a get all endpoint. Instead you have to specify the specific entry ID you want to get.

For example,

https://example.com/api/data/entries/ID

I have the following working code for getting a single ID:

{{ printf "<-- This page intentionally blank. -->" | safeHTML }}

{{/* Constants. */}}
{{ $contentSection := "entries" }}
{{ $contentTemplate := "templates/content.md" }}
{{ $url := "https://www.example.com/api/data/entries/1234" }}
{{ $token := getenv "HUGO_TOKEN" | string }}
{{ $auth := printf "%s %s" "Bearer" $token }}
{{ $opts := dict
  "headers" (dict "Authorization" $auth)
}}

{{/* Get data. */}}
{{ $data := dict }}
{{ with resources.GetRemote $url $opts}}
  {{ with .Err }}
    {{ errorf "%s" . }}
  {{ else }}
    {{ $data = . | transform.Unmarshal }}
    {{/* Debugging data shown in prebuild/public/index.html */}}
    {{print $data}}
  {{ end }}
{{ else }}
  {{ errorf "Unable to get remote resource %s" $url }}
{{ end }}

{{/* Publish pages. */}}
{{ with $r := resources.Get $contentTemplate }}
  {{ with $data }}
    {{/* Map data fields to content and front matter. */}}
    {{ $content := .Content }}
    {{ $frontMatter := dict
      "date" .Entry.DateCreated
    }}
    {{/* Publish a page. */}}
    {{ $ctx := dict "frontMatter" $frontMatter "content" $content }}
    {{ $path := printf "%s/%s.md" $contentSection (.Entry.Name | urlize) }}
    {{ ($r | resources.ExecuteAsTemplate $path $ctx).Publish }}
  {{ else }}
    {{ errorf "The data set retrieved from %s is empty" $url }}
  {{ end}}
{{ else }}
  {{ errorf "Unable to get resource %s" $contentTemplate }}
{{ end }}

What I’d like to do is:

  • Store the ID in a state file like state.json
  • Increment the ID and run GetRemote until a 404 is returned
  • Update the latest ID that return 200

I know how to do this in code, but not sure if it’s possible in Hugo alone.

Thanks in advance!

Since v0.126.0 there’s a better way to create pages from data:
https://gohugo.io/content-management/content-adapters/

As I understand it, you don’t have access to all entries in one API call. Is that correct?

If yes, is there another endpoint that would give you an array of IDs? Or, in the JSON returned by a single entry call, is there a field that contains the next ID?

If no, is the first ID always 1? How many entries in total?

Since v0.126.0 there’s a better way to create pages from data:
Content adapters | Hugo

I actually tried your example hugo-github-issue-12440 to compare it with the preload way and I don’t think the former works for my use-case. I want to get new entries from the API, then vet the contents manually before building the pages. AFAICT using a content adapter the API responses are not written to disk as files I can manipulate. Make sense?

As I understand it, you don’t have access to all entries in one API call. Is that correct?

Exactly!

If yes, is there another endpoint that would give you an array of IDs? Or, in the JSON returned by a single entry call, is there a field that contains the next ID?

Unforunately, the answer to both is no.

If no, is the first ID always 1? How many entries in total?

No, the ID at the moment is 1023 and it increments sequentially for each new entry. The total number of entries from 1023 is unknown as new entries come each day (varies from 1 or more).

In other words, the only hacky way I can think of is as I mentioned to store the current ID let’s say 1023 once I’ve completed this hugo site. Then once a day I’ll build the site using GitHub Actions and it will try to get ID 1024. If it returns a 200, it will continue incrementing the ID, e.g. 1025, 1026 etc. If 1027 return 404, then the state file will be updated with ID 1026 which was the last entry for the day.

Yuck.

Perhaps you can write to a “last-id.txt” in the prebuild public directory during the prebuild phase, and make sure you don’t clear that file between builds. You can write this file using the resources.FromString function.

To read it, you’d need to mount the prebuild public directory to something you can read the file from during the prebuild phase, probably assets. Then you can use resources.Get “last-id.txt” and the resource’s .Content method to get the file contents. It’s ugly, but the best I can come up with at the moment.

The only directory in which you can create (publish) a file is the public directory.

1 Like

Yuck
I agree!

Ok I’m trying to implement your suggestions.

I got the id stored in a file using:

    {{ $r := resources.FromString "last-id.txt" $id }}
    {{ $r.Publish }}

I’m trying to do the second half which is to read the contents of that file, but not sure how to get the contents.

I’m mounting the public folder into assets like this:

[[module.mounts]]
source = 'assets'
target = 'assets'

[[module.mounts]]
source = 'prebuild/public'
target = 'assets'

In my layouts/index.html I addedd this to read the file contents:

{{ $lastIDContent := resources.Get "last-id.txt" | .Contents}}

The error I get when building is:

executing "index.html" at <.Contents>: can't evaluate field Contents in type *hugolib.pageState

Searching through the docs I’m not sure what the correct syntax is to read the contents from the file.

Singular, not plural.

Are you doing this in the prebuild project, or in the main project? My understanding is that the ID read/write would occur during the prebuild stage.

1 Like

:person_facepalming:

You’re right! I made a mistake adding prebuild/public in the hugo.toml inside the prebuild directory. Changing it to public worked :smiley:

So this is my current solution to working with this :poop: API:

{{ printf "<-- This page intentionally blank. -->" | safeHTML }}

{{/* Constants */}}

{{ $contentSection := "entries" }}
{{ $contentTemplate := "templates/content.md" }}
{{ $lastID := resources.Get "last-id.txt"}}

{{/* Authentication */}}

{{ $token := getenv "HUGO_API_TOKEN" | string }}
{{ $auth := printf "%s %s" "Bearer" $token }}
{{ $opts := dict
  "headers" (dict "Authorization" $auth)
}}

{{/* Get entries from API */}}

{{ $data := dict }}

{{ $nextID := add 1 ($lastID.Content | int)}}
{{range seq 50}}
  {{ warnf "nextID: %d" $nextID }}

  {{ $url := printf "%s%d" "https://www.someapi.com/api/entries/" $nextID }}
  {{ warnf "URL: %s" $url }}

  {{ with resources.GetRemote $url $opts}}
    {{ with .Err }}
      {{ errorf "%s" . }}
    {{ else }}
      {{ $data = . | transform.Unmarshal }}
      {{/* Debugging data shown in prebuild/public/index.html */}}
      {{print $data}}

      {{ $r := resources.FromString "last-id.txt" $nextID }}
      {{ $r.Publish }}

    {{ end }}
  {{ else }}
    {{ errorf "Unable to get remote resource %s" $url }}
    {{ break }}
  {{ end }}

  {{/* Publish pages. */}}
  {{ with $r := resources.Get $contentTemplate }}
    {{ with $data }}
      {{/* Map data fields to content and front matter. */}}
      {{ $content := .Content }}
      {{ $frontMatter := dict
        "date" .Entry.DateCreated
        "title" .Entry.Title
      }}
      {{/* Publish a page. */}}
      {{ $ctx := dict "frontMatter" $frontMatter "content" $content }}
      {{ $path := printf "%s/%s.md" $contentSection (.Entry.Title | urlize) }}
      {{ ($r | resources.ExecuteAsTemplate $path $ctx).Publish }}
    {{ else }}
      {{ errorf "The data set retrieved from %s is empty" $url }}
    {{ end}}
  {{ else }}
    {{ errorf "Unable to get resource %s" $contentTemplate }}
  {{ end }}

  {{ $nextID = add 1 $nextID }}

{{ end }}

One thing I couldn’t get to work is checking the HTTP status code and if it’s a 404, break.
At the moment it breaks after {{ errorf "Unable to get remote resource %s" $url }}.

I tried the following from looking at the hugo docs:

  {{ with resources.GetRemote $url $opts}}
    {{ with .Err }}
      {{ errorf "%s" . }}
    {{ else }}
      {{with .Data}}
          {{warnf .StatusCode}}
        {{if eq .StatusCode 404}}
          {{ errorf "Resource not found: %s" $url }}
          {{ break }}
        {{end}}
       
        {{ $data = . | transform.Unmarshal }}
        {{/* Debugging data shown in prebuild/public/index.html */}}
        {{print $data}}

        {{ $r := resources.FromString "last-id.txt" $nextID }}
        {{ $r.Publish }}
      {{end}}
    {{ end }}
  {{ else }}
    {{ errorf "Unable to get remote resource %s" $url }}
  {{ end }}

But I never get any output from {{warnf .StatusCode}}.
Otherwise, with your help @jmooring I have a working albeit hacking solution :stuck_out_tongue:

The resources.GetRemote function returns nil when the HTTP request receives a 404 status code in the response:

You can’t access .Data on nil (obviously), so you can’t check the status code. In your example, break from the loop when resources.GetRemote returns nil.

Whether we should return an error with 404’s is debatable I guess, but the behavior is consistent with other resource retrieval methods and functions (e.g., resources.Get, etc.).

I’ve updated the documentation to clarify behavior with 404 responses.
https://gohugo.io/functions/resources/getremote/#error-handling