Currently GetRemote
returns a limit set of Response Headers.
Some REST API utilize Paginating support via Resonse Headre Link field
example for GitHub REST API
< Link:
<https://api.github.com/repositories/11180687/issues?page=1&per_page=2>; rel="prev",
<https://api.github.com/repositories/11180687/issues?page=3&per_page=2>; rel="next",
<https://api.github.com/repositories/11180687/issues?page=233&per_page=2>; rel="last",
<https://api.github.com/repositories/11180687/issues?page=1&per_page=2>; rel="first"
support for this would be nice in the GetRemote method to add the Link header field if it exists.
This would allow much easier collecting paginated content.
What do you think?
did I miss it somewhere and its possible
worth an issue?
workarounds: manual counting returned values, construct page links manually, uh
graphql - but this needs authentication even for public info
I don’t think that we have access to the headers on a GetRemote
request, and yes, I think it’s worth a feature request in the repo. It’s not nice of the API in question though, to “hide” this information in the headers, that though is a completely different topic.
But I guess it’s important too for things like rate limiting (which in all my projects is done via header) and some exotic features like content type… yeah, let’s have headers in an array to range through if required.
+1 from me.
bep
May 19, 2024, 5:57pm
3
irkode:
worth an issue?
Yes.
I originally just added what I thought would be “useful” in the .Data
, but we could certainly expand on that.
This is what we have today:
func responseToData(res *http.Response, readBody bool) map[string]any {
var body []byte
if readBody {
body, _ = io.ReadAll(res.Body)
}
m := map[string]any{
"StatusCode": res.StatusCode,
"Status": res.Status,
"TransferEncoding": res.TransferEncoding,
"ContentLength": res.ContentLength,
"ContentType": res.Header.Get("Content-Type"),
}
if readBody {
m["Body"] = string(body)
}
return m
}
1 Like
irkode
May 20, 2024, 10:24am
4
created issue:
Maybe other want to opt in if they also need a new field in the response
opened 10:00PM - 19 May 24 UTC
NeedsTriage
Proposal
# Support Link Response fields in GetRemote Result object
[Discourse 49894](h… ttps://discourse.gohugo.io/t/support-for-link-header-field-in-getremote-used-for-api-pagination/49894/3)
## Issue
Different APIs use different Response-Header fields to return additional data about the Response.
Especially paginating REST APIs (GitHub, GitLab guess there are many) used in combination with GetRemote are affected.
Hugo currently only provides access to the _Content-Type_ filed through it's Data Interface in
resource Object returned from GetRemote
I would like to have see the Link Header fields for Pagination at the resource.
- start
- first
- next
- prev(ious)
- last
- self
## Link Response Header Field (RFC 5988)
RFC has a complete grammar in [section 5](https://datatracker.ietf.org/doc/html/rfc5988#section-5)
Just the key points for the stuff used in pagination here:
```text
Link = "Link" ":" #link-value
link-value = "<" URI-Reference ">" *( ";" link-param )
link-param = ( "rel" "=" relation-types ) <-- the are more keywords than "rel" and more types
relation-types = relation-type
| <"> relation-type *( 1*SP relation-type ) <">
```
relation types are defined in
[section 622](https://datatracker.ietf.org/doc/html/rfc5988#section-6.2.2)
commonly used types are
- self - the link to the page itself
- help - usually links to the api documentation
- stylesheet - for XML delivery or transformation
- start - slighty different definition than first (non native speaker) - and there is no _end_ type
;-)
So we have multiple _types_ in a _rel_ field that may link to the same Url. And two different types
for the same (prev and previous) Hurray.
### Example from GitHub
> This is a one line string, newlines added for readability
>
> ```text
> Link:
> <https://api.github.com/repositories/11180687/issues?page=1&per_page=2>; rel="prev",
> <https://api.github.com/repositories/11180687/issues?page=3&per_page=2>; rel="next"
> <https://api.github.com/repositories/11180687/issues?page=233&per_page=2>; rel="last",
> <https://api.github.com/repositories/11180687/issues?page=1&per_page=2>; rel="first"
> <https://www.example.com; rel="self">
> <https://www.example.com; rel="self stylesheet">
> ```
## Implementation thoughts
Implementation effort varies on the degree of Link attribute support
1. only rel=
1.2 all types
1.3 list of types (minimal: next,prev(ios),first,last,start)
2 complete Link RFC
3 all or some other Headers
I would vote for 1.2 maybe 1.1 so it's full supported. Guess you have some Go libraries already in
use. The grammar for the Link attribute seems to regex capture handlebar.
### Easy access
should be dead simple $resource.Links.First $resource.Links "first" ... no puzzling around with
walking a structure or index function.
### Should not fail
If the link cannot be parsed completely it should not fail but store it in a String representation
maybe Resource.Header.LinkRaw ...
## Field that may be worth adding
- _Content-Language_ might be interesting, so one can handle the process according to Hugo site
languages.
## Further considerations
For simplicity all fields exposed are directly available at the .Data field.
Could be moved to a .Data.Header storing the Response headers. Just one field currently for the
Response-Header, so a deprecation now and duplicate/link it to .Data.Headers. or an interface,
getter ... i'm not in Go. The new link Field might be widely used because of paginating APIs
utilized with GetRemote and Content Adapters. imho if there should be a change, early is better ;-)
Dunno if these will be a conflict candidate later but at lest a point to consider
## Resources
- [mdn_web_docs](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link)
- [Wikipedia](https://en.wikipedia.org/wiki/List_of_HTTP_header_fields)
- [RFC 5988](https://datatracker.ietf.org/doc/html/rfc5988)
system
Closed
May 22, 2024, 10:25am
5
This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.