Use HTTP cache mechanism for efficient cache update

Hello,
First thanks a lot for your work on hugo and thanks to this forums contributors!

I am using a lot of remote content for a hugo website and I ran into two cache issues:

  1. It seems that empty responses are cached. A resource was unavailable temporary at build time and hugo cached the empty response.
  2. I can set an expiration date for my cache. But neverchanging files or frequently changing ones will be updated at the same rate.

For bandwidth efficiency and up to date HTTP resources, hugo could:

  • Make a HEAD request to check if the online resource is newer than the cached one. (And update it if needed)
  • Use the cached resource if the online is down or unfetchable

If this check took place after the current cache expiry system, it may not break any expected behavior?
Do you think this would be a helpful feature to add? If not I could try to do it within a function or partial.

Thanks for your time!

  • With empty responses I assume you mean 404 responses, which to me isn’t a temporary state, but I guess it could be.
  • You can create your own key which allow you create a more coarse grained expiration setup (e.g. by using year-day-hour or something as part of the key).

But yes, you are right, we could/should improve this, and it would be great if you could create a GitHub issue with the above text.

With empty responses I assume you mean 404 responses, which to me isn’t a temporary state, but I guess it could be.

I assume nothing precisely, I think that in my example, the HTTP server did not reply at all or issued a 5XX error (hence the “empty” vague phrasing).
I guess a 4XX HTTP error could be considered definitive but that may lead this cache validator into some corner-cases management (which I’m afraid of :wink: ). “May” because I have no idea of how it is implemented…

You can create your own key which allow you create a more coarse grained expiration setup (e.g. by using year-day-hour or something as part of the key).

I read various posts on this forum where you explained this mechanism but it did not feet my need because I don’t know beforehand which content will be updated frequently.