Currently, Hugo used the Go-native http.DetectContentType function to determine the media type of a remote resource when using resources.getRemote. However, sometimes the algorithm is unable to determine the exact content type and returns application/octet-stream instead. This leads to getRemote failing (media.FromContent returns zero in this case):
// from resource_factories/remote.go
...
// Now resolve the media type primarily using the content.
mediaType = media.FromContent(c.rs.MediaTypes, extensionHints, body)
}
if mediaType.IsZero() {
return nil, fmt.Errorf("failed to resolve media type for remote resource %q", uri)
}
...
This is problematic, because it limits the usefulness of getRemote. Could we potentially just use the content type in the header if we can’t determine the media type from the content? E.g., doing this:
if mediaType.IsZero() {
mediaType, _ = media.FromString(contentType)
}
Google Protobuf should be “application/x-protobuf”. The format is (among other things) used for vector map tiles. Admittedly, it’s a bit obscure but shouldn’t getRemote be format agnostic?
There are three approaches, either separate or in combination, to determine content type:
Sniff the content
Use the response header
Use the file extension (if present in the URL)
Which of these can you trust?
The payload can be altered to pass a sniff test
The response header can be wrong, either intentionally or unintentionally
The file extension can be wrong, either intentionally or unintentionally
The current implementation is, for a Very Good Reason, cautiously restrictive. And there have not been many (any?) reported cases where this behavior has been a show-stopper.
I’m criticising the current approach, and being cautious is definitely not a bad thing. However, I feel like there should be a way to override this behavior. Maybe just a flag for getRemote to ignore the content would be enough?
I think the conclusion then, which I agree with, is that we could add a mediaType option to resources.GetRemote. Template authors are trusted to do good things, so this should be safe. If someone points me to a GitHub issue I will make it into the next Hugo (scheduled for middle of next week).