My Hugo site has images, but I don’t want to put those images into source control since they are binary files. Instead I would like to put something more like a map of files to their SHA-1s into S3 (or Dropbox or whatever), and version-control this manifest in git.
Then, when I build the site, I can pull down any changes to the files from remote storage as part of my build steps. (The files themselves are git-ignored in this scheme.) In this way, I:
just have to upload files of interest indicated in the manifest
keep them out of my repository while still leaving them visible to Hugo
keep my repository small
Is this sort of thing a workflow that others use? Is there something that will do some or all of this for me?
Have you looked into the Cloudinary free tier? It solved my similar problem. Will obviously depend on how much traffic you get, how many images are involved (and how big their files are), etc.
I think the manifest part is important since if images change there’s no way to keep the repository state synchronized with the object store’s state. But I like this approach; maybe it’s a good starting point.
Which I still think is a really good idea that certainly could be expanded to support “folders”. From the Hugo side of it, it’s not hard to implement, but I have silently been hoping that some bigger corp would pick up this ball to make it more “standard”.
Yes, I think you’re right – it would be ideal if this were a standard-by-convention (e.g. having an .assets file the way one might have a README, a Makefile, or a .gitignore).
This is something I’ve really been meaning to look at with Hugo, as my primary purpose is for a photography/documentary storytelling/articles/photoblog style feed that replaces instagram, but a lot of image assets. It adds up fast. Ideally hosting elsewhere would be a good out as things grow, I was kind of waiting for Cloudflare R2 to go open beta (it did last week).
But the other reason I chose hugo was for its ability to resize image and metadata support, along with webp.
It is almost certain that I am missing something but isn’t this just as easy as adding entries for images to your .gitignore and then using rsync to sync only the images with your website? I did that for a non-Hugo website and it worked very well. But since a Hugo website is really just a plain website, it should work just as well.
That sounds a lot like git-lfs-on-S3.
I have some sites on Netlify Large Media, and that offers on-the-fly resizing. But to be able to use S3 or just any other storage backend would be great.
I found this post detailing how a 7 year old node.js application still worked 2 years ago (although the writer says he had to patch the code here and there). I’m tempted to see if I can get that to run on-demand on Heroku or in the GCP free tier CloudRun, and have git-lfs track the resources dir.
edit: this should be even easier to get to work. Or this.
I’ve tried a few of the git-lfs s3 backend implementations, and they’re all a pain to work with. But that might be because git lfs itself isn’t a great pleasure to work with I guess.
Also, trying to get a 7 year old node app to work while I don’t know much about node proved to be a bit too much. I ended up with Giftless, written in python. Relatively easy to set up and pretty stable. Handles sleep and wake up on Heroku well.
I’ve put an almost completely configured ready to clone fork here. Needs only the bucket name in a config file, the rest is env vars.