Image processing not making use of existing generated resources

I’ve got a fairly large site which uses Hugo’s image processing to generate several different sizes of images.

What I’m noticing is that every time I build my site using hugo --gc --minify, Hugo seems to process more or less the same amount of images despite no changes being made. I believe that this is the reason why deployments to Netlify are slow with ~1000 new files being uploaded on every deploy with no changes to the site’s content.

I’ve attempted to troubleshoot as best as I can starting with the information Hugo Image Processing, which states:

Hugo caches processed images in the resources directory. If you include this directory in source control, Hugo will not have to regenerate the images in a CI/CD workflow (e.g., GitHub Pages, GitLab Pages, Netlify, etc.). This results in faster builds.

I committed the full resources directory and it made no difference as Netlify still reports uploading ~1000 new files.

My second attempt was to use the Netlify Plugin - Hugo Cache. Although it caches the resources as shown below, it doesn’t seem to make any difference:

1. netlify-plugin-hugo-cache-resources (onPreBuild event) 
Checking if resources exist at "resources"
Restored cached resources folder. Total files: 2    

...

3. netlify-plugin-hugo-cache-resources (onPostBuild event) 
Saved resources folder to cache. Total files: 7690

To try and isolate if this is a Hugo or Netlify issue, I removed public/ and resources/ and built my site locally three times as shown below. Each time Hugo is processing thousands of images.

Output (no previous public or resources directories):

hugo v0.95.0-9F2E76AF linux/arm64 BuildDate=2022-03-16T14:20:19Z VendorInfo=gohugoio

                   |  EN    
-------------------+--------
  Pages            | 10613  
  Paginator pages  |   246  
  Non-page files   |  1360  
  Static files     |     6  
  Processed images |  8542  
  Aliases          |    28  
  Sitemaps         |     1  
  Cleaned          |     0  

Second run (existing public and resources directories):

hugo v0.95.0-9F2E76AF linux/arm64 BuildDate=2022-03-16T14:20:19Z VendorInfo=gohugoio

                   |  EN    
-------------------+--------
  Pages            | 10613  
  Paginator pages  |   246  
  Non-page files   |  1360  
  Static files     |     6  
  Processed images |  7846  
  Aliases          |    28  
  Sitemaps         |     1  
  Cleaned          |     2  

Third run:

hugo v0.95.0-9F2E76AF linux/arm64 BuildDate=2022-03-16T14:20:19Z VendorInfo=gohugoio

                   |  EN    
-------------------+--------
  Pages            | 10613  
  Paginator pages  |   246  
  Non-page files   |  1360  
  Static files     |     6  
  Processed images |  7781  
  Aliases          |    28  
  Sitemaps         |     1  
  Cleaned          |     2  

To my untrained eyes it appears that Hugo is not making use of the cached processed images in resources/ and that’s the reason for the thousands of new file uploads, but I’m not sure how to troubleshoot further.

Any thoughts from the community is much appreciated :slight_smile:

UPDATE 1: I removed fingerprint throughout my project and also built locally with hugo (no garbage collection or minification) but I still see the same results.

With v0.97.3 I am unable to reproduce the behavior that you saw when building locally.

The test site has 629 images (JPEG, 1920x1080, 720 MB total) .
Each image is resized to 200x, converted to Webp, and rendered on the home page.

First run
$ hugo --gc --minify

Start building sites … 
hugo v0.97.3-078053a43d746a26aa3d48cf1ec7122ae78a9bb4+extended linux/amd64 BuildDate=2022-04-18T17:22:19Z VendorInfo=gohugoio

                   | EN   
-------------------+------
  Pages            |   9  
  Paginator pages  |   0  
  Non-page files   |   0  
  Static files     |   0  
  Processed images | 629  
  Aliases          |   0  
  Sitemaps         |   1  
  Cleaned          |   0  

Total in 37175 ms

$ git status

On branch temp
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        resources/

nothing added to commit but untracked files present (use "git add" to track)

$ ls resources/_gen/images/images/ | wc -l

629

$ git add resources/ && git commit -m "Add resources" && git commit

Second run
$ hugo --gc --minify

Start building sites … 
hugo v0.97.3-078053a43d746a26aa3d48cf1ec7122ae78a9bb4+extended linux/amd64 BuildDate=2022-04-18T17:22:19Z VendorInfo=gohugoio

                   | EN   
-------------------+------
  Pages            |   9  
  Paginator pages  |   0  
  Non-page files   |   0  
  Static files     |   0  
  Processed images | 629  
  Aliases          |   0  
  Sitemaps         |   1  
  Cleaned          |   0  

Total in 79 ms

$ git status

On branch temp
nothing to commit, working tree clean

It would be helpful if you could share your repository.

The variation in “processed images” count may be related to:
https://github.com/gohugoio/hugo/issues/7498

  • Assuming you have set up your cache to survive the rebuilds, I’m pretty sure this works as designed.
  • What may be a little confusing, though, is that when Hugo says “Processed Images 7781”, that number may mean “we read 7781 images from cache”.

Thank you @jmooring @bep for your replies!

Now that I know what “processed images” actually means it makes sense not too use it as a metric for whether caching works :slight_smile:

Is there a way to see during hugo’s build that it is looking for an existing resources directory and reading existing images? I tried hugo --debug --verbose but I don’t see anything related to reading or writing to resources.

No, but it should be obvious from the build time.

I notice you’re using the “netlify build cache plugin”; I must admit I have had no great success getting that to work, so I would recommend a much simpler setup:

You could either set these in your Netlify environment:

export HUGO_CACHES_IMAGES_DIR=':cacheDir/images'
export HUGO_CACHES_IMAGES_MAXAGE='1440h' 

Or something like that :cacheDir will, if you have not set it to something explicit, be adjusted to a path that survives rebuilds on Netlify.You can also configure this in your site settings, see

1 Like

Can you please clarify if setting the options above are necessary as I came across one of your older posts that suggests that Hugo will do this automatically on Netlify?

Hugo detects if you’re running on Netlify and sets the cache dir (if not set by you) to a Netlify path that survives builds.