I’m using hugo to generate a blog type site, with many images. Publishing is done through a gitlab pipeline, and hosted on dedicated instance. I’ve done so, in order to be sure that I’m able to have control over the build process and more specifically to handle properly the image generation and caching that can be time consuming.
My pipeline has a first test-buildstage which as the following script command:
script:
- hugo --cacheDir /home/gitlab-runner/hugocache
and a second stage which build again on merging to main branch and actually deploy by copying the public part.
script:
- hugo --gc --minify --cacheDir /home/gitlab-runner/hugocache
- echo "DEPLOYING FROM '$CI_PROJECT_DIR' to /usr/share/nginx/html"
- sudo rm -Rf /usr/share/nginx/html/*
- sudo mv $CI_PROJECT_DIR/public/* /usr/share/nginx/html/
- sudo chown -R nginx:nginx /usr/share/nginx/html/*
As you can see I’m using the cacheDir directive, and I was hoping that would alleviate the time required to build the site. I’ve also set it up to be on a different directory than the main build directory, because as resources and public are untracked under git, they are removed each time the pipeline runs.
Here are the output of the last job to get metrics:
hugo v0.143.1-0270364a347b2ece97e0321782b21904db515ecc+extended linux/amd64 BuildDate=2025-02-04T08:57:38Z VendorInfo=snap:0.143.1
| EN
-------------------+-------
Pages | 194
Paginator pages | 50
Non-page files | 956
Static files | 22
Processed images | 3366
Aliases | 39
Cleaned | 0
Total in 435625 ms
My execution time seems a bit long, and I was wondering if I’m doing something wrong. Maybe I have misunderstood a feature and it doesn’t work the way I think it is. Can anybody point me out to something not properly configured?
Thanks for your quick reply.
No, I didn’t, assuming that would not be necessary when using the --cacheDir option.
Just to fully understand, if I do so, the cacheDir in the config will be replaced with what follows in the command line option? And the maxAge whould be preserved?
Because if it gets back to default it would be under the same root directory as the repo and then removed when the pipeline get initiated.
The cacheDir config setting is the path to the cache directory. The :cacheDir token that can be used in the caches portion of the site config is expanded to the value of cacheDir.
Just try it, and temporarily include some debug command (ls, etc.) to verify the processed images are cached in the correct location.
To be sure I altered the test job using to add debug information:
script:
- hugo --cacheDir /home/gitlab-runner/hugocache
- echo "Debugging commands follows"
- ls -la $CI_PROJECT_DIR # Debug
- ls -la /home/gitlab-runner/hugocache
First run failed at the end, cause initially I had ls -la $CI_PROJECT_DIR/resources/ which is fine since it does mean there is no resources directory, as expected.
Execution time was Total in 432963 ms
Next run, fixing the test script, execution time shrinked to Total in 22693 ms and I can see the cache is actually correct:
$ ls -la /home/gitlab-runner/hugocache
total 0
drwxr-xr-x. 5 gitlab-runner gitlab-runner 55 Feb 7 14:15 .
drwx------. 6 gitlab-runner gitlab-runner 63 Feb 5 14:25 ..
drwxr-xr-x. 3 gitlab-runner gitlab-runner 23 Feb 7 14:15 _images
Also, about Gitlab CI cache, stages are actually isolated and data transmitted through artifacts. This two items lead me to use a dedicated instance to be sure the job output is preserved between jobs and not transmitted through artifacts, as my site is growing and exceeding the SAAS allowed size for artifacts. Even if I managed to build only the increments, at some point that would mean I won’t be able to rebuild it from scratch with the same pipeline if something bad happens, hence this solution.