I have thought long and hard about how to organize my content. The prominent statement in the Hugo docs about content organization caught my eye:
Hugo believes that you organize your content with a purpose.
For a static site generator this is crucial and has set Hugo apart from most other SSGs. Over two years ago I wrote a blog post about content management and how great Hugo already was back then.
Since then I have created some larger multi-editor websites with Hugo and I have since had a structure of the website and content that I could explain in 10 minutes even to programming newbies!
Try to do this with a WordPress file structure and itβs database contentβ¦
[quote=βbep, post:12, topic:6338β]
that is the downside of becoming popular β the site variations in the wild become very big[/quote]
As some recent discussion has made clear, there are various ways people organize their content in Hugo and itβs not always obvious to people maintaining and advancing the Hugo code that it is used in such a way.
In Search of a Best Practice to Organize Content
Letβs make it visible how and why we organize our content the way we do. Such discussions might be relevant for decisions about the future direction of things like β¦
I will share some structures I have used and I would love to learn from you the good and bad you see in it. Please also share your Hugo structure if you found something that is clear and could work for others.
Classic structure with assets in the static folder.
Pros
Clear separation of content files that are processed and assets that are static.
Flat content tree.
Cons
Managing assets in the static folder can be a pain: If we put all images in
one folder it is difficult to tell which images are still in use and which images
could be deleted. We could replicate the entire structure of the content for
images but this is also a lot of work to maintain.
References to images are long and may lead to mistakes
(for example /images/blog/firstpost/post-image-1.jpg).
This folder is ignored by Hugo. It is only used for preprocessors to generate files
into the static folder.
An alternative to this can be found in the Victor Hugo boilerplate by Netlify.
It puts the Hugo folder in a site subfolder and uses Gulp + Webpack as an
asset pipeline.
All assets are in the static folder. Here I replicate the content folder structure
for images. We could also create folders by date (like WordPress does) or put all
images in one folder.
Content files and corresponding assets are in one folder. This is very clean
from a content perspective since I consider images of a blog post also as
content (not processd, but still content). For example, we can create a zip
file from a blog post folder with markdown and images and we have everything
that belongs to that blog post.
Images in the content folder are also much easier to maintain.
For example, if we delete a blog post, we can savely delete the blog posts
images as they live in the same folder.
References to images are short (for example post-image-1.jpg). This also
means that Github finds and displays the images when the markdown file is opened.
Cons
Content contains a mix of files to be processed (markdown) and static files
(images). If we did some image optimization/resizing/cropping, we could argue
that images are processed as well and thus belong to the content folder.
A lot of folders to manage in the content directory (comment by @alexandros).
The second structure with corresponding images under each post is a bit of a pain in my eyes. I can understand the logic behind it. But still itβs too many folders to manage. IMO.
Personally I prefer to keep things simple and have everything in /static/images/ and create subfolders in there if I feel like it.
Thanks a lot for sharing @marcojakob ! Structure B all the way. I experimented with both, and clearly having everything in the static folder is a pain unless you write some tools to manage your pictures. Add you add sections, posts and all to your website, which is whatβs happening to me, having all your pictures in the static folder doesnβt scale well. To manage the complexity, my img folder just replicated the content folderβs structure at first.
Itβs very handy to work in a single folder for each post. Last but not least, if having all the pictures mixed up with the index.md file is an issue for you, just put them all in an img subfolder.
I must say, Iβm not a fan of the flexibility with content organization. When I got started with Hugo, as a beginner, I just messed up my website. Not that flexibility itself is bad, but I think strong guidelines would be helpful for newcomers. Iβd recommend your structure B to any beginner!
With a little customization one can generate any kind of shortcode or even donβt use the Photoswipe plugin (if itβs not needed).
I am currently managing a couple of image heavy sites with Hugo and for me, like I stated above, it works best to have everything organized with subfolders under /static/images/
I guess my site would fall into Structure C: most binary assets are hosted on S3. This means that I have to use external scripts to calculate image sizes and generate gallery code, but I was doing that before I migrated to Hugo, so itβs not a big deal for me.
The primary benefit of not storing binary assets inside your Hugo site is that it significantly reduces build time. For instance, just now I moved about 256 MB of TIFFs, PDFs, and JPGs out of /static, eliminating copies that were responsible for about a third of the total runtime of hugo. Now it only takes 13 seconds on my laptop, or 30 on a virtual.
@jgreely Yes, externally hosting binary assets is a valid case that I have also been thinking about using for some of my larger projects. Like you said, letβs call this Strucutre C:
Structure C
Keeping most binary assets in external storage (like S3, Cloudinary, .etc.) instead of keeping them inside the Hugo folder structure.
Pros
Keep Git repo small: Most Hugo projects are stored in a Git repo. Even if Git is not ideal for storing binary assets this is no problem for smaller projects. But if a project gets larger (letβs say over ~600MB) keeping everything in a Git repo might slow down everything.
Very fast build process (especially helpful for things like continuous deployment - Netlify and the like).
Asset pipeline (like resizing images) can be handled by scripts or completely outsorced to platforms like Cloudinary or imgix.
Cons
Setup is complicated. Instead of one place for the entire site, content it is spread out over two platforms.
Managing assets in an external storage can be a pain (same with static folder in Structure A): If we put all images in one storage it is difficult to tell which images are still in use and which images
could be deleted. We could replicate the entire structure of the content for images but this is also a lot of work to maintain.
References to images are long and may lead to mistakes
(for example https://my-external-storage.on-amazon-s3.com/my-bucket/12341232344234.jpg).
Very interesting discussion. I like the option B, but it seems that this prevents using archetypes. If this is true, it should to be added to βminusβ section.