Roadmap to Hugo v1.0

The ability to auto-publish future posts on the given date and time without using complicated third party tools or scripts would be awesome.

Scheduling posts in a simple way is essential for most people including me.

Also an import script from Tumblr to Hugo would be really cool.

You could achieve this with the publishdate variable in the frontmatter. According to the docs:

publishdate If in the future, content will not be rendered unless hugo is called with --buildFuture

Combining this with deployment tools like Wercker would give you exactly what you want. The remaining question is, how do you define complicated third-party tools? Would Wercker be one of them?


Also an import script from Tumblr to Hugo would be really cool.

I agree that it would be awesome to offer more support for migration tools. They’re already on the wishlist. But until know didn’t find a third-party tool/script on Github.

Preferebly those people, who migrate(d) from other platforms should write this migration function. Most of them are more familiar with the structure of those platforms. A good example would be the migration command for Jekyll that was added with the latest release.

I’ve updated the summary in the post according to the latest posts.

After defining what bullet points should be considered as a milestone and how they should look like (roughly) we should split this thread into seperate ones. Those should be specific for each milestone. In the near future it could be hard to follow a single discussion.

I am not particularly confident about using Wercker or Caddy Server for that matter.

I would prefer a solution like having Hugo running locally as a process that can --watch a directory and check post dates so that it can auto-publish on the right time. Spf13 mentioned something about partial/incremental building earlier.

As for Tumblr there is an API for retrieving stuff from their server https://www.tumblr.com/docs/en/api/v2
(I am unable to write a script since I am not a programmer)

With auto-publishing, do you mean the deployment of the generated content to a server?

Partial or imcemental builds only rerenders pages, whose content change, and those that are associated with them . At the moment, Hugo would generate the whole site even if you changed only a single line in a post. For optimization purposes we would only need to regenerate the page of the post and associated pages like the homepage that shows the latest posts. But you would still need a trigger for generating the site.

Incremental builds is an incredible hard problem to tackle – it will happen, eventually, but …

“Incremental publishing” on the other hand is easier, even when the timestamps change on every build. Just keep track of the MD5 hashes (or similar) of all the generated files.

@digitalcraftsman
Yes that is what I mean by partial or incremental build. But @bep has grasped what I am talking about. Even though I’ve no clue whatsoever about tracking MD5 hashes. LOL!

Thanks for your answers.

Theme Standardization

Something that hasn’t been mentioned that I think is incredibly important is metadata standardization. One of the reasons that WordPress is so popular is that you can easily swap between themes and have the bulk of your data / settings come with you. While Hugo supports that at the most basic level, there isn’t a Hugo standard for metadata regarding authors, social media profiles, images, videos, etc. Before we lock down an API for 1.0, let’s get this set.

4 Likes

While Hugo is still pre-1.0, there is more flexibility than there ever will be again. While we definitely don’t want to break BC entirely, what about redesigning the layout and providing an upgrade function that auto-reorganizes content into proper directories?

2 Likes

Having a standard for metadata regarding authors, social media profiles, images, videos, etc. would be great since this a common data that are used in nearly every website.

But how far do you want to standardize the theme structure? Many of my themes have their own config default config file. Others require the user as part of the set up to copy some files into the data folder in the root directory. Those files usually are needed to support i18n or to add metadata for projects in portfolios.

what about redesigning the layout and providing an upgrade function that auto-reorganizes content into proper directories?

In this case auto-reorganization can be become complicated (depending on the flexebilty of the tool).

There’s a discussion dedicated to that over here, alongside the already mentioned GH issue. One of them should be closed and direct readers to the other so that discussion can continue in one place. There’s already been some overlap.

2 Likes

I’ve been stewing on this for quite a bit, but here is the (unsolicited) opinion of a content manager. Keep in mind this is mostly just spitballing. I also think Hugo is the best SSG out there and the only one that seems viable for publishing sites. I should distinguish that when I say “publishing,” I mean “more than just blogs, documentation, and portfolio sites,” which is the pigeonhole most SSGs seem to fall into. I also agree with @DerekPerkins: in a semver world, per-1.0 means breaking changes should (theoretically) be expected with the upcoming major release. Anyways, here are some thoughts:

  • Other SSGs
  • What is Hugo’s Focus?
  • Source Organization and Content Sections
  • What Can Hugo Learn from Other SSGs for V1?

Other SSGs

I’ve looked heavily into Hexo, Jekyll, Middleman, and Webhook. Jekyll and Hexo are great, but there are reasons you really only see them being used for portfolios, blogs (mostly for devs if we are being honest), small marketing sites, and documentation. Webhook is the best SSG I’ve used (here and here are some cool WH examples), but for less obvious reasons than the shiny UI and multi-level permissions:

  1. It forces you to follow best practice by content modeling (aka content typing, aka archetyping) first.
  2. It uses Swig, a powerful Django-based templating language (Jinja, Liquid [Jekyll], Twig [Craft CMS (a digital content manager’s wet dream), Drupal 8], Nunjucks [Hexo]) with easy methods for content relationships, template inheritance (blocks), etc.

But Webhook also has shortcomings (slow builds on larger sites, complex roll-your-own infrastructure, monthly costs) that are already obviated by Hugo.

What is Hugo’s Focus?

My first impressions of Hugo was that it’s strength was its focus on content. Page content will always be updated more frequently than layout or user interactions. Also, content should be done first, with presentation and interact coming later in any professional site. That said, I’d be interested in hearing what the owners of the project think about such a single-minded philosophy.@bep @spf13

  • Is the content aspect of Hugo established enough to start incorporating presentation or behavior (css, js, preprocessors)? Eg, this pull request.
  • Are such features necessary (ie, will Hugo be easier or faster than setting up a pipeline with Grunt, Gulp, or Webpack)? Do these facets of the build need to take place every time content is updated?
  • Are variables and theming in CSS better handled via NodeSASS for statically generated websites (ie, content websites like those built with Hugo)?

@spf13 Treating each page as a folder with multiple files (ie, thereby making yaml/toml act more like an XML sidecar than EXIF in a JPEG) is smart. I also like @bep’s previous comments about “a page being just a page,” but I think these propositions betray some contentious (NB, not necessarily incorrect) philosophies about managing content at scale:

  1. content != data
  2. single-page != multi-content (This is easy to negate by just looking at the average Wordpress site)

To expand on #1, here’s an empirical (ie, not scientific) review of different content habits:

<!---average writer using Markdown-->
# My First Article
Jon Doe
February 16, 2016

## My Heading

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
<!--blogger,devs creating documentation,technical writer-->
---
title: My First Article
date: 2016-02-16
description: This is a description of my first article.
author: Jon Doe
tags: [first,article]
categories: [publishing]
---
## My Heading

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
<!--content manager, web publisher, web editor-->
<!--content types, aka archetypes, aka content modeling; sorta as if archetypes are classes, and content are objects; or maybe thinking of it as archetypes as SQL tables and templating as joins? (Keep in mind that I am NOT a developer:) -->
---
title: My First Article
date: 2016-02-16
description: This is the description for My First Article
descrption_long: This is a longer description of My First Article that I can use in summary views or in list pages in different areas of the site where I don't want to be limited to a less-than-170-characters-for-seo sensibility.
authors: [Jon Doe]
categories: 
- publishing
    - web
    - content-first
tags: [markdown, content management]
audience: 
- publishers
    - web
- writers
    - technical
content: "## My Heading

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
---
<!--And then a sample from a controlled vocabulary at authors/jon-doe.md-->
---
name_first: Jon
name_last: Doe
name_middle: Joe
credentials: [BA, MA, PhD]
bio_short: This is a short bio of me, Jon Doe.
bio_long: Jon Doe was born in Anytown, USA. Blah blah blah blah blah blah blah blah.
image: ryan-watters.jpg
---

Source Organization and Content Sections

So, for example, a site might have the following structure/hierarchy when built:

index
	+-- cats
	|		...
	|
	+-- dogs
	|		...
	|		
	+-- birds
	|		...
	|
	+-- locations

But use the following source organization when organized by content type and built with the ideal - let me qualify that this is just my ideal - SSG:

+--animals
|
+--stores
|
+--articles
|
+--authors
|
+--reviews

The templating in these cases is the sorta “join” between each self-contained piece of structured data/content. Jekyll’s approach to collections could provide insight into how Hugo might render individual content types, although it’s far from perfect and still documented as an experimental feature. That said, the current folder-as-global-navigation-item-as-site-section approach is, by default, simpler than the above example, but it can also be more limiting (cf. my introductory comment re: blog generators vs publishing solutions).

What Might Hugo Pull from Other SSGs for V1?

  1. This is a brazen, but Go templates deviate considerably from their more popular Django-and-mustache-based counterparts. I really think the simplicity of Liquid, and just sheer age and user base (obviously), is the only reason people use Jekyll more than Hugo. More robust templating could include a wider selection of
    • Filters (eg, | jsonify)
    • Functions (eg, {{capture "capture_var"}})
    • Inheritance (eg, {{ block "content" }} {{ block extends "content" }})
    • Easier management of relationships between content types. This is a tough one, but maybe something to the effect of a {{ archetype.relationship}} method.
  2. The ability to read to and write from all types of structured content (xml,json,toml,yaml,csv). Much of this is already supported in Hugo, which is awesome!
  3. Content management and utilities (eg, search, indexing, linkcheckers, etc), again, a lot of which is already being rolled out.
  4. Better documentation. This might mean a complete reorganization of the documentation structure and will come as a result of narrowing the focus - a la UNIX philosophy - of the project, IMHO. Since this is what I do a for a living, it’s also the one improvement I’m happy to help out with.
  5. Rather than embedding shortcodes or partials into the Hugo itself, it might be better for a library of the aforementioned to be decoupled and maintained by the community. Basically a repo of common use cases, which is sometimes the nature of questions on this forum (eg, “How do I pull up the most recently published article’s title based on the first tag used for such and such page?”). Otherwise, functions/partials/shortcodes are built out based on more narrow content models (eg, Disqus, which is ultimately for blogging, or social-media content types, which might not fit many publishing goals), and then you find yourself with a blogging engine rather than a publishing solution.

I’d love to hear feedback from the owners of this project. Thanks much to all. Hugo is freakin’ awesome.

3 Likes

Great ideas. Quick response to the bottom 5 suggestions:

  1. We’ve extended the go templates quite a bit. We can add more filters and functions as people suggest them. Inheritance would be good and @bep has done some great work on that already. I don’t know what you mean by relationships between types.
  2. Personally I kinda hate XML and CSV but wouldn’t have any problem with others adding support. We already support the others.
  3. This would be great.
  4. Absolutely. More and more I’m of the mindset that we need to dramatically improve the docs and a complete reorg is needed. We would love help there. If you want to take the reins and drive it that would be great.
  5. Yes, we need to do this. I’ve always thought of both of those being community driven things. We need to figure out an easy way to have a repository, manage it, and install things from it.

I have been fiddling with Hugo on and off since October, with the hope of migrating my Wordpress site over. The biggest blocker is handling photo galleries.

My preferred workflow would be to export photos from Lightroom into a sub-subdirectory of /content, e.g. a subdirectory galleryname of /content/photo/. Hugo would then use the EXIF or XMP metadata in the JPEGs as the front matter, and use standard Hugo templates to build galleries, photo details pages and so on.

I started tinkering with the Go code, with little to show this far because I am still getting my head round the data model and processing flow inside Hugo. My approach is to create a basicFileHandler for JPEGs, but a basicPageHandler might be a better hook for implementation. Generating thumbnails or intermediate resolution files could be done by template functions called by the templates that need them, e.g the section list for /content/photo/galleryname which would be the gallery page. The Go template system as implemented in Hugo should be flexible enough to handle most photo galleries use cases, and taxonomies are incredibly powerful (I can imagine extracting the XMP People metadata from LR face recognition to automatically build an index of photos for each person, or GPS and location metadata to build one gallery page per city, and so on).

Photos really should be managed in Lightroom or equivalent, uploading to a CMS and clunky copying captions or other metadata is incredibly backwards. Hugo would take a major step ahead if it handled photos, videos, music and other types as first-class content types just like Markdown.

Oh, and of course, we would only want the enhanced JPEG handling for files under specific directories, but not for those under /static or /themes

Also I advise that shortcodes should be removed from the themes area. They are more attached to the site than the theme. If you use shortcodes attached to a theme for any amount of time, you’re now locked into that theme. One could just copy the shortcodes over assuming there isn’t a lot of css and variable modification needed.

@spf13 I am happy to take the lead on reworking the documentation. However, taking a standard content strategy approach would mean revising the content first according to the focus of Hugo for V1.0. Restructuring site architecture would come second and would reflect the major components/sections of, eg, the data model. Some of the changes I would make to the docs are pretty radical, so it’s better to be strategic about it.

As for relationships between content types, some of this can already be accomplished by Hugo templating. I’m probably talking more about syntactic sugar. Here is an example of relationships in the Webhook SSG. Relationships between content types are more complex than “taxonomies” (ie, in the way the word taxonomies is used in Hugo).

Taxonomies usually create list pages of content files based on matches in front matter. Relationships are more complex.

This is probably best illustrated with some samples. The syntax doesn’t currently exist in Hugo, but it should illustrate the concept.

<!--articles/my-article.md-->
---
title: This is my Article
date: 2016-02-19
description: This is a description of my article.
categories: [lettuce]
authors: [Jon Doe]
---

### Article Heading

Lorem ipsum dolor sit amet, consectetur adipisicing elit. Repellat distinctio dignissimos, libero alias sint recusandae. Obcaecati earum, vitae illum quo nulla, numquam, voluptatem dignissimos molestias facilis illo consectetur est rerum.

<!--categories/lettuce.md-->
---
title: Lettuce
description: Lettuce is a pretty boring vegetable. You must be really bored reading about this.
---
<!--authors/jon-doe.md-->
---
name_first: Jon
name_last: Doe
bio: Jon Doe was born in 1980 in Chicago, IL. 
bio_long: Jon doe was blah blah blah...
email: jondoe@example.com
---
<!--layouts/article.html-->
<h1>{{.Title}}</h1>
<p class="date">{{.Date}}</p>
{{.Content}}
<div class="article-authors">
{{ range relationship.authors }}
<h3>{{relationship.name_first}} {{relationship.name_last}}</h3>
<p class="author-bio">{{relationship.bio}}</p>
<div class="contact">Email me: {{relationship.email}}</div>
</div>
{{ end }}
{{range relationship.categories }}
<div class="category-info">
<h3>This article was published in <strong>{{relationship.title}}</strong></h3>
<p>{{relationship.description}}</p>
</div>
{{end}}

Final Output:

<!--mysite/articles/my-article/-->

<h1>This is my Article</h1>
<p class="date">2016-02-19</p>
<h3>Article Heading</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit. Repellat distinctio dignissimos, libero alias sint recusandae. Obcaecati earum, vitae illum quo nulla, numquam, voluptatem dignissimos molestias facilis illo consectetur est rerum.</p>
<div class="article-authors">
  <h3>Jon Doe</h3>
  <p class="author-bio">Jon Doe was born in 1980 in Chicago, IL.</p>
  <div class="contact">Contact me: jondoe@example.com</div>
</div>
<div class="category-info">
  <h3>This article was published in <strong>Lettuce</strong></h3>
  <p>Lettuce is a pretty boring vegetable. You must be really bored reading about this.</p>
</div>

Does that make sense?