Work-around for making pages from data?

After looking at the archetypes docs… I think I know what you mean but not sure how that works exactly since when creating the content file based on the archetype, it wouldn’t know what values to search the data file with… or does it? And are you saying that’s a way that you use pull in data into the content file instead of using a template/partial to do it?

And something else I thought about… can taxonomies only be added to posts as long as they exist in that posts front matter at build time? For example, I couldn’t list the taxonomies I want for a particular item in that item’s data structure and have it be assigned the taxonomy, only unless that’s some function added to the archetype and you do the getJSON as you described, right? So, if I want to have my posts/products have various taxonomies, then those have to be in the frontmatter, full-stop, yea?

No way to add taxonomies to the post given your exact example though right? (I mean, unless you have your script generate those frontmatter taxonomy entries based on the data file?)

also, why JSON, why not toml or yaml?

You have to tell it. Basically 2 options:

  1. By convention (random element, use the day number, whatever)
  2. Passing what you want as a OS env, e.g. QUERY="name = 'foo'" hugo new "post/foo.md" and use getenv` in your archetype template
1 Like

Ha! Thanks so much for that @bep :+1:

I wouldn’t have thought of using Hugo environment variables this way!


@gaetawoo

Here is a working example based on the code snippets given in this excellent topic: [Solved] CSV data key lookup

Contents of /data/abc.csv

id, description, color, number, cat, image
110, blue pearl, blue, 10, pearls, bpearl.jpg
111, green pearl, green, 12, pearls, gpearl.jpg
112, orange peel, orange, 19, fruit, opeel.jpg

Contents of archetype under /archetypes/post.md

{{- range $i, $r := getCSV "," "data/abc.csv" -}}
{{- $id := index $r 0 -}}
{{- if eq (getenv "HUGO_TITLE") $id }}
---
title: "{{ index $r 0 | title }}"
---
{{ end -}}
{{- end -}}

Basically withe the above we are telling Hugo to create a markdown file only if the environment variable value equals one of the values of the first column cells of the CSV.

And here is the way to create separate markdown files from a CSV with Hugo only:

Just paste the entire list of commands in the terminal and Hugo will magically create the separate files (you will only need to press ENTER for the last file:

env HUGO_TITLE="110" hugo new "post/110.md"
env HUGO_TITLE="111" hugo new "post/111.md"
env HUGO_TITLE="112" hugo new "post/112.md"

Enjoy! :tada:

P.S. Of course you could do something similar with a JSON.

6 Likes

Thanks! That shows a lot of power. I guess you could actually do the content that way too if one so chose. I’ll have to think through my options now that I have a few ways to approach.

I seem to recall @budparr writing a slick blog post on how to use csplit to create individual markdown files from data as well, but I can’t find it on The New Dynamic for some reason. Maybe he has some recommendations as well.

Mentioned here, btw:

The above technique will output various markdown files from a single CSV file when using the command hugo new. After that you will have a pretty conventional content directory.

Also you could use some automated RegEx in your favorite editor to generate the list of terminal commands. :wink:

That article is 404. The other technique I was using up until today involved the csplit command indeed and the article that documented it is over here:

But the above technique with Hugo Environment Variables and Archetypes, feels way more natural, since it involves less commands and less messing about with the CSV and it can be adapted for use with a JSON very easily.

Anyway @rdwatters Hugo has become quite powerful and we haven’t quite realized.

1 Like

I just want to say that I have used the above technique with a complex CSV to split it in various markdown files and it works flawlessly.

I did encounter a couple of caveats that are unrelated to Hugo but I managed to fix them so I’m posting these fixes here in case someone else encounters them:

  1. When exporting a CSV from Excel all UTF-8 characters turn into question marks. There are involved hacks circulating the web that seem like overkill.

Solution: Ditch Excel and use Sublime Text’s Advanced CSV Package

  1. However when working with Advanced CSV it stripped quotation marks whenever I justified the columns.

Solution: Disable auto_quotes in settings of Advanced CSV.

Tip 1: To use a comma in a description cell of your CSV you need to escape it by quoting the entire cell.

Tip 2: You can use Hugo functions outside the getCSV range to fetch whatever resources you may need and have them rendered as front matter parameters.

Tip 3: The Hugo new command can create markdown files in the specified PATH (Big time saver)

I am very happy with the above. It’s pretty fast and efficient and it shows the power of Hugo with Data.

Again my big thanks to @bep for suggesting Environment Variables.

1 Like

What exactly do you mean by this (other than what you’ve written in the code example above)?

For example if you have an Archetype called orange.md

Then you can do:

hugo new "/orange/varieties/belladonna.md"

And belladonna.md will be created in the above PATH.

Ah, yea that’s what I thought you meant. Thank you!

Nice.

Of course, you could take this further with a simple BASH, PowerShell or Node.JS script (OK, or pretty much any other scripting language too!). It would be pretty easy to automate the whole thing.

Instead, you could (to redress the Excel bash), use Excel’s PowerQuery ETL features to generate the list of commands directly from the CSV, Excel file, JSON, a database, …

Excel is out of the question, it turns UTF-8 characters to ??? on CSV export.

One could import the `.xlsx to Libre Office’s Calc and then Export the CSV. (Isn’t that ironic a free and Open Source Software can handle UTF-8 whereas the proprietary and expensive counterpart cannot).

But as I said above I’m using Sublime Text’s Advanced CSV Package and I have automated the generation of the console commands with the RegReplace package.

The main reason that I prefer using this setup than what I was doing before with Bash (posted link above) is the ability to generate the content files on the fly with Hugo Archetypes.


I am using resources.Get and .Resize from within the Archetype to populate some old-fashioned image front matter parameters.

It seems to me that using a .Params.image in the templates, is faster than any .resources.Get, .Resize or Page Bundle.


Of course if the above statement regarding the speed is wrong, I;m sure that someone else more knowledgeable, will chime in to correct me. :wink:

Sorry, I wasn’t clear. The PowerQuery tool wouldn’t use a CSV export. It would create a shell script for you. Save directly to a file or copy/paste from Excel. Though to be fair, I’d need to double check extended characters but they should work OK. Needs a new’ish version of Excel though and I don’t know if the same facilities are available on Excel for Mac. Still, PowerQuery in Excel is designed specifically as an Extract, Transformation & Load (ETL) tool.

If you have some example data, I’d be happy to give it a go to check.

Mostly, I’d probably use either PowerShell or Node.JS to do this. Others might prefer Python or BASH :slight_smile:

That would still be used. I’m only suggesting that you could automate the whole thing based on the source data. Only worth it if you have LOTS of posts to create of course.

1 Like

Thanks for offering to help. :four_leaf_clover:

But I’m not using Windows Tools for Development.

Haha, well I’ll try not to feel too sorry for you :smile:

Anyway, just thought I’d chip in with some other ideas in case they spark some thoughts in others.

1 Like

Hey,
I wrote a simple tool that solves this problem (using a JS-wrapper): kidsil/hugo-data-to-pages .

Would definitely appreciate some feedback.

1 Like