Using CSV file to create individual posts on build

Currently I have been creating individual files for each book I read. This results in a huge amount of frontmatter per book and makes it complicated to add a new book. Since there is no way to automate filling in this front matter using hugo new I am looking at using my Goodreads export file (in CSV format) to run through a template. But I don’t want to just create a listing, I want each entry to have its own page/post.

The workflow I envision is to add an entry to the CSV file, then have hugo create the resulting html file during the build process (or using LiveReload locally).

Is this a workable scenario? Will it create issues using this with Netlify? Obviously I need to dig into the getCSV part of Data Templates, but most of that seems to talk about pulling all the data into one file…

What you want is not yet possible, but there is an open GH issue about it: https://github.com/gohugoio/hugo/issues/5074

Now, if you were okay with having all your book info on a single page, that is very doable.
It’s the “build a new page for each book” that is not yet possible.

I would create a separate script in your favourite language to read the CSV file and generate separate content files with the needed frontmatter filled in.

Hugo would then see the content files and generate the site as you want.

1 Like

You can (probably easily) convert the csv to json. And hugo accepts json for build right now.

I like the JSON idea, but in the docs it says this:

“Data files aren’t used to generate standalone pages; rather, they’re meant to be supplemental to content files”

So I’m not sure if that would allow what I’m trying to do…

@frjo I already have a separate file for each book, generated from the CSV using Wordpress, a plugin, and Advanced Custom Fields. Just trying to automate the workflow a bit going forward.

We’re not there yet. That would be an ideal scenario but as @zwbetz mentioned hopefully this will be possible once Build pages from data source · Issue #5074 · gohugoio/hugo · GitHub is addressed.

Currently the way I am creating separate Markdown files from a CSV is outlined here:

In retrospect, even though it wouldn’t take long, I’ve decided that rebuilding 140+ book pages every time I change something on the site feels pretty clunky.

As a workaround, I added ‘newContentEditor’ to my config file so any new book page created auto-opens in VS Code. I have the archetype set up so I can start entering data and move to the next field with one quick down-arrow. This is about as automated as it’s going to get, and that’s fine.

Just for clarification, how do you trigger the file creation? Do you have to run ‘hugo new’? Normally you have to provide a filename with that…

By JSON I mean create a markdown file with json in it. So the extension will be .md but whatever is in the file is json (as opposed to yaml or toml). This is for your posts in your content folder. Whatever is in the data folder is something else entirely. That’s what you are referring to @dixonge

Here is a simple example of a python script that converts each row in a CSV file in to a Hugo post.

It will use the first row as headers and use them as params in the front matter.

If a file exist it will not be regenerated, this speeds up the script if you are adding to the CSV file and want to output the new content.

#!/usr/bin/env python3
"""
The csv file must contain a "title" column and any main content should be in a "body" column.
"""

import csv
import re
import unicodedata

from pathlib import Path

# Set the path to the source CSV file.
source = Path('/Users/frjo/Desktop/test.csv')


def slugify(value):
    value = str(value)
    value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore').decode('ascii')
    value = re.sub(r'[^\w\s-]', '', value).strip().lower()
    return re.sub(r'[-\s]+', '-', value)


def write(path, row):
    file = open(path, 'w')
    print('---', file=file)
    for key, value in row.items():
        if key != 'body':
            print(f'{key}: "{value}"', file=file)
    print('\n---\n', file=file)
    if 'body' in row:
        print(row['body'], file=file)


def main():
    csv_file = open(source, newline='')
    csv_content = csv.DictReader(csv_file)
    for row in csv_content:
        filename = slugify(row['title'])
        path = Path(f'{filename}.md')
        if not path.is_file():
            write(path, row)


if __name__ == '__main__':
    main()
5 Likes

Indeed.

Typically I paste a list of hugo new commands in the console and the files are created sequentially.

env HUGO_TITLE="110" hugo new "post/110.md"
env HUGO_TITLE="111" hugo new "post/111.md"
env HUGO_TITLE="112" hugo new "post/112.md"
1 Like

That’s a pretty neat script @frjo :grinning:

2 Likes

How add categories and tags some body sir??

for toml
categories= ["fruits"]
tags= ["banana", "apel"]

Thanks

way ahead of you, sir - I converted all my sites weeks ago! :slight_smile: