Pelican Blog Migration -- How to reduce manual effort

I’m migrating my blog https://tonym.us/ from pelican to hugo. I’d like to minimize the manual changes needed.

My pelican content is geared toward a chronological blog with all the posts in a /content/ folder e.g.

./content
   - post1.md
   - post2.md 
   - etc

Here are the areas I could use help with:

  • migrating frontmatter to yaml with Hugo schema – i have a rough script (below)
  • updating pelican tags & categories hugo to tags & genre – tags are not currently indexing despite using `taxonomy`` frontmatter params
  • chronological & paginated indexing & blog post previews – using m10c theme i get indexing but not previews. I may need to write a new template to generate blurbs.
  • migrating pelican single-directory to hugo multiple-directory structure. No solution yet.
  • preserving slugs / permalinks. i have a solution for post permalinks using hugo.yaml but not a good solution to preserve tags, category & rss permalinks.

questions

  1. Is it possible to simplify hugo config to keep this simple structure?
  2. Are there tools to help with the above migration steps , e.g. frontmatter, tagging, etc.

Here are some scripts I’ve found that I’m working to modify to fit my needs

This is a trivial scripting exercise.

They should be if your front matter looks like this:

title: foo
date: 2024-10-31T12:15:56-07:00
tags:
  - Lorem
  - ipsum
  - dolor

If not:

  • Check your site configuration to make sure you haven’t excluded “tags” from the taxonomy system . For example, this tells Hugo that the only taxonomy is colors:

    [taxonomy]
    color = colors
    
  • Check your site configuration to make sure you are not disabling the taxonomy or term page kinds. For example, don’t do this:

    disableKinds = ['taxonomy','term']
    

If none of those apply, you’ll need to share your site’s project repository with us.

You may need to override one or more of your theme’s templates. You override a theme template by copying it into the same path relative to the layouts directory in the root of your project.

I don’t know what that means or why it is required.

I think we would need specific examples. You can override URLs by defining permalinks in your site configuration, by setting the url field in front matter, or by setting the slug field in front matter.

https://gohugo.io/content-management/urls

1 Like

thanks your guidance was helpful.

Simple structure

I was able to update list.html in my theme to reflect my current directory structure

-      {{ range where .Paginator.Pages "Type" "!=" "page" }}
+      {{ range .Paginator.Pages.ByDate.Reverse }}

Tags , categories , etc

Thanks to your guidance, I updated frontmatter with the following and it’s improved the indexing

---
title: Three Pillars
date: 2021-07-14
categories: leadership
tags: management, information
params:
    author: Tony Metzidis
---

Permalinks

Thank you for your help with permalinks.

Frontmatter conversion

You’re right sed can help with some of the frontmatter, but there’s still date formatting and other transforms that could benefit from a 1-200 line script. Every little bit helps.

No. The tags and categories fields must be arrays.

Some scripts that helped with the translation . mostly i used vim macros for the transforms.

slugs

# slugs
grep -h 'title:' *.md | cut -c 8-120| tr '[:upper:]' '[:lower:]'| sed 's/[^0-9a-z -]//g;s/ /-/g;s/--/-/g' |sed 's/^/slug: /' | cut -c 7-120| tr '[:upper:]' '[:lower:]'| sed 's/[^0-9a-z -]//g;s/ /-/g;s/--/-/g' |sed 's/^/slug: /'|sort 

titles

# Titles
echo 'Fighting GCP & Firebase Cloud Client CLI and SDK Bloat' | tr '[:upper:]' '[:lower:]'| sed 's/[^a-z -]//g;s/ /-/g;s/--/-/g'

frontmatter

# frontmatter
sed 's/Title:/title:/g;s/Date:/date:/g;s/Category:/categories:/g;s/Authors:/author:/g' content/wsl2-backup-to-onedrive.md|head

thanks that helped

1 Like

Although not required (yet), the idiomatic way to define custom front matter fields is to place them under a params key.

https://gohugo.io/content-management/front-matter/#parameters

example yaml:

---
date: 2024-02-02T04:14:54-08:00
draft: false
params:
  author: John Smith
title: Example
weight: 10
---

Again, what you have works fine, but at some point we may require this front matter structure.

great tip. that sed script was a precursor for some final vim macros. here’s an example of the final product

---
title: Three Pillars
slug: three-pillars
date: 2021-07-14
categories: [leadership]
tags: [management, information]
params:
    author: Tony Metzidis
---
1 Like

The final site went live. It was about ~40 commits across my site and the theme template.

Here’s a diff of the template changes that were needed to preserve my layout

Here’s a sample of the local hugo layout & config needed

Highlights

  1. layout for archives list in /arcrhives.html
  2. Dockerfile for CI
  3. hugo config for color theme, footer, uglyURLs

Some good results

  • New site network transfer is 58% smaller
  • Builds are much faster about 60ms with hugo vs 5 seconds with pelican
  • Docker images for CI are smaller

Biggest Challenges

  • Understanding template targeting with content type, layout, kind dimensions
  • Some manual labor for frontmatter
  • slug generation is hard-coded in hugo buildtime. Maybe a config needed to customize slugs ? (pelican uses long slugs, hugo uses 10-char slugs)

Overall very nice experience learning Hugo templating.

2 Likes

I’m not quite sure what that means. This is perfectly valid:

+++
title = 'Post 1'
slug = '012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'
+++

I’m referring to slug generation from the post content . Pelican auto-generates the slugs from the title. I would have liked to config this routine to match the previous behavior.

func dateAndSlugFromBaseFilename(location *time.Location, name string)

slug := strings.Trim(withoutExt[10:], " -_")

I’m trying to be helpful in documenting the migration to make the migration path easier . This is good for the project. Please don’t take this as criticism. It’s been a good experience.

1 Like

Here’s areas I could help

  • updating docs on pelican to hugo migration (currently docs have other vendors)
  • adding compatibility for the slugs to preserve urls
  • proper front-matter transforms (e.g. transform schema, date formats)

I’ve long been a hugo fan (i think i contributed the original docker work) . I really enjoy volunteering.

The code you are pointing to is applicable when your site configuration looks like this:

[frontmatter]
date = [':filename', ':default']

Which tells hugo that a filename such as this:

2024-04-01-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.md

Should have date set to 2024-04-01 and the slug set to “xxxxxxxxxx… (no limit)”.

I think I’m missing something here.

Example File

  • filename (in) = /content/100kb-docker-image.md
  • front-matter:
---
title: Build 100kB Docker Images from Scratch

Pelican (old)

url (out) = /build-100kb-docker-images-from-scratch.html

Hugo (default , no slug)

url (out) = /100kb-docker-image.html

To clarify, you want the URL to be a “slugified” version of the title as defined in front matter? Is that correct?

yes , as an option, to preserve URLs during migration. My workaround was adding “slug” to the content.

site config:

uglyURLs = true
[permalinks]
'/' = '/:title'

content structure

content/
├── 100kb-docker-image.md <-- front matter title = 'Build 100kB Docker Images from Scratch'
└── _index.md

published site

public/
├── build-100kb-docker-images-from-scratch.html
├── favicon.ico
└── index.html