Wordpress Migration: URL Rewriting

I would like to migrate my existing website from Wordpress to Hugo.

A lot of my traffic is driven from old posts on Reddit, etc, which use Wordpress’s permalink format:

http://www.penwatch.net/cms/?p=387

If I break these links when migrating to Hugo, I will lose the majority of my inbound traffic. I also don’t want to lose any Google-juice from broken links.

The existing tools, i.e. wordpress-to-hugo-exporter, don’t address this.

Other documentation, i.e. Justin Dunham’s guide, addresses how Apache’s mod_rewrite (i.e. .htaccess files) can be used to re-direct from old URL’s to new URL’s. But the example given in Justin’s guide is for a friendlier scheme than I am using - his URLs included the post name, where mine are just sequential numbers like 387.

I Googled various keywords like wordpress, static site, rewrite, permalink, migrate, htaccess, and so on, but didn’t find anything that looked suitable for my circumstances. (Or, possibly, I am looking at the problem the wrong way, and passed over an actual solution without recognising it.)


How can I migrate my existing Wordpress site, with horrible permalinks like http://www.penwatch.net/cms/?p=387, to Hugo, without breaking links to the old Wordpress site?

I am happy to write up the successful procedure and add that to the Hugo documentation, when done.

1 Like

This approach might be tedious but you can define aliases in the frontmatter of your post. Those will generate empty HTML files that redirect users to the new url.

I would agree with @digitalcraftsman – if scripted, it wouldnt be too tedious.

Aliases sound like a reasonable solution.

I could generate aliases so that the content which formerly lived at:

http://www.penwatch.net/cms/?p=387

is aliased to:

http://www.penwatch.net/cms/old_permalinks/387 

Then use mod_rewrite to do the translation.

This is better than abusing Hugo’s “permalink” attribute, which affects other aspects of the page generation.

It should be easy to adapt the existing Wordpress-to-Hugo converter to output the relevant information. (Famous last words??)

I’ll give it a go, and report back.

I have a working solution for this now.

I am writing up a larger post about migrating from Wordpress to Hugo, start to finish.

It will live on my new Hugo website, when it comes up. A draft (subject to tweaks and changes) is below. It requires some polish and proofreading. Please let me know if it can be improved.

This solution is currently live at http://www.penwatch.net/cms2, mirroring the live Wordpress site at http://www.penwatch.net/cms/. Try http://www.penwatch.net/cms2/?p=24 for an example of a redirect.

Note the following limitations:

  • Wordpress draft posts are not exported.
  • Wordpress comments are not exported. These can be exported from the Wordpress admin interface, - in Wordpress Extended RSS (WXR) format, along with all posts, etc. You would then suck the comments into a third party service like Disqus. You would need to edit the WXR file to change all the URLs from old post locations to new post locations - I haven’t looked into this yet.

Get data out of Wordpress

  1. Export your data from Wordpress using wordpress-to-hugo-exporter.

    • Install the wordpress-to-hugo-exporter plugin to your Wordpress blog

    • From the Wordpress dashboard, activate the plugin.

    • In the Wordpress Dashboard, use Tools > Export To Hugo.

      If you get PHP errors, you might try disabling some or all of your Wordpress plugins.

    • You will get a zip file named hugo-export.zip.

  2. Unzip hugo-export.zip.

    For some reason, the wordpress-to-hugo-exporter plugin creates the zip file with a nameless parent folder inside. So Windows Explorer thinks it’s an empty zip file.

    Opening the zip file with a more advanced tool, i.e. 7-zip, reveals the actual contents.

    Unzip the contents to a folder like C:/mysite/. You should have folders like C:/mysite/post/ containing markdown files, and C:/mysite/wp-content/ containing images.

  3. Make the Hugo data into a git repository

    cd C:/mysite/
    git init .
    git add .
    git commit -am "inital commit from wordpres--to-hugo-exporter"
    
  4. Create a .gitignore file containing the following:

    public/
    

    Commit that:

    git add .
    git commit -am "Add .gitignore"
    

Install a theme and get an initial config

  1. Install the Blackburn theme:

    mkdir themes
    cd themes
    git submodule add https://github.com/yoshiharuyamashita/blackburn.git
    git commit -am "Imported Blackburn theme" cd ..
    
  2. Add a Hugo configuration file.

    Discard the existing config.yaml file generated by wordpress-to-hugo-exporter, which is pretty useless.

    Create a new config.toml file, using the Blackburn theme’s template:

    rm config.yaml
    cp themes/blackburn/exampleSite/config.toml ./config.toml
    git add .
    git commit -am "Add config file"
    

Move files into the correct folders

wordpress-to-hugo-exporter doesn’t put files into the folders that Hugo expects. We will need to move the files into the correct directory structure, with static/, content/, etc.

  1. Move images to static/: The wp-content folder, containing your images (i.e. wp-content/uploads/139.png), should actually be static/wp-content/.

    mkdir static
    mv wp-content static
    git add .
    git commit -am "Move images to static/ folder"
    
  2. Move posts to content/: Similarly, the post/ folder, containing markdown files, should actually be content/post/.

    mkdir content
    mv post content
    git add .
    git commit -am "move posts/ to content/ folder"
    
  3. Move pages to content/: If you had any “pages” in Wordpress - i.e. an “About” page or a “Disclaimer” page - then hugo-to-wordpress-exporter puts those into their own directories.

    For example, my “Disclaimer” page in WordPress became disclaimer/index.md.

    Move these into the content/ folder as well:

    mv disclaimer/index.md content/disclaimer.md
    mv public-bookmarks/index.md content/bookmarks.md
    git add .
    git commit -am "move pages into content/ directory"
    

Edit front matter

The front matter of each post needs to be changed, to accomodate a) the Blackburn theme, and b) redirection of of Wordpress permalinks into the new Hugo pages.

Perform the following find and replace operations in all the content/*.md and content/post/*.md files.

I used Notepad++ to do this, using “Find and Replace in Files” in “Regular Expression” mode.

  • Find and replace

    url: /\?p=
    

    with

    aliases:\n    - /wp-permalink/p
    

    This is needed to redirect from the permalinks later. Note that the question mark in url: /?p= is a special character in regular expressions, so it needs to be escaped: \?.

  • Find and replace

    categories:
    

    with

    tags:
    

    This is needed for the Blackburn theme, which prints tags at the top of every post, but not categories.

  • Find and replace

    layout: post
    

    with nothing (empty string “”). This is needed because the Blackburn theme doesn’t have any specific template for a “post”.

Front matter before:

---
title: How to draw an inductor symbol
author: Li-aung Yip
layout: post
date: 2011-11-20T17:18:17+00:00
url: /?p=35
categories:
  - Uncategorized

---

Front matter after:

---
title: How to draw an inductor symbol
author: Li-aung Yip

date: 2011-11-20T17:18:17+00:00
aliases:
    - /wp-permalink/p35
tags:
  - Uncategorized

---

Commit your changes:

git commit -am "Apply find and replace operations to front matter"

Test run the site

At this point, we have enough done that we can try generating the site.

PS C:\mysite\> hugo --buildDrafts
Started building site
1 of 1 draft rendered
0 future content
0 expired content
46 pages created
0 non-page files copied
5 paginator pages created
16 tags created
0 topics created
in 703 ms

The website is generated into the public/ folder, i.e. C:/mysite/public/.

(IF you want to see a live preview, try hugo server --buildDrafts and open your web browser to http://localhost:1313.)

Take a look inside and notice that there is a folder called wp-permalink, which contains one file for each of your posts. For example, public/wp-permalink/p24/index.html. These files contain HTML that re-directs the browser from the alias,

http://www.penwatch.net/cms/wp-permalink/p24/index.html

to the actual post location,

http://www.penwatch.net/cms/post/2011-11-20-my-thesis-online-empirical-mode-decomposition/.


Set up .htaccess to redirect URLs

We need to redirect the old URLs, i.e.

http://www.penwatch.net/cms/?p=24

to the alias:

http://www.penwatch.net/cms/wp-permalink/p24/index.html

Which contains the redirect to the actual post location:

http://www.penwatch.net/cms/post/2011-11-20-my-thesis-online-empirical-mode-decomposition/

Create a file named static/.htaccess and put the following code into it (adapting for your URL scheme):

RewriteEngine On
RewriteCond %{QUERY_STRING} ^p=(\d+)$
RewriteRule ^$ http://www.penwatch.net/cms/wp-permalink/p%1? [L,R=301,NC]

For Apache webservers, this will cause the URL

http://www.penwatch.net/cms/?p=100

To be re-directed to:

http://www.penwatch.net/cms/wp-permalinks/p100

and the web browser will handle things from there.

Cleanup

  • Replace ’ (Smart quote ’) with '
  • Replace – (em dash –) with -
  • Replace “ and ” (Smart quotes “ ”) with "

Quick note:

If wordpress-to-hugo-converter fails when using Chrome with a “Network Error”, try using Firefox instead.

I suspect that wordpress-to-hugo-converter is a bit rough and ready with the way it creates and sends the zip file, and Firefox is more liberal than Chrome in what it accepts.