Best way of parsing HTML to Hugo MD files

luca · February 5, 2023, 1:26pm

Hello,

I’m planning to migrate a 50+ plages website to hugo. The majority of the html pages are from templates, and their relative archetypes in hugo have already been created.

To proceed with the migration I have to manually copy text from H1, H2s, various meta tag contents, into the md files in hugo.

As I am working with content that is structured on both sides of the fence, it would be create if I had a way of mapping said areas in HTML to specific areas in the md files.

Where there is a gallery of photos, it would be great if I could use a loop and map those images in the images array in md files.

Just to clarify. I am pretty much ready to migrate, and I only need to transfer the content (simple text and some html tags with text ) to the MD files, and I am looking for a script or a tool that would spare the hussle of doing it by hand.

Your contribution will be greatly appreciated.
Regards
Luca

fekete-robert · February 6, 2023, 8:39am

Have you checked how good (or bad) the output of a html>markdown conversion with pandoc is?

luca · February 6, 2023, 5:58pm

Hello Robert, thanks for the suggestion but I feel that Pandoc is way too complex and generat in purpose. I only need to map a few tags to a few front-end bits.
not the entire page.

andrewd72 · February 6, 2023, 7:29pm

Not tried it but there are bound to be some html to markdown scripts out there e.g.
node-html-markdown - npm (npmjs.com)
Convert HTML to Markdown (davidwalsh.name)

Topic		Replies	Views
Dynamically convert Markdown to HTML support	2	1245	October 8, 2015
Need to convert html to Markdown support	15	8759	May 24, 2018
Unparsed markdown support	6	577	June 16, 2019
Markdown/HTML	3	2795	August 23, 2017
[SCRIPT] Use advanced Markdown features (Images, etc.) tips & tricks	1	2934	June 2, 2017

Best way of parsing HTML to Hugo MD files

Related topics