Best way of parsing HTML to Hugo MD files


I’m planning to migrate a 50+ plages website to hugo. The majority of the html pages are from templates, and their relative archetypes in hugo have already been created.

To proceed with the migration I have to manually copy text from H1, H2s, various meta tag contents, into the md files in hugo.

As I am working with content that is structured on both sides of the fence, it would be create if I had a way of mapping said areas in HTML to specific areas in the md files.

Where there is a gallery of photos, it would be great if I could use a loop and map those images in the images array in md files.

Just to clarify. I am pretty much ready to migrate, and I only need to transfer the content (simple text and some html tags with text ) to the MD files, and I am looking for a script or a tool that would spare the hussle of doing it by hand.

Your contribution will be greatly appreciated.

Have you checked how good (or bad) the output of a html>markdown conversion with pandoc is?

Hello Robert, thanks for the suggestion but I feel that Pandoc is way too complex and generat in purpose. I only need to map a few tags to a few front-end bits.
not the entire page.

Not tried it but there are bound to be some html to markdown scripts out there e.g.
node-html-markdown - npm (
Convert HTML to Markdown (