Generate Hugo website as a PDF

Hey !

I’m using https://github.com/matcornic/hugo-theme-learn as a documentation website. I’m looking for a way to provide an offline version of the website as PDF. This would assume to be able to convert Markdown to PDF, but especially order page in the right way before converting the data.

Does anyone has tried this yet ?
Thanks

I don’t believe there is a way to do this automatically with Hugo. There are some default arguments being passed to helpers like Pandoc, but last time I checked I couldn’t find out to change them:

Once those can be changed then you should be able to include PDF generation arguments for Pandoc.

There are couple alternative options.

  1. The easiest would probably be to create a wrapper script around Pandoc that adds these arguments. Unless you can get help from someone else with this, I don’t have the time to invest in doing it for you.

  2. PDF documents can be created from a custom output format template, but this is probably way more complicated than it is worth which is why it hasn’t been done.

You can watch for updates here:

Thank you for pointing me out at this issue. I’ll follow it!
I could write a script with pandoc and try to sort the markdown but I’m not sure I want to take this path…

I’ve done it recently, and this is how I’ve done it:

There’s a page in Hugo, with a different layout, that combines all pages and outputs an HTML that is suitable for PDF. Then, this is the [simplified] build script I run each time I want to update my website:

hugo --cleanDestinationDir --minify
cat ./public/all-content/index.html | wkhtmltopdf --outline-depth 2 --enable-internal-links - ./public/downloads/all-content.pdf
rm -r ./public/all-content

I’ve also excluded the source HTML page from the sitemap.

Most of the styles in the PDF file will come from the HTML file, but some that are specific to PDF, like margins, can be set using a YAML settings file for wkhtmltopdf.

I’ve also piped a few SED’s to the script above to adjust the HTML content for PDF conversion (e.g., replacing iframes with fixed images), but you can also do it in Hugo when setting up that page.

4 Likes

Really interesting @Nader_K_Rad!

How do you get this one HTML page or what is the URL ?

Thanks

It’s a section ("/all-content/") with its own layout. This is a simplified version of the layout:

<!DOCTYPE html>
<html lang="en">

<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# product: http://ogp.me/ns/product#">

  <meta charset="utf-8">

  <xbase href="file:///mnt/websites/xxxxxxxx/public">

  <title>All the Posts!</title>

  <style>
    
  </style>

</head>

<body>

  <h1>All the Posts!</h1>

  <p>This is a copy of all posts on the website. To download the latest version, visit...</p>

  <p>TOC:</p>

  <ul>
    {{ range .Site.RegularPages }}
      <li><a href="#{{ substr .RelPermalink 1 -1 }}">{{ .Title }}</a></li>
    {{ end }}
  </ul>

  {{ range .Site.RegularPages }}

    <h2 id="{{ substr .RelPermalink 1 -1 }}">{{ .Title }}</h2>
    <p style=""><a href="{{ .Permalink }}">{{ .Permalink }}</a> - {{ .Params.sdate }}</p>
    {{ .Content }}

  {{ end }}

</body>
</html>

I should warn you, however, that I’m not fully expert in Hugo; others in this forum may be able to give you much better solutions. This is just something I could come up, which works fine for me.

All right ! Understood, I need to create a specific page that shows all the markdown as HTML then convert it to PDF. Thanks for the hint !

1 Like

Hi there,
i’m working on Paged.js, an html to pdf solution that follows the w3c standards, to print books easily. Incidently, i made our new website using Hugo, and setup a paged.js exemple. Check the website i made using Hugo (for paged.js, the lib that let you print and make PDF from HTML : https://www.pagedjs.org

you got a button top right on this page to set up the preview. Once it’s done, you only need to print from the browser, and you’re all set. Browser side printing as it should be :slight_smile:

http://localhost:1313/posts/2020-02-18-return-from-prague-the-future-of-web-to-print/

I’ll make a reusable version of that, meantime, you can see how it’s done here:


(calling the Paged.js script and adding a button for that)

and here: https://gitlab.pagedmedia.org/tools/pagedjs-website/blob/master/themes/pagedjs/assets/js/print.js
the js to set up the printing button.

Tell me if that helps!

2 Likes

This is great! I’ve been meaning to pull this into my own site, and this work will be very helpful! :pray:

one thing: the print preview button/link is not showing up on all the pages …

one thing: the print preview button/link is not showing up on all the pages …

If you’re talking about pagedjs website, that’s what we wanted: if it makes sense to print single page, i’m not sure that printing list pages make sense. But there is no bug here :slight_smile:

Links point to 404 for me. The pagedjs site project is here:

1 Like

Hello @Nader_K_Rad,
Could you explain a little bit how you specifiy the “all-content layout” to generate the page in the all-content directory ?

Thanks in advance !

  1. Create a new directory in the content directory. I’ve called it “all-content”.
  2. Add a simple index.md file to that directory. The content is not important.
  3. Right now, this new directory will use the default layout, which is not fine for our purpose. Create a directory with the same name in layouts; e.g., /layouts/all-content/. Now you can define a new layout for this page.
  4. Create /layouts/all-content/list.html, with a layout that includes all pages on the website. Use my second reply to the original post above as a starting point and then refine it to match your preferences.
  5. Create a build script to first build the website using Hugo, and then convert /public/all-content/index.html to pdf. Use my first reply above as a starting point and then adjust it to your need. You may need to install a few things such as wkhtmltopdf too.
  6. Refine the rest of the website by excluding that page from the sitemap and RSS feed, etc.
1 Like

Thanks a ot for the very quick reponse @Nader_K_Rad ! I understand better now.! It works very well !

Hello all,

Just to let you know that I have create a small node library (GitHub - jgazeau/website2pdf: Node library to print PDFs from a website following sitemap protocol) to crawl a website based on the sitemap protocol (which is the default one used in Hugo), and generate a PDF for each URL.

Maybe it will help you or others with the same aim.

1 Like