Asciidoc + Hugo performance

We opted for Asciidoc over Markdown in our new documentation site. There are many good reasons to go with Asciidoc, mostly human-readability and extensibility (without ending up cramming HTML directly into your content). It is basically what people came up with after they saw the problems with many markdown sites.

However, as the content grew we soon started to have a performance problem, which my friend @dirtyf alerted me to. We’re basically killing Hugo’s spectacular performance with Asciidoc which is much slower than markdown, because it calls out to a slow Ruby executable.

I found some posts on the Internet about this, which basically say Asciidoc needs a faster implementation (possibly in Go), and there are open-source projects doing it, but very old, stale, and incomplete. It is a daunting task.

My question here is this: is there any architectural mitigation possible, in Hugo, to help with this? For example, is Hugo calling Asciidoctor engine once per file, and would it be faster to somehow call it only once for everything? Is that even possible?

Or any other idea? I’m really trying to avoid converting everything back to markdown… thanks!

The real solution is a native Asciidoc (Golang) processor. But that is a big project in itself. There have been some efforts to improve the current solution, but with marginal gains and too much-added complexity.

It is only recommended if you can live with the performance drawbacks.

Thanks for the clarifications. I’m reaching out to the authors of existing projects to see if one of them is worth picking up and following through.

In my Hugo build I get these statistics:

Started building sites ...

Built site for language en:
0 draft content
0 future content
0 expired content
93 regular pages created
55 other pages created
3 non-page files copied
0 paginator pages created
1 tags created
0 categories created

Built site for language fr:
0 draft content
0 future content
0 expired content
0 regular pages created
6 other pages created
0 non-page files copied
0 paginator pages created
0 tags created
0 categories created
total in 19269 ms

I don’t know the difference between “regular pages” and “other pages”, but in total I seem to have 93+55+6=154 pages.

This gives an average page processing time of 125 ms.

However, when the server is running in Fast-render mode, and I change a single .adoc page, the server detects the change and I get a rebuild time of 2076 ms for that single page.

@bep does this overhead seem reasonable to you, because of other things the server has to do in fast-render mode? Or is that also an Asciidoc delay? If so, why so much bigger than the average 125 ms?

I never had the experience of using Hugo with Markdown so I don’t have the sensitivity for evaluating and comparing these delays…

BTW, this is the repo and this is the rendered site, in case it helps.

It is Asciidoc. But I cannot spend any time debugging this. This is a known limitation.

Ok, no problem, it’s quite understandable and I can only thank you for the lots of things you already do for Hugo.

I am 95% sure I won’t embark on the Asciidoc engine rewrite myself, but I will consider just one possibility:

  • if this project can be run with Hugo
  • if it has a minimum support for most Asciidoc features I’m using in my files
  • if I can somehow work with the two engines simultaneously, so that I can have the quick Asciidoc engine for the files it can handle, and the slow one when I need some fancy feature

then I might try slowly adding support for more Asciidoc features, as needed by my project, and hope that others end up contributing also.

@bep I am weighing our possibilities of helping out with a new asciidoc Go engine.

I read about “external helpers”, and I read some of the Hugo code where it selects the engine based on file extension.

When looking at the code in this existing Go asciidoc project, I see it relies heavily on regexp processing, which I found strange because my intuition tells me that’s not the way to write a parser if you need it to be very efficient. It’s using a generic parser to write a specific parser…

I checked the BlackFriday code that Hugo uses, which I know for sure is efficient, and I see it uses a rather different approach. There is some regexp use, but most of it is just for loops going through strings.

So I’m starting to wonder: what if, instead of developing that asciidoctor engine, we could approach this task as a kind of “fork” from BlackFriday, editing the way it parses each language element, and the way it renders it in HTML?

So…

  1. What do you think about this approach I’m suggesting, does it look promising to you? Better than developing from that other project?
  2. Do you see a future asciidoc Go engine as something that goes inside Hugo, like BlackFriday currently, or do you see it as an external helper?

Thanks in advance for your (always precious) insights.

I think you greatly underestimate the work required to get native Asciidoc.

I have briefly looked into it, but this has to either

  1. Be backed by a company with lots of available developer resources (similar to how Google does it with many open source projects, Go, Kubernetes etc.)
  2. Find a great developer with WAY too much spare time on his/hers hands.

Let us say that this is 4-5 times the amount of work compared to Blackfriday. Totally guessing here, but you cannot start with a talk about using regexps or not.

Many companies and developers out there really, really want this. If it was easy to implement, it would have happened long time ago.

I have some notion that it’s a big task, although I didn’t have a notion of it being possibly 4-5 times more complicated than BlackFriday :frowning:

But, to be honest, I wasn’t thinking of finishing it, I was thinking of starting it :slight_smile:

Basically now, in our site, we use:

  • text!
  • text emphasis (bold, etc)
  • links
  • headings
  • images

And only the very simplest versions of these:

  • tables
  • numbered lists

So all I am saying is that if we can get this started, we could have a very partial implementation of Asciidoc, and use it partially in our sites, while keeping the Ruby Asciidoc engine for the heavy-lifting of a handful of more complex pages.

We could use a different file extension to choose from the two engines, like .adocish for our own internal very partial Asciidoc engine.

The point of all this would be to break the ice, to provide a way for people to then add features bit by bit: “ok, now I need to use nested lists, I can add that feature and move on”. This could unblock the issue, and instead of waiting indefinitely for 1 big company to come in and do 100 things for this, we’d have 100 people doing 1 thing each, progressively.

But I know I might be dreaming :crazy_face: so don’t feel you need to answer this, I am content with your current appraisal which, I have to say, sounds realistic. Thanks for reading this far - see you on other threads!

I’m just guessing here – but the Asciidoc spec (if there even is a spec) seems … big.

So, if you only need the most basic features, why not write a script that convert to Markdown and then sit and wait … It will happen (I hope and think). Writing a script to convert back to Asciidoc should not be too hard … Me thinks.

Sure, converting from one to the other is relatively easy, and pandoc takes care of 95% of the effort.

I am resisting going that route (though eventually it’s likely to happen) just because I want my users to cooperate in the site, we have a big community (+80.000) willing to participate, and I wanted to give them one single kind of mark-up, and I wanted to give them the best there is.

Sigh… :neutral_face:

That is a great number. Yes, I understand. Writing a bare-bone Asciidoc should be doable, but don’t expect me to do it … But if you can get something good up and running, I would be happy to add it to Hugo in some way (add some other extension, maybe).

… and the smart way of doing this is to reuse Blackfriday’s HTML renderer.

2 Likes