Asciidoc + Hugo performance

pgr · February 19, 2018, 7:33pm

We opted for Asciidoc over Markdown in our new documentation site. There are many good reasons to go with Asciidoc, mostly human-readability and extensibility (without ending up cramming HTML directly into your content). It is basically what people came up with after they saw the problems with many markdown sites.

However, as the content grew we soon started to have a performance problem, which my friend @dirtyf alerted me to. We’re basically killing Hugo’s spectacular performance with Asciidoc which is much slower than markdown, because it calls out to a slow Ruby executable.

I found some posts on the Internet about this, which basically say Asciidoc needs a faster implementation (possibly in Go), and there are open-source projects doing it, but very old, stale, and incomplete. It is a daunting task.

My question here is this: is there any architectural mitigation possible, in Hugo, to help with this? For example, is Hugo calling Asciidoctor engine once per file, and would it be faster to somehow call it only once for everything? Is that even possible?

Or any other idea? I’m really trying to avoid converting everything back to markdown… thanks!

bep · February 19, 2018, 7:37pm

The real solution is a native Asciidoc (Golang) processor. But that is a big project in itself. There have been some efforts to improve the current solution, but with marginal gains and too much-added complexity.

It is only recommended if you can live with the performance drawbacks.

pgr · February 20, 2018, 2:25pm

Thanks for the clarifications. I’m reaching out to the authors of existing projects to see if one of them is worth picking up and following through.

In my Hugo build I get these statistics:

Started building sites ...

Built site for language en:
0 draft content
0 future content
0 expired content
93 regular pages created
55 other pages created
3 non-page files copied
0 paginator pages created
1 tags created
0 categories created

Built site for language fr:
0 draft content
0 future content
0 expired content
0 regular pages created
6 other pages created
0 non-page files copied
0 paginator pages created
0 tags created
0 categories created
total in 19269 ms

I don’t know the difference between “regular pages” and “other pages”, but in total I seem to have 93+55+6=154 pages.

This gives an average page processing time of 125 ms.

However, when the server is running in Fast-render mode, and I change a single .adoc page, the server detects the change and I get a rebuild time of 2076 ms for that single page.

@bep does this overhead seem reasonable to you, because of other things the server has to do in fast-render mode? Or is that also an Asciidoc delay? If so, why so much bigger than the average 125 ms?

I never had the experience of using Hugo with Markdown so I don’t have the sensitivity for evaluating and comparing these delays…

BTW, this is the repo and this is the rendered site, in case it helps.

bep · February 20, 2018, 3:04pm

It is Asciidoc. But I cannot spend any time debugging this. This is a known limitation.

pgr · February 20, 2018, 3:33pm

Ok, no problem, it’s quite understandable and I can only thank you for the lots of things you already do for Hugo.

I am 95% sure I won’t embark on the Asciidoc engine rewrite myself, but I will consider just one possibility:

if this project can be run with Hugo
if it has a minimum support for most Asciidoc features I’m using in my files
if I can somehow work with the two engines simultaneously, so that I can have the quick Asciidoc engine for the files it can handle, and the slow one when I need some fancy feature

… then I might try slowly adding support for more Asciidoc features, as needed by my project, and hope that others end up contributing also.

pgr · February 23, 2018, 4:55pm

@bep I am weighing our possibilities of helping out with a new asciidoc Go engine.

I read about “external helpers”, and I read some of the Hugo code where it selects the engine based on file extension.

When looking at the code in this existing Go asciidoc project, I see it relies heavily on regexp processing, which I found strange because my intuition tells me that’s not the way to write a parser if you need it to be very efficient. It’s using a generic parser to write a specific parser…

I checked the BlackFriday code that Hugo uses, which I know for sure is efficient, and I see it uses a rather different approach. There is some regexp use, but most of it is just for loops going through strings.

So I’m starting to wonder: what if, instead of developing that asciidoctor engine, we could approach this task as a kind of “fork” from BlackFriday, editing the way it parses each language element, and the way it renders it in HTML?

So…

What do you think about this approach I’m suggesting, does it look promising to you? Better than developing from that other project?
Do you see a future asciidoc Go engine as something that goes inside Hugo, like BlackFriday currently, or do you see it as an external helper?

Thanks in advance for your (always precious) insights.

bep · February 23, 2018, 5:01pm

I think you greatly underestimate the work required to get native Asciidoc.

I have briefly looked into it, but this has to either

Be backed by a company with lots of available developer resources (similar to how Google does it with many open source projects, Go, Kubernetes etc.)
Find a great developer with WAY too much spare time on his/hers hands.

Let us say that this is 4-5 times the amount of work compared to Blackfriday. Totally guessing here, but you cannot start with a talk about using regexps or not.

Many companies and developers out there really, really want this. If it was easy to implement, it would have happened long time ago.

pgr · February 23, 2018, 5:49pm

I have some notion that it’s a big task, although I didn’t have a notion of it being possibly 4-5 times more complicated than BlackFriday

But, to be honest, I wasn’t thinking of finishing it, I was thinking of starting it

Basically now, in our site, we use:

text!
text emphasis (bold, etc)
links
headings
images

And only the very simplest versions of these:

tables
numbered lists

So all I am saying is that if we can get this started, we could have a very partial implementation of Asciidoc, and use it partially in our sites, while keeping the Ruby Asciidoc engine for the heavy-lifting of a handful of more complex pages.

We could use a different file extension to choose from the two engines, like .adocish for our own internal very partial Asciidoc engine.

The point of all this would be to break the ice, to provide a way for people to then add features bit by bit: “ok, now I need to use nested lists, I can add that feature and move on”. This could unblock the issue, and instead of waiting indefinitely for 1 big company to come in and do 100 things for this, we’d have 100 people doing 1 thing each, progressively.

But I know I might be dreaming so don’t feel you need to answer this, I am content with your current appraisal which, I have to say, sounds realistic. Thanks for reading this far - see you on other threads!

bep · February 23, 2018, 6:09pm

I’m just guessing here – but the Asciidoc spec (if there even is a spec) seems … big.

So, if you only need the most basic features, why not write a script that convert to Markdown and then sit and wait … It will happen (I hope and think). Writing a script to convert back to Asciidoc should not be too hard … Me thinks.

pgr · February 23, 2018, 6:23pm

Sure, converting from one to the other is relatively easy, and pandoc takes care of 95% of the effort.

I am resisting going that route (though eventually it’s likely to happen) just because I want my users to cooperate in the site, we have a big community (+80.000) willing to participate, and I wanted to give them one single kind of mark-up, and I wanted to give them the best there is.

Sigh…

bep · February 23, 2018, 6:27pm

That is a great number. Yes, I understand. Writing a bare-bone Asciidoc should be doable, but don’t expect me to do it … But if you can get something good up and running, I would be happy to add it to Hugo in some way (add some other extension, maybe).

bep · February 23, 2018, 6:29pm

… and the smart way of doing this is to reuse Blackfriday’s HTML renderer.

stiobhart · December 23, 2022, 8:07pm

I know this thread is prehistoric. But I just came across it while searching myself to see if any progress is being made on this. And I felt I had to correct something the OP said --just for the record:

We’re basically killing Hugo’s spectacular performance with Asciidoc which is much slower than markdown, because it calls out to a slow Ruby executable.

Asciidoctor is not the bottleneck here. I wrote a piece myself a while back, when I converted my site from Hugo + Markdown to Hugo + AsciiDoc and suffered similar slowdowns:

As part of that process, I ended up having to run AsciiDoctor ‘standalone’ on my site because Hugo was refusing to build it. And I found that AsciiDoctor was easily capable of converting a couple of hundred pages in a few seconds. Similar in speed to what I was previously seeing with Hugo’s build times and a Markdown site.

So, in my experience, the problem is not with Hugo alone [which we know can build sites lightning fast] nor with AsciiDoctor alone [which, in my test was equally nippy] but lies somewhere in the communication between Hugo and AsciiDoctor. There’s something in there that is slowing everything to a crawl.

Unfortunately, 5 years on from this original thread, I’m still finding that Hugo takes an age to build my AsciiDoc site. There doesn’t seem to have been any progress made at all on this. Which may be because the finger has [erroneously in my opinion] been pointed at AsciiDoctor as the culprit and so filed under ‘Someone else’s problem’ by the Hugo dev team.

Which, I suppose, does justify this bit of thread archaeology!

hybras · September 10, 2023, 8:21pm

It seems the overhead is due to there being one asciidoctor process spawned per file. Running asciidoctor in batch mode is fast.

My benchmarking attempts with

hyperfine (benchmarking tool)
laptop: 2020 arm m1 macbook air, 8gb ram, macos ventura 13.5.1
hugo version: v0.111.3+extended
site Summary

type count

Pages 121

Non-page files 29

Static files 9
bench results

bench name command time (sec)

stock hugo 8.9

shim alias asciidoctor=cat 0.7

standalone fd -e adoc -x asciidoctor -o - 6.2

batch fd -e adoc -X asciidoctor -o - 0.38

Topic		Replies	Views
Libasciidoc - a Golang Asciidoc renderer feature asciidoc	21	3964	October 21, 2024
Asciidoctor Server tips & tricks asciidoc	3	588	December 31, 2023
Hugo organization and/or embracing Asciidoc(tor)go dev asciidoc	4	3249	July 4, 2018
Add support for AsciiDoc feature asciidoc	7	3771	August 5, 2015
AsciiDoc external helper not working in Fedora (Incorrect content rendering) support asciidoc	11	908	May 18, 2019

bench name	command	time (sec)
stock	`hugo`	8.9
shim	`alias asciidoctor=cat`	0.7
standalone	`fd -e adoc` `-x` `asciidoctor -o -`	6.2
batch	`fd -e adoc` `-X` `asciidoctor -o -`	0.38

type	count
Pages	121
Non-page files	29
Static files	9

Asciidoc + Hugo performance

Related topics