Using furigana/ruby with Markdown


#1

First a little context! In many East Asian languages, a technique is used to make it easier to read characters that people might not recgonise. For example, the Japanse word for tea is pronounced “ocha”. It contains a kanji character and to help people learn, the phentic is written above it.

00

This is written in HTML like so:

お
<ruby>
    茶 <rp>(</rp><rt>ちゃ</rt><rp>)</rp>
</ruby>

You can also use it to write its ‘romanised’ pronounication, such as my name:

<ruby>
    トー<rp>(</rp><rt>tō</rt><rp>)</rp>
    マ <rp>(</rp><rt>ma</rt><rp>)</rp>
    ス <rp>(</rp><rt>su</rt><rp>)</rp>
</ruby>

There is no official way to do this in Markdown, as far as I know.

djfun/furigana_markdown, a Python Markdown extension, uses this:

[a](-b)

Which becomes:

<ruby><rb>a</rb><rp>(</rp><rt>b</rt><rp>)</rp></ruby>

amclees/furigana-markdown, for Discourse, uses this:

[世界]^(せかい)
[世界]{せかい}

So there isn’t an established way of doing it it seems, but it would be great to see if there was a way of having some sort of support for this.


#2

Do you happen to know how HTML editors use ruby in those languages? Is the formatting prescribed at https://github.com/amclees/furigana-markdown a standard of a sort? :slight_smile:


#3

I just tested this in a normal Japanese markdown page in Hugo, and it renders fine as is. I just pasted in the <ruby>茶<rp>(</rp><rt>ちゃ</rt><rp>)</rp><ruby> like a span, to test, and it works fine mixed in there. It did not appear to disrupt the markdown processing.

Edit: one thing that does not look good is, when you mix in a kanji character like this to English, the line height is effected and the page looks kind of messy. Adding ruby makes it worse. On an all-Japanese page, it’s probably not as much of a problem.


#4

No I don’t actually! Mozilla have a few references, but would be interesting to know what others do.


#5

Yeah it has its downfalls when you’re making a page that’s in English, such as when you’re teaching Japanese – but I guess that’s fairly niche and can be negated. When I’ve used that, I tend to separate them in blocks rather that doing it in-line. If you’re teaching kanji on a Japanese page then I guess it’s probably more reasonable.

Good point that it works well with just the HTML! I think I’d just assumed it would be messy without even trying, but it’s not so bad looking at it.

ありいがとう


#6

Well, if you’re making a page that needs rubii then I imagine you can format it how you need to, either way.

どういたしまして!