Parsing bibtex


I store bibliographic references in .bib files (BibTeX file format) within page bundles.
I use Hugo to concatenate them all in a single file (for the whole website with a template in a custom .bib output format) which I later reformat manually to CSL-JSON with pandoc and copy to assets/.
Then with Hugo getJSON I can parse the CSL-JSON file to display proper in-text references and bibliographies across the website.
That works pretty well.

Now what I’m trying to do is to automate the transformation of the bibtex files to CSL-JSON with Hugo.
So I created a custom output format for CSL-JSON and a template (index.csljon.json) which basically ranges over all the pages and retrieves the content of all the .bib files like this:

{{ range site.Pages }}
  {{ with .Resources.GetMatch "*.bib" }}
    {{ $bib := `^@(.*?){(.*?),((.|\n)*?)+(}}|\n})+$` }}
    {{ $csl := `{\"id\":\"${2}\",\"type\":${1},\"content\":[{${3}}]}` }}
    {{ .Content | replaceRE $bib $csl }}
  {{ end }}
{{ end }}

The idea being that, by rendering CSL-JSON before HTML, I should be able to query the CSL-JSON while rendering the HTML.

I’m not really familiar with regex but I came up with the above pattern which should more or less do the job (with further processing) or so according to regex101 but actually it’s not in Hugo (content is rendered as is)… Am I missing something?

Here’s the basic structure of a .bib file:

key = {value}

And the desired output:

   "id": "id",
   "type": "type",
   "key": "value"

Any help much appreciated

Sorry, don’t get me wrong, but static website generator does not include converting of content types.

The easiest you can do is using a converter on your content before you use it in Hugo.

Like this one (just googled, did not try to understand what both formats mean):

Replace this:

{{ $bib := `^@(.*?){(.*?),((.|\n)*?)+(}}|\n})+$` }}
{{ $csl := `{\"id\":\"${2}\",\"type\":${1},\"content\":[{${3}}]}` }}
{{ .Content | replaceRE $bib $csl }}

With this:

{{ $bib := `@(.*?){(.*?),((.|\n)*?)+(}}|\n})+` }}
{{ $csl := `{\"id\":\"${2}\",\"type\":${1},\"content\":[{${3}}]}` }}
{{ .Content | replaceRE $bib $csl | safeHTML }}

And then you can start troubleshooting your regex values, which is “off topic” for this forum.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.