Regular expressions: (sub)matching groups not supported?

Hi,

it looks like findRE function returns only re matches but not groups.
I don’t see any reference to groups in docs

What I mean for example:

findRE "%%(.*?)%%" "%%customerNumber%%"

will give me:

[%%customerNumber%%]

I want only the group match(es), i.e. the string customerNumber, just like shown here

Aside from processing findRE output with some other function like substr, am I missing something?

Thanks
S.

Have you seen the re2 docs (which is the syntax used by go’s regex package)? Perhaps the examples there may be helpful to you.

Hi and thanks for the reply.
Syntax is ok, it is similar to other languages. The problem is that findRE function in Hugo does not return capturing groups (both named or not).

I had a look at hugo code for regexp functions and there’s not any reference to the golang function FindAllStringSubmatch that looks like what I need.

I guess I have two choices:

  • write some code longer with available functions, I am thinking to remove “%%” from match using substring or similar stuff;
  • learn golang and make a pull request hoping to see it live soon. Maybe it would be good to have a 3rd parameter on findRE function.

Ah, I see your dilemma now.

I propose a 3rd option. I don’t know all your requirements, but going off your example, try this:

{{ findRE "[^%%].*[^%%]" "%%customerNumber%%" }}

Which returns:

[customerNumber]

Yes, that works if the string is exactly that one “%%customerNumber%%”. But, my fault, I haven’t properly explained my use case.

Imagine you have strings like this one:

There are %%customersNumber%% customers in %%storesNumber%% stores.

and you want to extract “customerNumber” and “storesNumber”. Actually you want to extract whatever is between two %% delimiter in non greedy way.
In other contexts I’ve always been using group submatching , but as far as I understood, there’s no support in Hugo templating code for that.

As a workaround now I clean the matches using a replace $match "%%" "" to remove the “%%”

{{ range $match := findRE "%%(.*?)%%" $theString }} 
    {{ $cleanMatch := replace $match "%%" "" }}
    <!-- some other sutff down here -->
{{ end }}
4 Likes

I’m looking for this feature too, it would be useful to collect some information of HTML, such as headings from TableOfContents:

[
{
  "anchor": "#foo",
  "heading": "Foo"
},
{
  "anchor": "#bar",
  "heading": "Bar"
}
]