replaceRE not working for me [SOLVED]


#1

I’m trying to use replaceRE from the docs:

{{ "http://gohugo.io/docs" | replaceRE "^https?://([^/]+).*" "$1" }} → “gohugo.io”

So source string, pipe symbol, replaceRE, match pattern, and replacement.

In my theme, I use replaceRE like this:

{{ .Plain | replaceRE "(\bReferences:.*$)" "wordNotInTextNatively" }}

According to an online validator, my regex matches all text from the Reference: heading till the end the of the string (which is what I want).

However, Hugo doesn’t do anything: there’s no error message, nor is the regex matched string removed from the .Plain output.

What am I missing? Thanks!


Regex invalid syntax
#2

If I don’t use word boundaries, replaceRE works for me. I’m not sure if it should do that?

If I run the following through an validator, they all match my word ‘References:’ and the remainder of the string:

  • \bReferences: \b.*$
  • \bReferences:.*$

But these don’t do something in Hugo. Instead, I need to use:

  • References: .*$
  • References:.*$ (or References:.* simplified).

Please let me know if I misunderstood replaceRE (then I need to do some more studying).


replaceRE works odd from my standpoint:

{{ .Plain | replaceRE "References:.*" "" }}
–> only replaces the word ‘References:’ but not any of the characters following that (despite .*).

{{ "copy paste of the article content here" | replaceRE "References:.*" "" }}
–> replaces the word ‘References:’ and anything after that till the end of the string.

From my understanding, replaceRE doesn’t behave consistently. (Or perhaps by design?)


#3

Hugo uses the Go regular expression package which uses the re2 syntax. This is not pcre or javascript, so whatever syntax validator you’re using online may have some differences.

I suspect your problem is that you are trying to match a multi-line input string and need to turn on the multi-line flag with (?m). See this Go playground for an example of using the flag and what’s happening inside of Hugo.


#4

Thanks @moorereason for the clarification and additional information. I didn’t realise that Hugo used a specific regex for this.

I’m a bit closer to the solution but not fully there yet. With an online regex tester for Go I verified that both (?m)(\bReferences:.*$) and (\bReferences:.*$) are valid regex matches for my content.

This doesn’t work for me however yet:

{{ .Plain | htmlEscape | replaceRE "(?m)(\bReferences:.*$)" ""}}

To see why, I used the Go playground example and set the input variable to output of .Plain | htmlEscape. And there it works; ‘References:’ and everything that follows after that word is replaced by ‘wordNotInTextNatively’.

But in Hugo, it doesn’t work; no error and the text from ‘References:’ onward is not replaced.


I’m using:

PS I:\site> hugo env
Hugo Static Site Generator v0.18.1 BuildDate: 2016-12-30T10:03:28+01:00
GOOS="windows"
GOARCH="amd64"
GOVERSION="go1.7.4"

#5

@Jura,
You need to escape the backslash in your pattern:

{{ .Content | replaceRE "(?m)(\\bReferences:.*$)" "" }}

#6

Thanks @moorereason! With that, and your comment on Github saying that “the Go . pattern doesn’t match newlines. You have to pass the s flag to the group.” I completed my regex. :smile:

{{ .Plain | htmlEscape | replaceRE "(?m)(?s:\\bReferences:.*$)" "" }}

Thanks again for your help; without guidance I wouldn’t have come up with this in a year.