Hi,
I want to make a list of all internet references of a page beside the pictures.
I intended to run a regexp on .Content with findRE
, but it looks like the goland flavor of RE has neither lookhead nor lookbehind ?
How can I extract a substring from a match without them ?
I would need to make an array of all strings starting http
or www
, following either :
or ](
and ending at the first )
or a new line/carriage return.
I could do that usually, but with these limitations I can’t.
I know about capturing groups, but how do I say “extract that group” ?
An example would help us to help you.
This ( in bold characters) would
match:
[AAAAAAA](
http://www.aol.fr
)
[BBBBBBB](www.google.fr
)
[CCCCCCC](www.google.fr
" sdf
sdfds")
[XXXXXX](www.youtube.fr
" sdf\
While none of these would qualify: no image remote or local, and no local links:
![local image](Images/images_article_instincto/illustration_instincto.webp “The answer is to observe wild animals, to discover our original food range.”)
[local_link](philosophy)
![remote link](https://yyz2.discourse-cdn.com/flex036/user_avatar/discourse.gohugo.io/tom_durand/32/14460_2.png)
and the whole of that would make the list (after removing duplicates and prefixes):
[“aol.fr”, “google.fr”, “youtube.fr”]
which looped over would produce the text file consisting of:
aol.fr
google.fr
youtube.fr
Ah I’m sorry I just saw the thread below now, popping up on top of the list or so.
It seems to say the only option to proceed in two times, when it comes to capturing groups.
then I make a new array with only with what includes //
or www and does not include !
.
This should do it:
{{ $array1 := union (findRE
(?s)((![.](.(//)|(www).)|(/s+:/s+![.](.(//)|(www).)).Content) (findRE
(?s)cite=“.“.Content) (findRE
(?s)quote=”.”.Content)}}
then I loop over this and apply:
findRE
(?s)((![.](.(//)|(www).)|(/s+:/s+![.](.(//)|(www).)) .Content
What do you think ? Are there cases in markdown that it would miss ?
This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.