Troubles with ranging over a string

Hi, I am working of small tokenizer to replace “split” in case the string contains quotes " or \ (so that substrings can include spaces".
But this is not my code. I do get the gist of it of course, but not enough to troubleshoot.

{{- define "_partials/tokenize.html" -}}
{{- $tokens := slice -}}
{{- $buf := "" -}}
{{- $inQuote := false -}}
{{- $escape := false -}}

{{- range $i, $r := . }}
  {{- $ch := printf "%c" $r -}}
  {{- if $escape }}
    {{- $buf = printf "%s%s" $buf $ch -}}
    {{- $escape = false -}}
  {{- else if eq $ch "\\" }}
    {{- $escape = true -}}
  {{- else if eq $ch "\"" }}
    {{- $inQuote = not $inQuote -}}
  {{- else if and (not $inQuote) (eq $ch " ") }}
    {{- if gt (len (trim $buf " ")) 0 }}
      {{- $tokens = $tokens | append $buf -}}
    {{- end -}}
    {{- $buf = "" -}}
  {{- else }}
    {{- $buf = printf "%s%s" $buf $ch -}}
  {{- end }}
{{- end }}
{{- if gt (len (trim $buf " ")) 0 }}
  {{- $tokens = $tokens | append $buf -}}
{{- end }}
{{- return $tokens -}}
{{- end }}

called with {{- $fields := cond (strings.ContainsAny $line"`) (partial “tokenize.html” $line) (split $line " ") -}}
`. $line is a string:

{{- range $line := (split (strings.TrimSpace .Inner) "\n") }}
  {{- $line = trim $line " " -}}

All the code before and after that call worked before, so it should not be erroneous.
Right now I get

calling partial: template: _shortcodes/list.html:96:20: executing “_partials/tokenize.html” at <.>: range can’t iterate over NVVV k moy

I read ranging over a string produces not individual characters but numerica values as the corresponding unicode codepoints:

for string value, the “range” clause iterates over the Unicode code points in the string starting at byte index 0. On successive iterations, the index value will be the index of the first byte of successive UTF-8-encoded code points in the string, and the second value, of type rune, will be the value of the corresponding code point.

… So what gives ? The rest probably has errors too, but I’ll come to them after this. Easier when stuff actually reaches the page.

mmh, sure that you can range over a string in Hugo?

have you tried {{ range "abc"}} {{ . }} {{end}} → thar errors out
if you want to range over you could use {{ range (split "abc" "")}}{{ . }} - {{end}} (the docs tell split returns a list of strings)

on the other hand maybe a regular expression may be a solution:

without any error checking and hoping I matched the possible allowed strings spec

  {{ $string := `he\"llo you "H\ello\n\" World"`}}
  <h4>{{ $string }}</h4>
  */ }}
  {{ $match := apply (findRESubmatch `(?:\w|\\")+|"(?:\\.|[^"\\])*"` $string) "index" "." 0}}
  <ul>
  {{ range $match }}<li>{{.}}</li>{{end}}
  </ul>

Hints

  • (?:\w|\\")+ matches a word character or \"
  • "(?:\\.|[^"\\])*" matches a string enclose in " that may contain anything but " including backslash escapes (which includes \").
  • the apply around flattens the list of lists returned by findRESubmatch to a list of matches.
<h4>he\"llo you "H\ello\n\" World"</h4>
  
  <ul>
  <li>he\"llo</li>
  <li>you</li>
  <li>"H\ello\n\" World"</li>
  </ul>

Well no it doesn’t seem we can, that’s another reason I asked the question !
However, I just remember you can do split $var "" to the same effect.
So… pointless question.
I read that hugo’s language doesn’t follow Go 100% for reasons of type safety and performance. I find bothersome that there isn’t a complete manual we can refer to, like the Ada Reference Manual for instance. At least regards to basic programming logic and types. But I suppose it evolves too fast.

never is - it will help others to avoid some pitfalls or adjusting expectations

to mention:

  • this is a three way cascade: Go → Go templates (html/text) → Hugo
    might be that the templating engine already does abstractions here.

  • the docs for range state

    Syntax: range COLLECTION

    the docs state the types for arguments and returns maybe some gaps with ANY, but …
    if one goes to the technical details a Go string is an array…with all the gory details.

  • maybe should have not that brief on range "abc" "" will fail :slight_smile:

  • and maybe range (split "abc" "") is what you called remembered.

  • byte or rune looping seems a little heavy so I showed an alternative.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.