Special Chars/Emoji in plain Text files

I have some .txt files with the minetype “text/plain”. I anyway want them to display some special chars like:

  1. © (special char)
  2. (emoji UTF-8 compatibel)

The copyright I want to place in my “robots.txt
The heart in the “humans.txt”.

When I edit the files in the browser (dev console) and add the chars/emojis it works. This indicates I can place them into the files.
But every time I place them (as they are) in Hugo and generate the new files it breaks, and I end up with:

= ❤

My humans.txt is set up like this:

[outputs]
    home = ["HTML", "humanstxt"]

[outputFormats.humanstxt]
    baseName = "humans"
    isPlainText = true
    mediaType = "text/plain"

[mediaTypes."text/plain"]
    suffixes = "txt"

So I thought the problem things are not UTF-8 encoded is the isPlainText = true so I tried all combinations of isPlainText and isHTML. But did not succeed.

Did anyone manage it to place special chars into txt files and still having them displayed correct in the browser? Or is there a way to define the encoding for that file?

That’s how it should look:
image

That’s how it looks:
image

I don’t think that it is possible to define the encoding for a Plain Text Output Format with Hugo.

Perhaps you can do this with some kind of a Post Process script to save *.txt files under the publishDir as Unicode.

I pasted both the heart and the copyright symbol into layouts/index.humanstxt.txt then built the site.

The resulting humans.txt file contains both the heart and the copyright symbol.

Which editor are you using?

1 Like

Thanks, then I will continue to search. May you are able to give me some deeper insights about that file?
What MimeType does it have? What is the configuration in Hugo for this file? Are you able to share a link to this file so I can inspect it?

My source looks like this:

/* NOTE */
	Built with ❤ ❤ ❤.

… so I’ve tried multiple versions of the same special char.

I use Atom: Poll 2021. Which code editors do you use when writing Hugo's files (markdown and templates)?

In UTF-8 Mode for this file.

The configuration is identical to what you posted:

[outputs]
    home = ["HTML", "humanstxt"]

[outputFormats.humanstxt]
    baseName = "humans"
    isPlainText = true
    mediaType = "text/plain"

Here’s a simple example:

git clone --single-branch -b hugo-forum-topic-33052 https://github.com/jmooring/hugo-testing hugo-forum-topic-33052
cd hugo-forum-topic-33052
hugo
cat public/humans.txt
1 Like

Thanks. I noticed the header does have a slight difference:

my header:
encoding: text/plain

your header:
encoding: text/plain; charset=utf-8

I tried to modify my [outputFormats.humanstxt] to:

[outputFormats.humanstxt]
    baseName = "humans"
    isPlainText = true
    mediaType = "text/plain; charset=utf-8"

But that obviously did not work. I think that is a lack of flexibility to not be able to set the charset/encoding directly.

As I did my search I noticed that the “solution” provided HERE just works for HTML, but will never work for textfiles as they dont have any header in the content itself.
Any other plans how toconfigurate this that it works?

On which file?
How are viewing the header?

https://github.com/jmooring/hugo-testing/blob/hugo-forum-topic-33052/layouts/index.humanstxt.txt

I just clicked on “RAW” then GitHub gives you a web preview of the content:
https://raw.githubusercontent.com/jmooring/hugo-testing/hugo-forum-topic-33052/layouts/index.humanstxt.txt

There everything is displayed correctly. I know… thats not generated with Hugo, but it works and the difference is the encoding in the header.

So what happened when you cloned my repo and ran hugo?

outout from cat was:
© ❤
but when calling in a browser it gets like this:
© ❤

Does that mean you ran hugo server to see it in a browser?

No, I build static assets and serve them with my webserver.
Debian 10.

OK, then run hugo server with my repository, then tell me what you see when you visit http://localhost:1313/humans.txt.

I dont have my setup running local, so I can not call this. Also I have all ports blocked beside the ones that are for web (80, 443)

But I have found the solution. I had to force my Nginx to set default to “UTF-8”. It was done with:

charset UTF-8;

Just added to the server, or location. Both would do the trick.
I apparently thought I have already done this, but have done it within the .htaccess and not the nginx.conf. But static files are getting served by Nginx on my setup.

Sorry for the trouble and thank you for the fast help!!