Google Webmaster tools can not find the robots.txt


#1

Google Webmaster tools can not find the robots.txt

config.toml
enableRobotsTXT = true

/layouts/robots.txt
User-agent: *

Here the structure of may multi language page. I have two languages English and
German. public is the folder build from Hugo.

public/ 
├── de
│   ├── robots.txt
│   ├── sitemap.xml
│
└── en
    ├── robots.txt
    ├── sitemap.xml

I can reach the robots.txt under https://ssl.webpack.de/example.com/de/robots.txt
and under https://ssl.webpack.de/example.com/en/robots.txt

ssl.wwebpack.de is a ssl proxy, example.com is the domain


#3

Do you have a single domain with /de/robots.txt and /en/robots.txt? If so, that’s incorrect; the robots.txt has to be a single file at the root of the site.

From that tree you shared, I think the expectation would be to upload those into two separate folders that are the root of the de and en sites, mapping to http://example.de and http://example.com.


#4

Hello Rick,

thank you for your answer. Which solution would you prefer? example.de shows the German site and example.com the English site. For the France and Swiss side I order two new domains, the ch and fr domains. At the moment I have the first solution. example.com/de and example.com/en. The example.de moves to example.com


#5

In Japan, people accept the local .co.jp domains as canonical for companies, so I prefer separating. It’s up to you, but it looks like you’re set for separate domains. The point is, if you’ve configured hugo as multihost you need to put each of the subfolders of public, in its own domain. I can’t see what you are doing because you haven’t linked the repo so I cannot say for sure.


#6

I have configured Hugo as multihost. The domain example.com is linked to the
public folder. The example.de domain is linked to the example.de domain. I have
changed the structure to:

public/
│
├── robots.txt
├── sitemap.xml
│
├── de
│   ├── index.html 
│
└── en
    ├── index.html

Your idea is to move the folder /de to example.de and the /en folder to
example.com?

The problem is the sitemap.xml. This sitemap.xml produces Hugo:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:xhtml="http://www.w3.org/1999/xhtml">

  <url>
    <loc>https://ssl.webpack.de/example.com/en/beer/</loc>
    <xhtml:link
                rel="alternate"
                hreflang="de"
                href="https://ssl.webpack.de/example.com/de/bier/"
                />
    <xhtml:link
                rel="alternate"
                hreflang="en"
                href="https://ssl.webpack.de/example.com/en/beer/"
                />
  </url>

  <url>
    <loc>https://ssl.webpack.de/example.com/en/categories/</loc>
    <priority>0</priority>
    <xhtml:link
                rel="alternate"
                hreflang="de"
                href="https://ssl.webpack.de/example.com/de/categories/"
                />
    <xhtml:link
                rel="alternate"
                hreflang="de"
                href="https://ssl.webpack.de/example.com/de/categories/"
                />
    <xhtml:link
                rel="alternate"
                hreflang="en"
                href="https://ssl.webpack.de/example.com/en/categories/"
                />
  </url>

and so on

In the browser https://ssl.webpack.de/example.com/sitemap.xml is the result:
This sitemap is not valid and Google don’t accept this.


#7

If you configure hugo as multihost, it expects a different domain per language. You need to publish your public/de to your German domain, and your public/en to your English domain, however you do that. There should be a robots.txt and sitemap.xml in each of those folders. Whatever is in de is accessible at the German domain, and the same for the en.

I don’t understand where the ssl.webpack.com domain is coming in. Your domains and baseurl are the example.de and example.com right? If so, the files related to those are what you should be submitting to Google.


#8

Hello Rick,

Thank your for your help. It works so far. Great. ssl.webpack.com is a ssl proxy. Have you seen the output from the sitemap.xml?


#9

I am not sure of the purpose of the ssl proxy is supposed to be vis-à-vis your other sites, or, what the problem is.

If you need to produce a site that has to be accessible under a different domain, you can specify both the config file and the baseURL at the command line when you generate it using hugo.


#11

The sitemaps are alright (it’s XML that the browser renders, check the source and it will look alright).

The robots.txt MUST be in siteroot. https://ssl.webpack.de/robots.txt for instance. The way you do it (in a subfolder) the robots.txt is not accepted.

The way you host your domains will result in your robots.txt not being available. I am not sure that the SSL is configured correct, it should be https://developer.kwpse.com instead of the ssl.webpack.de in it.


#12

I suggest you drop this portion for the purposes of troubleshooting your issue. That is confusing folks, and isn’t really needed, correct? We are just checking files, so it is okay if it isn’t over an encrypted connection.