Viewing 5 replies - 1 through 5 (of 5 total)
  • Plugin Author Sybre Waaijer

    (@cybr)

    Hello!

    TSF shouldn’t add any sitemap links than its own to the robots.txt. It should only add a link to the base sitemap and the optional Google News sitemap. The sitemap_n.xml links must come from another plugin or theme.

    Unfortunately, since I do not have a link to your site, I can’t inspect where these might be coming from.

    Please note that those links in the robots.txt output are mere hints and should not affect the website’s pages’ crawlability or validity.

    Thread Starter kelvinow

    (@kelvinow)

    Hi Sybre,

    I seem to have solved the robots.txt issue by manually creating a new one.

    I also added 4 different vesions of my site (https , http , www , non www) to Google Search Console.

    I have uploaded the new sitemap to all 4 versions, using 1 sitemap. May I know if this is correct?

    Also, if I open the sitemap from Search Console via the versions without the www, it shows a blank white page.

    But if I open the sitemap via the versions with the wwww, it shows the correct sitemap.

    I went on to see the errors on the console and below is the error.

    “Unsafe attempt to load URL https://example.com/sitemap.xsl from frame with URL https://example.com/sitemap.xml. Domains, protocols and ports must match.”

    May I know what is wrong? When I visit the .xsl link, it show a different sitemap with just the header and no URLs. (Also generated from TSF plugin). Meaning there are 2 different sitemaps generated with the TSF plugin.

    Please kindly advise.

    Thank you.

    Plugin Author Sybre Waaijer

    (@cybr)

    Hello!

    TSF only generates one sitemap for each language of your site. If you don’t use a translation plugin, you should have at most one sitemap.

    This sitemap is displayed virtually and is not stored in your filesystem. It can be accessed via various methods like changing the endpoint (www, HTTP(S), queries, etc.) on your website.

    Only the root endpoint of your website should be prevalent. It’s up to you to choose a version; I prefer using HTTPS (for privacy, performance, and security reasons) without the archaic ‘www’ subdomain and with a trailing slash since the RFCs are all over the place. This will create https://example.com/.

    Google does not need to be aware of any other root endpoints that may still be accessible, and TSF automatically chooses the correct version for you via canonical URLs as configured in WordPress’s general settings — provided you’ve not fiddled with wp-config.php constants like WP_SITEURL. If you do manage other versions of your website, you may run into duplicated content issues.

    About the stylesheet
    The .xsl script expects content to style. When no expected content is available, it won’t be rendered. That’s why you’ll see a different ‘sitemap’ when you visit the stylesheet script directly.

    Since the .xsl file is a script, it’s protected by the CORS protocol for your security. We use the canonical URL of your site to generate its valid endpoint, and we’ll never roguery change the endpoint based on the visitor’s inputted URL. This is why you see the error message when you access the sitemap from an unregistered endpoint.

    If you read the error message you got carefully, focus on this part: “Domains, protocols and ports must match.” You’ve changed the real domains noted in this error message, but they did not match originally.
    – Domains: This means that even the subdomain much match, such as www or non-www. We could work around this, but that’d defeat the purpose of CORS.
    – Protocols: HTTPS vs. HTTP. We could work around this… but that’d invalidate privacy policies.
    – Ports: 443 vs. 80. The equivalent of ‘protocols’ in this case.

    For everything else, please refer to our KB about sitemaps.

    Thread Starter kelvinow

    (@kelvinow)

    Hi Sybre,

    Noted with thanks. In other words, this is totally fine, right? There’s no issue with the sitemap at all.

    Also, are you familiar with Japanese Keyword Hack? Can we talk in a more private channel regarding this?

    Awaiting your response.

    Thank you.

    Plugin Author Sybre Waaijer

    (@cybr)

    Hi again 🙂

    Correct, there aren’t any breaking issues asserted from what you’ve told me.

    Well, now I am aware of that! https://developers.google.com/web/fundamentals/security/hacked/fixing_the_japanese_keyword_hack

    I’m not sure what I’m supposed to do with that, though. It’s quite peculiar, and I doubt its relation to TSF. Nevertheless, you can find various channels to my private inbox around the web. I prefer not to link them since I’m already overwhelmed — I barely have time to eat.

Viewing 5 replies - 1 through 5 (of 5 total)

The topic ‘Robots.txt’ is closed to new replies.