Getting errors 403 with Google Bots

Resolved massimopaolini
(@massimopaolini)

13 years ago

I’ve seen a sharp rise in Google Bot error 403 (forbidden) in Google Webmaster Tools. After looking into it I find that if I “Empty all caches” and fetch as Google Bot it works properly.

Is there a setting I need to adjust, is this a glitch, or am I barking up the wrong tree?

The site is http://online-sales-marketing.com

Thanks Massimo

http://ww.wp.xz.cn/extend/plugins/w3-total-cache/

Viewing 14 replies - 1 through 14 (of 14 total)

Thread Starter massimopaolini
(@massimopaolini)

13 years ago
Here is the fetch by Google Bot:
```
------------Start Fetch as Google - 403 error----------------------

<blockquote>Fetch as Google
This is how Googlebot fetched the page.
URL: http://online-sales-marketing.com/stupid-seo-service-slip-ups
Date: Monday, May 13, 2013 at 3:33:02 PM PDT
Googlebot Type: Web
Download Time (in milliseconds): 51
HTTP/1.1 403 Forbidden
Date: Mon, 13 May 2013 22:33:02 GMT
Server: Apache
Vary: Accept-Encoding,User-Agent
X-Powered-By: W3 Total Cache/0.9.2.9
Last-Modified: Thu, 01 Jan 1970 00:00:00 GMT
Expires: Mon, 13 May 2013 23:33:02 GMT
Content-Encoding: gzip
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
</blockquote>
------------End Fetch as Google - 403 error----------------------
Then I emptied ass cache and refetched:
------------Start Fetch as Google - OK----------------------

<blockquote>Fetch as Google
This is how Googlebot fetched the page.
URL: http://online-sales-marketing.com/stupid-seo-service-slip-ups
Date: Monday, May 13, 2013 at 3:33:13 PM PDT
Googlebot Type: Web
Download Time (in milliseconds): 553
HTTP/1.1 200 OK
Date: Mon, 13 May 2013 22:33:13 GMT
Server: Apache
Vary: Accept-Encoding,User-Agent
X-Powered-By: W3 Total Cache/0.9.2.9
X-Pingback: http://online-sales-marketing.com/xmlrpc.php
X-W3TC-Minify: On
Link: <http://online-sales-marketing.com/?p=5212>; rel=shortlink
Last-Modified: Mon, 13 May 2013 22:33:13 GMT
Expires: Mon, 13 May 2013 23:33:13 GMT
Content-Encoding: gzip
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
</blockquote>
------------Start Fetch as Google - OK----------------------
```
mbrsolution
(@mbrsolution)

13 years ago

Hi you might want to read this thread or even this other thread. I hope they help you with your question.

Kind regards

Thread Starter massimopaolini
(@massimopaolini)

13 years ago

mrbsolution thank you for the referral to those posts.
I’ve read the threads and if I understand them properly they point to a blocking plugin as the culprit.

I use an IP blocker on my site, the IP’s only get added when they try to login and fail. Just to be safe I checked the Google IP’s against the list and they are not blocked.

Besides Googlebot works fine when I clear the cache, so the IPs cannot be blocked.

I just de-activated Spamfree WordPress that automatically blocks spam coomments. This was a recent addition, I’ll post the results here tomorrow when I check GWMT again.

Plugin Contributor Frederick Townes
(@fredericktownes)

13 years ago

@massimopaolini, are you seeing any improvement?

Thread Starter massimopaolini
(@massimopaolini)

13 years ago

The short answer is no. The details are below.

Steps I took:
Every day I would make one adjustment
1. Disabled “Spam Free WordPress”
2. Waited 3 days for Google Webmaster Tools to report on crawl errors
3. Disabled “IP Blacklist Cloud”
4. Checked GWMT for errors – errors found
5. Turned on error notification for minify
6. Checked GWMT for errors – errors found and an email from W3TC about minify not working
7. Turned off JS minify
8. Checked GWMT for errors – errors found and an email from W3TC
9. Turned off CSS
10. Checked GWMT for errors – errors found and an email from W3TC
11. Turned off Minify completely
12. Checked GWMT for errors – errors found and an email from W3TC
13. Turned off all other cache items
14. Disabled plugin
14. Checked GWMT for errors – none and an error email from W3TC(?)

Hope this helps resolve the issue, please feel free to contact me for more details.

mdp8593
(@mdp8593)

13 years ago

I’m getting the same problem. And it all started with the latest plugin update on 5/22/2013. Got a message from Google saying:

“Google detected a significant increase in the number of URLs we were blocked from crawling due to authorization permission errors.”

Tried to fetch our home page using Google Webmasters Fetch tool:
https://www.google.com/webmasters/tools/googlebot-fetch

This was the report:

HTTP/1.1 403 Forbidden
Date: Tue, 28 May 2013 17:43:36 GMT
Server: Apache/2.2.20 (Unix) mod_ssl/2.2.20 OpenSSL/0.9.8m DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
Expires: Tue, 28 May 2013 18:43:36 GMT
Pragma: public
Cache-Control: max-age=3600, public, must-revalidate, proxy-revalidate
Etag: 15d365532833901733dcaa5628b1d377
X-Powered-By: W3 Total Cache/0.9.2.11
Content-Encoding: gzip
Vary: Accept-Encoding,User-Agent
Last-Modified: Thu, 01 Jan 1970 00:00:00 GMT
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html

Turned off W3TC completely, and the problem went away. Googlebot fetched the page with no problem.

mdp8593
(@mdp8593)

13 years ago

Update: the problem is related to CDN. I deactivated that (amazon cloudfront), and no problem.

Will continue investigating.

mdp8593
(@mdp8593)

13 years ago

Update 2: Looks like I spoke too soon. Problem still exists with Page Cache and Browser cache only activated.

mdp8593
(@mdp8593)

13 years ago

Update 3: I did a complete W3TC uninstall. Deleted all files, then reinstalled. So far, the problem seems to be gone. Will keep an eye on it and update if anything changes.

Plugin Contributor Frederick Townes
(@fredericktownes)

13 years ago

@mdp8593, any update?

mdp8593
(@mdp8593)

13 years ago

No update. Still working fine.

TimWRoberts
(@timwroberts)

12 years, 3 months ago

Curious if anyone has solved the problem with W3TC? I’m dealing with the same issue this morning so I went back to Quick Cache and I’m able to fetch my site via google web tools just fine. However, it would be nice to know the root cause is.

Thread Starter massimopaolini
(@massimopaolini)

12 years, 3 months ago

Still no update.

markgelo
(@markgelo)

12 years, 2 months ago

any solution for this?? i have the same issue.. ple help!!

Viewing 14 replies - 1 through 14 (of 14 total)

The topic ‘Getting errors 403 with Google Bots’ is closed to new replies.