Hi,
Has anyone had issues blocking the MJ12bot?
MJ12bot belongs to https://majestic.com. Have you or are you using their services in your site?
No not at all. My host cloudways put me on to it. I can see that soemtimes their bot visits the site 100s of times an hour. I’ve put MJ12bot in the agent blacklist but it doesn’t seem to stop them.
Hi, are you able to find out what IP address is used by the bot?
Hi,
It uses multiple IP addresses. Here is what Majestic say
“MJ12bot adheres to the robots.txt standard. If you want the bot to prevent website from being crawled then add the following text to your robots.txt:
User-agent: MJ12bot
Disallow: /
Please do not block our bot via IP in htaccess – we do not use any consecutive IP blocks as we are a community based distributed crawler. Please always make sure the bot can actually retrieve robots.txt itself. If it can’t then it will assume that it is okay to crawl your site.
If you have reason to believe that MJ12bot did NOT obey your robots.txt commands, then please let us know via email: [email protected]. Please provide URL to your website and log entries showing bot trying to retrieve pages that it was not supposed to.”
Hence my question do you create a Robots.txt?
Thank you,
Matthew
I’ve tried creating a robots.txt to see if that will work, but i’m surprised ALL in one doesn’t seem to create one?
Hi,
I’ve tried creating a robots.txt to see if that will work,
Are you able to block the bot using a robots.txt file?
adheres or not… i think no bot got the right to fetch over 1000 url in 1 day :((((
how do we lower crawl rate? It’s doing 1000’s of page views a day?
Any way to lower crawl rate for my entire server by IP?
I run hosting company. And it’s killing us…
@ravetildon, can you start a new support thread.
Thank you