• Resolved lickthespoon

    (@lickthespoon)


    Has anyone had issues blocking the MJ12bot?
    It’s overloading the processor on my host with constant requests.
    I’ve added it to the blacklist useragent list and can see it’s in .htaccess but doesn’t seem to work.
    Does the AIW create a Robots.txt?
    Thank you,
    Matthew

    The page I need help with: [log in to see the link]

Viewing 9 replies - 1 through 9 (of 9 total)
  • Plugin Contributor mbrsolution

    (@mbrsolution)

    Hi,

    Has anyone had issues blocking the MJ12bot?

    MJ12bot belongs to https://majestic.com. Have you or are you using their services in your site?

    Thread Starter lickthespoon

    (@lickthespoon)

    No not at all. My host cloudways put me on to it. I can see that soemtimes their bot visits the site 100s of times an hour. I’ve put MJ12bot in the agent blacklist but it doesn’t seem to stop them.

    Plugin Contributor mbrsolution

    (@mbrsolution)

    Hi, are you able to find out what IP address is used by the bot?

    Thread Starter lickthespoon

    (@lickthespoon)

    Hi,
    It uses multiple IP addresses. Here is what Majestic say
    “MJ12bot adheres to the robots.txt standard. If you want the bot to prevent website from being crawled then add the following text to your robots.txt:

    User-agent: MJ12bot
    Disallow: /
    Please do not block our bot via IP in htaccess – we do not use any consecutive IP blocks as we are a community based distributed crawler. Please always make sure the bot can actually retrieve robots.txt itself. If it can’t then it will assume that it is okay to crawl your site.

    If you have reason to believe that MJ12bot did NOT obey your robots.txt commands, then please let us know via email: [email protected]. Please provide URL to your website and log entries showing bot trying to retrieve pages that it was not supposed to.”
    Hence my question do you create a Robots.txt?
    Thank you,
    Matthew

    Thread Starter lickthespoon

    (@lickthespoon)

    I’ve tried creating a robots.txt to see if that will work, but i’m surprised ALL in one doesn’t seem to create one?

    Plugin Contributor mbrsolution

    (@mbrsolution)

    Hi,

    I’ve tried creating a robots.txt to see if that will work,

    Are you able to block the bot using a robots.txt file?

    adheres or not… i think no bot got the right to fetch over 1000 url in 1 day :((((

    ravetildon

    (@ravetildon)

    how do we lower crawl rate? It’s doing 1000’s of page views a day?

    Any way to lower crawl rate for my entire server by IP?

    I run hosting company. And it’s killing us…

    Plugin Contributor mbrsolution

    (@mbrsolution)

    @ravetildon, can you start a new support thread.

    Thank you

Viewing 9 replies - 1 through 9 (of 9 total)

The topic ‘MJ12bot’ is closed to new replies.