• Resolved dev

    (@devksec)


    Hello,

    We’re on A2 hosting and they’ve been unable to get the crawler working on subsites. I’ve put a workaround of having the subsites sitemap URLS in the primary site which works, however the crawler isn’t running automatically and is stuck “Start watching…”

    When run manually sometimes it does run without an issue run but others it gets a “stopped_reset”. It does seem to often be stuck in position 1, despite crawlers 1-4 being disabled (we don’t use the litespeed webp ones). The last interval doesn’t update on its own but CRON appears to show the “litespeed_task_crawler” next run time to be changing as expected.

    Start watching...
    25 Jul 2023 02:01:47     Size: 853     Crawler: #5     Position: 1     Threads: 3     Status: crawling, prepare running
    26 Jul 2023 10:57:28     Size: 853     Crawler: #1     Position: 1     Threads: 3     Status: stopped_reset
    26 Jul 2023 10:57:28     Size: 853     Crawler: #5     Position: 1     Threads: 3     Status: crawling, prepare running
    26 Jul 2023 11:00:29     Size: 853     Crawler: #5     Position: 10     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:00:43     Size: 853     Crawler: #5     Position: 16     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:00:58     Size: 853     Crawler: #5     Position: 25     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:01:13     Size: 853     Crawler: #5     Position: 34     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:01:26     Size: 853     Crawler: #5     Position: 46     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:01:37     Size: 853     Crawler: #5     Position: 58     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:01:48     Size: 853     Crawler: #5     Position: 70     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:02:01     Size: 853     Crawler: #5     Position: 85     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:02:12     Size: 853     Crawler: #5     Position: 97     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:02:23     Size: 853     Crawler: #5     Position: 109     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:02:36     Size: 853     Crawler: #5     Position: 124     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:02:47     Size: 853     Crawler: #5     Position: 136     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:02:58     Size: 853     Crawler: #5     Position: 148     Threads: 3     Status: crawling, updated position
    26 Jul 2023 11:03:09     Size: 853     Crawler: #5     Position: 160     Threads: 3     Status: crawling, updated position

    Reviewing the reset logs, there are some API queries for third-party integrations but unsure if any of these are the issue. Here are some examples

    07/24/23 14:35:22.541 [34.243.223.231:54066 4 xe8] πŸ’“ ------POST HTTP/1.1 (HTTPS) /
    07/24/23 14:35:22.541 [34.243.223.231:54066 4 xe8] Query String: wc-api=wc_shipstation&auth_key=REDACTED
    07/24/23 14:35:22.541 [34.243.223.231:54066 4 xe8] User Agent: RestSharp/106.3.1.0
    07/24/23 14:35:22.541 [34.243.223.231:54066 4 xe8] Accept: application/json, application/xml, text/json, text/x-json, text/javascript, text/xml
    07/24/23 14:35:22.541 [34.243.223.231:54066 4 xe8] Accept Encoding: gzip
    07/24/23 14:35:22.541 [34.243.223.231:54066 4 xe8] X-LSCACHE: true
    07/24/23 14:35:22.541 [34.243.223.231:54066 4 xe8] X-LiteSpeed-Purge: public,1344_Po.25407,1344_W.recent-posts-4,1344_FD,1344_A.0,1344_F,1344_H,1344_PGS,1344_PGSRP,1344_D.202307,1344_REST => LiteSpeed\LSC->send_headers()@616 => WP_Hook->apply_filters(,ARRAY)@308 => WP_Hook->do_action(ARRAY)@332 =>
    /home/ksc5577p/REDACTED.co.uk/wp-includes/load.php@517
    
    07/24/23 12:40:32.965 [=CLI=ksc5577p 3 4BO] X-LiteSpeed-Purge: public,1343_Po.24879,1343_URL./,1343_W.recent-posts-4,1343_FD,1343_A.1,1343_F,1343_H,1343_PGS,1343_PGSRP,1343_D.202307,1343_REST => LiteSpeed\LSC->send_headers()@616 => WP_Hook->apply_filters(,ARRAY)@308 => WP_Hook->do_action(ARRAY)@332 =>
    /home/ksc5577p/REDACTED.ksec.co.uk/wp-includes/load.php@517
    
    
    07/24/23 12:33:46.651 [34.241.67.207:56998 3 j1l] πŸ’“ ------GET HTTP/1.1 (HTTPS) /
    07/24/23 12:33:46.651 [34.241.67.207:56998 3 j1l] Query String: wc-api=wc_shipstation&REDACTED
    07/24/23 12:33:46.651 [34.241.67.207:56998 3 j1l] User Agent: ShipStation
    07/24/23 12:33:46.651 [34.241.67.207:56998 3 j1l] Accept:
    07/24/23 12:33:46.651 [34.241.67.207:56998 3 j1l] Accept Encoding: gzip
    07/24/23 12:33:46.651 [34.241.67.207:56998 3 j1l] X-LSCACHE: true
    07/24/23 12:33:46.652 [34.241.67.207:56998 3 j1l] X-LiteSpeed-Purge: public,1343_Po.24878,1343_W.recent-posts-4,1343_FD,1343_A.2629,1343_F,1343_H,1343_PGS,1343_PGSRP,1343_D.202307,1343_REST => LiteSpeed\LSC->send_headers()@616 => WP_Hook->apply_filters(,ARRAY)@308 => WP_Hook->do_action(ARRAY)@332 =>
    /home/ksc5577p/REDACTED.co.uk/wp-includes/load.php@517
    
    
    07/24/23 11:24:14.569 [185.62.138.109:33528 5 RXe] πŸ’“ ------POST HTTP/1.1 (HTTPS) /wp-json/wc/v2/stock-sync-batch
    07/24/23 11:24:14.569 [185.62.138.109:33528 5 RXe] Query String: consumer_key=REDACTED
    07/24/23 11:24:14.569 [185.62.138.109:33528 5 RXe] User Agent: WooCommerce API Client-PHP/2.0.1
    07/24/23 11:24:14.569 [185.62.138.109:33528 5 RXe] Accept: application/json
    07/24/23 11:24:14.569 [185.62.138.109:33528 5 RXe] Accept Encoding: gzip
    07/24/23 11:24:14.569 [185.62.138.109:33528 5 RXe] X-LSCACHE: true
    07/24/23 11:24:14.569 [185.62.138.109:33528 5 RXe] X-LiteSpeed-Purge: public,1345_Po.162,1345_URL./product/REDACTED/,1345_W.recent-posts-1,1345_FD,1345_A.451,1345_F,1345_H,1345_PGS,1345_PGSRP,1345_D.202207,1345_REST,1345_Po.158,1345_T.4,1345_T.19,1345_T.52,1345_T.43,1345_T.23,1345_T.47,1345_T.28,1345_T.29,1345_T.51,1345_T.30,1345_T.31,1345_PT.product,1345_product => LiteSpeed\LSC->send_headers()@616 => WP_Hook->apply_filters(,ARRAY)@308 => WP_Hook->do_action(ARRAY)@332 =>
    /home/ksc5577p/REDACTED.co.uk/wp-includes/load.php@517
    

    Here is our Report number:Β XLANDMHU

    We’re happy to work with the primary site being the crawler or to try resolve the subsite to crawl instead but unsure why its not automatically running.

Viewing 11 replies - 1 through 11 (of 11 total)
  • Plugin Support qtwrk

    (@qtwrk)

    if manual run works , then it’s most likely cron issue

    please make sure your wp cron is always triggered in time , and I would suggest to set up a system cron to trigger it.

    Thread Starter dev

    (@devksec)

    Hello,

    Our cron is working as expected and is a system cron. Is there a specific one we can run just for the crawler? WP Crons shows the crawler being scheduled as expected.

    We thought it could just be being reset rather than a cron issue as it wordpress cron doesn’t have any other issues. It runs every 10 minutes for the site.

    The primary management site is often in maintenance mode, subsites are not, would this disable the crawler?

    Plugin Support qtwrk

    (@qtwrk)

    please go to crawler setting , set crawler interval to 61 , then in sitemap setting , drop domain set to OFF , and make sure your server load is within limit

    after you set interval to 61 , ideally , the crawler should start itself every 61 seconds if not running , give it an hour or two , see how it behaves.

    Thread Starter dev

    (@devksec)

    At the moment, due to the subsites not being able to run the crawler (A2 host can’t seem to resolve this, it shows as “crawler_disabled”). All of the sitemap URLS are within the multisite primary, if drop domain is added it then won’t have any pages to crawl.

    We would need to resolve the crawler_disabled issue for subsites first. Any ideas on how to resolve that ?

    Much appreciated for your support with this!

    Plugin Support qtwrk

    (@qtwrk)

    please relay to A2hosting support , that add

        <IfModule Litespeed>
         CacheEngine on crawler
        </IfModule>

    to your Apache virtualhost context will resolve that error.

    Thread Starter dev

    (@devksec)

    Hello,

    They have done that which has enabled the multisite primary site to have the crawler but all subsites are still disabled.

    Is there a multi-site specific config required? They’ve not been able to get it enabled on any subsite yet.

    Plugin Support qtwrk

    (@qtwrk)

    hmmm , no , this should work , you can create a phpinfo page for each of these subsites , you will be looking for $_SERVER['X-LSCACHE'] and it should contain word crawler in it , to indicate that directive is added

    Thread Starter dev

    (@devksec)

    On the subsites litespeed toolbox settings, it shows it as enabled and has a config:

    Server Variables
    SERVER_SOFTWARE = LiteSpeed
    DOCUMENT_ROOT = /home/ksc5577p/REDACTED/
    X-LSCACHE = on,crawler

    There’s a config there from testing as well.

    )
    img_optm-webp_replace_srcset = true
    img_optm-jpg_quality = 82
    crawler = true
    crawler-usleep = 500
    crawler-run_duration = 400
    crawler-run_interval = 600
    crawler-crawl_interval = 302400
    crawler-threads = 3
    crawler-timeout = 30
    crawler-load_limit = 3
    crawler-sitemap = https://REDACTED/sitemap_index.xml
    crawler-drop_domain = true
    crawler-map_timeout = 120
    crawler-roles = array (
    )

    Doesn’t show any recent activity on the subsite for crawling but does appear to be enabled yet if its manually run

    Current sitemap crawl started at: 07/22/2023 01:09:30
    
    The next complete sitemap crawl will start at: 08/08/2023 20:59:30
    
    Last complete run time for all crawlers: 1237800 seconds
    
    Current crawler started at: 1s ago
    
    Current server load: 0.63
    
    Last interval: 1s ago
    
    Ended reason: crawler_disabled

    • Start watching…
    • 01 Jan 1970 01:00:00     Size: 473     Crawler: #5     Position: 1     Threads: 0     Status: crawler_disabled
    • ..
    • This reply was modified 2 years, 10 months ago by dev.
    Plugin Support qtwrk

    (@qtwrk)

    are you visiting the phpinfo page by subsite domain ? instead of going by main site domain ?

    Thread Starter dev

    (@devksec)

    This is from litespeeds settings/toolbox on the subsite, the crawler config etc is different from the main primary site.

    Plugin Support qtwrk

    (@qtwrk)

    please create a ticket by mail to support at litespeedtech.com with reference link to this topic , we will investigate further.

Viewing 11 replies - 1 through 11 (of 11 total)

The topic ‘Multisite crawler issue’ is closed to new replies.