• Resolved Pixelwunder

    (@pixelwunder)


    After the crawler ran on the test site yesterday (https://ww.wp.xz.cn/support/topic/crawler-and-http-auth/page/2/#post-17736156), I updated the live site to 6.2.

    Unfortunately, the crawler does not run.

    Ended reason: stopped_maxtime. The crawler hangs at status: crawling, updated position. I have tried to start it manually twice in a row, nothing happens.

    I have set Crawl Interval to 61 seconds. Nothing happens.

    The WP-Cron calls the litespeed_task_crawler correctly.

    I’m almost at the end of my ideas. Especially because the test page ran yesterday, even if only once. Sometimes the crawler starts again after an hour. With 6.1 it took 6 hours and the crawling was completely finished.

    Why these problems with 6.1 and 6.2?

    JBPBJAUI

Viewing 8 replies - 1 through 8 (of 8 total)
  • Thread Starter Pixelwunder

    (@pixelwunder)

    Here is an update. The crawler has run through and it took a total of 17 hours (6.2). This is because the crawler kept stopping for an hour or two.

    Under 6.0 the crawler started, ran for 6 hours and the page was crawled completely.

    I monitored this yesterday. The crawler started for 600 seconds, as it is set. It then waited correctly for 100 seconds and then started again. Then after a few seconds it stopped and the crawler was terminated with maxtime. Then it started again an hour later.

    17 hours after a complete purge is far too long.

    Plugin Support qtwrk

    (@qtwrk)

    stopped_maxtime -> this indicates the crawler has been running for the configured time , and it will wait for interval time to pass and to run again

    if you think your crawler next run is too long, you can lower the interval setting in crawler setting.

    Thread Starter Pixelwunder

    (@pixelwunder)

    As I wrote, the value for interval settings is 100 seconds! It starts again after 100 seconds, but then simply stops and runs again after an hour. This should not happen according to the settings. Run Duration is 600 seconds, I have also tested 1800. The set values are ignored.

    Plugin Support qtwrk

    (@qtwrk)

        crawler-run_duration = 600
        crawler-run_interval = 61
        crawler-crawl_interval = 3600

    from your report , this means , let’s say at a day 00:00 , crawler starts , assuming no other factors like purge or overload , then

    00:00 -> run 600 seconds -> 00:10 -> stop for 61 seconds -> 00:11 -> run another 600 seconds -> 00:21 -> stop for 61 seconds -> 00:22 -> run another 600 seconds , and so on , until , let’s say 02:22 it finished current run

    then at 02:22 -> wait for 3600 seconds -> 03:22 -> it starts to run again and repeat above process as 00:00

    Thread Starter Pixelwunder

    (@pixelwunder)

    I understood the theory and that’s how it is under 6.0. But under 6.2 it is the case that in this example:

    00:00 -> run 600 seconds -> 00:10 -> stop for 61 seconds -> 00:11 -> run another 600 seconds -> 00:21 -> stop for 61 seconds -> 00:22 -> run another 600 seconds , and so on

    So far 70,000 pages have been crawled in 6 hours (6.0). Now it takes 17 hours! (6.2)

    But the crawler stops after 30 minutes, or after an hour and then continues somewhere else. But it should continue until the end and not wait for hours after maxtime until it continues crawling.

    Plugin Support qtwrk

    (@qtwrk)

    the crawling will also dependent on your server performance

    the crawler itself is same, doing request to pages it grabs from sitemap , once the request is sent, it’s server’s job to process the request

    as far as the crawler’s functionaility concern , it does same job on all versions , I can’t really think any possible reason crawler takes 3 times the difference , only possible case is the crawler did not send the request , but in that case, you will need to enable debug log and see how crawler acts

    Thread Starter Pixelwunder

    (@pixelwunder)

    As you can see, the crawler ended with stopped_maxtime and then did not start again, although only 42676 pages were crawled.

    https://www.prad.de/images/log.txt

    This is certainly not a server problem, the load is never above 8 and the setting is 12.

    After a break of one hour the crawler started again (crawl interval 3600). But the Interval Between Runs is 100 seconds!

    Then it is stopped again at position: 42866 pages (maxtime), i.e. just 390 pages.

    • This reply was modified 2 years, 1 month ago by Pixelwunder.
    • This reply was modified 2 years, 1 month ago by Pixelwunder.
    • This reply was modified 2 years, 1 month ago by Pixelwunder.
    • This reply was modified 2 years, 1 month ago by Pixelwunder.
    Thread Starter Pixelwunder

    (@pixelwunder)

    This is a managed LiteSpeed server. Support asks me where to find the variable. Is it not set by default?

Viewing 8 replies - 1 through 8 (of 8 total)

The topic ‘Crawler stops’ is closed to new replies.