it is enabling and disabling the crawlers as scheduled it is working fine but when it comes to running it even after trying with litespeed-crawler r or litespeed-crawler run, all i get is
Success: Start crawling. Current crawler #1 [position] 0 [total] 1884
I need to go back into the website and manually click the “Run Crawler” button. only then will it actually start processing as usual. Report number DSRERPTK
This topic was modified 1 year, 6 months ago by arithdevlpr. Reason: added enabling/disabling is working fine
please try add wp litespeed-crawler reset before run , or better yet , reset , wait for 60 seconds, then run it
also your crawler interval is 302400 , which is 3.5 days , it may not trigger the cron to run again until 3.5 days later, I’d suggest to set to 61 for quick test first
Perfect it’s working now. Also i thought that the crawl interval setting only applies for how long you want to wait before a fresh entire sitemap crawl? since i’m using the server cron i thought running wp litespeed-crawler run was enough.
So when I increase the crawl interval, it doesn’t seem to work and just does the default Success: Start crawling. Current crawler #1 [position] 0 [total] 1884 like earlier.
This is what’s confusing me because i thought “Crawl Interval” means “how long to wait before the job crawls the entire sitemap again”. Not “how long to wait before the job runs normally”.?
Hi, it’s me again. i must be doing something wrong because out of 8 crawlers it seems to never erach the 3rd one but will always reset after the end of the 1st or 2nd and just loop those two crawlers throughout the entire night. I have also tried setting separate turn on/turn off cli server cron jobs to atleast try and get the other crawlers to start but to no avail.
Is there exists better documentation around the CLI crawler commands and the front-end settings?
For example if i have Crawler set to ON on the frontend but I am using CLI cron to enable, run and disable it, is having it set to ON here still necessary?
And what happens if i have this setting ON but am also using the CLI cron job?.
My debug log doesn’t show anything but it seems like setting the crawler interval to 61 is what is causing the position reset. This is my latest cronjob lines i do not have any reset added but i feel like the frontend settings are causing a clash?
#Enable crawlers at 7:30 PM NZDT (6:30 AM UTC) with logging
Alright yes it’s working now. May I make a suggestion on updating the crawler documentation? It would be a good idea for posting an article on the subject too as I notice alot of other users are having some issues with the WP CLI.
Correct any areas if I’m wrong and happy for you to share this with the team for ideas and feedback.
The crawler travels through your site, refreshing pages that have expired in the cache. This makes it less likely that your visitors will encounter uncached pages.
The crawler must be enabled at the server-level or the virtual host level by a site admin. Please see: Enabling the Crawler at the Server or Virtual Host Level
Learn more about crawling on our blog.
If you are <a href="https://developer.ww.wp.xz.cn/plugins/cron/hooking-wp-cron-into-the-system-task-scheduler/">hooking WP-Cron into the System Task Scheduler</a>, you must be comfortable using the crawler's <a href="https://docs.litespeedtech.com/lscache/lscwp/cli/">WordPress CLI commands</a> to manually enable, run, reset position and disable the crawlers.
Learn more about this on our blog (insert blog post article on the subject)
This determines how long in seconds before the crawler starts crawling/re-initiating the crawling process. You might want to change this depending on how long it takes to crawl your site. The best way to figure this out is to run all the crawlers a few times and keep track of the "Last complete run time for all crawlers". Once you've got that amount, set Crawl Interval to slightly more than that. For example if your last complete run time for all crawlers is 4 hours, you could set this value to 5 hours (or 18000 seconds)
This setting is also reliant on the Run Duration setting. If your Run Duration is lower than the Crawl Interval, the crawler will not re-initiate until the Crawl Interval has been reached.
For example using the default values Run Duration 400, Crawl Interval 302400, and your site has not completed crawling, This means once the crawler starts and 400 seconds is past, it will be another 302000 seconds before the crawler is re-initiated
If you are using server cron to schedule the crawler, it is recommended to set this value to something lower so the crawler can be re-initiated by the cron accordingly. Learn more from our article (insert article)
thanks for the suggestion , we are also refactor the crawler feature in upcoming version , as well as we need re-work on the document , I must say , even to me , if I didn’t dig the code directly , it was somewhat confusing as well
Viewing 12 replies - 1 through 12 (of 12 total)
The topic ‘Server Cron job for Crawler CLI Issues’ is closed to new replies.