• Resolved dadishome

    (@dadishome)


    Hello,

    I am experiencing alot of misses in my cache and it causes long average response time from Google and I would like to know how to fix it.

    Google is crawling a lot of URL with query parameters ( I checked the logs) to manage product variations (e.g., color, size) in URLs like the following :

    https://ezstrap.fr/produit/bracelet-apple-watch-silicone-a-surface-rainuree/?attribute_pa_couleur=gris-transparent&attribute_pa_taille-de-lapple-watch=38mm-a-41mm

    However, thoses kind of URLs are not in Cache so when Google crawl my website, I have a lot of miss and it make the PHP works all the time , so long TTFB and google is not really happy about it.

    As you can see , static and dynamic cache hit are pretty low :

    Here is the view inside the Google Search Console, (you can focus on the part circled in green as I had another hosting company before) I have 1300 ms in average (200ms is recommended by Google) :

    As you can see , this is the kind of request google is having on my website. I have a lot of variations per product and I need a way to put in cache this kind of URLs :

    Also, I cannot remove theses URLs from robots.txt because they are necessary inside google merchant center.

    Here are my questions,
    -Is there a way to cache automatically the urls of my shop with query parameters so google hit the cache directly every time its looking for a variation URLs?

    -Is there a way to prewarm the Edge Cache (QUIC.cloud) for variation URLs so that even the first request is a HIT?

    -Can I provide a custom crawler-map.txt with all variation combinations (e.g. color+size) to ensure the urls with the query parameters are also cache ( because right now the crwaler only have normal URLs inside the Sitemap List) ?

    -If I can provide a custom crawler to cache the query parameter URLs, do you know the best way to do it? ( is this a problem only me is having ? should I go for a custom coding solution or simpler solution exists? )

    -Is “Google” in Guest Mode User-Agents enough for LiteSpeed to treat Googlebot as guest + cache-eligible as we can clearly see with my long average response time that the cache is not served for google?

    I have been strugling with this problems for weeks so I would be really happy to have your feedback on it. Thank you.

    Regards,

    • This topic was modified 10 months, 2 weeks ago by dadishome.
Viewing 15 replies - 1 through 15 (of 20 total)
  • Plugin Support litetim

    (@litetim)

    @dadishome please share a report ID generated from LSC => Toolbox => Report => click on “Send to Litespeed”
    And share the ID generated.

    Thank you

    Thread Starter dadishome

    (@dadishome)

    hello,

    Here is the report ID : DBCWIAPP

    thank you

    Plugin Support qtwrk

    (@qtwrk)

    -Is there a way to cache automatically the urls of my shop with query parameters so google hit the cache directly every time its looking for a variation URLs?

    -Is there a way to prewarm the Edge Cache (QUIC.cloud) for variation URLs so that even the first request is a HIT?

    -Can I provide a custom crawler-map.txt with all variation combinations (e.g. color+size) to ensure the urls with the query parameters are also cache ( because right now the crwaler only have normal URLs inside the Sitemap List) ?

    you need to customize your sitemap , to add these query string into sitemap , so plugin’s crawler will know and crawl over them to pre-cache them.

    -If I can provide a custom crawler to cache the query parameter URLs, do you know the best way to do it? ( is this a problem only me is having ? should I go for a custom coding solution or simpler solution exists? )

    that probably depends on your sitemap generator plugin

    -Is “Google” in Guest Mode User-Agents enough for LiteSpeed to treat Googlebot as guest + cache-eligible as we can clearly see with my long average response time that the cache is not served for google?

    it’s not about the user agent , it’s just these query string’ed URIs are not pre-cached , so it takes longer response time.

    Thread Starter dadishome

    (@dadishome)

    Thank you for the answer,

    I checked with my sitemap generator plugin but its not doing it, neither are other plugins so that’s why I was wondering if you had any tips for that…

    Also, if I try to check the cache status of a URL, it always shows “miss” the first time, even though the page is supposed to be already cached. This behavior doesn’t make sense to me.

    I tested the URLs in many different scenarios. The first request is always a MISS, and only the second request becomes a HIT. But if I always have to load a page twice to benefit from the cache, what’s the point of having a cached version in the first place?

    For example if I take this URL ( its supposed to be in cache ) and I test it it doesnt work :

    I tested cache behavior for this URL using all of the following methods:

    • The URL tester tool from QUIC.cloud
    • Using cURL via the command line
    • Opening the page in a private browser window

    No matter how I test it, I consistently get a MISS on the first load, and only a HIT on the second.
    This happens even when the page is confirmed to be cached.

    Am I doing something wrong, or is this expected behavior even for cached pages?

    Regards,

    Plugin Support qtwrk

    (@qtwrk)

    well , for the sitemap part, sadly I don’t have any further tips , if sitemap plugin doesn’t support it , you may need to find some way to manually create one , like attach the query string to each product or something like that via a shell script or something

    for cache miss , at this moment of I write the reply , I opened the home page in private window , it gives me x-litespeed-cache: bkn alone with x-qc-cache: miss

    the curl and URL tester tools are NOT sending request exactly as real browser does

    most likely , after the crawler has finished , and before you open it , there was a purge took place due to certain event like you edit/publish something , you modified something …etc

    a simple method to check this part is like : use a shorter/smaller sitemap like only 10 – 20 URLs , manually run them , then immediately open any of these URLs in in private after crawler finishes , this will at least confirm crawler works or not.

    Thread Starter dadishome

    (@dadishome)

    Hey qtwrk,

    Thank you for the answer, it helps a lot. I did a small sitemap file and checked if I got a hit right after the crawler did his job and yes it worked. So probably it was what you were saying.

    After careful consderation, I realized it will be better if I drop all the query parameters from cache so I have only one cached version per page. The only difference for user will be that they dont have the proper variation image selectionned when they land on the page , but its not that much of a deal.

    To drop all query parameters I added “*” as you can see on the image:

    Then I tested like this ; I would warm up the cache ( load two times the URL so I get an HIT ) with this URL for example :

    https://ezstrap.fr/produit/bracelet-apple-watch-tissu-tresse-boucle-metallique-type-alpine/?attribute_pa_couleur=rouge&attribute_pa_taille-de-lapple-watch=38mm-a-41mm&utm_source=Google%20Shopping&utm_campaign=Google%20Shopping%20Feed&utm_medium=cpc&utm_term=53777

    Then I remove part of the query parameters and I test again ( I should get a hit as I am dropping all the queries parameters ) with this URL :

    https://ezstrap.fr/produit/bracelet-apple-watch-tissu-tresse-boucle-metallique-type-alpine/?attribute_pa_couleur=rouge

    But then I get Misses , wich does not make sense as I warmed up the cache with the first URL.

    I tried many time , sometimes I warmed up using the canonical URL and I would then try the same URL with query parameters , and every time I get a miss instead of a hit if I change parameters in the URL.

    Here is my question,
    -Did I dropped all the query paramters the correct way by adding * inside Cache > Drop Query String ?

    Maybe there is a better way to always have the same cache no matter the query parameters?

    Regards,

    Thank you,

    Plugin Support qtwrk

    (@qtwrk)

    no , don’t do this , if you make query string drop , all the query stirng will return same page

    like ?attribute_pa_couleur=black and ?attribute_pa_couleur=white will return same content , this is NOT what you would want

    Thread Starter dadishome

    (@dadishome)

    Yes I understand that, the only difference it makes is the default images selectionned when someone arrive on the page, but the page is the same. They will have to select themeselve the variation they want.

    How I see it is that I have only two ways of reducing the response time for google, either I cache all the urls with query parameters as we discuss earlier, either I served the same cached version of the page no matter what parameter is in the URL.

    Why do you say its a bad thing if I do this?

    Plugin Support qtwrk

    (@qtwrk)

    okay , if you understand the consequences and okay with it , you can do that

    Thread Starter dadishome

    (@dadishome)

    Thank you,

    But the thing is that it doesnt even work, if still get misses when I should get hit, I prewarm the cache with : https://ezstrap.fr/produit/bracelet-apple-watch-tissu-tresse-boucle-metallique-type-alpine/

    Then I test with https://ezstrap.fr/produit/bracelet-apple-watch-tissu-tresse-boucle-metallique-type-alpine/?attribute_pa_couleur=orange

    but I get a miss again, an it always happen so its clearly not working and I have no idea why. If the query parameters are supposed to be irrelevant for the cache version, I should have hit no matter what query parameter I put after I warmed the url https://ezstrap.fr/produit/bracelet-apple-watch-tissu-tresse-boucle-metallique-type-alpine/

    Do you know why it doesnt work,

    Regards

    Plugin Support qtwrk

    (@qtwrk)

    it works on me , what’s header you got ?

    Thread Starter dadishome

    (@dadishome)

    Here is an example. As you can see I am just chaging the color, so it should be the same cache version as I am only editing the query parameters :

    It makes absolutly no sense…

    Thread Starter dadishome

    (@dadishome)

    But I dont really understand when you say it works for you as you have a miss. For me if you have a miss it means the cache was not hit right? I am probably confused about something…

    Plugin Support qtwrk

    (@qtwrk)

    no , you need to look at x-litespeed-cache: bkn as well , bkn stadns for backend , which means the cache is missed at CDN node, but hit at origin server, which is also considered as cache hit

    Thread Starter dadishome

    (@dadishome)

    ahhh ok I didnt know about that, so x-qc-cache: miss → means only the cdn is missed but not litespeed cache on the server. So I should take care of bkn only thats good.

    Then to get a x-qc-cache: hit I need to warm up the cdn node but this is only possible with the crawler if I understand correctly right?

Viewing 15 replies - 1 through 15 (of 20 total)

The topic ‘Query parameter inside URL in cache’ is closed to new replies.