Hello!
I found various cached URLs with ?query-* and Yoast SEO active (example). So, this issue isn’t isolated to TSF.
TSF’s Advanced Query Protection works by testing broken queries from registered endpoints. The canonical URL generator ignores everything that isn’t registered. Together, they help with indexing only real pages.
However, in your case, something is “correctly” registering the ?query-* URLs or is otherwise forcing it incorrectly into WordPress’s query APIs. This is something out of our control.
If you look closely at those pages, every link is rewritten to have the ?query-* parameter, whether they initially had them or not.
It isn’t the Twenty Twenty-Four theme doing this, but it is another plugin.
I recommend disabling caches and inspecting the source of such a ?query-* page. Then, deactivate plugins one by one until the ?query-* stuff disappears; the last plugin you’ll deactivate is probably the cause of this mayhem.
Thanks Sybre. The issue wasn’t caused by any plugin, let alone The SEO Framework. I deactivated all plugins but the issue wasn’t resolved until I reverted to the Twenty-Twenty-Three theme from the new one.
Don’t know why but the new wp theme is causing this.
I scanned the theme for the query-thing; maybe you had an older version or one that was altered somehow. I’ll test it out and report back with a workaround and a fix to help regain your indexing.
I tested it more thoroughly now. It appears that Twenty Twenty-Four allows for a block called “Query Loop,” which doesn’t honor anything WordPress has to offer. I’m not sure if the theme causes this issue or if the theme toggles something broken in WordPress itself. Either way, that block should’ve never existed — at least not in its current form.
In my testing environment, TSF won’t allow indexing of those ?query-*-pages because they aren’t registered correctly.
Yoast SEO, on the other hand, does allow indexing of these pages by assigning its SEO. This is one of many critical things TSF is better at.
View post on imgur.com
Since Yoast SEO never focused on fixing its code (I reported 30 bugs, none of which are fixed) but on extracting money from its users, bugs like these still linger.
So, the workaround is using TSF. But, it won’t work magically within a day: Google has to recrawl all pages and find that the links to those aren’t canonical. This takes more time than it did to discover those rogue pages in the first place. This can take a couple of weeks (sometimes months…).
To speed up reindexing, after you activate TSF and clear all your caches (also at Cloudflare), you can ask them to recrawl the homepage. See https://developers.google.com/search/docs/crawling-indexing/ask-google-to-recrawl.
If, after three weeks, you still find that Google indexed thousands of fake query pages, let me know, and we should see if we can start cleaning up the index using robots.txt.
Thank you Sybre for the detailed investigation. I have been using the Twenty Twenty Three theme since its launch, and now I regret it. The query loop block is perhaps the only way to display the latest posts in full site editing themes. You can also customize it, for example filter categories or set other parameters to include/exclude certain taxanomies for displaying posts. I used the SEO Framework, so this was probably because such pages didn’t get indexed.
However, when I switched to the Twenty Twenty Four theme, I deactivated my plugin for a while in order to reactivate it when all the customization was done. However, before I could activate the plugin, those artificial pages started appearing. I had no choice but to test various plugins to see if one could fix the issue.
I have to admit that the SEO Framework handles this issue better than other plugins. At least it doesn’t assing those faux pages any canonical value. Yoast, unfortunately, sets these pages as cononical, and the cycle never ends.
I have activated the SEO Framework again and switched to the Blocksy theme to avoid such issues in the future, even though I like the customizability of FSE themes.
My question is, how can I know when WP has fixed it? I do have a WP testing environment, so any idea how I can check if they have fixed the query loop block?
Thanks again for your reply and time, despite the fact that the issue wasn’t caused by your plugin.
Hi Sybre. Hope you’re doing well.
Google has deindexed some of my pages but then there’s silence. On the other hand, my traffic has dropped drastically. Can you please tell me a way how to fast-track deindexing of invalid pages?
Thank you very much.