Override WordPress content import (custom preprocessing)
-
Hi Maxwell,
I’m using mxchat with complex Elementor-built pages that include heavily nested structures (e.g. vc-rows, shortcodes, nested containers). While the current tag-stripping approach helps, it doesn’t produce reliable or structured enough content for RAG use.
I also tried the URL and sitemap import options, but in our setup (preloader + JS-rendered content) they only capture placeholder HTML rather than the final rendered page. Not a blocker for me right now, just noting it in case it matters.
What actually worked well was manually having ChatGPT transform page content into clean, structured knowledge-base text for RAG.
My question: is there a way to override or hook into the WordPress content import step in mxchat so I can inject a custom preprocessing function (e.g. HTML → structured Markdown/clean text) before indexing? I think hookking on mxchat_before_process_post could do the trick.
Any pointers to relevant hooks or extension points would be appreciated.
Thanks,
Samy
You must be logged in to reply to this topic.