Exclude html tags with class from highlighting
-
Hello,
I’m using Relevanssi feature ‘Highlight query terms in documents’ which is great.
For code snippets I’m using the plugin Enlighter. Unfortunately, if the query term is part of the code snippet the html is outputted instead of highlighting the term.
Example for search term ‘system’:
get <span style="background-color: #dbeff2">system</span> ....
It’s ok that there is no highlighting but the html code is not ok. Is there a way to tell Relevanssi not to highlight e.g. between<pre class="EnlighterJSRAW">and</pre>?Best regards
saNNNy
-
No, not really… Currently there are few good solutions for this. The only way I can come up right now is to look at the excerpts before they are printed out on the search results template and remove the offending highlights there. That’s a bit of a hack, but should work.
Something like this:
$excerpt = preg_replace( '/<\/?span.*?>/', '', $post->post_excerpt ); echo $excerpt;But there should probably be another way. Perhaps some added filter hooks… maybe one for modifying the highlighted excerpts?
In the excerpts in the search template everything is fine. When I open a page (with highlight parameter in url) the highlighting ‘doesn’t work correctly’.
When I put your code in the search template there is no more highlighting and when I open a page there is no difference.
Some filter hooks would be nice, yes. Maybe one for modifiying highlighted excerpts and another for modifying highlighted documents.
Ah, missed that you talked about highlighting in documents. My solution won’t help there. It’s not unusual to have problems with the highlighting in documents, it’s a very rough method.
Actually, you don’t need a specific filter hook for modifying highlighted content. The highlighting is added with
the_contenton priority 11, so you can add your own function on a later priority that cleans out the unwanted highlights. That’s one solution.For another solution, I’ll add this filter:
$content = apply_filters( 'relevanssi_clean_excerpt', $content, $start_emp_token, $end_emp_token );It’ll be added in
relevanssi_highlight_terms(), before these lines:$content = str_replace( $start_emp_token, $start_emp, $content ); $content = str_replace( $end_emp_token, $end_emp, $content ); $content = str_replace( $end_emp . $start_emp, '', $content );It’ll allow you to modify the highlight marker tokens before they are converted into whatever tags your chosen highlighting method uses. The tokens look like this:
**{}[and]}**, but since they may change at some point, the actual tokens are passed as parameters.This is probably the cleanest way to remove unwanted highlights. Slightly before those lines in excerpts-highlights.php there’s some code that cleans out highlights from inside tags.
Actually, now that I look at it, the easiest solution is actually even easier. Relevanssi already cleans out highlights from inside
script,style,objectandembedtags. I’ll just addpreandcodeto that list. That should solve your problem here as well.That’s a simple change, you can try it out now. Just replace this:
if ( preg_match_all( '/<(style|script|object|embed)>.*<\/(style|script|object|embed)>/U', $content, $matches ) > 0 ) {with this:
if ( preg_match_all( '/<(style|script|object|embed|pre|code)>.*<\/(style|script|object|embed|pre|code)>/U', $content, $matches ) > 0 ) {in excerpts-highlights.php. Does that solve your problem?
Thank you very much for your help.
In
codeandpretags without any class the highlighting already worked fine. It’s only in combination Enlighter plugin which adds a class to the tags.I tried your last solution. Relevanssi now cleans out highlights from inside
codeandpretags but without any class.So I tried to change the code to
if ( preg_match_all( '/<(style|script|object|embed|pre.*|code.*)>.*<\/(style|script|object|embed|pre|code)>/U', $content, $matches ) > 0 ) {
Now highlights from inside code and pre tags with or without a class are cleaned. This works great!If I only want to clean them out between the tags with the Enlighter class I can change it to:
if ( preg_match_all( '/<(style|script|object|embed|pre class="Enlighter.*|code class="Enlighter.*)>.*<\/(style|script|object|embed|pre|code)>/U', $content, $matches ) > 0 ) {
This also works great (I’m not sure what I prefer)!Will you add something for this? Or do I need to update the file after every Relevanssi update?
How about this:
if ( preg_match_all( '/<(style|script|object|embed|pre|code).*<\/(style|script|object|embed|pre|code)>/U', $content, $matches ) > 0 ) {for a clean solution? I guess
styleandscripttags can also have attributes.Yes, this sounds good and works like a charm.
Good, I’ll change it to that in the next version.
Great, thank you very much.
Hi Mikko,
I was glad to see that there is an update for Relevanssi today. But unfortunately, it doesn’t solve my problem.
I watched in excerpts-highlights.php file and you changed the if clause to
if ( preg_match_all( '/<(style|script|object|embed|pre|code)>.*<\/(style|script|object|embed|pre|code)>/U', $content, $matches ) > 0 ) {instead ofif ( preg_match_all( '/<(style|script|object|embed|pre|code).*<\/(style|script|object|embed|pre|code)>/U', $content, $matches ) > 0 ) {. With your code the tags with attributes aren’t regarded because they are directly closed.This has been fixed now and the fix will be included in the next version (4.0.12 or 4.1).
Thank you.
It seems the problem persisted also in the version 4.1 or are there a setting for omitting a specific classes in the plugin?
Please see our page https://www.sketchengine.eu/documentation/api-documentation/?highlight=API#toggle-id-1 where is highlighted word “API”. Within Englighter plugin code it creates this
<span style="background-color: #ffaf75">api</span>any news about this issue?
You may need to open a new topic, I think Mikko doesn’t read it because this issue is resolved.
In my version 2.2.5 (premium) everything looks okay.
The topic ‘Exclude html tags with class from highlighting’ is closed to new replies.