img are processed multiple times
-
Hi,
In some case Lazy Load XT process img tag multiple times. I noticed this issue with a post grid component that I use (Beaver Builder).
This occurs because the post grid use the_post_thumbnail() which fires post_thumbnail_html filter so img are processed by Lazy Load XT. And then when the_content filter runs Lazy Load XT process the img of the post grid a second time.To avoid this I slightly modified the regex you use to grab img.
So I turn this
preg_match_all('/<'.$tag.'[\s\r\n]+.*?(\/|\/'.$tag.')>/is',$content,$matches);to this
preg_match_all('/<'.$tag.'[\s\r\n]([^<]+)(\/|\/'.$tag.')>(?!<noscript>|<\/noscript>)/is',$content,$matches);Hope this help and thanks for this great lazy loading implementation.
Best regards
-
Interesting. So does your theme create some HTML that includes get_the_post_thumbnail(), then run apply_filter(‘the_content’, $that_html) ?
Either way, I’m making some revisions to the regex for next version, so I’ll include your (?!<noscript>|<\/noscript>) bit.
I’ll be releasing the next version shortly, so feel free to update to it when you see it come through.
Interesting. So does your theme create some HTML that includes get_the_post_thumbnail(), then run apply_filter(‘the_content’, $that_html) ?
Yes exactly but it’s not the theme that include get_the_post_thumbnail(). It’s the Beaver Builder plugin which is a drag&drop frontend editor that allows to drop a post grid module anywhere in a page. I don’t know how other page builder plugins behave but I suspect it should be the same.
I’ll be checking your next version to see how it plays with Beaver Builder.
Regards
I have a development version of the plugin here that incorporates the regex change you suggested. I’ll be releasing the next version soon.
I checkout the revision 1135603 this is the right one?
There is a problem with the regex and I thing you should turn this
preg_match_all('/<'.$tag.'[\s\r\n]+.*?'.$tag_end.'>(?!<noscript>|<\/noscript>)/is',$content,$matches);to this
preg_match_all('/<'.$tag.'[\s\r\n]([^<]+)'.$tag_end.'>(?!<noscript>|<\/noscript>)/is',$content,$matches);By replacing +.*? by ([^<]+) we just match everything except opening tag which is necessary with the (?!<noscript>|<\/noscript>) addition.
To see what happens, you can try both regex against following html :
<img width="155" height="300" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" data-src="http://local.wordpress.dev/wp-content/uploads/2013/03/featured-image-vertical-155x300.jpg" class="attachment-medium wp-post-image" alt="Horizontal Featured Image" /><noscript><img width="155" height="300" src="http://local.wordpress.dev/wp-content/uploads/2013/03/featured-image-vertical-155x300.jpg" class="attachment-medium wp-post-image" alt="Horizontal Featured Image" /></noscript> <img class="fl-photo-img" src="http://local.wordpress.dev/wp-content/uploads/2015/04/5527cc7c08d02_input_1.jpg" alt="5527cc7c08d02_input_1" itemprop="image" />1135603 is correct.
I understand the basics of regex, but I’m no pro. I tested both expressions agains the HTML you provided, and they both accurately matched the second img but not the first.
I recognize that they both work, so I don’t understand the purpose of changing +.*? to ([^<]+). Can you explain it?
Does ([^<]+) begin here? …
<img class=”fl-photo-img” src=”http://local.wordpress.dev/wp-content/uploads/2015/04/5527cc7c08d02_input_1.jpg” alt=”5527cc7c08d02_input_1″ itemprop=”image” />Where as +.*? begins here? …
<img class=”fl-photo-img” src=”http://local.wordpress.dev/wp-content/uploads/2015/04/5527cc7c08d02_input_1.jpg” alt=”5527cc7c08d02_input_1″ itemprop=”image” />(Look for the bold characters in those img tags. They’re kinda hard to see.)
I’m not a regex pro neither so we speak the same language.
Did you use the s modifier when you tested both regex?My understanding is that with the s modifier turned on the first regex will match the whole string :
<img width="155" height="300" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" data-src="http://local.wordpress.dev/wp-content/uploads/2013/03/featured-image-vertical-155x300.jpg" class="attachment-medium wp-post-image" alt="Horizontal Featured Image" /><noscript><img width="155" height="300" src="http://local.wordpress.dev/wp-content/uploads/2013/03/featured-image-vertical-155x300.jpg" class="attachment-medium wp-post-image" alt="Horizontal Featured Image" /></noscript> <img class="fl-photo-img" src="http://local.wordpress.dev/wp-content/uploads/2015/04/5527cc7c08d02_input_1.jpg" alt="5527cc7c08d02_input_1" itemprop="image" />Indeed if we split the regex: <img[\s\r\n]+.*?\/>(?!<noscript>|<\/noscript>)
<img[\s\r\n]+will match opening img tag followed by one or more white space, carriage return or new line.
.*?match any character (new line included with the s modifier turned on), zero or more times.
\/>(?!<noscript>|<\/noscript>)match the ending tag if not followed by <noscript> or </noscript>.
So this regex will not match :
<img width="155" height="300" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" data-src="http://local.wordpress.dev/wp-content/uploads/2013/03/featured-image-vertical-155x300.jpg" class="attachment-medium wp-post-image" alt="Horizontal Featured Image" /><noscript>as the ending img tag is followed by <noscript>.
However it will match the whole string matching the first opening img tag and everything that stands between it (without any restriction with .*?) and the first closing tag that is not followed by <noscript> or </noscript> so the closing tag of the second img in our example :
<img width="155" height="300" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" data-src="http://local.wordpress.dev/wp-content/uploads/2013/03/featured-image-vertical-155x300.jpg" class="attachment-medium wp-post-image" alt="Horizontal Featured Image" /><noscript><img width="155" height="300" src="http://local.wordpress.dev/wp-content/uploads/2013/03/featured-image-vertical-155x300.jpg" class="attachment-medium wp-post-image" alt="Horizontal Featured Image" /></noscript> <img class="fl-photo-img" src="http://local.wordpress.dev/wp-content/uploads/2015/04/5527cc7c08d02_input_1.jpg" alt="5527cc7c08d02_input_1" itemprop="image" />Now if we replace .*? by ([^<]+) we still match everything except a new opening tag and so it prevent the regex to spread accross multiple tags.
So the resulting regex should be:
<img[\s\r\n]+([^<]+)\/>(?!<noscript>|<\/noscript>)With this you can even remove the s modifier as negative class always match new line character (see http://php.net/manual/en/reference.pcre.pattern.modifiers.php).
You’ll also noticed that I added back the + after [\s\r\n] that was missing in my previous version.
What do you think?
Ahah. So my original regex would match this:
<img src="" /><noscript></noscript>Blah blah<br />because it looks for a “/>” ?
Where as yours looks until a “>” and then checks for the <noscript> ?
I think that makes sense. And yeah, that would be a good amendment to the code. I’ll incorporate it and release a new version this weekend.
Thanks for your help!
Yes exactly the original regex version would match all of this.
The version I proposed just doesn’t match this:
<img src="" /><noscript></noscript>because “<img…” is followed by “<noscript…”.
And doesn’t match this neither:<img src="" /><noscript></noscript>Blah blah<br />because they are an
"<"between"<img..."and"<br />". So it breaks at the"<"of"<noscript>"in fact.Glad I could help π
Just released 0.4 with this implemented in it. Thanks for your help! I sincerely appreciate it.
The topic ‘img are processed multiple times’ is closed to new replies.