• Resolved Weston Ruter

    (@westonruter)


    I just encountered an issue (via support topics) where a site running Aruba HiSpeed Cache will corrupt the markup if the HTML Optimizer feature is enabled. For example, consider the following block markup in the editor:

    <!-- wp:paragraph -->
    <p>This is an XPath: <code>/HTML/BODY/*[1][self::DIV]</code></p>
    <!-- /wp:paragraph -->

    <!-- wp:paragraph -->
    <p>This is a Script:</p>
    <!-- /wp:paragraph -->

    <!-- wp:code -->
    <pre class="wp-block-code"><code>&lt;script>
    /* example script */
    &lt;/script></code></pre>
    <!-- /wp:code -->

    <!-- wp:html -->
    <p data-od-xpath="/HTML/BODY//*[4][self::P]">This has an XPath in a data attribute.</p>
    <!-- /wp:html -->

    <!-- wp:code -->
    <pre class="wp-block-code"><code>&lt;style>
    /* example style */
    &lt;/style></code></pre>
    <!-- /wp:code -->

    Without the HTML Optimizer enabled, this is rendered on the frontend as:

    <p>This is an XPath: <code>/HTML/BODY/*[1][self::DIV]</code></p>

    <p>This is a Script:</p>

    <pre class="wp-block-code"><code>&lt;script>
    /* example script */
    &lt;/script></code></pre>

    <p data-od-xpath="/HTML/BODY//*[4][self::P]">This has an XPath in a data attribute.</p>

    <pre class="wp-block-code"><code>&lt;style>
    /* example style */
    &lt;/style></code></pre>

    However, when the HTML Optimizer is enabled, this is the resulting output:

    <p>This is an XPath: <code>/HTML/BODY &lt;/script&gt;</code></pre> <p data-od-xpath="/HTML/BODY/ &lt;/style&gt;</code></pre>

    The following markup in bold is getting erroneously stripped out:

    <p>This is an XPath: <code>/HTML/BODY/*[1][self::DIV]</code></p>

    <p>This is a Script:</p>

    <pre class="wp-block-code"><code>&lt;script>
    /* example script */

    &lt;/script></code></pre>

    <p data-od-xpath="/HTML/BODY//*[4][self::P]">This has an XPath in a data attribute.</p>

    <pre class="wp-block-code"><code>&lt;style>
    /* example style */

    &lt;/style></code></pre>

    The problematic code is this line:

     $buffer = preg_replace( '@\/\*(.*?)\*\/@s', ' ', $buffer); //remove  comment

    Using regular expressions to manipulate HTML is highly dangerous. I highly recommend you switch to something more robust, such as the HTML API which is available as of WordPress 6.2. In particular, WordPress 6.7 introduced the ability to safely manipulate the text nodes in tags via the ::set_modifiable_text() method.

Viewing 7 replies - 1 through 7 (of 7 total)
  • Thread Starter Weston Ruter

    (@westonruter)

    This is liable to break the Optimization Detective plugin from the WordPress Core Performance team which includes such XPath references.

    Plugin Contributor Aruba Support

    (@arubasupport)

    Dear Customer,
    we have checked the regexp on the code under consideration and there is no evidence of the removal of any tags that could cause the difficulties you have reported.
    We suggest a check among the optimiser plugins active on the site, as they may conflict with each other.
    Best regards

    Thread Starter Weston Ruter

    (@westonruter)

    @arubasupport First of all, I’m not a customer but rather a maintainer of other plugins. I’m also a WordPress core committer.

    Secondly, here is the evidence that shows how your regular expression is inadvertently corrupting HTML: https://3v4l.org/IEJ0P

    Plugin Contributor Aruba Support

    (@arubasupport)

    Dear Weston Ruter
    We would like to inform you that the plugin in question can only be used on WordPress hosting at Aruba Spa.
    Best regards

    Thread Starter Weston Ruter

    (@westonruter)

    @arubasupport Yes, I know. I’ve opened this support topic on behalf of a user of your plugin being hosted on Aruba Spa. See related support topic.

    dolceremy

    (@centoasa)

    I confirm that I am a user and customer of Aruba WordPress managed hosting.

    Plugin Contributor Aruba Support

    (@arubasupport)

    Dear dolceremy,
    with regard to what you have reported, in order to carry out the necessary analysis, we invite you to contact us by requesting assistance through the channel: https://assistenza.aruba.it/home.aspx .
    Best regards

Viewing 7 replies - 1 through 7 (of 7 total)

The topic ‘HTML Optimizer corrupts output’ is closed to new replies.