• Resolved LucasRolff

    (@lucasrolff)


    Hello,

    Is there anything that can be done to improve the amount of data Updraft writes to disk when doing batched backups? When BinZip is used, every time Updraft “adds” files to a given zip, it results in completely rewriting the zip to a temporary file where this batched data is then added in the end of the zip file, and the process then starts over for the next iteration of files being added.

    2577.809 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (19.9 MB, 173597 files batched, 1001 (148946) added so far); re-opening (prior size: 0.7 KB)
    2578.606 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (18.8 MB, 173597 files batched, 1001 (149947) added so far); re-opening (prior size: 20604.2 KB)
    2580.990 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (22.5 MB, 173597 files batched, 1001 (150948) added so far); re-opening (prior size: 40013.2 KB)
    2584.437 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (24.1 MB, 173597 files batched, 1001 (151949) added so far); re-opening (prior size: 63221.6 KB)
    2590.148 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (19.3 MB, 173597 files batched, 1001 (152950) added so far); re-opening (prior size: 88060.9 KB)
    2594.383 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (18.7 MB, 173597 files batched, 1001 (153951) added so far); re-opening (prior size: 107994.8 KB)
    2600.179 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (23.6 MB, 173597 files batched, 1001 (154952) added so far); re-opening (prior size: 127341.8 KB)
    2607.114 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (20.1 MB, 173597 files batched, 1001 (155953) added so far); re-opening (prior size: 151697.1 KB)
    2614.763 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (32.9 MB, 173597 files batched, 1001 (156954) added so far); re-opening (prior size: 172478.1 KB)
    2624.164 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (54.7 MB, 173597 files batched, 1001 (157955) added so far); re-opening (prior size: 206395.2 KB)
    2636.662 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (50.7 MB, 173597 files batched, 1001 (158956) added so far); re-opening (prior size: 262615.3 KB)
    2650.977 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (39.6 MB, 173597 files batched, 1001 (159957) added so far); re-opening (prior size: 314734.6 KB)
    2666.409 (0) Adding batch to zip file (UpdraftPlus_BinZip): over 1000 files added on this batch (39.5 MB, 173597 files batched, 1001 (160958) added so far); re-opening (prior size: 355394.4 KB)
    2683.314 (0) Adding batch to zip file (UpdraftPlus_BinZip): possibly approaching split limit (12.5 MB, 205 (161163) files added so far); last ratio: 1; re-opening (prior size: 395955.6 KB)

    In the above example, the total size effectively becomes ~ 400 megabyte ( as configured in the plugin ), but to create this 400 megabyte file, a total of roughly 3.4 gigabyte is written, meaning we have a write amplification in this case of 8.5x

    In another zip that’s being created a total of 4.15 gigabyte is written to disk for a 400 megabyte file, meaning a write amplification 10.3x

    Over all, it seems to average out at around 8.9x for a given website of a size of 30 gigabyte, resulting in 267 gigabytes of data being written during the duration of the backup.

    I know I can “improve” this by reducing the batch/split size from 400MB to 50MB for example, this greatly reduces the write amplification, but it shouldn’t be needed in the first place.

    This has been a “problem” for many years, but when more and more people use updraft, it shows great impact on systems writing multiple terabytes of data each day due to write amplification caused by a backup plugin.

Viewing 7 replies - 1 through 7 (of 7 total)
  • Plugin Contributor bcrodua

    (@bcrodua)

    Hi,

    You’re absolutely right that backups can generate enormous disk I/O due to what’s called “write amplification.” As you’ve noticed, creating a 400 MB ZIP can result in several gigabytes of actual disk writes—a lot more than you’d expect.

    To reduce this impact, you can lower the split archive size in the Expert Settings of UpdraftPlus. Reducing it from the default (e.g. 400 MB) to something like 50–100 MB will significantly cut down on write-amplification and improve backup performance.

    Additionally, running backups during low-traffic periods—such as overnight—helps both performance and reliability, especially if your hosting resource availability fluctuates during the day.

    Best Regards,
    Bryle

    Thread Starter LucasRolff

    (@lucasrolff)

    Hello,

    Running backups during low-traffic periods (such as overnight) does not fix the write amplification. And if all customers run backups during low-traffic periods, you’ll end up having servers writing several gigabytes per second (combined). We’re seeing several terabytes of data being written to disk per day due to UpdraftPlus’s write amplification issue.

    While we use expensive enterprise drives, disk writes are (still) finite, that’s simply the reality of NAND chips. Many other plugins seemingly can do backups just fine without causing write amplification issues, but Updraft is different. As a hosting provider we can’t simply go blacklist a given plugin, if we could, Updraft would be on the list in all honesty due to the fact this is even an issue.

    I ask that you bring this to your developers so they fix the issue instead of simply acknowledging that your plugin indeed has a write amplification issue. If the “fix” is to reduce the default chunk size to something smaller, then go for it. But a plugin should not cause 8x write amplification out of the box.

    Plugin Support vupdraft

    (@vupdraft)

    It used to be that UpdraftPlus created the whole backup on your server before the backup was uploaded to your remote storage..

    Now, it creates a chunk, uploads it then immediately deletes it so we have made significant improvements with regards to trying to limit the amount of disk space that the plugin uses.

    With regards to your initial question, this is simply a limitation of php, it’s not limited to UpdraftPlus.

    You may find this article interesting: https://teamupdraft.com/documentation/updraftplus/topics/backing-up/faqs/how-much-free-disk-space-do-i-need-to-create-a-backup

    Thread Starter LucasRolff

    (@lucasrolff)

    With regards to your initial question, this is simply a limitation of php, it’s not limited to UpdraftPlus.

    The way you add batches of files to a zip, is not a limitation of PHP, but rather how UpdraftPlus decides to implement adding files into batches in the zip.

    As I’ve already pointed out, this doesn’t seem to be an issue with a bunch of other WordPress backup plugins, if it was a PHP limitation, all backup plugins would face the same issue, but they do not.

    While “this is a limitation of PHP”-excuse may work for your average customer who doesn’t know any better, it ain’t gonna work in case. Once again, you should really look into reducing your write amplifications in the plugin. While it’s currently likely done for preventing “limits” from being hit, you’re causing backups to become slower and slower, because files are being rewritten for every single “batch” of 1000 files (for example) that’s done. The result is backups that takes significantly longer than what they should because you’re likely to hit disk IO throttling as the bottleneck (especially for certain providers out there that limit people to 5-10MB/s).

    One way could for example be to try to batch more files at a time, this reduces the amplification quite significantly, the number of files could be determined based on e.g. available memory_limit, and max_execution_time settings, e.g. try to “push” it to find a nice balance. The constant of 1000 files for example makes no sense in many cases.

    Batching things into smaller files like 400MB zip files are fine, I’m not against it, and it does save in terms of required storage. however, that does not justify the write amplification caused by UpdraftPlus. And creating smaller zip files than 1 big file, does not equal to high write amplification. What equals high write amplification is the way you batch add files to those zip files.

    So no, it’s not a limitation of PHP, and you know that.

    Plugin Support vupdraft

    (@vupdraft)

    Hi,

    You can experiment with the batch size yourself. If you goto the UpdraftPlus.php

    On line 62, you can find this

    if (!defined(‘UPDRAFTPLUS_MAXBATCHFILES’)) define(‘UPDRAFTPLUS_MAXBATCHFILES’, 1000);

    Just change the 1000 to the number that you would prefer.

    Thread Starter LucasRolff

    (@lucasrolff)

    Hello,

    Considering it’s a constant, I should be able to modify this in wp-config.php? Or is there any particular reason you’re suggesting we go change hundreds of website’s files only for those changes to be overwritten upon next upgrade?

    Plugin Contributor bcrodua

    (@bcrodua)

    Hi,

    You can add the below code line in the wp-config.php file

    define(‘UPDRAFTPLUS_MAXBATCHFILES’, 2000);

    NOTE: You must insert this BEFORE /* That’s all, stop editing! Happy blogging. */ in the wp-config.php file.

    Thanks,
    Bryle

Viewing 7 replies - 1 through 7 (of 7 total)

The topic ‘Improve disk IO for backups’ is closed to new replies.