Backups are an exercise in trade-offs between ease of running, ease of restoring, time to run, and storage to be used. Only you can choose the relative priorities for each of those things.
I'm with aljames2 on my backup methods. For me, daily, versioned, backups were a requirement. Initially, I was doing local copies, just to a different disk. Then slowly I moved to "push" backups over the network and after seeing all the malware and crypto-ware problems that versioned backups could protect against, I switched to "pull" backups. The client system don't have any access at all to the backup storage on the server.
If they did, then malware could corrupt all the versioned backups, which defeats the purpose. That also mean that using network storage is a bad idea for backup storage, so don't use CIFS or NFS mounted storage. Choose tools that use a non-storage client/server architecture.
In the profession, we start by asking a few questions - what are the RPO and RTO requirements?
Answering those questions will determine how much money needs to be spent and the complexity of the backup methods.
Typically, backups once a day are just as cheap as once a week or once a month, so there's little reason to do anything besides daily backups.
For my needs, my RTO was 1 hour for core capabilities. So, my recovery process needs to take less than 1 hour until the system is back up and working with the most important capabilities. Any computer here with less than 100GB of data can be restored from scratch in less than 1 hour. It won't be exactly the same, since I don't backup everything, but it will work the same. Because I use snapshots, sometimes I can restore from logical issues in just a few minutes. LVM and ZFS snapshots are extremely powerful, but those have to be setup BEFORE they are needed and most often, they need to be selected during the OS install. There's no way to go back and add them post-install.
If you are new, it is likely that "pull" backups are beyond your ability. So, start with what you can wrap your head around and plan to move to "pulled" backups perhaps in 6-12 months after you have more understanding.
Some links to get you started:
An important point of consideration is exactly how much trouble are you willing to have at restore time. 1-click restores waste 100x more storage, so they aren't really viable for an entire system. Remember why we are doing backups - it is 10% backups and 90% about restoring files, so be certain you test the restore or you'll never know whether it actually works or not. When I was first creating my new backup techniques (around 2010), it took me 5 attempts before I solved all the chicken-egg issues and had enough practice doing the restore process that for most systems here, a restore takes less than 30 minutes. It is only systems with lots and lots of data that take more. I have a 2 systems with over 20TB of data ... just copying the backup data to new HDDs (assuming I had those HDDs available) would take about 3 days. However, the core services those 2 computers provide can be up and working in about 45 minutes, just without all that data,
What do they say? Begin with the end in mind.
Just so you don't think TB and TB of data are needed for every versioned backup. Here's the massively cut version of my email gateway server backups:
Code:
# rdiff-backup --list-increment-sizes spam2
Time Size Cumulative size
-----------------------------------------------------------------------------
Wed May 22 00:03:28 2024 3.90 MB 3.90 MB (current mirror)
Tue May 21 00:03:21 2024 2.95 KB 3.90 MB
Mon May 20 00:03:29 2024 1.60 KB 3.91 MB
....
Thu May 25 00:03:19 2023 14.0 KB 4.43 MB
Wed May 24 00:03:17 2023 1.27 KB 4.43 MB
Tue May 23 00:03:15 2023 1.25 KB 4.43 MB
I keep 1 yr of daily backups because it is a high risk system. But it doesn't have any data on it and it uses extremely standard tools. The backups are really just a list of installed packages and configuration files for those packages. Less than 4.5MB of storage to have 1 yr of backups? Seems like a bargain to me.
Of course, other systems need more storage and I don't keep as many versions. Here's my desktop computer:
Code:
# rdiff-backup --list-increment-sizes deneb
Time Size Cumulative size
-----------------------------------------------------------------------------
Wed May 22 00:07:48 2024 11.4 GB 11.4 GB (current mirror)
Tue May 21 00:11:32 2024 257 MB 11.6 GB
Mon May 20 00:06:43 2024 13.7 MB 11.6 GB
Sun May 19 00:07:20 2024 1.72 MB 11.6 GB
...
Thu Jan 25 00:05:37 2024 1.44 MB 15.8 GB
Wed Jan 24 00:05:46 2024 147 KB 15.8 GB
Tue Jan 23 00:07:02 2024 209 MB 16.0 GB
I have just 120 days of versioned backups for it. 16GB for a desktop. Seems like a bargain to me. Of course, most of the data that desktop uses is on network storage accessible via NFS. Locally, I only have active files for current programming and presentations. Reference materials are on the network storage.
And the time needed to backup each system daily is important, just like the total storage used. I get a daily report for each system:
Code:
=== Time for Backups to spam2 ===
StartTime 1716350608.00 (Wed May 22 00:03:28 2024)
EndTime 1716350613.77 (Wed May 22 00:03:33 2024)
ElapsedTime 5.77 (5.77 seconds)
TotalDestinationSizeChange 10344 (10.1 KB)
=== Time for Backups to deneb ===
StartTime 1716350868.00 (Wed May 22 00:07:48 2024)
EndTime 1716351017.72 (Wed May 22 00:10:17 2024)
ElapsedTime 149.72 (2 minutes 29.72 seconds)
TotalDestinationSizeChange 252629230 (241 MB)
My daily backups take less than 45 minutes total across all systems here. Before I switched to rdiff-backup, I needed over 8 hrs every night to backup the systems and didn't have storage to keep so many versions. I was using rsync. Before that, I was using image-based backups and before that I was hoping that I wouldn't need any backups, ever. Hope is never a plan.
Bookmarks