Saturday, January 29, 2011

Backup filling up

We have a 400GB tape drive and use Symantac Backup Exec 12.5. Looking at the history from two months ago (before I was here) the jobs were finishing at ~355GB, but in that time it's jumped by 40GB to ~395 — in the danger area for running out of space.

I'm looking for ideas for splitting this job up into a few pieces, identifying selections in the job that are growing more than others, or anything else that might get this under control.

  • IMHO, that's the nature of backups. Data accumulates and the backups grow larger. It's a natural evolution. Why is it a problem? Can you not use multiple tapes for the backup? Splitting the job isn't going to reduce the amount of data that needs to be backed up.

    From joeqwerty
  • Can your software back up to hard drives? I'd seriously consider replacing tapes with 2 TB hard drives and worrying about the problems of that a few years later. I know the cost is higher per item (or hope it is) but hard drives can be read on any computer, and if your tape drive fails, you lose a lot of tapes until you get another one.

    And tapes wear out.

    David Mackintosh : And hard drives are less portable, thus tending to burn down with the server they are attached to.
    John Gardeniers : A hard drive will be dead long before a modern tape wears out.
    joeqwerty : Personally I would never use hard drives for permanent or semi-permanent backup storage. I only use them for intermediate backup storage: backup to disk, then backup the disk backups to tape.
    Michael Graff : I've had the opposite, I find tapes far less reliable than drives. Perhaps I'm living in the past. It won't be the first time...
  • I'll guess that this is a single, non-robotic drive directly attached to some server. And there's no budget, or practically no budget. Your options are:

    Go through the source data and see if you can delete or exclude parts from your backup set.

    • PRO: costs and backup job definitions stay what they are.
    • CON: whatever is deleted is gone; whatever is excluded doesn't get backed up; if data grows in "excluded" areas, it will also be silently excluded which might not be what you want

    Figure out how to split the backup into two jobs and run the two jobs on different nights on different tapes.

    • PRO: everything that needs backing up gets backed up.
    • CONS: cost (tape footprint doubles); having two jobs means someone has to put the right tape in the drive at the right time twice as frequently; two jobs means you have to be careful how things grow to ensure both that A) neither job grows disproportionately; and B) new things are actually caught by only one of the two jobs

    Buy a bigger tape drive (hello, LTO-4!)

    • PRO: puts off the job-splitting or data-deletion decision for a while.
    • CON: cost++; possibly not reverse-compatible with the media you already have history on which means you have to keep the old drive around and working just in case.

    Buy a robot that will auto-change tapes for you.

    • PRO: if your software can handle tape-spanning jobs, you no longer have to worry about such pedestrian concerns, just keep shoveling media into it; such software can probably handle your existing drive as well, which gives you potential access to your history; this solution will scale up at much better than any single drive solution (ie solves the problem for the next three years, not just the next six months)
    • CON: cost+++ (at the very least you have to buy the robot, which is expensive; you probably will have to buy robotic slot licenses for your software, which are expensive; you might have to totally replace your software with something that can handle tape-spanning jobs and robots, which is expensive and risks you not having access to your current history)

    Find some backup-over-the-internet scheme.

    • PRO: your hardware, media, storage and retention issues become someone else's problem.
    • CON: your internet connection needs to be able to upload at a reasonable data rate so that your backup can finish in a reasonable time frame; your internet connection should have a transfer limit which will permit you to perform these backups; backups will be slow; restores will very probably be just as slow; cost (since the service provider is undoubtedly doing this to a modern backup facility, which means you are paying for all that plus a profit for the backup service provider)

    Backups are something that many people have written essays on.

    You have to know what the window of recovery is -- it ranges from "I want to recover the file I deleted ten minutes ago (or six months ago) NOW!" to "eh, if I have to wait a week for a tape to be delivered from offsite, that's cool."

    You have to know if these are backups for disaster recovery (ie the building burned down!) or auditing (proof of state on a certain date) or user-level recovery (someone deleted account X instead of Y) or file-level recovery (some secretary accidentally pasted a Facebook entry over the company's financials). The answers to these questions will drive how backups are done.

    If there are rules to backups, they are usually:

    • backup everything
    • keep it forever
    • keep it offsite

    Budgets will limit how much of each you can actually do, but the people in charge of the company need to know what is and is not being backed up and what the implications of those facts are. Backups are important to a company's ability to survive technical (and sometimes personnel) failures. They need to be approved by someone delegated this responsibility from a C-level.

    You can't win. When you are asking for robots, media, and software, they will say "we don't need to back up X." However in an emergency, they will be yelling at you "why didn't you back up X?"

    At my customers, every six months I prepare a document that details exactly what is being backed up, and I note every important set of objects (that I know about, heh) that are NOT being backed up, and I email it to the customer and then spend 30 minutes with them to review it. That way they know what is being backed up, what I think isn't being backed up, backup run time and data requirements, data retention policies, offsite details, a review of the major problems (and hopefully solutions) from the past six months, plus anything else I think is important. I ask them if they are aware of anything else that needs backing up. If any changes are required, I make them, re-send the document, and file those emails away as proof I sent them the documents. Realistically if something important isn't being backed up my ass will still be in a sling, but in most cases it will have company.

    Backups are a huge, expensive, complex can of worms.

    Good luck.

    David Mackintosh : Interesting. A long answer about robots, and I get challenged to prove myself a human.
    joeqwerty : +1. This is a well thought out and in depth answer and certainly presents some solid ideas to consider. Personally, I like to keep things as simple as possible. My recommendation in this scenario is: Leave the backups as they are and buy more tapes. If the current backup exceeds the capacity of a single tape then it's time to allocate 2 tapes to the job. A better solution would be to purchase a large enough USB HDD for the current job and expected future job growth, configure the current backup to backup to this USB HDD, and then backup the USB HDD to tape.
    pplrppl : David, Your answer would fit nicely in my question http://serverfault.com/questions/88016/if-lto-3-full-backup-takes-more-than-one-tape-what-is-my-next-step-hardware-wise
    Zypher : RE: Delete or exclude part - personally I think the best approach is to not exclude, but do a one-off backup of data older than x years or data that hasn't been accessed in x amount of time, then delete it. This way you can always get the data back from the one-off backup if needed, and you don't have to worry about excluded dirs growing or having data that needs to be backed up put in them
    Joel Coel : I knew most of this, but it's weird the first time you run into in a system that's graduating from where everything fits easily on a single tape to needing to completely re-think the strategy... especially as I started here just a couple weeks ago.
  • Agree with David that backups usually end up being incredibly complex, especially when you begin scaling up. A few additional ideas for potential solutions:

    • Consider using disk snapshots for some of your lower priority systems. You can do this at the disk level if the data is on a SAN, or at the filesystem level (VSS for Windows, LVM for Linux and various Unixes). The downside is that snapshots only protect you from accidental file remove; they do not protect you against disk failure. You could potentially do a hybrid as well, using snapshots for incremental backups, and keeping full backups on tapes.
    • Backing up to a seperate disk array and server might also be a legitimate strategy, at least for some of your systems. Most backup systems support storing backups to an arbitrary disk. Disk is cheap, so it may buy some time. You loose portability, as well as longevity, as hard drives don't last more then a few years. Also, disk space management on the backup system becomes an issue as backups grow.
    • In the end, you will need more capacity, and that translates into more money. When considering the cost of loosing your important business data, the additional cost of good backup infrastructure can be easily justified. How many man-hours would be consumed by manually reentering financial data, customer information, and the like?
    From SteveM
  • The first question I'd ask is whether there's any reason you can't use a second tape? If your backup fills a tape, Backup Exec will simply prompt (via an alert) for another overwritable tape and the backup will continue on to it.

    If you absolutely can't use a second tape, then you could either go for larger drives and tapes or an autoloader ($$$), or you could consider backing up some of your data on a full/differential backup policy.

    To find out which resources are increasing in size you can go to File, New Restore Selection List, and in the View by Resource window you can browse through the catalogged backups and view the sizes of resources, folders etc.

    If you want to split the jobs up into separate sections then it's a good idea to use Backup Exec's policies.

    From hmallett
  • For anyone who is interested, what we are going to do right now is create a new file share/mapped drive for users for "Archive" data. I could also call it "Static" data, or really anything that changes infrequently. This file share will only be backed up once per week.

    The idea is that it will work for a longer term because as the share grows we can easily divide up the archive section into several different chunks and back up one of the chunks each night, so that the full archive is backed up each week. We will then move anything that hasn't changed in the last year to the new location, and follow up once per quarter for new archivable files.

    From Joel Coel

0 comments:

Post a Comment