Saturday, January 29, 2011

SQL Server Partial Backups

We currently distribute snapshots of our database to our stores by using rdiff on compressed backups of the entire database.

In order to optimize how long it takes to both create these backups, and each store download them, we're looking to leave out some of the less important tables/rows.

The two largest tables are already in their own filegroups, but they aren't read-only since they are still used.

I've looked at horizontal partitioning for separating out the older data to a read-only filegroup, and I can exclude those from a backup using partial backups; however, I'm unclear on how to horizontally partition a table that already contains data, most of what I find is about creating a new partitioned table.

Is partitioning the recommended way of restricting what gets included in a full backup, or is there another way that I'm unaware of?

We are running SQL Server 2005 Enterprise.

  • Assuming that you want to still change your 'less important rows', put that data in a different (but still writeable) filegroup. Then leave that filegroup out of your backup set.

    ...and you can use ALTER TABLE to put a table into a particular partition scheme.

    ALTER TABLE: http://msdn.microsoft.com/en-us/library/ms190273.aspx

    Matt : The database in question is in Simple mode as we don't need/want log backups or to have to deal with truncating them since we don't use them. Simple mode won't allow you to specify the primary filegroup for a file-only backup, it wants us to use READ_WRITE_FILEGROUPS, which won't let us exclude the tables we want to.
    Rob Farley : Oh, that changes things somewhat. Have you considered using the full recovery model instead?
    Matt : Yes, but we don't use or have plans to use log backups, and it seems wasteful to run in full recovery mode if we just have to truncate the logs every so often to keep them from getting too large.
    From Rob Farley
  • Have you consider replication or log shipping? Sounds like a good case for it.

    Also, I don't know your specific environment but if your only reason NOT to run the database in Full-Recovery Mode is that you don't want to deal with t-log backup's then I would check with the business users whether or not being able to recover only to the last backup is "good enough" for them.

    Matt : Our daily transactions are stored in a separate database that is in full recovery mode. The database in question stores everything else that we don't care if we lose the current day's changes. We are trying to get a local copy of the database to each location to help with queries against that data, and to provide a source for that data in the event their WAN link goes down for any period of time. The only replication model that fits what we need is snapshot or merge, and those don't seem scalable to the number of remote databases we will have. We're looking at about 200-300 locations ATM.
    Frank Kalis : Ah, okay. Thanks for explanation!

0 comments:

Post a Comment