Rebalancing data across files in a filegroup
There is no easy way to rebalance data across new files in a filegroup. It has long been known that for some workloads you can get a performance increase by adding multiple data files to a filegroup. It’s not guaranteed though–you have to perform some relevant testing to make sure you’ll see a performance boost in production.
August 3, 2011
Question: I’ve done some performance testing and determined that I need to add a couple more files to one of my filegroups (currently two files) to get better performance. The two existing files are very full so I’d like to add two more files and then have SQL Server rebalance the data across all four files. How can I do that?
Answer: Unfortunately there is no easy way to rebalance data across new files in a filegroup.
As some background for readers, it has long been known that for some workloads you can get a performance increase by adding multiple data files to a filegroup. It’s not guaranteed though – you have to perform some relevant testing to make sure you’ll see a performance boost in production. I’ve conducted tests on a workload using SSDs and spinning disks and found that a performance boost was possible in both cases, for my workload.
Blog posts: Benchmarking: Do multiple data files make a difference? and Benchmarking: Multiple data files on SSDs (plus the latest Fusioin-io driver)
SQL Server’s allocation system uses a round-robin mechanism where allocations are spread over all the data files in a filegroup – an allocation from file 1, then an allocation from file 2, and so on, then back to file 1 again. Combined with that, it also uses an algorithm called proportional fill that aims to allocate data from the files according to how much free space they have relative to the other files in the filegroup.
The basic premise of proportional fill is that each file has a weighting, where files with less free space will have a higher weighting, and files with lots of free space will have a low weighting. The files with the lower weightings will be allocated from more frequently – i.e. those files with more free space will be allocated from the most.
This means that if you add a new data file to a filegroup that has mostly full data files, the proportional fill weightings will be such that the new file will be the one where allocations come from until it fills up to the same level as the older files. The new file in essence becomes an allocation hot spot.
But is there a way to prevent this happening or work around it? Unfortunately not. The proportional fill algorithm cannot be circumvented, and the weightings cannot be changed. Rebuilding any indexes using the same filegroup won’t work either, as the space allocated to build the new index will mostly come from the new file.
If you’d like to rebalance over a certain number of files, the easiest method is to create a new filegroup with the desired number of files, then rebuild indexes using CREATE INDEX … WITH (DROP_EXISTING=ON) plus specifying the new filegroup, and then drop the old filegroup. That unfortunately means you need to provision even more space (easier said than done) but it will do the trick. Note that this method won’t work if you have any LOB data, as that cannot be moved without creating a new table and exporting the data into it.
About the Author
You May Also Like