Thursday, March 22, 2012

Adding a column to VLDB 200GB table

The column I'm adding needs to be part of the clustered PK (it will be the
last of three columns) so I need to recreate all the indexes.
My DB is set for FULL recovery mode ALLOW_SNAPSHOT_ISOLATION ON. I've tried
two methods so far.
Method 1:
BEGIN TRANSACTION
CREATE TABLE dbo.Tmp_copyoftablewithnewfield
(
) ON PRIMARY
IF EXISTS(SELECT * FROM dbo.originaltable)
EXEC('INSERT INTO dbo.Tmp_copyoftablewithnewfield (<original fields>)
SELECT <original fields> FROM dbo.originaltable WITH (HOLDLOCK TABLOCKX)')
GO
DROP TABLE dbo.originaltable
GO
EXECUTE sp_rename N'dbo.Tmp_copyoftablewithnewfield', N'originaltable',
'OBJECT'
GO
<recreate PK constraint>
<rebuild indexes>
COMMIT
Pro's: Lets me add the new field in the spot I'd like it (not a big deal)
Con's: Tons of wasted space and time. It took about 15 hours.
Method 2:
SET XACT_ABORT ON
GO
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
<drop PK constraint>
<drop indexes>
ALTER TABLE [dbo].[originaltable] ADD
[newfield] [tinyint] NOT NULL CONSTRAINT [DF_originaltable_newfield] DEFAULT
((1))
<recreate PK constraint>
<rebuild indexes>
COMMIT TRANSACTION
Pro's: No making a copy of the entire table taking up 200GB more space in
the db data file
Con's: My tempdb grew to accomodate the row versioning info for every row in
the 200GB table. It took over 30 hours.
A lot of time and disk space is wasted with both.
Since the db is going to be unavailable to users I have some flexibility
here. I was considering turning ALLOW_SNAPSHOT_ISOLATION OFF and then trying
method 2 again which should stop the versioning in tempdb and then turning it
back on.
I was also curious if setting the database recovery mode to SIMPLE would cut
down on db log usage and then I could set it back to FULL when done.
Do these really need to be in a transaction? If there's some hardware
failure or something unexpected I can just restore from backup and do the
conversion again. If the presence of the transaction itself is causing more
disk usage for logging or any other slowdown, I think I'd rather do without.
Given the amount of time this conversion takes, I wanted to get some
feedback other than "just try it" before doing any new tests.
Thanks.Demi
> Do these really need to be in a transaction? If there's some hardware
> failure or something unexpected I can just restore from backup and do the
> conversion again. If the presence of the transaction itself is causing
> more
> disk usage for logging or any other slowdown, I think I'd rather do
> without.
It is always worth taking backup of the database or snapshot as you are in
SQL Server 2005.
I'd use ALTER TABLE... (set it up with SIMPLE and ALLOW_SNAPSHOT_ISOLATION
OFF ) approach and there is no need BEGIN TRAN ....
"Demi" <Demi@.discussions.microsoft.com> wrote in message
news:D0A356E6-BFB7-498F-B5FA-871B6D3AFD75@.microsoft.com...
> The column I'm adding needs to be part of the clustered PK (it will be the
> last of three columns) so I need to recreate all the indexes.
> My DB is set for FULL recovery mode ALLOW_SNAPSHOT_ISOLATION ON. I've
> tried
> two methods so far.
> Method 1:
> BEGIN TRANSACTION
> CREATE TABLE dbo.Tmp_copyoftablewithnewfield
> (
> ) ON PRIMARY
> IF EXISTS(SELECT * FROM dbo.originaltable)
> EXEC('INSERT INTO dbo.Tmp_copyoftablewithnewfield (<original fields>)
> SELECT <original fields> FROM dbo.originaltable WITH (HOLDLOCK
> TABLOCKX)')
> GO
> DROP TABLE dbo.originaltable
> GO
> EXECUTE sp_rename N'dbo.Tmp_copyoftablewithnewfield', N'originaltable',
> 'OBJECT'
> GO
> <recreate PK constraint>
> <rebuild indexes>
> COMMIT
> Pro's: Lets me add the new field in the spot I'd like it (not a big deal)
> Con's: Tons of wasted space and time. It took about 15 hours.
> Method 2:
> SET XACT_ABORT ON
> GO
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
> GO
> BEGIN TRANSACTION
> <drop PK constraint>
> <drop indexes>
> ALTER TABLE [dbo].[originaltable] ADD
> [newfield] [tinyint] NOT NULL CONSTRAINT [DF_originaltable_newfield]
> DEFAULT
> ((1))
> <recreate PK constraint>
> <rebuild indexes>
> COMMIT TRANSACTION
> Pro's: No making a copy of the entire table taking up 200GB more space in
> the db data file
> Con's: My tempdb grew to accomodate the row versioning info for every row
> in
> the 200GB table. It took over 30 hours.
> A lot of time and disk space is wasted with both.
> Since the db is going to be unavailable to users I have some flexibility
> here. I was considering turning ALLOW_SNAPSHOT_ISOLATION OFF and then
> trying
> method 2 again which should stop the versioning in tempdb and then turning
> it
> back on.
> I was also curious if setting the database recovery mode to SIMPLE would
> cut
> down on db log usage and then I could set it back to FULL when done.
> Do these really need to be in a transaction? If there's some hardware
> failure or something unexpected I can just restore from backup and do the
> conversion again. If the presence of the transaction itself is causing
> more
> disk usage for logging or any other slowdown, I think I'd rather do
> without.
> Given the amount of time this conversion takes, I wanted to get some
> feedback other than "just try it" before doing any new tests.
> Thanks.sql

No comments:

Post a Comment