Thursday, February 9, 2012

Actual storage of Varchar values

I have a table with a definition similar to the following.
create table sizetest (pk int identity(1,1)
, V1 Varchar(1000) , V2 varchar(1000) , V3 varchar(1000)
, V4 Varchar(1000) , V5 varchar(1000) , V6 varchar(1000)
, V7 Varchar(1000) , V8 varchar(1000) , V9 varchar(x)
I have varied the length of column V9 between 16 and 40. When X <= 16 the
table is created without warning. When x = 40 I am told that the row length
is 8084 or 24 bytes too big. X= 20 says 8064 or 4 bytes too big. I have not
been able to come up with a definition for varchar storage that matches the
results I am seeing. This Reference
(http://msdn2.microsoft.com/en-us/library/ms176089(SQL.100).aspx) says that n
+ 2 is the answer but that does not work with any of this data I have
generated. For Example (8 * 1002) + 42 + 4 (for the pk) = 8062 which is not
equal to 8084 and is more than 8060 which BOL says is the max allowed. So I
understand the warning but not the numbers. On the other hand (8 * 1002) + 20
+ 4 = 8040 which is not equal to 8064 and is less than 8060. So why is this
one getting the warning?
I changed the table definition to 3 varchar fields 4000, 4000, and x and got
similar results but different numbers. Still no predicatble results.
How can I determine exactly what potential strorage is required if all of
the varchar fields are full.
My C++ developer won't believe me unless I can give him the details and show
a way to reliably predict the row length for any combination of column
types/sizes.
Thanks
Ray Herring
n+2 is the storage for the varchar fields themselves, but there is other
overhead required in each row.
You did not say what version you are using, but the documentation you
referred to is for SQL Server 2008, which is not in production yet, and the
docs are incomplete.
Take a look here for some information about additional overhead bytes:
http://msdn2.microsoft.com/en-us/library/ms189124.aspx
HTH
Kalen Delaney, SQL Server MVP
www.InsideSQLServer.com
http://DVD.kalendelaney.com
"Ray" <Ray@.discussions.microsoft.com> wrote in message
news:5B7540FA-BBC4-4B77-BC73-EA6FFE1524BE@.microsoft.com...
> I have a table with a definition similar to the following.
> create table sizetest (pk int identity(1,1)
> , V1 Varchar(1000) , V2 varchar(1000) , V3 varchar(1000)
> , V4 Varchar(1000) , V5 varchar(1000) , V6 varchar(1000)
> , V7 Varchar(1000) , V8 varchar(1000) , V9 varchar(x)
> I have varied the length of column V9 between 16 and 40. When X <= 16 the
> table is created without warning. When x = 40 I am told that the row
> length
> is 8084 or 24 bytes too big. X= 20 says 8064 or 4 bytes too big. I have
> not
> been able to come up with a definition for varchar storage that matches
> the
> results I am seeing. This Reference
> (http://msdn2.microsoft.com/en-us/library/ms176089(SQL.100).aspx) says
> that n
> + 2 is the answer but that does not work with any of this data I have
> generated. For Example (8 * 1002) + 42 + 4 (for the pk) = 8062 which is
> not
> equal to 8084 and is more than 8060 which BOL says is the max allowed. So
> I
> understand the warning but not the numbers. On the other hand (8 * 1002) +
> 20
> + 4 = 8040 which is not equal to 8064 and is less than 8060. So why is
> this
> one getting the warning?
> I changed the table definition to 3 varchar fields 4000, 4000, and x and
> got
> similar results but different numbers. Still no predicatble results.
> How can I determine exactly what potential strorage is required if all of
> the varchar fields are full.
> My C++ developer won't believe me unless I can give him the details and
> show
> a way to reliably predict the row length for any combination of column
> types/sizes.
> Thanks
> --
> Ray Herring
|||Thanks Kalen
I thought about the version a little after I posted. I am using 2000 right
now.
I found an additional reference (I think 2k5) that mentions a 24 byte over
head that occurs when a column overflows a page and gets forwarded.
When I add piece in I come within +/- 1 byte of the correct numbers.
Ray Herring
"Kalen Delaney" wrote:

> n+2 is the storage for the varchar fields themselves, but there is other
> overhead required in each row.
> You did not say what version you are using, but the documentation you
> referred to is for SQL Server 2008, which is not in production yet, and the
> docs are incomplete.
> Take a look here for some information about additional overhead bytes:
> http://msdn2.microsoft.com/en-us/library/ms189124.aspx
> --
> HTH
> Kalen Delaney, SQL Server MVP
> www.InsideSQLServer.com
> http://DVD.kalendelaney.com
>
> "Ray" <Ray@.discussions.microsoft.com> wrote in message
> news:5B7540FA-BBC4-4B77-BC73-EA6FFE1524BE@.microsoft.com...
>
>
|||This overflow doesn't happen on SQL 2000 though. Row overflow is new in
2005. And you are getting your errors because the overflow is not allowed;
if you were using 2005, you wouldn't get these errors.
HTH
Kalen Delaney, SQL Server MVP
www.InsideSQLServer.com
http://DVD.kalendelaney.com
"Ray" <Ray@.discussions.microsoft.com> wrote in message
news:41896F83-7882-4045-9174-E239F1D25308@.microsoft.com...[vbcol=seagreen]
> Thanks Kalen
> I thought about the version a little after I posted. I am using 2000
> right
> now.
> I found an additional reference (I think 2k5) that mentions a 24 byte over
> head that occurs when a column overflows a page and gets forwarded.
> When I add piece in I come within +/- 1 byte of the correct numbers.
> --
> Ray Herring
>
> "Kalen Delaney" wrote:
|||This is SQL 2000. I have about 100 client sites that have not moved to 2005.
The tables are being created and/or altered from scripts run in Isqlw.
This is the warning message.
"Warning: The table 'xxxxx' has been created but its maximum row size (8065)
exceeds the maximum number of bytes per row (8060). INSERT or UPDATE of a row
in this table will fail if the resulting row length exceeds 8060 bytes."
I have not received a runtime error on an insert or update so far but I
figure that is only a matter of time.
The table definition has 13 colums including a BigInt(primary key), 2 ints
, 1 datetime, 2 varchar(500) and 7 varchar(1000).
The pk is a clusterd index. The pk and the 2 ints are also foreign keys.
There are no other constraints or indexes.
So using what I have found the row size should be 8 + 4 + 4 + 8 + (7*1002) +
(2*502) = 8042. This should fit according to the documentation I have on 2K.
So where do the additional 23 bytes come from?
If you add the 24 bytes for a column overflow (I understand that is 2K5)
then the total is 8066 which is over the limit but not equal to the total
cited in the warning message.
Ray Herring
"Kalen Delaney" wrote:

> This overflow doesn't happen on SQL 2000 though. Row overflow is new in
> 2005. And you are getting your errors because the overflow is not allowed;
> if you were using 2005, you wouldn't get these errors.
> --
> HTH
> Kalen Delaney, SQL Server MVP
> www.InsideSQLServer.com
> http://DVD.kalendelaney.com
>
> "Ray" <Ray@.discussions.microsoft.com> wrote in message
> news:41896F83-7882-4045-9174-E239F1D25308@.microsoft.com...
>
>
|||I imagine I got your table script wrong because there is an extra byte you
didn't tell us about? When I run this I get the same message but 8065, not
8064 (this is on 8.00.2039):
CREATE TABLE dbo.splunge
(
BigID BIGINT,
FooID INT,
BarID INT,
EventDate DATETIME,
vc500_1 VARCHAR(500),
vc500_2 VARCHAR(500),
vc1000_1 VARCHAR(1000),
vc1000_2 VARCHAR(1000),
vc1000_3 VARCHAR(1000),
vc1000_4 VARCHAR(1000),
vc1000_5 VARCHAR(1000),
vc1000_6 VARCHAR(1000),
vc1000_7 VARCHAR(1000)
);
GO
DROP TABLE dbo.splunge;
Warning: The table 'splunge' has been created but its maximum row size
(8064) exceeds the maximum number of bytes per row (8060). INSERT or UPDATE
of a row in this table will fail if the resulting row length exceeds 8060
bytes.
Some interesting stuff about this here, including 14 bytes of overhead that
are not documented in 2000 BOL or the Inside SQL Server 2000 book (though
Steve says the content was presented on Kalen's site
www.insidesqlserver.com, but I cannot find it now, since it has been
replaced with the 2005 version):
http://www.sqlservercentral.com/articles/Miscellaneous/pagesize/497/
So if we use Steve's information, we have:
2 bytes for status bits
2 bytes for # of columns
24 bytes for fixed length data
2 bytes for storing the length of fixed length data
8000 bytes for variable length data
18 bytes for variable length overhead
2 bytes for null bitmap (9-16 nullable columns)
8050
+14 bytes for reserved pointer space = 8064
|||>I imagine I got your table script wrong because there is an extra byte you
>didn't tell us about? When I run this I get the same message but 8065, not
>8064 (this is on 8.00.2039):
Sorry, I got that backwards, obviously. You claim 8065 but I get 8064.
And it made no difference if I made BigID PK and/or identity, or the ints
nullable, or the datetime nullable...
|||Thanks Aaron
I don't know why there is a one byte difference in our numbers. Actually my
calculations also differ by one from the warning message.
The columns in the table are in a different order. Do you think that might
be related?
Anyway the detail you have provided is sufficient for what I need to pass on
to the developer.
Thanks again.
Ray Herring
"Aaron Bertrand [SQL Server MVP]" wrote:

> Sorry, I got that backwards, obviously. You claim 8065 but I get 8064.
> And it made no difference if I made BigID PK and/or identity, or the ints
> nullable, or the datetime nullable...
>
|||There are another two bytes needed for the pointer in the slot array at the
end of the page, but this is counted against the size of the row.
And another two bytes for the first variable length column. I can't get the
numbers to add up right now either, but I'm not going to worry about it.
Also, I don't think the space for the pointer is counted against the total.
With Snapshot Isolation in SQL 2005, the max row size is just increased.
FYI, the null bitmap has a bit for every column, not just the nullable ones.
HTH
Kalen Delaney, SQL Server MVP
www.InsideSQLServer.com
http://DVD.kalendelaney.com
"Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in message
news:6C3D5975-0885-4BE5-94EB-33D1B5312130@.microsoft.com...
>I imagine I got your table script wrong because there is an extra byte you
>didn't tell us about? When I run this I get the same message but 8065, not
>8064 (this is on 8.00.2039):
> CREATE TABLE dbo.splunge
> (
> BigID BIGINT,
> FooID INT,
> BarID INT,
> EventDate DATETIME,
> vc500_1 VARCHAR(500),
> vc500_2 VARCHAR(500),
> vc1000_1 VARCHAR(1000),
> vc1000_2 VARCHAR(1000),
> vc1000_3 VARCHAR(1000),
> vc1000_4 VARCHAR(1000),
> vc1000_5 VARCHAR(1000),
> vc1000_6 VARCHAR(1000),
> vc1000_7 VARCHAR(1000)
> );
> GO
> DROP TABLE dbo.splunge;
> --
> Warning: The table 'splunge' has been created but its maximum row size
> (8064) exceeds the maximum number of bytes per row (8060). INSERT or
> UPDATE of a row in this table will fail if the resulting row length
> exceeds 8060 bytes.
> Some interesting stuff about this here, including 14 bytes of overhead
> that are not documented in 2000 BOL or the Inside SQL Server 2000 book
> (though Steve says the content was presented on Kalen's site
> www.insidesqlserver.com, but I cannot find it now, since it has been
> replaced with the 2005 version):
> http://www.sqlservercentral.com/articles/Miscellaneous/pagesize/497/
> So if we use Steve's information, we have:
> 2 bytes for status bits
> 2 bytes for # of columns
> 24 bytes for fixed length data
> 2 bytes for storing the length of fixed length data
> 8000 bytes for variable length data
> 18 bytes for variable length overhead
> 2 bytes for null bitmap (9-16 nullable columns)
> 8050
> +14 bytes for reserved pointer space = 8064
|||Thanks Kalen and Arron.
I had just found a reference
"http://msdn2.microsoft.com/en-us/library/ms189124.aspx" (2005 again) that
mentions the null bit map and also other over head bytes.
I guess the row size limit is not simply 8K, 8060, 8092, xxxx but rather
depends on how you define it. It seems that MS would have rolled all of the
basic overhead (null bit map, etc) into the basic page size so that we would
only need to work out the variable and fixed lenght columns.
I guess I will start falling back on the =/-10% sort of estimate. It sounds
like even the column over flow is a mixed blessing performance wise.
Thanks for the assistance guys.
Ray Herring
"Kalen Delaney" wrote:

> There are another two bytes needed for the pointer in the slot array at the
> end of the page, but this is counted against the size of the row.
> And another two bytes for the first variable length column. I can't get the
> numbers to add up right now either, but I'm not going to worry about it.
> Also, I don't think the space for the pointer is counted against the total.
> With Snapshot Isolation in SQL 2005, the max row size is just increased.
> FYI, the null bitmap has a bit for every column, not just the nullable ones.
> --
> HTH
> Kalen Delaney, SQL Server MVP
> www.InsideSQLServer.com
> http://DVD.kalendelaney.com
>
> "Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in message
> news:6C3D5975-0885-4BE5-94EB-33D1B5312130@.microsoft.com...
>
>

No comments:

Post a Comment