Have You Seen This SQL Server Bug?

Dec.01, 2013 in development

I may have found a bug regarding how SQL Server calls CLR functions when using columns from tables joined using non-indexed columns. I’m including a SQL script, with comments on what to do with it, and hopefully anyone reading this will try it for themselves and tell me if what I’m seeing happens for them and whether I’m misinterpreting things. I’m willing to admit if I’m misunderstanding how things are supposed to work but so far everyone that I have spoken with about this is as perplexed as I am. And yet none of us can imagine this has never been noticed before and also is working as intended.

For starters, I discovered this in 64 bit SQL Server 2008 R2. One of my coworkers, our DBA, also reproduced this on 64 bit SQL Server 2012. I posted the issue at stackoverflow but no one even commented on it. My summary of the problem as I posted there is as follows:

SQL Server (at least on 2008 R2, 64 bit), when processing a query that includes a CLR function call in the SELECT clause and when using at least one column from a table that is included via JOIN on non-indexed columns, appears to invoke that CLR function for every row not constrained by either an indexed column in the JOIN or by an explicit WHERE clause item.

Now for some more details. This all has to do with the Microsoft spatial objects CLR functionality. Suffice to say I have a table full of extents, latitude/longitude pairs marking the opposing corners of a region. Additionally I have an SRID (spatial reference ID) to indicate what coordinate system was used. Some use meters offset, some use degrees. Also, the data I have is dirty. It has NULL values, invalid SRID values, coordinates that don’t match the SRID (i.e. meter offsets recorded for a supposedly degree based SRID). Finally, not only did I need to filter out potential offending values I also had additional filters being joined in from other tables. So my original SQL was actually doing a pre-filter pass to generate a valid set of ID values followed by the actual SELECT which joined this list of pre-filtered keys in a table variable with the original data table.

Part of the SELECT was to make use of the geography::STGeomFromText() call to return a geography object as a column in the result set. No sweat, my tests showed only valid rows were returning. But when I added that call, I began getting exceptions from the underlying CLR complaining about invalid geographies, invalid SRID values, insufficient numbers of points in the ring, etc. After a considerable amount of debugging and testing, I came up with the script listed below.

The first chunk of the script is commented out and includes two sections. One section creates a FUNCTION that takes a VARCHAR and an INT and just passes them directly to geography::STGeomFromText(). It is a simple wrapper. More on that in a bit.

The second section in the commented out area sets up the data table. The data table happens to include 7 test records but could probably work with fewer. I hit the exact results I was hunting down after I had added the 7th record and as much out of superstitious need to cling to my voodoo doll as anything else, I did not alter the data set. In any case, the CREATE TABLE is most notable for NOT indexing the ID column. The column is also not an IDENTITY but honestly making it an IDENTITY column did not impact the outcome and just added needless complexity.

The uncommented code that follows declares a table variable to hold valid keys, populates the table variable, and then performs a SELECT, returning not only all columns from the original data table but also additional columns which directly invoke the CLR function as well as call the UDF created up top. There is also a commented out WHERE clause.

I’ll summarize what I saw before I dump the script.

If I do not include the CLR call as a column in the SELECT, the SELECT works and I only see records which have valid values that would work in a call to the CLR function.
If I call the UDF, a wrapper around the CLR, I again only see the expected records and I get valid geography objects returned for each.
If I call the CLR, I receive an exception about an invalid SRID (being NULL). Depending on the results of the full table scan, you may also receive errors about invalid geography point values. Regardless, the query fails.
If I uncomment the WHERE clause, I can now include the CLR call as a column with no errors.
If the WHERE clause is commented out but the original data table is instead recreated such that the id field is a PRIMARY KEY field (or really, indexed in any way but the obvious approach would be to make it a PRIMARY KEY), the CLR function may also be included in the SELECT without issue.

What I believe is happening is that the main query is joining two tables, the data table and the table with a simple list of keys. It joins these tables on the ID column. When there is an INDEX on the data table’s ID column, SQL Server first joins on the results of the INDEX lookup, then fetches only the rows corresponding to the joined keys. Since the key table only has keys to rows with valid data, no invalid data is ever even seen. But when there is no INDEX, a table scan must be performed on the data table. While it is true that the result set will only end up including rows with valid data since that’s the only keys in the key table, it seems as though the CLR function is still being called for each row in the scan, regardless of whether it will end up in the final result set.

The interesting thing about the UDF is that it seems to mask whatever behavior is going on with the CLR function call. That is, the UDF does not get called for every row of a table scan, only for the rows which would be returned in the result set.

Aside from simple stating the obvious, that the lack of an INDEX forces a full table scan and the database is giving the CLR function a shot on each row before it is filtered out by the JOIN restriction (though, oddly enough, after any WHERE clause), I can’t see why this would happen. Wrapping it in a UDF seems to use a slightly different logic path. Perhaps it is a bug in the CLR assembly from Microsoft (this is Microsoft’s geography implementation) but I don’t have any other assemblies to test and our environment doesn’t allow us to install anything else.

Anyhow, here’s the script to reproduce the problem.

/*
 * This part of the script makes a UDF available to wrap the CLR call
 *
 * Only run this once
--
--
--
-- BEGIN RUN ONCE
--
--
--
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[fLPPTestMakeGeography]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))
DROP FUNCTION [dbo].[fMakeGeography]
GO
CREATE FUNCTION [dbo].[fLPPTestMakeGeography]
(
    @p1 varchar(max)
    , @p2 int
)
RETURNS geography
AS
BEGIN
    DECLARE @Result geography
    -- just a wrapper around the geography::STGeomFromText call
    SELECT @Result = geography::STGeomFromText(@p1,@p2)
    RETURN @Result
END
GO
--
--
--
-- END RUN ONCE
--
--
--
*/
/*
 * Likewise this sets up the test table.
 * Note that this all goes away when a PRIMARY KEY is placed on the id column
 * Commentary: by making id a PRIMARY KEY, it causes the join in the main query
 * to do an index lookup rather than doing a full scan of the data table. Without
 * that PRIMARY KEY setting, even though the table variable is joined to the data
 * table in such a way as to eliminate any potential invalid rows, every row of
 * the data table gets examined. The fact that the CLR call throws an exception
 * in this case but the UDF call does not suggests the following:
 * For each row of a table that is not filtered in the JOIN clause via an index
 * and not filtered in the WHERE clause directly, any CLR call in the SELECT
 * clause will be called if it involves any columns from that row.
 * Note that this behavior differs from that of UDFs, where the UDF call only
 * takes place once *all* filtering is done, including non-INDEXed portions of
 * any JOIN statements.
--
--
--
-- BEGIN RUN ONCE
--
--
--
drop table dbo.LPPTestEnvelope
create table dbo.LPPTestEnvelope
(
id int not null
, geog int
, minx decimal(38,16)
, maxx decimal(38,16)
, miny decimal(38,16)
, maxy decimal(38,16)
)
insert into dbo.LPPTestEnvelope (id, minx, maxx, miny, maxy, geog)
select 1,-80,-70,20,30,4326 union all
select 2,-80,-80,20,20,NULL union all
select 3,-80,-75,20,25,4326 union all
select 4,NULL,NULL,NULL,NULL,4326 union all
select 5,-85,-70,25,40,4326 union all
select 6,NULL,NULL,NULL,NULL,4326 union all
select 7,-100000,-90000,3000000,3100000,4326
--
--
--
-- END RUN ONCE
--
--
--
*/
-- select * from dbo.LPPTestEnvelope
set nocount on;
-- in a more complex setting this table variable
-- would be used to do some complex pre-queries and hold the
-- key values which would be joined to the original data
-- table to get the actual rows desired
--
-- for this example it's just dumbed down but the concept holds
declare @envtbl table
(
    eid int not null primary key
)
INSERT INTO @envtbl (eid)
SELECT
    id
FROM
    dbo.LPPTestEnvelope env
WHERE
    -- exclude invalid SRID values
    env.geog=4326
    -- exclude invalid lat/long values (e.g. UTM values with improper SRID assignments)
    AND env.minx > -180 AND env.minx < 180 AND env.maxx > -180 AND env.maxx < 180
    AND env.miny > -90 AND env.miny < 90 AND env.maxy > -90 AND env.maxy < 90
    -- avoid precision bug
    AND ABS(env.maxx-env.minx)>1E-7 AND ABS(env.maxy-env.miny)>1E-7
    -- avoid crossing hemispheres
    AND SIGN(env.maxx)=SIGN(env.minx) AND SIGN(env.maxy)=SIGN(env.miny)
SELECT
    -- display the raw data from the table
    env.*
    -- directly invoke the CLR to create a geography object
    -- NOTE: COMMENT THIS COLUMN OUT TO MAKE THE QUERY WORK
    ,geography::STGeomFromText('POLYGON(('
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR) + ', '
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.miny AS VARCHAR) + ', '
            + CAST(env.maxx AS VARCHAR) + ' ' + CAST (env.miny AS VARCHAR) + ', '
            + CAST(env.maxx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR) + ', '
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR)
            + '))', env.geog)
    ,[dbo].[fLPPTestMakeGeography]('POLYGON(('
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR) + ', '
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.miny AS VARCHAR) + ', '
            + CAST(env.maxx AS VARCHAR) + ' ' + CAST (env.miny AS VARCHAR) + ', '
            + CAST(env.maxx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR) + ', '
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR)
            + '))',env.geog)
    ,'SELECT geography::STPolyFromText(''POLYGON(('
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR) + ', '
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.miny AS VARCHAR) + ', '
            + CAST(env.maxx AS VARCHAR) + ' ' + CAST (env.miny AS VARCHAR) + ', '
            + CAST(env.maxx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR) + ', '
            + CAST(env.minx AS VARCHAR) + ' ' + CAST (env.maxy AS VARCHAR)
            + '))'',' + CAST(env.geog AS VARCHAR) + ')'
FROM
    -- join our data table to our table variable to fetch only the desired rows
    dbo.LPPTestEnvelope env
    INNER JOIN @envtbl et on et.eid=env.id
-- NOTE: UNCOMMENT THE WHERE CLAUSE TO MAKE THE QUERY WORK REGARDLESS OF INDEX
--WHERE
--  -- exclude invalid SRID values
--  env.geog=4326
--  -- exclude invalid lat/long values (e.g. UTM values with improper SRID assignments)
--  AND env.minx > -180 AND env.minx < 180 AND env.maxx > -180 AND env.maxx < 180
--  AND env.miny > -90 AND env.miny < 90 AND env.maxy > -90 AND env.maxy < 90
--  -- avoid precision bug
--  AND ABS(env.maxx-env.minx)>1E-7 AND ABS(env.maxy-env.miny)>1E-7
--  -- avoid crossing hemispheres
--  AND SIGN(env.maxx)=SIGN(env.minx) AND SIGN(env.maxy)=SIGN(env.miny)

pyehouse

Have You Seen This SQL Server Bug?

Leave a Reply