RMJCS Logo
 

T/SQL Functions

 
 

T/SQL functions are a useful way of improving code readability by encapsulating complex or repetitive logic. There are a number of limitations to T/SQL functions, the main two being that they must be deterministic, and they can not modify the database. One of the most important points to be aware of though is that T/SQL functions are (relatively speaking) slow - this is discussed in more detail below.

 
     
 

Relative Function Type Performance

 
 

T/SQL functions are (relatively speaking) slow, calling a T/SQL function in a large set based operation will have a significant impact on performance. This performance issue is not a fault or a bug, it's the nature of T/SQL. T/SQL is not a compiled language - a considerable amount of overhead is involved in every call to a function. The only way around this is not to use T/SQL functions in large set based operations where performance is important. Possible alternatives are to inline the body of the function into the main T/SQL statement (not always practical or desireable) or use a CLR function (not an option if CLR isn't enabled on the server).

To demonstrate the performance difference between inline code, T/SQL functions and CLR functions I've prepared a very simple test case. My 'function' takes an integer i and returns 2i. Below are the timings for each approach over 1/2/4/8 million rows.  The code to try this for yourself is shown below.

Relative Function Type Performance
# rows Inline
Expression (ms)
T/SQL
Function (ms)
T/SQL Vs Inline CLR
Function (ms)
CLR Vs Inline
1,000,000 266 2220 8.3 1246 4.7
2,000,000 530 4420 8.3 2466 4.7
4,000,000 1076 8843 8.2 4953 4.6
8,000,000 2156 17750 8.2 9873 4.6

In this case, with the simplest of functions, the T/SQL function is over eight times slower than the inline expression, whilst the CLR function is over four times slower. The relative performance of each approach will vary with the nature of the function so if performance is critical then you need to test for yourself then weigh up each approach against other factors such as code manageability and available skills.

The timing code used was as follows;

--
-- Create lots of rows.
--
DECLARE @intNumRows INT; SET @intNumRows = 1000000

IF (OBJECT_ID('tempdb..#tmp') IS NOT NULL) DROP TABLE #tmp
CREATE TABLE #tmp(RowNum INT NOT NULL PRIMARY KEY CLUSTERED)

INSERT #tmp (RowNum)
   SELECT TOP (@intNumRows) 
      ROW_NUMBER () OVER (ORDER BY A.message_id ) AS RowNum
   FROM 
      sys.messages A 
         CROSS JOIN sys.messages B
            CROSS JOIN sys.messages C

--
-- Call function lots of times.
--
DECLARE @dtStart DATETIME;
DECLARE @dtFinish DATETIME;
DECLARE @x BIGINT

SET @dtStart = GETDATE()
SELECT 
   @x = AVG(CAST(RowNum * 2 AS BIGINT))                -- Inline expression
   --@x = AVG(CAST(dbo.TimesTwo(RowNum) AS BIGINT))      -- T/SQL Function
   --@x = AVG(CAST(dbo.clrTimesTwo(RowNum) AS BIGINT))   -- CLR Function
FROM
   #tmp WITH (NOLOCK) OPTION (MAXDOP 1)
SET @dtFinish = GETDATE()

SELECT @x AS Result, @dtStart AS StartTime, @dtFinish AS FinishTime, DATEDIFF(ms, @dtStart, @dtFinish) AS DurationMs

The T/SQL function used was as follows;

CREATE FUNCTION dbo.TimesTwo (@int INT) RETURNS INT WITH RETURNS NULL ON NULL INPUT AS
BEGIN
   RETURN @int * 2;
END

The CLR function used was as follows;

public static SqlInt32 clrTimesTwo(SqlInt32 i)
{
   return i * 2;
}
 
     
Copyright 2007-2015 RMJCS Ltd