thoughts on programming and computer related stuff RSS 2.0
# Monday, December 08, 2008
Last week I had to make a SQL Procedure go faster and a new join in the same operation. From the execution plan I could see that the thing that cost most time was a call to an external function made as a .NET CLR that got all the groups a person was member of, and returned it as a comma separated string. This tok a couple of milliseconds, each time, and with some of the users input resulting in over 8000 calls in a single statement, the procedure took to long time. I also had the requirement to add an other column in the result, showing a comma separated list of functions each persons had. Doing this the same way as before, would turn the procedure almost twice as sluggish as before.

To get the result wanted, a small trick was used. Here is a simplified version of what I wrote:
SELECT TOP 10
        substring(O.Functions,2,500)
FROM    dbo.Tp_Person P
CROSS APPLY ( SELECT    ',' + CAST(FunctionName AS varchar) AS [text()]
              FROM      dbo.Tp_Function F
              WHERE     F.Personid = P.Personid
              FOR XML PATH('')
             ) O( Functions )


This trick actually reduced the time of the procedure on 5000 people from about 30 seconds to 1,5 seconds. Made my day.

Monday, December 08, 2008 6:55:56 PM (GMT Standard Time, UTC+00:00)  #    Comments [0] - Trackback
T-SQL
# Thursday, October 23, 2008

Checking if a email addresses is valid in SQL can be very convinient from time to time. Here are two ways to do it in SQL, and a bit about them.

 

Here is a full SQL way to do this:

 

Create function VALIDEMAIL(@s varchar(255)) 
--Returns true if the string is a valid email address.
returns bit
as
BEGIN
DECLARE @u VARCHAR(60), @v VARCHAR(60), @x VARCHAR(60), @i int, @j int, @result bit
SET @result=1
set @s = rtrim(ltrim(@s))
SET @i=len(@s)-CHARINDEX('@',reverse(@s))+1
if @i <= 1 or len(@s) < 5
begin
  set @result = 0
  goto done
end
SET @u=LEFT(@s,@i-1)
SET @j=len(@s)-CHARINDEX('.',reverse(@s))+1
SET @v=RIGHT(@s,LEN(@s)-@j)
if @j < 3 or @j < @i
begin
  set @result = 0
  goto done
end
SET @x=substring(@s,@i+1,@j-@i-1)
IF LEN(@x)<2
 BEGIN
 SET @result=0
 GOTo done
 END
IF (LEN(@x)=3) AND (@x NOT LIKE '[a-zA-Z][a-zA-Z][a-zA-Z]'Collate  Latin1_General_BIN)
 BEGIN
 SET @result=0
 GOTo done
 END
IF (LEN(@x)=2) AND (@x NOT LIKE '[a-zA-Z][a-zA-Z]' Collate  Latin1_General_BIN)
 BEGIN
 SET @result=0
 GOTo done
 END
SET @i=1
WHILE (@i<LEN(@u))
 BEGIN
 IF SUBSTRING(@u,@i,1) NOT LIKE '[a-zA-Z0-9_\-.+=/!#$&''*%?^`{|}"@\\ ]' escape '\' Collate  Latin1_General_BIN
  BEGIN
  SET @result=0
  GOTo done
  END
 SET @i=@i+1
 END
SET @i=1
WHILE (@i<LEN(@v))
 BEGIN
 IF SUBSTRING(@v,@i,1) NOT LIKE '[a-zA-Z]' Collate Latin1_General_BIN
  BEGIN
  SET @result=0
  GOTo done
  END
 SET @i=@i+1
 END
SET @i=1
WHILE (@i<LEN(@x))
 BEGIN
 IF SUBSTRING(@x,@i,1) NOT LIKE N'[a-zA-Z\-.0-9]' escape '\' Collate  Latin1_General_BIN
  BEGIN
  SET @result=0
     goto done
  END
 SET @i=@i+1
 END
done:
return @result
END
GO

Here is another way, where you insert .NET CLR code in the database. Here I use the .NET RegEx.Ismatch, so I can use it for tons of other stuff as well with the right input.


using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;

namespace SQLSignature
{
    public class SqlFunctions
    {

        [SqlFunction(IsDeterministic = true, DataAccess = DataAccessKind.None)]
        public static SqlBoolean fn_RegexIsMatch(SqlString input, SqlString pattern)
        {
            return (SqlBoolean)Regex.IsMatch(input.Value, pattern.Value);
        }

    }
}
In SQL:
CREATE ASSEMBLY SQLSignature
FROM 'C:\CLR Assemblies\Performance\SQLFunctions\bin\SQLFunctions.dll';
GO
-- Create fn_SQLSigCLR and fn_RegexReplace functions
CREATE FUNCTION dbo.fn_RegexIsMatch(@input AS NVARCHAR(MAX),@pattern AS NVARCHAR(MAX))RETURNS BIT 
WITH RETURNS NULL ON NULL INPUT 
EXTERNAL NAME SQLSignature.SQLFunctions.fn_RegexIsMatch;
GO
This can be run with a command like:

    select *
    from [AW].[person].emailaddress as p
    where dbo.fn_RegexIsMatch(RTRIM(LTRIM(lower(P.emailaddress))),
'^[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$')=1

I did some timings on the two functions and as you can see below there are big differences. The .NET CLR one is almost 8 times faster(cpu time) than the T-SQL one. It's also a whole lot easier to read, and can be used for all other RegEx's needed. You do however need to get out of management studio to make it, and 1.5 seconds is for processing almost 20000 email addresses still should be fast enough for most uses.

SET STATISTICS TIME ON

    select *
    from [AW].[person].emailaddress as p
    where dbo.VALIDEMAIL(P.emailaddress)=1

    select *
    from [AW].[person].emailaddress as p
    where dbo.fn_RegexIsMatch(RTRIM(LTRIM(lower(P.emailaddress))),
'^[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$')=1 (19929 row(s) affected) SQL Server Execution Times: CPU time = 1466 ms, elapsed time = 1911 ms. (19972 row(s) affected) SQL Server Execution Times: CPU time = 203 ms, elapsed time = 380 ms.
It's pretty cool to see that .NET performs good within SQL as well. I am primarily a .NET developer but do have to use SQL quite a bit as well, but the fact that .NET outperforms SQL both in readability and speed inside SQL on some actions make my job a lot easier.


Thursday, October 23, 2008 8:48:20 AM (GMT Standard Time, UTC+00:00)  #    Comments [1] - Trackback
.NET | T-SQL
# Friday, August 22, 2008

Last week I had a problem where I needed a hierarchical chain of results. I had to get results of all organisations above my organsiation in a hierarcy. I knew how I would do this by recursion in C#, but had no idea how to get it done in SQL. But asking a bit around, and searching the web I got a really nice solution using the to me new WITH-command. And it seems quite fast aswell. Here is a SQL script with some testdata:

Create table #Relations 
(
  ID int PRIMARY KEY,
  ChildId int,
  name varchar(50)
)
insert into #Relations values (5, 4, 'Me')
insert into #Relations values (4, 3, 'Father')
insert into #Relations values (3, 2, 'Grandpa')
insert into #Relations values (2, 1, 'Great Grandpa')
insert Into #Relations  values (1, null, 'Great Old Grandpa')

declare @startid int
set @startid = 5

;WITH Names AS
(
--Gets the starting line
SELECT * FROM #Relations where id = @startid
UNION ALL
SELECT r.* FROM #Relations r
INNER JOIN names n ON n.ChildId=r.id
)
select * from names

Here are the results. As you can see, it works;)

ID ChildId name
---------- ----------- --------------------------------------------------
5 4 Me
4 3 Father
3 2 Grandpa
2 1 Great Grandpa
1 NULL Great Old Grandpa
(5 row(s) affected)

Friday, August 22, 2008 11:38:13 AM (GMT Standard Time, UTC+00:00)  #    Comments [0] - Trackback
T-SQL
Navigation
Archive
<September 2010>
SunMonTueWedThuFriSat
2930311234
567891011
12131415161718
19202122232425
262728293012
3456789
About the author/Disclaimer

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2010
gaute
Sign In
Statistics
Total Posts: 17
This Year: 0
This Month: 0
This Week: 0
Comments: 1
Themes
Pick a theme:
All Content © 2010, gaute
DasBlog theme 'Business' created by Christoph De Baene (delarou)