thoughts on programming and computer related stuff RSS 2.0
# Thursday, October 23, 2008

Checking if a email addresses is valid in SQL can be very convinient from time to time. Here are two ways to do it in SQL, and a bit about them.

 

Here is a full SQL way to do this:

 

Create function VALIDEMAIL(@s varchar(255)) 
--Returns true if the string is a valid email address.
returns bit
as
BEGIN
DECLARE @u VARCHAR(60), @v VARCHAR(60), @x VARCHAR(60), @i int, @j int, @result bit
SET @result=1
set @s = rtrim(ltrim(@s))
SET @i=len(@s)-CHARINDEX('@',reverse(@s))+1
if @i <= 1 or len(@s) < 5
begin
  set @result = 0
  goto done
end
SET @u=LEFT(@s,@i-1)
SET @j=len(@s)-CHARINDEX('.',reverse(@s))+1
SET @v=RIGHT(@s,LEN(@s)-@j)
if @j < 3 or @j < @i
begin
  set @result = 0
  goto done
end
SET @x=substring(@s,@i+1,@j-@i-1)
IF LEN(@x)<2
 BEGIN
 SET @result=0
 GOTo done
 END
IF (LEN(@x)=3) AND (@x NOT LIKE '[a-zA-Z][a-zA-Z][a-zA-Z]'Collate  Latin1_General_BIN)
 BEGIN
 SET @result=0
 GOTo done
 END
IF (LEN(@x)=2) AND (@x NOT LIKE '[a-zA-Z][a-zA-Z]' Collate  Latin1_General_BIN)
 BEGIN
 SET @result=0
 GOTo done
 END
SET @i=1
WHILE (@i<LEN(@u))
 BEGIN
 IF SUBSTRING(@u,@i,1) NOT LIKE '[a-zA-Z0-9_\-.+=/!#$&''*%?^`{|}"@\\ ]' escape '\' Collate  Latin1_General_BIN
  BEGIN
  SET @result=0
  GOTo done
  END
 SET @i=@i+1
 END
SET @i=1
WHILE (@i<LEN(@v))
 BEGIN
 IF SUBSTRING(@v,@i,1) NOT LIKE '[a-zA-Z]' Collate Latin1_General_BIN
  BEGIN
  SET @result=0
  GOTo done
  END
 SET @i=@i+1
 END
SET @i=1
WHILE (@i<LEN(@x))
 BEGIN
 IF SUBSTRING(@x,@i,1) NOT LIKE N'[a-zA-Z\-.0-9]' escape '\' Collate  Latin1_General_BIN
  BEGIN
  SET @result=0
     goto done
  END
 SET @i=@i+1
 END
done:
return @result
END
GO

Here is another way, where you insert .NET CLR code in the database. Here I use the .NET RegEx.Ismatch, so I can use it for tons of other stuff as well with the right input.


using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;

namespace SQLSignature
{
    public class SqlFunctions
    {

        [SqlFunction(IsDeterministic = true, DataAccess = DataAccessKind.None)]
        public static SqlBoolean fn_RegexIsMatch(SqlString input, SqlString pattern)
        {
            return (SqlBoolean)Regex.IsMatch(input.Value, pattern.Value);
        }

    }
}
In SQL:
CREATE ASSEMBLY SQLSignature
FROM 'C:\CLR Assemblies\Performance\SQLFunctions\bin\SQLFunctions.dll';
GO
-- Create fn_SQLSigCLR and fn_RegexReplace functions
CREATE FUNCTION dbo.fn_RegexIsMatch(@input AS NVARCHAR(MAX),@pattern AS NVARCHAR(MAX))RETURNS BIT 
WITH RETURNS NULL ON NULL INPUT 
EXTERNAL NAME SQLSignature.SQLFunctions.fn_RegexIsMatch;
GO
This can be run with a command like:

    select *
    from [AW].[person].emailaddress as p
    where dbo.fn_RegexIsMatch(RTRIM(LTRIM(lower(P.emailaddress))),
'^[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$')=1

I did some timings on the two functions and as you can see below there are big differences. The .NET CLR one is almost 8 times faster(cpu time) than the T-SQL one. It's also a whole lot easier to read, and can be used for all other RegEx's needed. You do however need to get out of management studio to make it, and 1.5 seconds is for processing almost 20000 email addresses still should be fast enough for most uses.

SET STATISTICS TIME ON

    select *
    from [AW].[person].emailaddress as p
    where dbo.VALIDEMAIL(P.emailaddress)=1

    select *
    from [AW].[person].emailaddress as p
    where dbo.fn_RegexIsMatch(RTRIM(LTRIM(lower(P.emailaddress))),
'^[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$')=1 (19929 row(s) affected) SQL Server Execution Times: CPU time = 1466 ms, elapsed time = 1911 ms. (19972 row(s) affected) SQL Server Execution Times: CPU time = 203 ms, elapsed time = 380 ms.
It's pretty cool to see that .NET performs good within SQL as well. I am primarily a .NET developer but do have to use SQL quite a bit as well, but the fact that .NET outperforms SQL both in readability and speed inside SQL on some actions make my job a lot easier.


Thursday, October 23, 2008 8:48:20 AM (GMT Standard Time, UTC+00:00)  #    Comments [1] - Trackback
.NET | T-SQL
Navigation
Archive
<July 2010>
SunMonTueWedThuFriSat
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567
About the author/Disclaimer

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2010
Gaute Magnussen
Sign In
Statistics
Total Posts: 17
This Year: 0
This Month: 0
This Week: 0
Comments: 1
Themes
Pick a theme:
All Content © 2010, Gaute Magnussen
DasBlog theme 'Business' created by Christoph De Baene (delarou)