Checking if a email addresses is valid in SQL can be very convinient from time to time. Here are two ways to do it in SQL, and a bit about them.
Here is a full SQL way to do this:
Create function VALIDEMAIL(@s varchar(255)) --Returns true if the string is a valid email address.returns bitasBEGINDECLARE @u VARCHAR(60), @v VARCHAR(60), @x VARCHAR(60), @i int, @j int, @result bitSET @result=1set @s = rtrim(ltrim(@s))SET @i=len(@s)-CHARINDEX('@',reverse(@s))+1if @i <= 1 or len(@s) < 5begin set @result = 0 goto doneendSET @u=LEFT(@s,@i-1)SET @j=len(@s)-CHARINDEX('.',reverse(@s))+1SET @v=RIGHT(@s,LEN(@s)-@j)if @j < 3 or @j < @ibegin set @result = 0 goto doneendSET @x=substring(@s,@i+1,@j-@i-1)IF LEN(@x)<2 BEGIN SET @result=0 GOTo done ENDIF (LEN(@x)=3) AND (@x NOT LIKE '[a-zA-Z][a-zA-Z][a-zA-Z]'Collate Latin1_General_BIN) BEGIN SET @result=0 GOTo done ENDIF (LEN(@x)=2) AND (@x NOT LIKE '[a-zA-Z][a-zA-Z]' Collate Latin1_General_BIN) BEGIN SET @result=0 GOTo done ENDSET @i=1WHILE (@i<LEN(@u)) BEGIN IF SUBSTRING(@u,@i,1) NOT LIKE '[a-zA-Z0-9_\-.+=/!#$&''*%?^`{|}"@\\ ]' escape '\' Collate Latin1_General_BIN BEGIN SET @result=0 GOTo done END SET @i=@i+1 ENDSET @i=1WHILE (@i<LEN(@v)) BEGIN IF SUBSTRING(@v,@i,1) NOT LIKE '[a-zA-Z]' Collate Latin1_General_BIN BEGIN SET @result=0 GOTo done END SET @i=@i+1 ENDSET @i=1WHILE (@i<LEN(@x)) BEGIN IF SUBSTRING(@x,@i,1) NOT LIKE N'[a-zA-Z\-.0-9]' escape '\' Collate Latin1_General_BIN BEGIN SET @result=0 goto done END SET @i=@i+1 ENDdone:return @resultENDGO
Here is another way, where you insert .NET CLR code in the database. Here I use the .NET RegEx.Ismatch, so I can use it for tons of other stuff as well with the right input.
using System; using System.Collections.Generic; using System.Text; using Microsoft.SqlServer.Server; using System.Data.SqlTypes; using System.Text.RegularExpressions; namespace SQLSignature { public class SqlFunctions { [SqlFunction(IsDeterministic = true, DataAccess = DataAccessKind.None)] public static SqlBoolean fn_RegexIsMatch(SqlString input, SqlString pattern) { return (SqlBoolean)Regex.IsMatch(input.Value, pattern.Value); } } }
CREATE ASSEMBLY SQLSignature FROM 'C:\CLR Assemblies\Performance\SQLFunctions\bin\SQLFunctions.dll'; GO -- Create fn_SQLSigCLR and fn_RegexReplace functions CREATE FUNCTION dbo.fn_RegexIsMatch(@input AS NVARCHAR(MAX),@pattern AS NVARCHAR(MAX))RETURNS BIT WITH RETURNS NULL ON NULL INPUT EXTERNAL NAME SQLSignature.SQLFunctions.fn_RegexIsMatch; GO
select * from [AW].[person].emailaddress as p where dbo.fn_RegexIsMatch(RTRIM(LTRIM(lower(P.emailaddress))), '^[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$')=1
I did some timings on the two functions and as you can see below there are big differences. The .NET CLR one is almost 8 times faster(cpu time) than the T-SQL one. It's also a whole lot easier to read, and can be used for all other RegEx's needed. You do however need to get out of management studio to make it, and 1.5 seconds is for processing almost 20000 email addresses still should be fast enough for most uses.
SET STATISTICS TIME ON select * from [AW].[person].emailaddress as p where dbo.VALIDEMAIL(P.emailaddress)=1 select * from [AW].[person].emailaddress as p where dbo.fn_RegexIsMatch(RTRIM(LTRIM(lower(P.emailaddress))), '^[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$')=1 (19929 row(s) affected) SQL Server Execution Times: CPU time = 1466 ms, elapsed time = 1911 ms. (19972 row(s) affected) SQL Server Execution Times: CPU time = 203 ms, elapsed time = 380 ms.
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.