DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Databases
  4. T-SQL vs. CLR – A Comparison

T-SQL vs. CLR – A Comparison

Nick Haslam user avatar by
Nick Haslam
·
Jul. 04, 12 · Interview
Like (0)
Save
Tweet
Share
7.26K Views

Join the DZone community and get the full member experience.

Join For Free

i’ve recently been looking at optimizing a data load that i’m working on, and have got a quite a nice example of how the performance can be improved significantly by creating a clr object.

to do this, i created a set of data (which is downloadable from here , is approx. 3mb and is extractable using 7-zip ) which has approx. 200k records in it. the file has two columns (pipe delimited), which are a row id, and a set of events in a string (in the format “1,2,3,4,5”). each string has a variable number of events in it (between 1 and 75), which looks something like this.

image

what i’m wanting to do is split this out, to have an event id, sequence_number and event with the eventlist above being split over multiple records, and a sequence number showing the position of the event code in the list. sounds straightforward.

image

i’m effectively looking for a quick way to generate this data.


t-sql – custom_split

a custom script to split the table was created, looking like this:

create function [dbo].[split] 
( 
    @string nvarchar(4000), 
    @delimiter nvarchar(10) 
) 
returns @table table 
( 
    [value] nvarchar(4000) 
) 
begin 
    declare @nextstring nvarchar(4000) 
    declare @pos int, @nextpos int 
 
    set @nextstring = '' 
    set @string = @string + @delimiter 
 
    set @pos = charindex(@delimiter, @string) 
    set @nextpos = 1 
    while (@pos <> 0) 
    begin 
        set @nextstring = substring(@string, 1, @pos - 1) 
 
        insert into @table 
        ( 
            [value] 
        ) 
        values 
        ( 
            @nextstring 
        ) 
 
        set @string = substring(@string, @pos + len(@delimiter), len(@string)) 
        set @nextpos = @pos 
        set @pos = charindex(@delimiter, @string) 
    end 
    return 
end 
-- this code was taken from here: http://stackoverflow.com/questions/314824/t-sql-opposite-to-string-concatenation-how-to-split-string-into-multiple-recor/314833#314833

this was executed in management studio, with this script:

select eventid,
   identity (int, 1, 1) as sequenceno,
   value as event
   into dbo.sql_events   
      from dbo.largefile
   cross  apply  dbo.split(dbo.largefile.eventlist, ‘,’) as t
   order  by dbo.largefile.eventid

running this generated a table with 9.8 million rows and took 6 minutes 8 seconds, so handling approx. 26.6k records a second.


clr – splitevents

to split the eventlist in a clr object, i did the following in visual studio 2010.

1. choose the .net framework, clr project

image

2. right click the project when it’s created, and choose ‘add –> stored procedure’.

in the body of the class, i used the following code:

using system;
using system.data;
using system.data.sqlclient;
using microsoft.sqlserver.server;public partial class storedprocedures
{
    [microsoft.sqlserver.server.sqlprocedure]
    public static void clr_split()
    {
        // put your code here
         char[] delimiters = new char[] { ‘,’, ‘-’ };

        using (sqlconnection conn = new sqlconnection(“context connection=true”))
        {
            using (sqlcommand cmd = new sqlcommand(“select eventid, eventlist from largefile”, conn))
            {
                sqlpipe pipe = sqlcontext.pipe;
                sqlmetadata[] cols = new sqlmetadata[3];
                cols[0] = new sqlmetadata(“eventid”, sqldbtype.int);
                cols[1] = new sqlmetadata(“sequence_no”, sqldbtype.int);
                cols[2] = new sqlmetadata(“event”, sqldbtype.int);

                conn.open();
                using (sqldatareader reader = cmd.executereader())
                {
                    int irow = 0;
                    if (reader.hasrows)
                    {
                        sqldatarecord rec = new sqldatarecord(cols);
                        pipe.sendresultsstart(rec);
                        while (reader.read())
                        {
                            int ictiqueueid = convert.toint32(reader[0]);
                            string sincoming = reader[1].tostring();
                            if (sincoming.length > 0)
                            {
                                string[] ssplit = sincoming.split(delimiters);
                                int isplitcount = 1;
                                foreach (string s in ssplit)
                                {
                                    try
                                    {
                                        int isplitval = convert.toint32(s);
                                        rec.setsqlint32(0, ictiqueueid);
                                        rec.setsqlint32(1, isplitcount);
                                        rec.setsqlint32(2, isplitval);
                                        pipe.sendresultsrow(rec);
                                    }
                                    catch (exception)
                                    {
                                    }
                                    isplitcount++;
                                }
                            }
                            irow++;
                        }
                        pipe.sendresultsend();
                    }
                    else
                            pipe.send(“”);
                }
                conn.close();
            }
        }
    }

};

having compiled and deployed this to sql server, it could be run using this :

insert into clr_events (eventid, sequence_no, event)
exec clr_split

notes

when running these tests, i was running sql server 2008 r2, on a i5-430m laptop, with 4gb ram.

the database had plenty of space pre-allocated to it, so there was no performance hit based on growing the database.

Common Language Runtime Database Comparison (grammar)

Published at DZone with permission of Nick Haslam, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Required Knowledge To Pass AWS Certified Solutions Architect — Professional Exam
  • Microservices Testing
  • Cucumber.js Tutorial With Examples For Selenium JavaScript
  • Low-Code Development: The Future of Software Development

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: