SSIS – Using Bulk Insert
Bulk Insert is a very important tool for any SSIS developer and is used by nearly every developer to load the Flat file into the staging table. Here we aren't going to describe Bulk Copy in full detail, but will discuss some common factors when using Bulk Copy.
Join the DZone community and get the full member experience.
Join For FreeBulk Insert is a very important tool for any SSIS developer and is used by nearly every developer to load the Flat file into the staging table. Here we aren't going to describe Bulk Copy in full detail, but will discuss some common factors when using Bulk Copy.
Common Factors
Some of the common factors that we faced with BULK COPY are:
1. When the Import Table has the Identity Columns
2. When we do not want to Insert some records in a specified columns of a Table
3. When data type mismatch occurs
4. When Truncation Error occurs
Sometimes in the ETL process the flat file size is so heavy that we are unable to open it.
How to Solve This Problem
This type and many other types of problem with BULK COPY can be solved by using Format File. So here in this article we are going to describe the BULK COPY using format file.
Format file is a Metadata of table where we are importing the data from flat file. There are two types of format files: one is in text format and the other is in xml format. Generally, I prefer the xml format, but it just depends on the developer—there are no hard and fast rules.
Demonstration
Step 1 — The Flat File Format
Step 2 — The Destination Table
IF OBJECT_ID(N'[dbo].[tbl_StudentDetails]', N'U')IS NOT NULL
DROP TABLE [dbo].[tbl_StudentDetails];
GO
CREATE TABLE [dbo].[tbl_StudentDetails]
(
StdRoll INT NOT NULL PRIMARY KEY,
Stdname VARCHAR(50) NULL,
StdClass INT NULL,
Section CHAR(1) NULL
);
GO
Step 3 — Creating the Format File
The XML Format File
bcp PracticeDB.dbo.tbl_StudentDetails
format nul -c -f
E:\Preparation\SSIS\BCP\FmtStudentDetails.xml
-x -S JOYDEEP-LAPTOP\JOYDEEPSQL12 -T -q -t;
Here the XML format file name is FmtStudentDetails.xml and the Instance of the SQL Server name is JOYDEEP-LAPTOP\JOYDEEPSQL12.
<?xml version="1.0" ?>
- <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50"
COLLATION="SQL_Latin1_General_CP1_CI_AS" />
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="1"
COLLATION="SQL_Latin1_General_CP1_CI_AS" />
</RECORD>
- <ROW>
<COLUMN SOURCE="1" NAME="StdRoll" xsi:type="SQLINT" />
<COLUMN SOURCE="2" NAME="Stdname" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="3" NAME="StdClass" xsi:type="SQLINT" />
<COLUMN SOURCE="4" NAME="Section" xsi:type="SQLCHAR" />
</ROW>
</BCPFORMAT>
The TEXT Format File
bcp PracticeDB.dbo.tbl_StudentDetails
format nul -c -f E:\Preparation\SSIS\BCP\FmtStudentDetails.fmt
-S JOYDEEP-LAPTOP\JOYDEEPSQL12 -T -q -t;
10.0
4
1 SQLCHAR 0 12 ";" 1 StdRoll ""
2 SQLCHAR 0 50 ";" 2 Stdname
SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 12 ";" 3 StdClass ""
4 SQLCHAR 0 10 "\r\n" 4 Section
SQL_Latin1_General_CP1_CI_AS
If we want we can edit both the files. Suppose the StdRoll is an IDENTITY column and we do not want to insert records in the table for that, then we have to just edit it.
<?xml version="1.0" ?>
- <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <RECORD>
<FIELD ID="0" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50"
COLLATION="SQL_Latin1_General_CP1_CI_AS" />
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="1"
COLLATION="SQL_Latin1_General_CP1_CI_AS" />
</RECORD>
- <ROW>
<COLUMN SOURCE="0" NAME="StdRoll" xsi:type="SQLINT" />
<COLUMN SOURCE="2" NAME="Stdname" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="3" NAME="StdClass" xsi:type="SQLINT" />
<COLUMN SOURCE="4" NAME="Section" xsi:type="SQLCHAR" />
</ROW>
</BCPFORMAT>
10.0
4
1 SQLCHAR 0 12 ";" 0 StdRoll ""
2 SQLCHAR 0 50 ";" 2 Stdname
SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 12 ";" 3 StdClass ""
4 SQLCHAR 0 10 "\r\n" 4 Section
SQL_Latin1_General_CP1_CI_AS
Here is an image of the code with highlighted lines, for you to better understand:
Step 4 — Using Format File for Bulk Insert
BULK INSERT tbl_StudentDetails
FROM 'E:\Preparation\SSIS\BCP\test.txt'
WITH (FORMATFILE = 'E:\Preparation\SSIS\BCP\FmtStudentDetails.xml',
FIELDTERMINATOR = ';',
ROWTERMINATOR = '0x0A'
);
BULK INSERT tbl_StudentDetails
FROM 'E:\Preparation\SSIS\BCP\test.txt'
WITH (FORMATFILE = 'E:\Preparation\SSIS\BCP\FmtStudentDetails.fmt,
FIELDTERMINATOR = ';',
ROWTERMINATOR = '0x0A'
);
And, another image showing the highlighted lines:
Please note that the Format file is your Metadata of the table, so if the format file is generated properly the BULK INSERT is done successfully. We can Edit the format file according to our needs and by controlling the BULK INSERT.
Hope this helps!
Published at DZone with permission of Joydeep Das, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Testing Applications With JPA Buddy and Testcontainers
-
MLOps: Definition, Importance, and Implementation
-
What Is JHipster?
-
Clear Details on Java Collection ‘Clear()’ API
Comments