Thursday, 19 November 2015

SSIS Getting Started - Extract, Transform & Load the data




In this post we will learn how to use the SQL Server Integration Services 2012 to Load the data. Here I created a sample CSV data file "Employee" which we will load into a table dbo. Employee in the database. During this exercise we will use some of the basic SSIS transformations like Source Adapter, Derived Column transformation, Lookup transformation etc.

You can download this paper with Source file, Script & SSIS solution from my Technet link given
below

https://gallery.technet.microsoft.com/SSIS-Getting-Started-Guide-d8fb22ac

Open SSDT from Start-->Microsoft SQL Server 2012-->SQL Server Data Tools


Click on File-->New-->Project as in the image below



In the new project window click on the Integration Services Project, and enter the details like name of the project, Location where you want to create SSIS solution etc as in the image below.


After creating solution we will get the window as in the image below, where we have Solution Explorer Window on right side, here one ssis package is already created package.dtsx. 
I have given SSIS solution physical location here is "D:\MSBI\SSIS\Learning\SSIS" and Source Csv file location is "D:\MSBI\SSIS\Learning\SourceFiles"



Rename this package as SSISLoad.dtsx by right clicking on package.dtsx.
Now right click on Connection Managers pane at bottom, then click on new flat file connection as in the image below


Flat File Connection Manager Editor window will appear, Fill in the details as below

Connection Manager Name : SourceFileConnection (you can choose whatever you want)
File Name : D:\MSBI\SSIS\Learning\SourceFiles\Employee.csv (give the path where source file                                                                                                                   exists)



Leave it as is the rest of the setting.

Now click on Column Tab on the left pane of the Flat File Connection Manager Editor window, you see here row & column  delimiter are defined as {CR}{LF} & Comma {,}. Leave it as is. Here you can preview first 100 rows of your source file data.


Now click on Advanced Tab on the left pane of the Flat File Connection Manager Editor window, it is the important part as we can define our source data type according to our destination table data types here. Now the destination table data type are as below

CREATE TABLE [dbo].[Employee](
[EmployeeKey] [int] NOT NULL,
[ParentEmployeeKey] [int] NULL,
[EmployeeNationalIDAlternateKey] [nvarchar](15) NULL,
[SalesTerritoryKey] [int] NULL,
[EmployeeFullName] [nvarchar](500) NOT NULL,
[FirstName] [nvarchar](50) NOT NULL,
[LastName] [nvarchar](50) NOT NULL,
[MiddleName] [nvarchar](50) NULL,
[NameStyle] [bit] NOT NULL,
[Title] [nvarchar](50) NULL,
[HireDate] [date] NULL,
[BirthDate] [date] NULL,
[EmailAddress] [nvarchar](50) NULL,
[Phone] [nvarchar](25) NULL,
[MaritalStatus] [nchar](1) NULL
) ON [PRIMARY] 

We need to run above query to crate this table on the database where we will be loading the data, so create above table on your database.

So we need to define the data type on advance tab as in the image below i change EmployeeKey from DT_Str to DT_Numeric 


Repeat above steps for all the columns and then click on ok. Now you will see that your source connection "SourceFileConnection" is created as in the image below


Now we will create database connection, right click again on the Connection Manager pane and choose New Oledb Connection, a new window will pop up "Configure Oledb Connection Manager" as in the image below, in this window click on new button then you will get Connection Manager Window where you can select your SQL Server name, login mechanism and the database name where you need to load the data


Once you will fill in these details then click ok and your destination connection will be created as in the image below






Now we will design our data flow. First drag Execute SQL Task by Clicking on SSIS Toolbox on the left window pane as image below


Now double click on Execute SQL Task and define its properties, first click on connection drop down and select your destination connection as in the image below



Now click on sql statement property and paste the following query

Truncate table [dbo].[Employee]

This query truncates the data every time you will run the package.


Click ok. Now Drag a Data Flow Task from SSIS toolbox window and connect the Execute Sql Task to this Data Flow Task with the green arrow as in the image below.


Now we will design our data flow which will extract data from file and insert into the destination table. Double click on Data Flow Task and drag Source Assistant from SSIS Toolbox window to the design window as in the image below, it will pop up a window named "Source Assistant - Add New Source"




Now click on Flat File in Select Source Type section on the left of the window and select SourceFileConnection (source connection which we created above) on the right pane of the window and click ok.


If you check the destination table which the package will create has FullName as one column but in the source file we does not have this column, so we will create this column values by SSIS package so to do this we will drag drived column transformation from SSIS Toolbox window and join it with the Source Assistant we just dragged.

Now double click Derived Column Transformation which will pop up a window "Derived Column Transformation Editor". Click on Columns on the left pane of the window and copy & paste below expression under the Expression tab

[FirstName]+" "+ [MiddleName]+" "+ [LastName]

Give a name to this column under Derived Column Name and click ok


Drag Destination Assistant and double click to configure it. Select DestinationDb (destination connection which we created above) on the right pane of  Destination Assistant window and click ok



Now double click on the OLEDB Destination which will pop up OLEDB Destination window, here select your table (here dbo.Employee) as in the image below


Now click on the mappings on the left pane of the window, it will map all the source columns with the destination columns automatically if both have the same name. Now if you see for EmployeeFullName on the destination side is not mapped with the source column, so click on the <ignore value> on the source column and select FullName which we created in the Derived Column Transformation above and click ok.



Fun time, package development is complete, now we will execute it. Right click on SSISLoad.dtsx on the Solution Explorer window and select Execute Package.


Cheers, It runs successfully and load 296 rows into the dbo.Employee database.

Happy Learning.


















No comments:

Post a Comment