KEMBAR78
SSIS Interview Prep Guide | PDF | Parallel Computing | Microsoft Sql Server
0% found this document useful (0 votes)
519 views79 pages

SSIS Interview Prep Guide

The document discusses various SSIS interview questions and their answers. It covers topics like: - What is ETL and Business Intelligence - How to find the version of an SSIS package from its .dtsx file - The difference between Control Flow and Data Flow in SSIS - How many tasks can execute in parallel in an SSIS package and the MaxConcurrentExecutables property - The EngineThreads property of the Data Flow task - What are precedence constraints and examples of their use

Uploaded by

Ganesh Kamthe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
519 views79 pages

SSIS Interview Prep Guide

The document discusses various SSIS interview questions and their answers. It covers topics like: - What is ETL and Business Intelligence - How to find the version of an SSIS package from its .dtsx file - The difference between Control Flow and Data Flow in SSIS - How many tasks can execute in parallel in an SSIS package and the MaxConcurrentExecutables property - The EngineThreads property of the Data Flow task - What are precedence constraints and examples of their use

Uploaded by

Ganesh Kamthe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 79

MSBI Interview Question

SSIS Interview Questions..............................................................................................................1


What is ETL?...............................................................................................................................1
What is Business Intelligence?.................................................................................................2
SSIS - How to Find The Version Of SSIS Package From Dtsx File.....................................2
SSIS - What Is The Difference Between Control Flow and Data Flow In SSIS ?..............3
What Is Parallel Execution In SSIS, How Many Tasks A SSIS Package Can Execute In
Parallel?.......................................................................................................................................4
What is the MaxConcurrentExecutables property on a Package level?..............................6
What is the Engine Thread property of Data Flow Task?......................................................6
What are the Precedence Constraints in SSIS, and where and why have you used
them?............................................................................................................................................7
What is the difference between the Success and the Completion value of Precedence
Constraint?.................................................................................................................................14
What is the DelayValidation property of Data Flow Task? Why does one use this
property?....................................................................................................................................17
DelayValidation Property:........................................................................................................17
Real Time Examples for Using DelayValidation Property in SSIS Package:....................17
What is RetainSameConnection Property on Connection Manager in SSIS Package? Why is it
used?...........................................................................................................................................19
RetainSameConnection Properties on SSIS.........................................................................19
If we create a temp table in SSIS Package and want to use it in other tasks, which properties do
we need to use?..........................................................................................................................23
SSIS - How To Create / Use Temp Table In SSIS Package...............................................23
What is data Viewer in SSIS? Is data viewer available in ControlFlow or Data Flow?..................35
What is the difference between Checkpoint and Breakpoint in SSIS?.........................................38
Breakpoints in SSIS..................................................................................................................41
Creating Simple package.........................................................................................................41
Let’s see how we use Breakpoints...............................................................................................44

SSIS Interview Questions

What is ETL?

ETL is a process that extracts the data from different source systems, then transforms the data (like applying
calculations, concatenations, etc.) and finally loads the data into the Data Warehouse system. Full form of ETL is
Extract, Transform and Load.

It's tempting to think a creating a Data warehouse is simply extracting data from multiple sources and loading into
database of a Data warehouse. This is far from the truth and requires a complex ETL process. The ETL process
requires active inputs from various stakeholders including developers, analysts, testers, top executives and is
technically challenging.

In order to maintain its value as a tool for decision-makers, Data warehouse system needs to change with
business changes. ETL is a recurring activity (daily, weekly, monthly) of a Data warehouse system and needs to
be agile, automated, and well documented.

What is Business Intelligence?

BI(Business Intelligence) is a set of processes, architectures, and technologies that convert raw data into
meaningful information that drives profitable business actions.It is a suite of software and services to transform
data into actionable intelligence and knowledge.

BI has a direct impact on organization's strategic, tactical and operational business decisions. BI supports fact-
based decision making using historical data rather than assumptions and gut feeling.

BI tools perform data analysis and create reports, summaries, dashboards, maps, graphs, and charts to provide
users with detailed intelligence about the nature of the business.

SSIS - How to Find The Version Of SSIS Package From Dtsx File

Scenario: 

Let's say you just start working for a company and they pointed you to a folder which holds SSIS Packages. You
need to find out the version of these SSIS Package and schedule them to SQL Server 2008 or SQL Server 2012
according to their version.

Solution:

To find out the version our SSIS Package , we need to read the .dtsx file itself. We can open the file by using
different programs such as internet explorer or notepad or word pad etc. The .dtsx files are xml files and the
property we need to look for is "PackageFormatVersion".

For SSIS 2008

PackageFormatVersion="PackageFormatVersion">3

For SSIS 2012 

PackageFormatVersion="PackageFormatVersion">6
Fig 1: SSIS Package with different versions

As you can see that I have two SSIS Packages but can't tell either they are SSIS 2008 or SSIS 2012.

Right Click on the Package.dtsx and go to Open With. You can choose the program you want to use for this. I
opened it with Notepad.

SSIS - What Is The Difference Between Control Flow and Data Flow In SSIS ?

Control Flow: 

Control Flow is part of SQL Server Integration Services Package where you handle the flow of operations or
Tasks.
Let's say you are reading a text file by using Data Flow task from a folder. If Data Flow Task completes
successfully then you want to Run File System Task to move the file from Source Folder to Archive Folder. If
Data Flow Task failed then you want to send email to your users by using Send Mail Task. The Precedence
Constraints are used to control the execution flow.

Data Flow: 

Data Flow is the part of SQL Server Integration Services Package, where data is extracted by using Data Flow
Sources ( OLE DB Source, Raw File Source, Flat File Source , Excel Source etc.). After extacting data Data Flow
Transformations such as Data Conversion, Derived Column, Lookup, Multicast,Merge etc are used to implement
different business logics and finally written to Data Flow Destinations (OLE DB Destination, Flat File
Destination,Excel Destination,DataReader Destination ADO NET Destination etc.)
What Is Parallel Execution In SSIS, How Many Tasks A SSIS Package Can Execute In Parallel?

In simple words, If you place more than one Task on Control Flow pane and do not connect them by using
Precedence constraint, the Tasks will run in Parallel.

This can be helpful to speedup the process when we load data from Source Database to Staging Database and
there is no dependency which table should be loaded first.

This is great , So If I need to load 100 staging tables from source database, I can run all of them in Parallel?

Yes, you can. Visit Link.

In this post, I am considering default settings, that means our SSIS Package will only be able to execute
 Total Tasks=Number of processors of machine+2.

How would I know that how many processes are on my machine?


Couple of ways to do that quickly

1-Connect to SQL Server by using SSMS if installed on the machine, Right Click on Instance Name and go to
properties and then General and you will be able to see the number of processors.
Fig 1: Find Number of Processors from SQL Server Instance

2-Click on Start and then in Search write "Device Manager" and it will open Device Manager, Click on Processors
and you will see them there.

Fig 2: Find out the Number of Processors on Computer by using Device Manager
My machine has 4 processors, So the max number of Tasks those can be executed by SSIS Package on my
machine will be 4(processors)+2=6 with default setting.

What is the MaxConcurrentExecutables property on a Package level?

MaxConcurrentExecutables, a package level property in SSIS determines the number of control flow items that
can be executed in parallel. The default value is -1. This is equivalent to number of processors (logical and
physical) plus 2.

For example, in the below package running on my machine with 4 processors and MaxConcurrentExecutables =
-1, you can see 6 tasks have completed execution and 6 are currently running. It’s executing 6 at a time because
4 processors + 2 = 6 threads.

This applies to all versions of SSIS. Parallelism is powerful when your goal is to complete a process as quickly as
possible, specially when the tasks in a control flow are independent of each other.

If you’re thinking of increasing this setting to an infinity hoping to achieve a Nobel prize in performance tuning…
slow down. If the words throughput, threading, multi-tasking scares you, you should be careful with this property.
In most cases, the default setting can get the job done just fine.

What is the Engine Thread property of Data Flow Task?

Parallel execution improves performance on computers that have multiple physical or logical processors. To
support parallel execution of different tasks in the package, SSIS uses two properties:
MaxConcurrentExecutables and EngineThreads. MaxConcurrentExcecutables Property
The MaxConcurrentExecutables property is a property of the package. This property defines how many tasks can
run simultaneously; by specifying the maximum number of SSIS threads that can execute in parallel per
package. The default value is -1, which equates to the number of physical or logical processors plus 2. Using a
package which calls another package, in this example MaxConcurrentExecutables has its default value set as -1
and the server running the package has 8 processors, all 10 tasks (taking the Execute Package task in the
example, though it applies in the same way to other tasks as well) are executed in parallel, as shown
below: 
 If
MaxConcurrentExecutables was changed to 4 in the above package and run it on the same server, then only 4
tasks will runn in parallel at a time (Note the image below shows tasks are executed in a batch of 4, once 4 tasks
are executed another batch of 4 tasks will be executed)

What are the Precedence Constraints in SSIS, and where and why have you used them?

The executables in SSIS refer to tasks or containers. A precedence constraint links 2 executables: the
precedence executable and constrained executable. See an example below.

The running logic of the above example can be shown below.


The precedence executable runs first and then evaluate the precedence constraint result. If it returns true, then
the constrained executable will be run otherwise the process will be over. The result of a precedence constraint
depends on whether precedence executable executing result matches the precedence constraint setting, an
expression result or the result of both. An executable can generate either Success or Failure result after it runs. A
precedence constraint can be set one of the 3 results: Success, Failure and Completion. The following is the
setting and result of a precedence constraint.

Name Look & Feel Setting Result

Green Arrow Success Returns true only if precedence executable runs Succe

Red Arrow Failure Returns true only if precedence executable runs Failur

Blue Arrow Completion Always returns true no matter what result precedence

Also the result of a precedence constraint can be evaluated by an expression you defined or both. See the
example below.

1. Create an new package in the project LearnSSIS1 and rename the package to
PrecedenceContraint.dtsx.
2. Open the package and define a variable V with Int32 type and 1 as its default value.

3. Drag and drop a script task to the package and rename the task to "Source". Then Copy and paste the
task and rename the copy to "Destination". At last using precedence constraint links them together as
follows.

4. Run the package you will get the following result.

5. Click "Stop Debugging" and open the source or the destination script task you will find the default source
code below.
6. Change the code in line 95 in "Source" task to the following and click "OK" button to save it.

Dts.TaskResult = (int)ScriptResults.Failure;

7. Run the package again, you will get the result below. Then click "Stop Debugging".

Because The "Source" task returns "Failure" but the precedence constraint was defined to continue
running if the source returns "Success". In this case, the package stopped running after the "Source"
task executed.

8. Right click the green arrow and choose "Failure".

9. Run the package again, you will see the "destination" task runs successfully because the "Source"
returns Failure which meets the precedence constraint setting.
10. Right click the red failure arrow and choose "Edit..." to open precedence constraint editor.

In Constraint options, you can choose one of the 4 evaluation operations.

o Constraint
o Expression
o Expression and Constraint
o Expression or Constraint

The default one is Constraint which we already tried in the previous steps. The value is the constraint
value we set and the current value is Failure. That means only if the "Source" task returns failure we'll
run "Destination" task.
11. Select "Expression and Constraint" and click "..." button in Expression to open Expression Builder to
create an expression @[User::V] == 1 and click OK button.

12. Leave "Multiple constraints" as the default "Logical AND" and click OK to save the setting. Then Run the
package.

You can see the "Destination" task ran OK because both constraint and the expression returns true. "fx"
in the above picture means the precedence constraint contains expression condition.

13. Right click the green arrow and choose "Failure".


14. Click "Stop Debugging" and change the variable V's value to 2 in Variables window. Run the package
again. You will see the "destination" task have not run because the expression @[User::V] == 1 returns
false.

15. Right click the precedence constraint and change the evaluation operation to "Expression or Constraint"
and click OK.
16. Run the package again and you will find the "Destination" task runs successfully because the Constraint
and Expression are logic OR ( True || False = True ).

What is the difference between the Success and the Completion value of Precedence Constraint?

Precedence Constraints are the arrows those we use in Control Flow Pane to connect the Tasks. Precedence
Constraints are used to control the execution flow of Tasks as well under what condition pass execution control
to which Task.

The default constraint is Success that is represented with Green Arrow between Tasks.
Fig 1: Precedence Constraint on Success

In Fig 1, The Execute SQL Task has to execute successfully to pass execution control to Data Flow Tasks. If
Execute SQL task will fail then Data Flow Tasks will not execute.

Fig 2: On Successful Execution of Execute SQL Task

If Execute SQL Task task fails, then control will not pass to Data Flow Tasks as shown in Fig 3.

Fig 3: On Failure of Execute SQL Task

There could be requirements in which even Execute SQL Task executes successfully or fail, we always want to
execute Data Flow Tasks. For this requirement, We need to configure Precedence Constraint to Completion.
Double click on the Green Arrow between the Tasks and then configure as shown below in Fig 4.
Fig 4: Configure Precedence Constraint for Completion

The Data Flow Tasks will execute on Completion of Execute SQL Task ( Completion can be success or failure
status).

Fig 5: Execution of SSIS Package (Precedence Constraint  Completion configuration)


What is the DelayValidation property of Data Flow Task? Why does one use this property?

DelayValidation Property:

DelayValidation Property is available on Task level, Connection Manager, Container and on Package level. By
default the value of this property is set to false that means that when the package start execution, It validates all
the Tasks, Containers, Connection Managers and objects( Tables,Views, Stored Procedures etc.) used by them.
If any object such as table or destination file etc. is not available then Package validation fails and Package stop
execution.

By setting this property to True, We enforce our SSIS Package not to validate that Task, Connection Manager or
entire Package at start but validate at run time. Let me explain with some real time examples

Real Time Examples for Using DelayValidation Property in SSIS Package:

Example 1: Make Use of Temp Table in SSIS Package

Let's say instead of  creating permanent staging tables we decided to use temp tables in our ssis pacakge. We
want to load the data in temp table from flat file source and then want to use this temp table in other tasks. Before
we use temp table in Data Flow Task , we have to create it. As the temp table will be created by using Execute
SQL Task before the Data Flow task and if we let the Delayvalidation=false, package will try to validate temp
table in Data Flow Task. As Temp table will not be available at this point, Package will fail. To skip this part, we
can set the DelayValidation property to True so Package will skip Validation at start point. By the time package
will reach to Data Flow Task, The temp table will be created by Execute SQL Task in above step and it will
validate and load successfully.

Here is blog post that you can use to practice above example.

Example  2:  Create Excel File with DateTime

To create Excel file with Datetime, you have to create empty excel destination file and keep it as template. The
steps involved are

1-- You copy the template file to required folder , while you copy the file you can rename with datetime.

As the file will not be available at the time of Packate Start, Package will fail to validate the Connection manager
and Data Flow Task. You can set the property DelayValidation=true for both by going to properties. By doing that
you are skipping the pre-validation. By the Time, Package will reach to Data Flow Task to load the data, you
would have created the file with datetime by using File System Task.

Example 3:  How to Create Multiple Files Dynamically From a SQL Server Table

I have a post in which I read the data from table and then create file name dynamically by using data from table.
As the file name will be created later and no file will be available at time or Package Start time, I have used the
DelayValidation Property. Click Here to see the blog post.

To set the DelayValidation Property, You can right click on Task/Connection Manager and go to properties and
set it to true. You can also click on any Task/ Connection Manager/ Container and hit F4 Key to go to properties. 
Fig 1: Package Level DelayValidation Property

Fig 2: How to Set DelayValidation Property on SSIS Task


Fig 3: How to Set DelayValidation Property to True for Connection Manager

What is RetainSameConnection Property on Connection Manager in SSIS Package? Why is it


used?

RetainSameConnection Properties on SSIS


We can see a connection manager as a factory creating connections.Each time a connection
manager is used by an SSIS component, a new connection is created. If we set
RetainSameConnection to true, we tell the connection manager to create only one connection and
hold on to it as long as the package runs. This prevents temporary tables or transaction to be
dropped.

RetainSameConnection is a property of an OLEDB Connection Manager. The default value of this


property is FALSE. This default value makes SSIS execution engine open a new OLEDB connection
for each task and close that connection when the task is complete.

We can set the property value to TRUE and then it will open just one OLEDB connection with a server
and keep it alive till the end of the package execution. The property can be set via the Properties
window for the OLEDB Connection Manager.

Here I am taking an example where I am using these properties. I am taking a Foreach loop
container. In a folder I am having more than 1400+ files and I want to insert a record to in table which
having the information of file name.
Take Foreach loop container
  
Take the Enumerator is file enumerator.
  

Configure the collection and select the folder and file extension and checked Name only because we
are storing the name. Select the Traverse subfolder. It means if folder contain sub folder it will travel
that.
I we need to map the variable.
  

Click ok.
No I am taking the execute sql task to insert the file name into the table.
  
Now I am configuring the execute sql task.

Map the variable and click ok.


Remember here still I am setting the Retain same connection properties. By default values is false.
  

Let’s Run the package and see the result.


  
Completed successfully. Let’s see the time taken by it.
  

It took around 25 seconds.

Now I am setting the Retain same connection properties as true.


  

Now I am executing the package.


  

When we set the properties true it will take less time to complete the task because in previous case
each time it opening the connection and closing the connection. But in second case it opens the
connection once and after completing the task it will close this connection. If we are having the large
transaction the it will be better to set the Retain same connection to true. For small transaction set is
as false.
If we create a temp table in SSIS Package and want to use it in other tasks, which properties
do we need to use?

SSIS - How To Create / Use Temp Table In SSIS Package


Scenario:

We have create a SSIS Package for Upsert(Insert/Update). We get csv file with millions
of records with (Id,Name,Address columns). If the record come with new Id , we need to
insert that record in dbo.Customer table(id, name, address) and for existing IDs we need
to update those records.

After doing some analysis, we got to know that the number of records those need to be
updated on daily basis are minimum 100,000 per day. To perform above task we can
use Lookup Transformation and find out existing and non existing records. Any non-
existing IDs can be directly inserted into dbo.Customer table but for update we have to
use OLE DB Command transformation. OLE DB Command transformation is slow, it will
update one row at a time and for 100,000 records it will take long time.

How about inserting the records into some staging table and write TSQL Statement to
Insert/update records? Good idea! It will be fast and easy to do. But my Architect do not
want to create a new table :(

Solution:

Ok, How about we create Temp table and then use it in our package to perform the
above task and once done, the Temp table will be gone!

Let's start with step by step approach

Step 1:

Prepare Source.csv file on desktop by using below data


Id,Name,Address
1,Aamir,ABC ADDRESS
2,Raza,Test Address
3,July, 123 River Side CA
4,Robert,540 Rio Rancho NM

Step 2:
Create dbo.Customer Table by using below script

USE TestDB
GOCREATE TABLE dbo.Customer
  (
     ID      INT,
     Name    VARCHAR(100),
     Address VARCHAR(100)
  )

Step 3: 

Create SSIS Package to load csv file into dbo.Customer Table.( Insert new records and
update existing)
Create OLE DB Connection to the database where your dbo.Customer table exists.
Right Click on Connection and then click properties or Click on Connection and press F4
to go to properties. 
Set RetainSameConnection=True. 

Fig 1: Set RetainSameConnection to True for OLE DB Connection

Step 4: 

Create ##Temp table by using Execute SQL Task as shown below by using 
Create Table ##Temp(ID INT, Name VARCHAR(100),ADDRESS VARCHAR(100))
Fig 2: Create ##Temp table by using Execute SQL Task

Step 5: 

Bring Data Flow Task to Control Flow Surface and then connect Execute SQL task to it.
Inside Data Flow task bring Flat File Source and make connection to Source.csv file that
you have created in Step 1.
Drag Lookup Transformation and configure as shown below. Our goal is to Insert any
record which Id does not exist in dbo.Customer table and if ID exists we want to update
that records. Instead of using OLE DB Command Transformation, we will insert records
which needs to be update in ##Temp table inside Data Flow Task.
Fig 3: Configure Lookup Transformation ( Redirect rows to no match output)
Fig 4: Choose Id from dbo.Customer for lookup
Fig 5: Map the Source Id to dbo.Customer.ID for lookup

Step 6:

Bring OLE DB Destination Transformation from Data Flow Items as shown. Join No
Match Output ( new records) of Lookup to OLE DB Destination and choose destination
Table (dbo.Customer).
Fig 6: Insert new records by using No Match Output of Lookup Transformation

As we do not want to use OLE DB Command transformation for update inside Data Flow
Task. Let's write all records those need to be update into ##Temp table by using OLE
DB Destination. We will not be able to see ##Temp table in drop down in OLE DB
Destination. Here are two steps we need to take
i) Create a variable with name ##Temp as shown below
Fig 7: TableName variable holding Temp Table Name

ii) Go to SSMS and create ##Temp table ( if you would not create this table, you will not
be able to map the columns in OLE DB Destination)
Create Table ##Temp(ID INT, Name VARCHAR(100),ADDRESS VARCHAR(100))

Bring the OLE DB Destination and map to TableName Variable as shown below.

Fig 8: Configure OLE DB Destination to use TableName variable for Destination Table
Name.
Fig 9: Map the Source Columns to ##Temp Table Columns

After all the configuration our Data Flow will look like below figure. I renames the
transformation to provide better picture about what we are doing in this Data Flow Task.

Fig 10: Data Flow Task with ##Temp Table Destination.


Step 7:

Go to Control Flow Surface and Drag Execute SQL Task to write update statement.

UPDATE DST 
SET DST.Name=SRC.Name
,DST.ADDRESS=SRC.ADDRESS
FROM  dbo.Customer DST
INNER JOIN ##Temp SRC
ON DST.ID=SRC.ID

Fig 11: Execute SQL Task to Update Dbo.Customer from ##Temp 

Our final SSIS Package will look like below


Fig 12: Insert/Update Package by using Temp Table for Updates

If we try to run the SSIS Package, It might complain that ##Temp does not exists. Go to
package properties by right clicking in Control Flow Pane and Set DelayValidation=True.
By setting DelayValidation we are asking the package not to validate any objects as
##Temp table does not exist at this point and it will be created later in Package. 

Fig 13: Set Delay Validation=True

Run the Package couple of times and check the data in dbo.Customer table. Data
should be loaded. Now let's go to Source.csv file and change some values for Name and
Address columns and run the package one more time to make sure, Update logic is
working fine.
Here is the data after update.
Id,Name,Address
1,Aamir1,Test  ADDRESS
2,Raza1,Test Address
3,July, 123 River Side CA USA
4,Robert,540 Rio Rancho NM

Fig 14: Package Execution After Updating Records in Source.csv file

As we can see that the records are updated, where ever we made changes in Name and
Address values.
Fig 16: dbo.Customer data after Upsert

What is data Viewer in SSIS? Is data viewer available in ControlFlow or Data Flow?

SSIS - How to Use Data Viewer in SSIS Package

Scenario:

Let’s say we are developing a package and it extracts some records from source,
Implement some business logic by using different transformations and finally load into
destination (table/file). When we look at destination, record is incorrect but we are not
sure what happen to source record. We want to see the change in record/records after
each of transformation to find out which logic is not working correctly.

Solution:

SQL Server Integration Services (SSIS) provided Data Viewer in Data Flow Task. Data
Viewer can be used between two transformations to see the data. When we executes
our package Data Viewer pop up window shows data so we can see What is changed
from Input to Output.

In this example we are extracting few records from Source, We want to see what we are
extracting. We have used aggregate transformation that is grouping by CountryName
and doing Sum operation on SaleAmount. We can create second data viewer after
Aggregate transformation to see the data.
To use Data Viewer between Transformations, Double click on green connection that
exists between two transformations, it will open Data Viewer Editor (Data Flow Path
Editor).

There are three options on Left Pane


General: Provides general information
MetaData : Provided meta data information of columns
Data Viewer: This is the tab where we will be able to select the column those we want to
include in Data View.

In SSIS 2008/ R2 and previous versions where other options were available in Data
Viewer, those options are removed. The only Grid option is left in SSIS 2012 and latest
versions and that is even not called Grid anymore but only Data Viewer.
Data Viewer Configuration window in SSIS 2008R2 and old

SSIS 2012  and latest versions Data Viewer Editor


Data Viewer Editor Window with only Data Viewer Tab in SSIS 2012 and Latest versions

Once data viewers are created, we can execute our package. We will be able to see
data at different stages of execution.
How to use Data Viewer in SSIS Package to view data while debugging SSIS Package
We can hit Play button in Data Viewer Output window to go to next Data Viewer. The
data can also be copied from Data Viewer and used for testing.

What is the difference between Checkpoint and Breakpoint in SSIS?

Integration Services Checkpoints to restart package from failure

Problem

We have a number of SSIS packages that routinely fail for various reasons such as a
particular file is not found, an external FTP server is unavailable, etc. In most cases
these error conditions are just a temporary situation and we can simply rerun the
package at a later time and it will be successful. The issue, however, is that we do
not want to rerun the tasks in the package that have have already completed
successfully. Is there a way that we can restart an SSIS package at the point of
failure and skip any tasks that were successfully completed in the previous execution
of the package?

Solution

SSIS provides a Checkpoint capability which allows a package to restart at the point
of failure. The Checkpoint implementation writes pertinent information to an XML file
(i.e. the Checkpoint file) while the package is executing to record tasks that are
completed successfully and the values of package variables so that the package's
"state" can be restored to what it was when the package failed. When the package
completes successfully, the Checkpoint file is removed; the next time the package
runs it starts executing from the beginning since there will be no Checkpoint file
found. When a package fails, the Checkpoint file remains on disk and can be used
the next time the package is executed to restore the values of package variables and
restart at the point of failure.
The starting point for implementing Checkpoints in a package is with the SSIS
package properties. You will find these properties in the Properties window under the
Checkpoints heading:

 CheckpointFileName - Specify the full path to the Checkpoint file that the package
uses to save the value of package variables and log completed tasks. Rather than
using a hard-coded path as shown above, it's a good idea to use an expression that
concatenates a path defined in a package variable and the package name.
 CheckpointUsage - Determines if/how checkpoints are used. Choose from these
options: Never (default), IfExists, or Always. Never indicates that you are not using
Checkpoints. IfExists is the typical setting and implements the restart at the point of
failure behavior. If a Checkpoint file is found it is used to restore package variable
values and restart at the point of failure. If a Checkpoint file is not found the package
starts execution with the first task. The Always choice raises an error if the
Checkpoint file does not exist.
 SaveCheckpoints - Choose from these options: True or False (default). You must
select True to implement the Checkpoint behavior.

After setting the Checkpoint SSIS package properties, you need to set these
properties under the Execution heading at the individual task level:

 FailPackageOnFailure - Choose from these options: True or False (default). True


indicates that the SSIS package fails if this task fails; this implements the restart at
the point of failure behavior when the SSIS package property SaveCheckpoints is
True and CheckpointFileUsage is IfExists.
 FailParentOnFailure - Choose from these options: True or False (default). Select
True when the task is inside of a container task such as the Sequence container; set
FailPackageOnFailure for the task to False; set FailPackageOnFailure for the
container to True.
Keep in mind that both the SSIS package Checkpoint properties and the individual
task properties need to be set appropriately (as described above) in order to
implement the restart at the point of failure behavior.

Before wrapping up the discussion on Checkpoints, let's differentiate the restart from
the point of failure behavior with that of a database transaction. The typical behavior
in a database transaction where we have multiple T-SQL commands is that either
they all succeed or none of them succeed (i.e. on failure any previous commands
are rolled back). The Checkpoint behavior, essentially, is that each command (i.e.
task in the SSIS package) is committed upon completion. If a failure occurs the
previous commands are not rolled back since they have already been committed
upon completion.

Let's wrap up this discussion with a simple example to demonstrate the restart at the
point of failure behavior of Checkpoints. We have an SSIS package with Checkpoint
processing setup to restart at the point of failure as described above. The package
has two Execute SQL tasks where the first will succeed and the second will fail. We
will see the following output when running the package in BIDS:

Task 1 is green; it executed successfully. Task 2 is red; it failed. If we run the


package a second time we will see the following output:

Notice that Task 1 is neither green nor red; in fact it was not executed. The package
began execution with Task 2; Task 1 was skipped because it ran successfully the
last time the package was run. The first run ended when Task 2 failed. The second
run demonstrates the restart at the point of failure behavior.

Caveats:

 SSIS does not persist the value of Object variables in the Checkpoint file.
 When you are running an SSIS package that uses Checkpoints, remember that when
you rerun the package after a failure, the values of package variables will be restored
to what they were when the package failed. If you make any changes to package
configuration values the package will not pickup these changes in a restart after
failure. Where the failure is caused by an erroneous package configuration value,
correct the value and remove the Checkpoint file before you rerun the package.
 For a Data Flow task you set the FailPackageOnFailure or FailParentOnFailure
properties to True as discussed above. However, there is no restart capability for the
tasks inside of the Data Flow; in other words you can restart the package at the Data
Flow task but you cannot restart within the Data Flow task.

Breakpoints in SSIS
A breakpoint is an intentional stop marked in the code of an application where execution pauses for
debugging. This allows the programmer to inspect the internal state of the application at that point.
When we developing the package in ssis we need to test and troubleshoot issue. It is helpful to know
the status of the data at certain points in the executing of the package.

 In other word we can say that using the Breakpoints we debug the SSIS package, view the value of
the variables. It enables us to stop a package during the execution and view the status of these items.
We can see the value of variable immediately before or after execution of the task.

Breakpoints we can set it on the package or control flow task or container.

Creating Simple package

Here I am taking a simple example. Taking a For loop container.

Open SSDT.

Take For Loop container.

Creating a variable Count and setting the value 1.

  
Now we need to configure the for Loop properties.

  

Click ok.

Now I am taking script task in the For loop container to display the values.

  
Assign the values

Edit Script

  

Writing simple code to display the count value.


Now package is ready to execute. I am executing the package.

  

Package executed successfully.

Let’s see how we use Breakpoints  


For using the breakpoints we need to right Click on the package and click on the Edit Breakpoints.

After clicking on the Edit Breakpoints. A window will be opened.


We need to check the check box for which event we want to see the value.

Each option in the Set Breakpoints window will stop the package execution at a different point during

 The task:

 OnPreExecute—Just before the task executes


 OnPostExecute—Directly after the task completes
 OnError—When an error occurs in the task
 OnWarning—When a warning occurs in the task
 OnInformation—When the task provides information
 OnTaskFailed—When the task fails
 OnProgress—To update progress on task execution
 OnQueryCancel—When the task can cancel execution
 OnVariableValueChanged—When the value of a variable changes (the
RaiseChangedEventProperty of the variable must be set to true)
 OnCustomEvent—When the custom task-defined events occur
 Loop Iteration—At the beginning of each loop cycle

The most commonly used events in breakpoints are OnPreExecute, OnPostExecute, OnError,

OnWarning and Loop Iteration.

The other properties in the Set Breakpoints window are Hit Count and Hit Count Type. These
properties

Four Hit Count Types exist:


 Always—the breakpoint stops the package every time the breakpoint fires.
 Hit Count Equals—the breakpoint stops the package when the breakpoint fires the number
Of times listed in Hit Count.
 Hit Count Greater than or Equal to—the breakpoint stops the package when the breakpoint
Reaches the number listed in Hit Count and every time afterwards.
 Hit Count Multiple—the breakpoint stops the package when the breakpoint reaches the
number listed in Hit Count and every multiple of the Hit Count number; a Hit Count of 2 stops the
package on every other breakpoint event.

Here I am selecting Break at the beginning of every iteration of the look.

Hit count Type: - as we discuss above you can select the hit count type and hit count as per your
need.

Now I am executing the package.


For seeing the result need to open Locals window.

Click on the Local Window


To keep continue click on the

We see the next value

Above example I show how to use breakpoints on the Container. Similarly we set the break points on
Script task as well as on Control flow and Data flow task.

Setting Breakpoints on script task.

Setting Breakpoints on control flow


Click on the anywhere in the control flow and right click and select the Edit Breakpoints

Select the breakpoint Condition.

Click OK.

  

On Control flow you see the Red circle.

For viewing the value on execution time on the Data flow pan we are using Data viewer.
Will my package run successfully by using SQL Server Agent if I have data viewers and
Breakpoint enabled?

Breakpoints and Data Viewers are only artifacts that have meaning within the debugger.
If running your package from SQL Agent fails, then there's a whole host of things that
could be wrong, generally permission related, but a data viewer or a breakpoint will not
be one of them.

What are different ways to execute your SSIS Package?

This is short post to answer one of the interview questions " What are different ways to
execute SSIS Package"?

SSIS Package can be executed by multiple ways, here are some of them.

1) By using BIDS/ SSDT


You can run your SSIS Package by using Business Intelligence Development Studio
( BIDS that is available for SSIS 2005,2008/R2) or you can use SQL Server Data Tools (
SSIS 2012 and SSIS 2014). These tools you will be using while developing your SSIS
Packages, testing and debugging.

2) DtExecUI
Execute Package Utility (DtExecUI) is graphical interface to run the SSIS Packages. The
Utility can run packages from different locations such as MS SQL Server Database,SSIS
Package Stored or packages stored in file system.

When you connect to SSIS Instance by using SSMS and then run the package , it
initiates DtExecUI. The graphical interface provide you different options to change the
values of variables , connection mangers etc.

If your packages are stored in file system task and you double click the .dtsx file, it
opens with DtexecUI. It is stand alone utility.

DtexecUI can be started from command line as well.

3) Dtexec.exe
Dtexec.exe is command line way to run your package. You have to provide information
such as package path to run the package from command line. You can also provide the
values of variables or Connection managers from command line to run the package with
specific requirements.
4) SQL Server Agent Job
SQL Server Agent can be used to create job that can run the SSIS Package on demand
or schedule. The SQL Server Agent Job can be single Step calling a SSIS Package or it
can consist of multiple steps calling more than one SSIS Packages. In most of the
companies the packages are scheduled by using SQL Server Agent. SQL Server agent
can access the packages those are stored in SQL Server or from folder storage.

5) Windows Scheduler or Any third party Scheduler


Once way is to create a batch file in which you use dtexec.exe with required parameters.
The batch job can be executed by Windows Scheduler or any third party scheduler.

6) Run SSIS Package Programmatically


You can run the package programmatically, here is link that can provide the code.

I have written as post how to run SSIS Package from Excel.You can check here.  What I
am doing in that post, I am calling dtexec.exe to execute SSIS Package on button click
that I created in Excel by using VBA. You can use any program of your choice that can
start dtexec.exe to run your SSIS Package, that can be custom application.

Can I run a SSIS Package by using a Stored Procedure?

Executing a SSIS Package from Stored Procedure

There are basically two ways you can execute your SSIS package from the user stored
procedure. First, you can create a SQL Server Agent job that would have a step to
execute the SSIS package, and wherever you want to execute it, you can use
sp_start_job system stored procedure to kick off the job, which in turn will execute the
package. You might not have any schedule attached to the job if you just want to
execute it from the user stored procedure, when you don't want your job to be kicked off
(SSIS package to be executed) on a defined schedule time/interval. The disadvantage of
using this approach is you don't have an easy way if you have to pass some values
(assign values to SSIS package variable) at run time. You might need to have a
metadata table from where your SSIS package will consume/retrieve the runtime values.

Second, you can enable xp_cmdshell extended stored procedure, and using it you can
execute DTEXEC utility to execute your SSIS package. The disadvantage of using this
approach is that enablement of xp_cmdshell poses security threats (operating system
level access) and hence by default it's disabled. However using this approach provides
finer level control of passing SSIS package variables' runtime values easily. In this
article, I am assuming the first approach is simple and straightforward and thus I will
jump directly to the second approach.
Please make sure you have enabled xp_cmdshell component or extended stored
procedure, or else you will get an exception like this:

Msg 15281, Level 16, State 1, Procedure xp_cmdshell, Line 1


SQL Server blocked access to procedure 'sys.xp_cmdshell' of component 'xp_cmdshell'
because this component is turned off as part of the security configuration for this server.
A system administrator can enable the use of 'xp_cmdshell' by using sp_configure. For
more information about enabling 'xp_cmdshell', see "Surface Area Configuration" in SQL
Server Books Online.

Enabling xp_cmdshell

To enable xp_cmdshell you need to use sp_configure system stored procedure with
advance options like this:

Enabling xp_cmdshell component

USE master
GO
-- To allow advanced options to be changed.
EXEC sp_configure 'show advanced options', 1
GO
-- To update the currently configured value for advanced options.
-- WITH OVERRIDE disables the configuration value checking if the value is valid
RECONFIGURE WITH OVERRIDE
GO
-- To enable the xp_cmdshell component.
EXEC sp_configure 'xp_cmdshell', 1
GO
RECONFIGURE WITH OVERRIDE
GO
-- Revert back the advance option
EXEC sp_configure 'show advanced options', 0
GO
RECONFIGURE WITH OVERRIDE
GO
Executing a SSIS package stored in SQL Server from the user stored procedure

Once xp_cmdshell extended stored procedure is enabled, you can execute any operating
system command as a command string. In our case, we will be using DTEXEC utility of
SSIS to execute the package. Since the SSIS package is stored in SQL Server, we need
to use /SQL switch with DTEXEC command and /SET switch to set the values of SSIS
variables as shown below:

DTEXEC calls SSIS package from SQL Server

DECLARE @SQLQuery AS VARCHAR(2000)

DECLARE @ServerName VARCHAR(200) = 'ARSHAD-LAPPY'

SET @SQLQuery = 'DTExec /SQL ^"\DataTransfer^" '


SET @SQLQuery = @SQLQuery + ' /SET \Package.Variables[ServerName].Value;^"'+
@ServerName + '^"'
EXEC master..xp_cmdshell @SQLQuery
GO

Figure 1 - DTEXEC calls SSIS package from SQL Server

Executing a SSIS package stored in file system from the user stored procedure

As our SSIS package is now stored in the file system, we need to use /FILE or /F switch
with DTEXEC command and /SET switch to set the values of SSIS variables as shown
below. You can see that executing the xp_cmdshell extended stored procedure it
outputs, if there is any, results in rows of texts. If you don't want this output to be
returned, you can use second parameter (NO_OUTPUT) of this stored procedure:

DTEXEC calls SSIS package from File System

DECLARE @SQLQuery AS VARCHAR(2000)

DECLARE @ServerName VARCHAR(200) = 'ARSHAD-LAPPY'

SET @SQLQuery = 'DTExec /FILE ^"E:\DataTransfer.dtsx^" '


SET @SQLQuery = @SQLQuery + ' /SET \Package.Variables[ServerName].Value;^"'+
@ServerName + '^"'

EXEC master..xp_cmdshell @SQLQuery


GO
Figure 2 - DTEXEC calls SSIS package from File System

Notes

 xp_cmdshell is an extended stored procedure that lets you execute operating


system commands and returns any output as rows of texts. This extended stored
procedure runs in synchronous mode; that means control will not return until the
command completes. By default only sysadmin can execute this extended stored
procedure but permission can be granted to other users with a proxy account; for
information, click here.
 DTEXEC is a command line utility for SSIS configuration and SSIS package
execution from SQL Server, SSIS service and file system sources. After
execution, this utility returns different exit codes that indicate what happened
during execution; for more information click here.

Conclusion

In this article I talked about executing a SSIS package from user defined stored
procedure. In the first approach, we created a job, making SSIS package call as a job
step and executing it by calling sp_start_job system stored procedure from the user
defined function. In the second approach, we enabled xp_cmdshell to execute DTEXEC
command line utility from the user defined stored procedure.

What types of deployment are available for a SSIS Package? Explain all.

In our previous article we’ve seen an overview of SQL Server Integration Services (SSIS)


and its benefits with a small and simple project. In that article we have also seen something
about Deployment model.
 
In this article we’ll discuss about the deployment models available in SQL Server Integration
Services (SSIS) and we’ll also see the steps needed to deploy packages in both the models.
 
So, let’s begin,
 
Integration Services supports 2 deployment models i.e.

1. Package Deployment Model


2. Project Deployment Model

Prior SSIS 2012, in all versions like SSIS 2005, 2008 or 2008 R2 we had ‘Package
Deployment Model’. With the introduction of SQL Server 2012 or 2014, a new Deployment
model introduced named ‘Project Deployment Model’.
 
Let’s see both deployment models in details.
 

Package Deployment Model


 
SQL Server Integration Service (SSIS) 2005, 2008 and 2008 R2 versions were using this
deployment model.

 In this model, package is the unit of deployment. At a time we can deploy single
package not multiple like in SSIS 2012.

 Packages and configuration saved in the file system. Package with extension ‘.dtsx’
and configuration file with extension ‘.dtsConfig’.

 Packages can be deployed either in File system or under msdb database.

 Packages are validated just before execution. You can also validate a package
with dtExec or managed code.

 Packages are run in a separate window process.

 During execution, events that are produced by a package are not captured
automatically. A log provider must be added to the package to capture events.

 Under this model, package configuration is required for each and every package
under the project.

 When we deploy same package again, it’ll overwrite the old one and due to this
reason there was no way to check the previous history that how much time we’ve
deployed our packages.
The following is the example showing how we can configure our SSIS 2008 created package
for deployment on another computer.
 
In this deployment model, to deploy any packages, we need to go through the following four
steps:

 Step 1: Create Package Configuration File.


 Step 2: Create a deployment utility.
 Step 3: Copy Deployment folder on destination.
 Step 4: Package Installation.

Let’s take a look at each step.


 
Step 1: To create Package Configuration File:
 
In this step, create a package configuration file that updates properties of package elements
at runtime. To create a package configuration file, right click anywhere on the Control Flow
area and select ‘Package Configuration’ as shown below:
 

 
 
On clicking this you’ll get Package Configuration Organizer, check ‘Enable package
configuration’ option and you’ll get Add button to add your configuration as shown below.
 
 
 
Click on the Add button and you’ll get Package configuration Wizard as shown below.
 
 
 
Click on Next and select type of configuration you want in your package.
 
 
 
I’m selecting XML configuration file and will save the file at desired location.
 
 
 
Click on Next and select the properties you want to be exported to the configuration file.
 
  
 
Click on Next and you’ll get a summary of your configuration file.
 
 
 
Click on Finish to finalize your configuration file creation. After doing the above steps, you’ll
get something like below.
 
 
 
Now we’re done with Step 1 where we created the package configuration file.
 
Step 2: To create a deployment utility
 
At this step, we require a package deployment utility for our project, which contains the
package that we want to deploy.
 
To create Deployment Utility, right click on your package and select Properties and you’ll get
the following window:
 
 
 
On this window, set CreateDeploymentUtility to ‘True’ and click on OK.
 
Now build your Project as shown below.
 

  
 
On successful build, you’ll get the following message in the output window.

1. ------ Build started: Project: Project1_BIDS (package deployment mod
el), Configuration: Development------  
2. Build started: SQL Server Integration Services project: Incremental.
..  
3. Creating deployment utility...   
4. Deployment Utility created.   
5. Build complete -- 0 errors, 0 warnings  
6. ========== Build: 1 succeeded or up-to-date, 0 failed, 0 skipped ===
=======  

If you go to \bin folder of your project you’ll find a newly created directory named
‘Deployment’ with some files.
 

 
 
These files are nothing but your configuration file, Manifest file and your packages you want
to deploy.
 
Step 3: Copy Deployment folder
 
Copy your Deployment folder under which you had built your project to the target computer
on which you want to deploy your package.
 
Step 4: Package Installation
 
Install your package with help of the ‘Package Installation Wizard’ to the file system or to
an instance of SQL Server.
 
As we can see, we’ve added 3 packages in our project and if we want to deploy those
packages, we need to do it one-by-one.
 
So, this was some information regarding package deployment model. Now, let’s move to the
next deployment model. 
 

Project Deployment Model


 
The Project Deployment Model was introduced in SQL Server 2012 Integration Services.
This model provides variety of features than Package Deployment Model. 

 In this model, the project is considered as a unit of deployment. This means we can
deploy whole project.

 Projects containing packages and parameters are deployed to SSISDB catalog on


SQL Server instance.

 During execution, events that are produced by the package are captured
automatically and saved in the catalog.

 One disadvantage I found under this deployment model is that, you cannot deploy
one or more packages without deploying the whole project. But SSIS 2016
introduced an Incremental Package Deployment feature that allows you to deploy
one or more packages without deploying the whole project.

 Deployment version or deployment history can be easily maintained in this model.

 When you build a project under this deployment model, it’ll create a deployment file
with .ispac extension as shown below.

The project deployment file shown above is a self contained unit of deployment that includes
only the essential information about the packages and the parameters of the project.
 
Project Deployment Model doesn’t require any package configuration file as we did in
Package Deployment. Here you can build and deploy your package under your catalog.
Let’s deploy our package which we have created under Project deployment model.
 
After successful build, you’ll get ‘Integration Services Project Deployment File (.ispac)’
file under a bin/Deployment folder. Double click on that file and it’ll open Deployment Wizard
as shown below.
 
 
 
Click next and select your Project Deployment File as Source.
 
 
 
 
Click Next to choose Destination where you want to deploy your SSIS packages.
 
I’ve already created my Integration Service Catalog and I’ll deploy my package over there.
 
 
 
 
Click Next and you’ll get the summary of the deployment i.e. Source, Destination, etc.
 
 
 
Click on Deploy to finalize the setup. On successful deployment you’ll get the following
window.
 
 
 
 
We’ve successfully deployed our package in this Model which seems very simple as
compared to Package Deployment Model.
 
Now, if you open Integration Services Catalog under SQL Server, you’ll find your deployed
package as shown below.
 
  
 
So with this, we’re done with Deployment models available in SSIS.

What is the difference between Package Deployment and Project Deployment?

Difference between Package and Project Deployment Model


Below is the deference between package and project deployment model

Project deployment model Package deployment model

Project is deployed Package is deployed

Project uses parameters Package uses Configurations


files/tables

Project is located in an .ispac file Packages have .dtsx extensions.


and packages have .dtsx extensions

Project is deployed to integration Packages are deployed to the MSDB


service catalog. or File.

CLR Integration is required. CLR integration is not required.

New environments in the SSIS System environment variables can


catalog can be used with be used with configuration.
parameters.

Projects and packages can be Packages are validated just before


validated before execution with T- execution and can be validated with
SQL or Managed code. dtexce or managed code.

Packages are executed with T-SQL. Packages are executed with dtexe
Parameters and environments can and dtexecui. Command parameters
be set with T-SQL. can be passed to the command
prompt.

Robust logging is built in with several Logging is built in with no report.


report.

Which version of SSIS can track versions of a SSIS Package deployed to the Server?
To check the package format version you have to open .dtsx file itself . If
PackageFormatVersion>3 then its 2008 when PackageFormatVersion>6 then its 2012
and when PackageFormatVersion>8 then 2014. I have added snapshot it might help you
to understand how to check version in SSIS.

What are the different ways to run your SSIS package on a schedule?

What are Different ways to Execute SSIS Package


This is short post to answer one of the interview questions " What are different ways to
execute SSIS Package"?

SSIS Package can be executed by multiple ways, here are some of them.

1) By using BIDS/ SSDT


You can run your SSIS Package by using Business Intelligence Development Studio
( BIDS that is available for SSIS 2005,2008/R2) or you can use SQL Server Data Tools (
SSIS 2012 and SSIS 2014). These tools you will be using while developing your SSIS
Packages, testing and debugging.

2) DtExecUI
Execute Package Utility (DtExecUI) is graphical interface to run the SSIS Packages. The
Utility can run packages from different locations such as MS SQL Server Database,SSIS
Package Stored or packages stored in file system.

When you connect to SSIS Instance by using SSMS and then run the package , it
initiates DtExecUI. The graphical interface provide you different options to change the
values of variables , connection mangers etc.

If your packages are stored in file system task and you double click the .dtsx file, it
opens with DtexecUI. It is stand alone utility.

DtexecUI can be started from command line as well.

3) Dtexec.exe
Dtexec.exe is command line way to run your package. You have to provide information
such as package path to run the package from command line. You can also provide the
values of variables or Connection managers from command line to run the package with
specific requirements.

4) SQL Server Agent Job


SQL Server Agent can be used to create job that can run the SSIS Package on demand
or schedule. The SQL Server Agent Job can be single Step calling a SSIS Package or it
can consist of multiple steps calling more than one SSIS Packages. In most of the
companies the packages are scheduled by using SQL Server Agent. SQL Server agent
can access the packages those are stored in SQL Server or from folder storage.

5) Windows Scheduler or Any third party Scheduler


Once way is to create a batch file in which you use dtexec.exe with required parameters.
The batch job can be executed by Windows Scheduler or any third party scheduler.

6) Run SSIS Package Programmatically


You can run the package programmatically, here is link that can provide the code.
http://msdn.microsoft.com/en-us/library/ms136090.aspx

I have written as post how to run SSIS Package from Excel.You can check here.  What I
am doing in that post, I am calling dtexec.exe to execute SSIS Package on button click
that I created in Excel by using VBA. You can use any program of your choice that can
start dtexec.exe to run your SSIS Package, that can be custom application.
Let’s say you have configured Event Handler to send an email to report an error for Data
Flow Task inside For Each Loop. If error occurred in a data flow task, you will get multiple
emails. Why is that? Howwe can prevent those series of emails coming for one error?
Scenario:
I have created an SSIS Package and  have configured Event Handler on Package level.
If any error occurs in any Task, I want to Send Email with Error Code , Error Description
etc. It is working great. But each time SSIS Package fails, It send more than one email.
How to get only single email if error occurs in SSIS Package?

Solution:
If the Event Handler is configured on package level OnError Event, If any Task fails then
sequence of calls will be send to Event Handler. Let's say we have Data Flow Task
inside the Sequence container. If Data Flow Task fails then first call will be send to Event
Handler and then second will be Send by Sequence Container. That is the reason Tasks
inside Event Handler will run multiple times in our case Send Email.

To handle this, we will create a variable "ErrorCnt" of Integer type. We will have value 0
for this variable and after Sending Email in Event Handler we will increase the value of
variable from 0 to 1. We will use this variable in Precedence constraint to Send Email
Task or any other task only if the value is 0. As the value will be increased after first call
from 0 to 1. Next time the tasks will not run.

Step 1:
Create an SSIS Package, I have created SSIS Package with two Data Flow Tasks inside
Sequence Container as shown below

Fig 1: Data Flow Tasks inside Sequence Container in SSIS Package


Step 2: 
Create the variables as shown below. I have created different variables those I have
used in Event Handler to Send Email by using Execute SQL Task.
Fig 2: Create Variables in SSIS Package
The one variable we will be using to handle Multiple emails or execution of Tasks is
ErrorCnt, set the value of this variable =0 as shown in Fig 2.

Step 3:
Let's go to Event Handler Pane now and configure. Bring the Script Task to Event
Handler Pane. We will be using the Script Task as dummy. The only purpose is to set
the Precedence Constraint so the below Tasks run only when ErrorCnt=0. I have used
Execute SQL Task to Send an Email. Connect Script Task to Execute SQL Task. The
last Task is going to be Execute SQL Task in which  we are going to change the value of
ErrorCnt variable from 0 to 1.

Fig 3: Event Handler in SSIS Package

In Fig 3, I have showed you that how your Event Handler would look like. Now lets see
how did I configured them. If you see the very first item that I have configured is
Precedence Constraint between Script Task and Execute SQL Task(I am using this to
send an email). I am putting condition ErrorCnt=0, That means it will run only when the
value of ErrorCnt=0. Here is how it is configured
Fig 4: Configuring Precedence Constraint to use ErrorCnt Variable

Let's change the value of ErrorCnt variable from 0 to 1 so multiple calls can be handled.
Fig 5: Set the value of variable ErrorCnt to 1

Fig 6: Save 1 to ErrorCnt variable

You are all done. This sound lengthy package as I had to create some Data Flow Tasks
etc. but it is pretty short and simple. You are creating a variable with value=0 and that is
begin used in Precedence Constraint and let the other tasks run when value is 0. Once
the Tasks are run in Event Handler , you set the value of variable to 1 so on next
iteration, it will make the expression false and no Task will run. If you need to see how to
send email

You might also like