SQL Server Integration Services (SSIS) is a platform for data integration and workflow applications. The SSIS architecture includes packages, tasks, containers, variables, connections and event handlers. Packages contain control flow elements, like tasks and containers, that prepare data. Data flow elements in packages extract, transform and load data. The control flow engine manages task execution while the data flow engine moves data between sources and destinations.
Architecture of IntegrationServices
Native Managed
Object Model
Integration Services Runtime
Task
Task Container
Task Task
Data Flow Task
Task
Custom Applications
Command Line Utils
SSIS Designer
SSIS Wizards
Custom
Tasks
Tasks
Log
Providers
Data
Sources
Integration
Services
Service
.dtsx File
msdb
Database
Enumerators
Connection
Managers
Event
Handlers
Integration Services Data Flow
Data Flow Task
Object Model
Source
Transformation
Destination
Source
Transformation
Destination
Custom Data Flow
Components
Data Flow Components
Control Flow Engine
Data Flow Engine
Package
Control Flow
The control flow is the workflow engine that contains control flow tasks,
containers, and precedence constraints, which manage when tasks and
containers execute.
Runtime engine
The Integration Services runtime saves the layout of packages, runs packages,
and provides support for logging, breakpoints, configuration, connections,
and transactions.
Integration Services Service
The Integration Services service lets you use SQL Server Management Studio
to monitor running Integration Services packages and to manage the storage
of packages.
Integration Services Package
A package is an organized collection of connections, control flow elements, data flow
elements, event handlers, variables, and configurations, that you assemble using either
the graphical design tools that SQL Server Integration Services provides, or build
programmatically.
Integration Services Task
Tasks are control flow elements that define units of work that are performed in a package
control flow. An SQL Server Integration Services package is made up of one or more tasks.
If the package contains more than one task, they are connected and sequenced in the
control flow by precedence constraints.
Integration Services Containers
Containers are objects in SQL Server Integration Services that provide structure to
packages and services to tasks. They support repeating control flows in packages, and they
group tasks and containers into meaningful units of work. Containers can include other
containers in addition to tasks.
Integration Services Designer
SSIS Designer is a graphical tool that you can use to create and maintain Integration
Services packages. SSIS Designer is available in Business Intelligence Development Studio
as part of an Integration Services project. SSIS includes additional tools, wizards, and
command prompt utilities for running and managing Integration Services packages.
API or object model
The SSIS object model includes managed application programming interfaces (API) for
creating custom components for use in packages, or custom applications that create, load,
run, and manage packages. Developer can write custom applications or custom tasks or
transformations by using any common language runtime (CLR) compliant language.
Data Flow Engine
The Data Flow task encapsulates the DF engine. DF engine provides the in-memory buffers
that move data from source to destination, and calls the sources that extract data from
files and databases. The DF engine also manages the transformations that modify data,
and the destinations that load data or make data available to other processes.
Data Flow Components
Integration Services data flow components are the sources, transformations, and
destinations that Integration Services includes. You can also include custom components in
a data flow.
Architecture of IntegrationServices Package
Package
Task
Task Container
Task Task
Data Flow Task
Task
Variables
Log
Providers
Data
Sources
Precedence
Constrains
Connection
Managers
Event
Handlers
Data Flow
Source
Transformation
Destination
Source
Transformation
Destination
Control Flow Components
Data Flow Components
Control Flow Elements
The control flow elements—tasks and containers—for building the control flow in a
package. Control flow elements prepare or copy data, interact with other processes, or
implement repeating workflow.
Variables
The variables that can be used in expressions to dynamically update column values and
property expressions, control execution of repeating control flows, and define the
conditions that precedence constraints apply.
Precedence Constrains
Precedence constraints sequence the control flow elements into an ordered control flow
and specify the conditions for executing tasks or containers.
Log Providers
The log providers that support logging of package run-time information such as the start
time and the stop time of the package and its tasks and containers.
Connection Managers
The connection managers that connect to different types of data sources to extract and
load data.
Data Sources
The Data Sources that connect to different types of data sources to extract and load data.
Event Handlers
The event handlers that run in response to the run-time events that packages, tasks, and
containers raise.
Data Flow Components
The data flow components—sources, transformations, and destinations—for building data
flows in a package that extract, transform, and load data. Paths sequence the data flow
components into an ordered data flow.
ETL Framework andLogical Architecture
Extract Data
Load Staging
Extract from
Staging
Transform
Data
Load
Dimensions
Load Facts
Check System
State
Process Cube
Log ETL
Process
Send
Notification
OLTP
ETL Schema
STAGING Schema
DWH Schema
9.
Resources
Architecture of IntegrationServices -http://msdn.microsoft.com/en-us/library/bb522498.aspx
Architecture of an Integration Services Package - http://msdn.microsoft.com/en-
us/library/cc645924.aspx
Editor's Notes
#6 These workflow elements might involve:running a stored proceduremoving a file from an FTP server to a destination foldersend an e-mail message when an error occursPackages also contain connections to data sources and data destinations. You set up these connections to connect to different external systems such as databases, files, File Transfer Protocol (FTP) servers, Simple Mail Transfer Protocol (SMTP) servers, and so on. Connections are used for the SSIS data processing engine (called the data flow) as well as the workflow engine (called the control flow).
#7 The objects that compose an Integration Services package include the following:The package itself The package, the unit of work that is retrieved, executed, and saved, and the most important Integration Services object.For more information, see Integration Services Packages.Control flow elements The control flow elements—tasks and containers—for building the control flow in a package. Control flow elements prepare or copy data, interact with other processes, or implement repeating workflow. Precedence constraints sequence the control flow elements into an ordered control flow and specify the conditions for executing tasks or containers. For more information, see Control Flow Elements.Data flow components The data flow components—sources, transformations, and destinations—for building data flows in a package that extract, transform, and load data. Paths sequence the data flow components into an ordered data flow. For more information, see Data Flow Elements.Connection managers The connection managers that connect to different types of data sources to extract and load data. For more information, see Integration Services Connections.Variables The variables that can be used in expressions to dynamically update column values and property expressions, control execution of repeating control flows, and define the conditions that precedence constraints apply. For more information, see Integration Services Variables.Event handlers The event handlers that run in response to the run-time events that packages, tasks, and containers raise. For more information, see Integration Services Event Handlers.Log providers The log providers that support logging of package run-time information such as the start time and the stop time of the package and its tasks and containers. For more information, see Integration Services Log Providers.