KEMBAR78
Data Transformation Manager Guide | PDF | Parameter (Computer Programming) | Computer File
100% found this document useful (1 vote)
950 views7 pages

Data Transformation Manager Guide

The Data Transformation Manager (DTM) process runs sessions in a workflow. It reads session information, expands variables, creates log files, validates data, runs pre- and post-session commands, extracts, transforms and loads data using multiple threads, and allocates memory for sessions. Log files are written by default to specified locations and contain varying detail depending on the set tracing level. Session logs prioritize error severity and codes can determine causes of session problems.

Uploaded by

Vikas Sinha
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
950 views7 pages

Data Transformation Manager Guide

The Data Transformation Manager (DTM) process runs sessions in a workflow. It reads session information, expands variables, creates log files, validates data, runs pre- and post-session commands, extracts, transforms and loads data using multiple threads, and allocates memory for sessions. Log files are written by default to specified locations and contain varying detail depending on the set tracing level. Session logs prioritize error severity and codes can determine causes of session problems.

Uploaded by

Vikas Sinha
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

What is Data Transformation Manager Process?

When the workflow reaches a session, the Load Manager starts the DTM process. The DTM process is the process associated with the session task. The Load Manager creates one DTM process for each session in the workflow. The DTM process performs the following tasks: Reads session information from the repository. Expands the server and session variables and parameters. Creates the session log file. Validates source and target code pages. Verifies connection object permissions. Runs pre-session shell commands, stored procedures and SQL. Creates and runs mapping, reader, writer, and transformation threads to extract, transform, and load data. Runs post-session stored procedures, SQL, and shell commands. Sends post-session email. The DTM allocates process memory for the session and divides it into buffers. This is also known as buffer memory. The default memory allocation is 12,000,000 bytes. The DTM uses multiple threads to process data. The main DTM thread is called the master thread.

Writing to Log Files


The following table shows the default location for each type of log file and the associated service process variables:

Tracing Levels
The amount of detail that logs contain depends on the tracing level that you set.

Message Severity
The Log Events window categorizes workflow and session log events into severity levels. It prioritizes error severity based on the embedded message type.

Session Log Codes


You can use the session log to determine the cause of session problems. You can find session-related server messages in the UNIX server log (default name: pmserver.log) or in the Windows Event Log (viewed with the Event Viewer).

Message code
BLKR CNX CMN DBG DBGR EP ES FR FTP HIER LM NTSERV OBJM

Description
Messages related to reader process, including Application, relational, or flat file. Messages related to the Repository Agent connections. Messages related to databases, memory allocation, Lookup and Joiner transformations, and internal errors. Messages related to PowerCenter Server loading and debugging. Messages related to the Debugger. Messages related to external procedures. Messages related to the Repository Server. Messages related to file sources. Messages related to File Transfer Protocol operations. Messages related to reading XML sources. Messages related to the Load Manager. Messages related to Windows server operations. Messages related to the Repository Agent.

ODL PETL PMF RAPP REP RR SF SORT TE TM TT VAR WRT XMLR XMLW

Messages related to database functions. Messages related to pipeline partitioning. Messages related to caching Aggregator, Rank, Joiner, or Lookup transformations. Messages related to the Repository Agent. Messages related to repository functions. Messages related to relational sources. Messages related to server framework, used by Load Manager and Repository Server Messages related to the Sorter transformation. Messages related to transformations. Messages related to Data Transformation Manager (DTM). Messages related to transformations. Messages related to mapping variables. Messages related to the Writer. Messages related to the XML Reader. Messages related to the XML Writer.

Thread
DIRECTOR MANAGER MAPPING READER_1_1_1 WRITER_1_*_1

List of Error tables created by Informatica


PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding source row. PMERR_MSG. Stores metadata about an error and the error message. PMERR_SESS. Stores metadata about the session. PMERR_TRANS. Stores metadata about the source and transformation ports, such as name and datatype, when a transformation error occurs.

select sess.FOLDER_NAME as 'Folder Name', sess.WORKFLOW_NAME as 'WorkFlow Name', sess.TASK_INST_PATH as 'Session Name', data.SOURCE_ROW_DATA as 'Source Data', msg.ERROR_MSG as 'Error MSG' from ETL_PMERR_SESS sess left outer join ETL_PMERR_DATA data on data.WORKFLOW_RUN_ID = sess.WORKFLOW_RUN_ID and data.SESS_INST_ID = sess.SESS_INST_ID left outer join ETL_PMERR_MSG msg on msg.WORKFLOW_RUN_ID = sess.WORKFLOW_RUN_ID and msg.SESS_INST_ID = sess.SESS_INST_ID where sess.FOLDER_NAME = <Project Folder Name> and sess.WORKFLOW_NAME = <Workflow Name> and

sess.TASK_INST_PATH = <Session Name> and sess.SESS_START_TIME = <Session Run Time>

Informatica Functions Used


We are going to use two functions provided by InformaticaPowerCenter to define our user defined error capture logic. Before we get into the coding lets understand the functions, which we are going to use. 1. ERROR() 2. ABORT() ERROR() : This function Causes the PowerCenter Integration Service to skip a row and issue an error message, which you define. The error message displays in the session log or written to the error log tables based on the error logging type configuration in the session. ABORT() : Stops the session, and issues a specified error message to the session log file or written to the error log tables based on the error logging type configuration in the session. When the PowerCenter Integration Service encounters an ABORT function, it stops transforming data at that row. It processes any rows read before the session aborts. Note : Use the ERROR, ABORT function for both input and output port default values. You might use these functions for input ports to keep null values from passing into a transformation and use for output ports to handle any kind of transformation error. Informatica Implementation For the demonstration lets consider a workflow which loads daily credit card transactions and below two user defined data validation checks

1. Should not load any transaction with 0 (zero) amount, but capture such transactions into error tables 2. Should not process any transactions with out credit card number and Stop the workflow. Mapping Level Changes To handle both the exceptions, lets create an expression transformation and add two variable ports. TEST_TRANS_AMOUNT as Variable Port TEST_CREDIT_CARD_NB as Variable Port Add below expression for both ports. First expression will take care of the user defined data validation check No 1 and second expression will take care of user defined data validation check No 2. TEST_TRANS_AMOUNT :- IIF(TRANS_AMOUNT = 0,ERROR('0 (Zero) Transaction Amount')) TEST_CREDIT_CARD_NB :- IIF(ISNULL(LTRIM(RTRIM(CREDIT_CARD_ND))),ABORT('Empty Credit Card Number'))

Save the workflow/session log in Ascii format


Edit your session --> Properties tab --> Click "Write Backward Compatible Session Log File".

List of Values of Row Indicators:

Row Indicator 0 1 2 3 4 5 6 7 8 9

Indicator Significance Insert Update Delete Reject Rolled-back insert Rolled-back update Rolled-back delete Committed insert Committed update Committed delete

Rejected By Writer or target Writer or target Writer or target Writer Writer Writer Writer Writer Writer Writer

Now comes the Column Data values followed by their Column Indicators, that determines the data quality of the corresponding Column.

List of Values of Column Indicators:


Column Indicator Type of data Writer Treats As

Valid data or Good Data.

Writer passes it to the target database. The target accepts it unless a database error

occurs, such as finding a duplicate key while inserting. Numeric data exceeded the specified precision or scale for the column. Bad data, if you configured the mapping target to reject overflow or truncated data. The column contains a null value. Good data. Writer passes it to the target, which rejects it if the target database does not accept null values. String data exceeded a specified precision for the column, so the Integration Service truncated it. Bad data, if you configured the mapping target to reject overflow or truncated data.

Overflowed Numeric Data.

Null Value.

Truncated String Data.

Session Variable
$PMSessionName EndTime ErrorCode ErrorMsg FirstErrorCode FirstErrorMsg PrevTaskStatus SrcFailedRows SrcSuccessRows StartTime Status TgtFailedRows TgtSuccessRows TotalTransErrors

User-Defined Session Parameters


Parameter Naming Description Type Convention Session Log File $PMSessionLogFile Defines the name of the session log between session runs. Number of $DynamicPartitionCount Defines the number of partitions for a session. Partitions Source File* $InputFileName Defines a source file name. Lookup File* $LookupFileName Defines a lookup file name. Target File* $OutputFileNames Defines a target file name. Reject File* $BadFileName Defines a reject file name. Database $DBConnectionName Defines a relational database connection for a source, target, Connection* lookup, or stored procedure. External Loader $LoaderConnectionName Defines external loader connections. Connection* FTP Connection* $FTPConnectionName Defines FTP connections. Queue $QueueConnectionName Defines database connections for message queues. Connection* Source or Target $AppConnectionName Defines connections to source and target applications.

Application Connection* General Session $ParamName Parameter*

Defines any other session property. For example, you can use this parameter to define a table owner name, table name prefix, FTP file or directory name, lookup cache file name prefix, or email address. You can use this parameter to define source, lookup, target, and reject file names, but not the session log file name or database connections. * Define the parameter name using the appropriate prefix, for example, $InputFileMySource.

Built-in Session Parameters


Parameter Type Naming Convention Folder name $PMFolderName Integration Service $PMIntegrationServiceName name Mapping name $PMMappingName Repository Service $PMRepositoryServiceName name Repository user $PMRepositoryUserName name Session name $PMSessionName Session run mode $PMSessionRunMode Description Returns the folder name. Returns the Integration Service name. Returns the mapping name. Returns the Repository Service name. Returns the repository user name.

Returns the session name. Returns the session run mode (normal or recovery). Source number of $PMSourceQualifierName@numAffectedRows Returns the number of rows the affected rows* Integration Service successfully read from the named Source Qualifier. Source number of $PMSourceQualifierName@numAppliedRows Returns the number of rows the applied rows* Integration Service successfully read from the named Source Qualifier. Source number of $PMSourceQualifierName@numRejectedRows Returns the number of rows the rejected rows* Integration Service dropped when reading from the named Source Qualifier. Source table $PMSourceName@TableName Returns the table name for the named name* source instance. Target number of $PMTargetName@numAffectedRows Returns the number of rows affected by affected rows* the specified operation for the named target instance. Target number of $PMTargetName@numAppliedRows Returns the number of rows the applied rows* Integration Service successfully applied to the named target instance. Target number of $PMTargetName@numRejectedRows Returns the number of rows the rejected rows* Integration Service rejected when writing to the named target instance. Target table $PMTargetName@TableName Returns the table name for the named name* target instance. Workflow name $PMWorkflowName Returns the workflow name. Workflow run ID $PMWorkflowRunId Returns the workflow run ID. Workflow run $PMWorkflowRunInstanceName Returns the workflow run instance instance name name. * Define the parameter name using the appropriate prefix and suffix. For example, for a source instance named Customers, the parameter for source table name is $PMCustomers@TableName. If the Source Qualifier is named SQ_Customers, the parameter for source number of affected rows is $PMSQ_Customers@numAffectedRows.

You might also like