Talend Data Integration Advanced
Talend Data Integration provides an extensible, highly scalable set of tools for
accessing, transforming, and integrating data from any business system. This course
enables you to use the more advanced features of Talend Data Integration as quickly as
possible. Participants can work in teams on projects shared on a remote repository to
monitor Jobs and database changes.
Duration 1 day
(7 hours)
Target Anyone who wants to use Talend Data Integration to perform data integration
audience and management tasks: software developers and development managers
Prerequisites Completion of Talend Data Integration Basics and knowledge of computing,
including familiarity with Java or another programming language, SQL, and
general database concepts
Course After completing this course, you will be able to:
objectives
Start Talend Studio and connect it to a remote repository
Use SVN branches in Studio
Run a Job in Studio or on a remote JobServer
Monitor host CPU and JVM memory in real time during Job execution
Use debugging features in Studio
Configure a Talend project to capture statistics and logs, and monitor
them from Activity Monitoring Console (AMC)
Implement several methods of parallel execution in a Talend Job
Create Joblets
Create a unit test from a working Job
Configure a database to monitor and log changes in a separate change
data capture (CDC) database
Use the CDC database to perform incremental updates between the
source and target
Set up a reference project in order to use items from another project
Course agenda Connecting to a remote repository
Creating a remote connection
SVN in Studio
Copying a Job to a branch
Comparing Jobs
Resetting a branch
Reference project
Setting up and using a reference project
Remote Job execution
Creating and running a Job remotely
Resource usage and basic debugging
Using Memory Run to view real-time resource usage
Debugging Jobs using Debug Run
Activity Monitoring Console (AMC)
Configuring statistics and logging
Using Activity Monitoring Console (AMC)
Parallel execution
Writing large files
Writing to databases
Automatic parallelization
Partitioning
Joblets
Creating a Joblet from an existing Job
Creating a Joblet from scratch
Triggering Joblets
Unit test
Creating a unit test
Change data capture
Examining databases
Configuring the CDC database
Monitoring changes
Updating a warehouse
Resetting the database