User Guide Autosys
User Guide Autosys
User Guide
4.0
This documentation and related computer software program (hereinafter referred to as the Documentation) is for the end users informational purposes only and is subject to change or withdrawal by Computer Associates International, Inc. (CA) at any time. This documentation may not be copied, transferred, reproduced, disclosed or duplicated, in whole or in part, without the prior written consent of CA. This documentation is proprietary information of CA and protected by the copyright laws of the United States and international treaties. Notwithstanding the foregoing, licensed users may print a reasonable number of copies of this documentation for their own internal use, provided that all CA copyright notices and legends are affixed to each reproduced copy. Only authorized employees, consultants, or agents of the user who are bound by the confidentiality provisions of the license for the software are permitted to have access to such copies. This right to print copies is limited to the period during which the license for the product remains in full force and effect. Should the license terminate for any reason, it shall be the users responsibility to return to CA the reproduced copies or to certify to CA that same have been destroyed. To the extent permitted by applicable law, CA provides this documentation as is without warranty of any kind, including without limitation, any implied warranties of merchantability, fitness for a particular purpose or noninfringement. In no event will CA be liable to the end user or any third party for any loss or damage, direct or indirect, from the use of this documentation, including without limitation, lost profits, business interruption, goodwill, or lost data, even if CA is expressly advised of such loss or damage. The use of any product referenced in this documentation and this documentation is governed by the end users applicable license agreement. The manufacturer of this documentation is Computer Associates International, Inc. Provided with Restricted Rights as set forth in 48 C.F.R. Section 12.212, 48 C.F.R. Sections 52.227-19(c)(1) and (2) or DFARS Section 252.227-7013(c)(1)(ii) or applicable successor provisions.
Contents
Event Processor .......................................................................... 15 High-Availability Option: Shadow Event processor...................................... 16 Remote Agent
........................................................................... 16 ............................................................... 17
Explanation .......................................................................... 18 Interface Components .................................................................... 19 AutoSys Machines .......................................................................... 110 AutoSys Instance Alarms
........................................................................... 110
Contents
iii
25 28 28 29
.......................................................................
User Authentication.................................................................. 29 Event processor Authentication ...................................................... 210 AutoSys User and Database Administrator Passwords ..................................... 210 AutoSys Job-Level Security.................................................................. 211 AutoSys Job Ownership ................................................................. 211 AutoSys User Types
.................................................................... ..............................................................
AutoSys Permission Types Granting Permissions AutoSys Superuser Privileges Edit Superuser
...............................................................
.........................................................................
Exec Superuser ......................................................................... 216 Restricting Access to AutoSys Jobs ........................................................... 217 Remote Agent Security .................................................................. 218
iv
User Guide
Job States and Status ......................................................................... 38 Example State Diagram: Simple Jobs ...................................................... 310 Example State Diagram: Box Jobs ......................................................... 312 Starting Parameters ......................................................................... 314 Starting Parameters and Boxes
........................................................... 314
Date/Time Dependencies ................................................................ 315 TZ Environment Variable ............................................................ 315 Custom Calendars ................................................................... 316 Job Dependencies Related to Job Status Cross-Instance Job Dependencies
................................................... 316 ..................................................... 318
Event processors ........................................................................ 319 Event Servers ........................................................................... 320 Example Job Dependencies ........................................................... 320 Managing Job Status ................................................................. 322 Job Dependencies Based on Exit Codes .................................................... 323 Using Exit Codes and Batch Files with Jobs Running On Windows ....................... 324 Job Dependencies Based On Global Variables .............................................. 325 Job Run Numbers and Names................................................................ 326 Defining Jobs in AutoSys .................................................................... 327 AutoSys Graphical User Interface Components
............................................ 327
Essential Job Attributes ....................................................................... 44 Attributes Common to All Job Types ....................................................... 44 Job Name ............................................................................ 44 Job Type............................................................................. 44 Job Owner ........................................................................... 44 Command Jobs Attributes
................................................................ 45
Contents
File Watcher Job Attributes ............................................................... 48 Machine to Run On File to Watch For Optional Job Attributes
..................................................................
48 48 49
....................................................................
Common Job Starting Attributes .......................................................... 49 Start Date /Time Dependence......................................................... 49 Days of the Week .................................................................... 49 Days to Run through a Custom Calendar
............................................. ........................................
410 410
Specific Times of Day to Run......................................................... 410 Time of Day Not to Run ............................................................. 411 Specific Times Every Hour to Run .................................................... 411 Job Dependencies (Starting Conditions) ............................................... 412 Common General Attributes ............................................................. 412 Description......................................................................... 412 Box Name .......................................................................... 413 Minimum Runtime Alarm ........................................................... 413 Maximum Runtime Alarm ........................................................... 414 Terminate Due to Runtime........................................................... 414 Send Alarm if the Job Fails ........................................................... 415 Terminate the Box if the Job Fails ..................................................... 415 Terminate the Job if the Box Fails ..................................................... 415 Number of Times to Restart a Job
....................................................
416
Time Zone for Job................................................................... 416 Delete Job After Completion ......................................................... 417 Autohold
..........................................................................
417
Permissions ........................................................................ 418 Command Job Attributes ................................................................ 419 Profile ............................................................................. 419 Redirection of the Standard Input File Redirection of the Standard Error File Queue Priority
................................................
vi
User Guide
Job Overrides ....................................................................... 423 Maximum Exit Code for Success ...................................................... 423 Average Runtimes................................................................... 424 Heartbeat-Interval ................................................................... 424 Resource Check: File Space ........................................................... 425 File Watcher Job Attributes
.............................................................. 425
Watch File Minimum Size ............................................................ 425 Watch Interval ...................................................................... 426 Resource Check: File Space ........................................................... 426 Box Job Attributes ....................................................................... 427 Box Successful Completion ........................................................... 427 Box Failure
......................................................................... 427
Date and Time Attributes and Time Changes .................................................. 428 The Time Change ....................................................................... 428 AutoSys Behavior During Time Change ................................................... 429 Spring Time Change ................................................................. 430 Fall Time Change
................................................................... 432
When You Should Not Use a Box .......................................................... 52 What Happens When a Box Runs .......................................................... 53 Simple Box Job ....................................................................... 54 Box Job Attributes and Terminators
........................................................... 55
Attributes in a Box Job Definition .......................................................... 55 Example of a Non-Default Success Condition ........................................... 55 Attributes in a Job Definition .............................................................. 56 Time Conditions in a Box ................................................................. 57 Force Starting Jobs in a Box Examples
............................................................... 58 ................................................. 59
How Job Status Changes Affect Box Status Advanced Conditions in Box Jobs
Contents
vii
.....................................................
513
Using the Box Terminator Attribute ...................................................... 514 Using the Job Terminator Attribute ....................................................... 515 Advanced Job Streams .................................................................. 516 Scenario On the First of the Month Scenario On the Second of Month Scenario II on First of the Month
...................................................
....................................................
63 63
Date/Time Options Dialog ............................................................... 65 Job Definition Advanced Features Dialog .................................................. 66 Creating a Simple Command Job Creating a File Watcher Job
.............................................................
67 69
..................................................................
Creating a Dependent Command Job......................................................... 613 Creating a Box Job .......................................................................... 615 Changing a Job............................................................................. 617 Setting Time Dependencies.................................................................. 619 Additional Time Setting Features
........................................................
620
Deleting a Job .............................................................................. 621 Deleting a Box Job ...................................................................... 621 Specifying One-Time Job Overrides .......................................................... 622 Setting Job Overrides
...................................................................
624
Customizing the Job Definition GUI.......................................................... 625 Database Connection Time-out Interval ................................................... 626 Job Definition Title Bar Text and Icon Text ................................................ 626
viii
User Guide
Rule 1 ............................................................................... 71 Rule 2 ............................................................................... 72 Rule 3 ............................................................................... 72 Rule 4 ............................................................................... 72 Rule 5 ............................................................................... 72 Rule 6 ............................................................................... 72 Rule 7 ............................................................................... 73 JIL Sub-commands ....................................................................... 73 Submitting Job Definitions ................................................................ 74 Running JIL
............................................................................. 75
Creating a Simple Command Job .............................................................. 75 Creating a File Watcher Job ................................................................... 76 Creating a Dependent Command Job .......................................................... 77 Creating a Box ............................................................................... 78 Adding Machines ............................................................................ 79 Changing a Job ............................................................................. 710 Setting Time Dependencies .................................................................. 711 Additional Time Setting Features ......................................................... 711 Deleting a Job
.............................................................................. 713
Setting Job Overrides .................................................................... 715 Example JIL Script .......................................................................... 716
Contents
ix
Edit Menu
.......................................................................... .........................................................................
87 88
Tools Menu
Options Menu ....................................................................... 89 Calendar Display ....................................................................... 810 Date States ......................................................................... 810 Navigation Controls .................................................................... 811 Shift Months Area Skip Button Area
..................................................................
................................................................... ...........................................................
Dates Prior to Todays Date.............................................................. 813 Calendar Selection Dialog ................................................................... 814 Term Calendar Rule Dialog Rule Specification
.................................................................
815 816
......................................................................
Action Area ........................................................................ 816 Date Range Area.................................................................... 816 Date Selection Rule Area
............................................................ ........................................................
820 820
Example
Control ................................................................................ 821 Term Calendar Viewer ...................................................................... 822 Combining Calendars Printing Calendars
......................................................................
.........................................................................
Import/Export File Name Dialog ............................................................ 825 Importing Calendar Text Files
...........................................................
Exporting Calendars .................................................................... 827 Customizing the Calendar Facility ........................................................... 828 Font Selection Resources Object Color
................................................................
829 829
...........................................................................
User Guide
Print Command ......................................................................... 830 Calendar Title Bar Text and Icon Text ..................................................... 830
Defining Machines to AutoSys ................................................................ 92 Specifying Machine Load (max_load) ...................................................... 93 Job Attributes and Load Balancing and Queuing Using max_load and factor Defining a Real Machine
........................................ 94
Deleting Real Machines ................................................................... 97 Defining a Virtual Machine ................................................................... 98 Deleting Virtual Machines Force Starting Jobs
................................................................ 99
Queuing Jobs ............................................................................... 916 Queuing and Simple Load Limiting....................................................... 916 Queuing with Priority ................................................................... 918 SubsetsIndividual Queues ............................................................. 920 Load Units and Virtual Machines ..................................................... 921 Multiple Machine Queues
............................................................... 921
Contents
xi
102 103
Operator Console Screens ................................................................... 104 Starting the Operator Console ............................................................... 104 Job Activity Console ........................................................................ 105 Menu Bar .............................................................................. 106 Job List ................................................................................ 106 Currently Selected Job
..................................................................
108
Starting Conditions ..................................................................... 109 Reports ............................................................................... 1010 Control Area .......................................................................... 1011 Action Buttons
....................................................................
1011
Send Event Dialog ................................................................. 1012 Canceling a Sent Event ............................................................. 1014 Control Buttons.................................................................... 1015 Job Path (History) Dialog ........................................................... 1017 Alarm Button...................................................................... 1017 Exit Button ........................................................................ 1018 Resizing Regions of the Job Activity Console ............................................. 1018 Job Selection Dialog
....................................................................... ..............................................................
1019 1020
Specifying Jobs by Status ............................................................... 1021 Specifying Jobs by Machine............................................................. 1021 Selecting Machines................................................................. 1022 Sorting the Specified Jobs
.............................................................. ........................................................ ....................................................
....................................................................
Alarm Manager Menu Bar .............................................................. 1026 Alarm List ............................................................................ 1027 Currently Selected Alarm
..............................................................
1028
xii
User Guide
................................................................... 1029
New Alarm Button ................................................................. 1029 Registering Responses and Changing Alarm States .................................... 1030 Alarm Selection Dialog ..................................................................... 1031 Select by Type Select by State
......................................................................... 1032 ......................................................................... 1032
Select by Time ......................................................................... 1033 Customizing the Operator Console .......................................................... 1034 Refresh Time Interval
.................................................................. 1035 ............................................................... 1035
Changing Fonts ........................................................................ 1035 Freeze Frame at Start Up ................................................................ 1036 Font Selection Resources ................................................................ 1036 Label Font Resources ................................................................... 1036 List Font Resources
.................................................................... 1036
Object Color ........................................................................... 1037 Currently Selected Job Name Field ................................................... 1037 Background Color of Variable Fields ................................................. 1037 Border Colors ...................................................................... 1037 Primary Interface Color ............................................................. 1038 Toggle Button Color ................................................................ 1038 Job List Column Widths
................................................................ 1038
Atomic Condition Fields ............................................................ 1038 Operator Console Size .................................................................. 1039 Default Report Type .................................................................... 1039 Alarm List Column Width .............................................................. 1039 Operator Console Title Bar Text and Icon Text ............................................ 1040 User-Configurable Action Buttons ....................................................... 1040 Accessing InfoReports from the Operator Console ..................................... 1041 Configuring InfoReports Viewer for Printing
......................................... 1042
Contents
xiii
114
Chapter Organization ................................................................... 114 Essential Monitor/Report Attributes ......................................................... 115 Common Essential AttributesGeneral
..................................................
115
Monitor/Report Name .............................................................. 115 Mode .............................................................................. 115 Common Essential AttributesEvents Alarms
...................................................
115 116
All Job Status Events ................................................................ 117 Individual Job Status Events ......................................................... 117 Job Filter ........................................................................... 117 Essential Report Attributes .............................................................. 118 Current Run Only
..................................................................
118
Events After a Certain Date/Time .................................................... 118 Optional Monitor/Report Attributes ......................................................... 119 Optional Monitor Attributes ............................................................. 119 Sound
.............................................................................
119
Verification Required for Alarms .................................................... 1110 Optional Report Attributes ............................................................. 1110 Defining Monitors and Reports Using the GUI ............................................... 1111 The Monitor/Browser Dialog ........................................................... 1111 Defining an Example Monitor and Report Defining a Monitor
...............................................
1113 1113
................................................................
xiv
User Guide
Defining a Report ...................................................................... 1119 Running a Monitor......................................................................... 1120 Customizing the Monitor/Browser .......................................................... 1120 Database Connection Time-Out Interval .................................................. 1121 Monitor/Browser Title Bar Text and Icon Text ............................................ 1121
Starting in Global Auto Hold Mode ................................................... 123 Monitoring the Event Processor Stopping the Event Processor
.......................................................... 124
$AUTOTESTMODE = 1................................................................. 1210 $AUTOTESTMODE = 2................................................................. 1210 AutoSys Maintenance Commands ........................................................... 1211 chase
................................................................................. 1211
AutoSys Database Overview ................................................................ 1216 Event Server Overview ................................................................. 1216 Using Dual Event Server Mode ...................................................... 1217 Database Storage Requirements
..................................................... 1217
Database Architecture .................................................................. 1218 General Database Maintenance.............................................................. 1219 Daily Database Maintenance ............................................................ 1219 DBMaint Script
........................................................................ 1220
Contents
xv
Modifying the DBMaint Script Event Server Rollover Recovery Event Server Crash
......................................................
............................................................
....................................................................
Synchronizing the Event Servers ........................................................ 1223 Improving Database Performance........................................................... 1224 Improving Sybase Database Performance ................................................ 1224 Improving Oracle Database Performance Maintaining Bundled Sybase SQL Servers
................................................
1225 1226
...................................................
Sybase Architecture.................................................................... 1226 Sybase Environment ................................................................... 1227 Default Sybase Users
..................................................................
1227 1228
Starting Sybase ........................................................................ 1229 Stopping Sybase ....................................................................... 1230 Accessing Sybase ...................................................................... 1231 Identifying Processes Connected to the Database Displaying the Database Date and Time Defining a Dump Device
.........................................
.................................................
Sybase Backup Server .............................................................. 1235 Dumping the Database ............................................................. 1237 Loading the Database .............................................................. 1238 Recovering a Bundled Sybase Database .............................................. 1239
..................................................................
xvi
User Guide
EDNumErrors, EDErrTimeInt ........................................................ 138 Machines to Check for Running Event Processors .......................................... 138 EDMachines ........................................................................ 138 Third Machine for Event Processor Contention
............................................ 139
ThirdMachine ....................................................................... 139 Event Processor Log Disk Space ......................................................... 1310 Internal Database Maintenance .......................................................... 1311 DBMaintTime, DBMaintCmd ........................................................ 1311 Event Transfer ......................................................................... 1312 EvtTransferWaitTime ............................................................... 1312 Sendevent Retries ...................................................................... 1312 SendeventMaxRetries ............................................................... 1312 SendeventRetryInterval ............................................................. 1312 Heartbeats
............................................................................ 1313
Check_Heartbeat ................................................................... 1313 Shadow Event Processor Pings .......................................................... 1314 ShadowPingDelay.................................................................. 1314 Remote Agent Log Files Directory ....................................................... 1314 AutoRemoteDir .................................................................... 1314 File Maintenance ....................................................................... 1315 CleanTmpFiles ..................................................................... 1315 RemoteProFiles
.................................................................... 1316 ............................................................ 1317
MaxRestartTrys .................................................................... 1317 Calculating the Wait Time Between Restarts .............................................. 1318 RestartConstant, RestartFactor, MaxRestartWait MachineMethod
...................................... 1318
KILLJOB Signals ....................................................................... 1320 KillSignals ......................................................................... 1320 Port Number for Remote Agent
......................................................... 1321
Contents
xvii
AutoSysAgentSupport
.............................................................
1322
AutoSysAgentDebug............................................................... 1322 Instance Wide Append Parameter ....................................................... 1323 AutoInstWideAppendx Job Starting Interval
............................................................
1323 1324
...................................................................
InetdSleepTime .................................................................... 1324 Unicenter Event Management Integration................................................ 1325 UnicenterEvents ................................................................... 1325 The auto.profile File ....................................................................... 1326 Sample Default auto.profile
............................................................
1327
Remote Agent Database Connection Settings ............................................. 1328 Sybase ............................................................................ 1328 Oracle
............................................................................ ..........................................................
Remote Agent Socket Connection ....................................................... 1329 Running Two AutoSys Versions of Remote Agents Configuring Remote Authentication The /etc/.autostuff File Client-Side Security Library Path
.......................................
........................................................
.......................................................................
svload Requirements
ServerVision Configurations ............................................................ 1336 Monitoring Job Resource Usage ......................................................... 1337 User-Defined Alarm Callbacks Notification Example
.............................................................
1338 1339
..................................................................
142 144
xviii
User Guide
.................................................... 147
Event Server Will Not Start (Bundled Sybase) .............................................. 149 Event Processor Troubleshooting ............................................................ 1410 Event Processor Is Down
............................................................... 1410 .......................................................... 1411
............................................................. 1412
Database Verification ............................................................... 1412 Remote Agent Will Not Start ............................................................ 1413 Remote Agent Will Start - Command Will Not Run
....................................... 1416
Remote Agent Starts, Command RunsNo RUNNING Event Is Sent ....................... 1419 xql Will Not Start (Sybase Only) ......................................................... 1423 Remote Agent Not Found ............................................................... 1423 Jobs Run Twice
........................................................................ 1425 ................................................................ 1426 ................................................. 1426
Appendix A: Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
Definition of Terms
.....................................................................
A2
Related Documentation .................................................................. A3 Job Scheduling for the Enterprise ............................................................. A4 Prerequisites
........................................................................... ....................................................
A4 A5
Stop the Event Processor ................................................................. A5 Configure the AutoSys Machine .......................................................... A5 Set the AutoSysAgentSupport Parameter .............................................. A5 Set the AutoSysAgentDebug Parameter Create the config.EXTERNAL File Example of config.EXTERNAL
............................................... ................................
A6 A6 A7 A8 A9
....................................................
....................................................... ................................................
Contents
xix
License Keys ........................................................................... A10 Restart the Event Processor .............................................................. A10 About asbIII
............................................................................... A11
ASB_PING_INTERVAL ............................................................. A12 ASB_RECV_INTERVAL ............................................................. A13 ASB_TRKARRAY_SIZE ............................................................. A13 ASB_CKP_PATH ................................................................... A13 PRIMARYCCISYSID ................................................................ A13 Bi-Directional Scheduling
............................................................... A14
Running Jobs on AutoSys on Behalf of a Workload Manager ............................ A14 AutoSys and AutoSys Connect Cross-Platform Dependencies
.................................. A16
Job Scheduler Interdependencies ......................................................... A17 Notation for Cross-Platform Job Dependencies ............................................ A18 AutoSys and AutoSys Connect Cross-Platform Dependency Example Naming Conventions for AutoSys Connect Cross-Platform Jobs Agent Job Names and User IDs
................... A18 ............................ A19
Running Jobs on Agent Managed Machines ............................................... A21 Defining Agent Machines to AutoSys..................................................... A22 Job Definition Examples ............................................................. A23 AutoSys Agent Machine in an AutoSys Job Definition...................................... A24 Log and Trace Information .................................................................. A24 AutoSys Connect and AutoSys Agent Job Statuses............................................. A25 Unsupported Attributes for AutoSys Connect or AutoSys Agent Jobs
........................... A26
B2
traceroute ............................................................................... B2
xx
User Guide
ccinet
................................................................................... B3
CCI Command Line Controls ................................................................. B4 cci show................................................................................. B4 cci semashow and cci semaclear X ......................................................... B5 cci shutdown ............................................................................ B5 cci debugon and cci debugoff
............................................................. B6
Contents
xxi
Chapter
Introduction to AutoSys
This guide is for users responsible for defining jobs to AutoSys and monitoring and managing these jobs. It assumes familiarity with the operating system on which jobs will be run, and it assumes that you have already installed and are running AutoSys using the procedures described in the Unicenter AutoSys Job Management for UNIX Installation Guide. AutoSys is an automated job control system for scheduling, monitoring, and reporting. These jobs can reside on any AutoSys-configured machine that is attached to a network. An AutoSys job is any single command, executable, script, or Windows batch file. Each AutoSys job definition contains a variety of qualifying attributes, including the conditions specifying when and where a job should be run. As with most control systems, there are many ways to correctly define and implement jobs. It is likely that the way you utilize AutoSys to address your distributed computing needs will evolve over time. As you become more familiar with both the features of AutoSys and the characteristics of your own jobs, you will also refine your use of AutoSys. However, before you install and use AutoSys, it is important to understand the basic AutoSys system, its components, and how these components work together. This chapter provides a brief overview of AutoSys, its system architecture, and features.
Introduction to AutoSys
11
AutoSys Jobs
AutoSys Jobs
In the AutoSys environment, a job is a single action that can be performed on a valid AutoSys client machine. On UNIX, this action can be any single command or shell script, and on Windows, this action can be any single command, executable, or batch file. In addition, job definitions include a set of qualifying attributes. For information on defining, running, managing, monitoring, and reporting on jobs, see the corresponding chapters in this guide.
Defining Jobs
Using AutoSys utilities, you can define a job by assigning it a name and specifying the attributes that describe its associated behavior. These specifications make up the AutoSys job definition. These are the two methods you can use to create job definitions:
Using the AutoSys Graphical User Interface (GUI). Using the AutoSys Job Information Language (JIL) through a command-line interface.
AutoSys Graphical User Interface The AutoSys GUI allows you to interactively set the attributes that describe when, where, and how a job should run. You create job definitions using the GUI Control Panel and the dialogs you can launch from it. The fields in the GUIs correspond to the AutoSys JIL sub-commands and attributes. In addition, from the GUI Control Panel, you can open applications that allow you to define calendars, monitors, and reports, and allow you to monitor and manage AutoSys jobs.
12
User Guide
AutoSys Jobs
Job Information Language AutoSys JIL is a specification language, with its own syntax, that is used to describe when, where, and how a job should run. When you enter the jil command, you get the jil command prompt, at which you can enter the job definitions one line at a time using this special language. When you exit the jil command-line interface, the job definition is loaded into the AutoSys database. Alternatively, you can enter the definition as a text file and redirect the file to the jil command. In this case, the jil command activates the language processor, interprets the information in the text file, and loads this information in the AutoSys database.
Introduction to AutoSys
13
System Components
System Components
The following are the main AutoSys system components:
In addition, AutoSys provides utilities to help you define, run, and maintain AutoSys instances and jobs. The included utilities are platform-specific; however, all platforms include the AutoSys GUI components and JIL. Both the GUI and JIL allow you to define, manage, monitor, and report on jobs. The following figure illustrates the AutoSys system components in a basic configuration. In addition, this figure illustrates the communication paths between the components.
Event Server
The event server or AutoSys database (the RDBMS) is the data repository for all system information and events as well as all job, monitor, and report definitions. Event server refers to the database where all the AutoSys information, events, and job definitions are stored. Occasionally, the database is called a data server, which actually describes a server instance. That is, it is either a UNIX or Windows process, and it is associated data space (or raw disk storage), that can include multiple databases or tablespaces.
14
User Guide
System Components
Note: The AutoSys database refers to the specific server instance and the autosys database for that instance. Some utilities, such as isql (Sybase), allow you to specify a particular server and database.
High-Availability Option: Dual-Event Servers AutoSys can be configured to run using two databases, or dual-event servers. This feature provides complete redundancy. Therefore, if you lose one event server due to hardware, software, or network problems, AutoSys operations can continue on the second event server without loss of information or functionality. For various reasons, database users often run multiple instances of servers that are unaware of the other servers on the network. When implementing AutoSys, the database can run stand-alone for AutoSys only, or it can be shared with other applications. For more information on using dual-event servers, see Dual Event Servers in the chapter Introduction to AutoSys in the Unicenter AutoSys Job Management for UNIX Installation Guide.
Event Processor
The event processor is the heart of AutoSys; it interprets and processes all the events it reads from the AutoSys database. Sometimes called the event_demon, the event processor is the program, running either as a UNIX process or as a Windows service that actually runs AutoSys. It schedules and starts jobs. After you start it, the event processor continually scans the database for events to be processed. When it finds one, it checks whether the event satisfies the starting conditions for any job in the database. Based on this information, the event processor first determines what actions are to be taken, then instructs the appropriate remote agent process to perform the actions. These actions may be the starting or stopping of jobs, checking for resources, monitoring existing jobs, or initiating corrective procedures.
Introduction to AutoSys
15
System Components
High-Availability Option: Shadow Event processor AutoSys lets you set up a second event processor, called the shadow event processor. This second processor should run on a separate machine to avoid a single point of failure. The shadow event processor remains in an idle mode, receiving periodic messages (pings) from the primary event processor. Basically, these messages indicate that all is well. However, if the primary event processor fails for some reason, the shadow event processor will take over the responsibility of interpreting and processing events. For more information on running a shadow event processor, see Shadow Event processor in the chapter Introduction to AutoSys in the Unicenter AutoSys Job Management for UNIX Installation Guide.
Remote Agent
On a UNIX machine, the remote agent is a temporary process started by the event processor to perform a specific task on a remote (client) machine. On a Windows machine, the remote agent is a Windows service running on a remote (client) machine that is directed by the event processor to perform specific tasks. The remote agent starts the command specified for a given job, sends running and completion information about a task to the event server, then exits. If the remote agent is unable to transfer the information, it waits and tries again until it can successfully communicate with the database.
16
User Guide
System Components
Note: Understanding this example will help you answer many questions that may arise during your experiences with AutoSys.
Note: In this example, the three primary components are shown running on different machines. Typically, the event processor and the event server run on the same machine.
Introduction to AutoSys
17
System Components
Explanation The following numbered steps explain the interactions in the example scenario: 1. From the event server, the event processor reads a new event, which is a STARTJOB event with a start time condition that has been met. Then the event processor reads the appropriate job definition from the database and, based on that definition, determines what action to take. In the example, it runs the rm /tmp/mystuff/* command on WorkStation_2. The event processor communicates with the remote agent on WorkStation_2. As soon as the remote agent receives the instructions from the event processor, the connection between the two processes is dropped. After the connection is dropped, the job will run to completion, even if the event processor stops running. The remote agent performs resource checks, such as ensuring that the minimum specified number of processes are available, then forks a child process that will actually run the specified command. The command completes and exits, and the remote agent captures the commands exit code. The remote agent communicates the event (exit code, status, and so forth) directly to the event server. If the AutoSys database is unavailable for any reason, the remote agent will go into a wait and resend cycle until it can deliver the message.
2.
3.
4. 5.
Only two AutoSys processes need to be running: the event processor and the event server. When these two components are running, AutoSys is fully operational. The remote agent is started on a client machine once per job. As soon as the job ends and the remote agent sends a completion event to the database, the remote agent exits. Note: The remote agent is started on the client machine by the event processor talking to the internet demon (inetd) on the client machine. For this to happen, inetd must also be running. However, since UNIX is responsible for starting this demon, it is not considered an AutoSys process.
18
User Guide
System Components
Interface Components
To define, monitor, and report on jobs, you can use either the AutoSys GUI or AutoSys JIL. In addition, the Operator Console and its dialogs provide a sophisticated method of monitoring AutoSys jobs in real time. This feature lets you view all jobs that are defined to AutoSys, whether or not they are currently active. If you purchased AutoSys/Xpert, you can use the Xpert product to monitor, analyze and forecast (simulate) AutoSys jobs. For more information, see the Unicenter AutoSys Job Management AutoSys/Xpert User Guide.
Introduction to AutoSys
19
AutoSys Machines
AutoSys Machines
From a hardware perspective, the AutoSys architecture is composed of the following two types of machines attached to a network:
Server Machine The AutoSys server is the machine on which the event processor, the event server (database), or both, reside. In a basic configuration, both the event processor and the event server reside on the same machine.
Client Machine The AutoSys client is the machine on which the remote agent software resides, and where AutoSys jobs are to be run. A remote agent must be installed on the server machine, and it can also be installed on separate physical client machines.
AutoSys Instance
An AutoSys instance is one licensed version of AutoSys software running as an AutoSys server with one or more clients, on a single machine or on multiple machines. An AutoSys instance is defined by the instance ID, which is a capitalized three-letter identifier defined by the $AUTOSERV environment variable. An instance uses its own event server and event processor and operates independently of other AutoSys instances. You may want to install multiple AutoSys instances. For example, you may want to have one instance for production and another for development. Multiple instances can run on the same machine, and can schedule jobs on the same machines without interfering or affecting the other instances.
110
User Guide
Events
Events
AutoSys is completely event-driven; that is, for a job to be activated by the event processor, an event must occur on which the job depends. For example, a prerequisite job has completed running successfully or a required file has been received. Events can come from a number of sources, including the following:
Jobs changing states, such as starting, finishing successfully, and so forth. Internal AutoSys verification agents, such as detected errors. Events sent with the sendevent command, sent from the Send Event dialog, the command line, or user applications.
As each event is processed, the event processor scans the database for jobs that are dependent on that event in some way. If the event satisfies another jobs starting condition, that job is either started immediately, or if necessary, queued for the next qualified and available machine. The completion of one job can cause another job to be started, and in this way, jobs progress in a controlled sequence.
Introduction to AutoSys
111
Alarms
Alarms
Alarms are special events that notify operations personnel of situations requiring their attention. Alarms are integral to the automated use of AutoSys. That is, jobs can be scheduled to run based on a number of conditions, but some facility is necessary for addressing incidents that require manual intervention. For example, a set of jobs could be dependent on the arrival of a file, and the file is long overdue. It is important that someone investigates the situation, make a decision, and resolve the problem. These are some important aspects of alarms:
Alarms are informational only. Any action to be taken due to a problem is initiated by a separate action event. Alarms are system messages about a detected problem. Alarms are sent through the system as an event.
Alarms have special monitoring features to ensure they will be noticed. For more information about these features, see the chapters The Operator Console and Monitoring and Reporting Jobs, in this guide.
Utilities
To help you define, control, and report on jobs, AutoSys has its own specification language called Job Information Language, or JIL, for defining jobs, machines, monitors, and reports. This language is processed by the jil command, which reads and interprets the JIL statements that you enter and then performs the appropriate actions, such as adding a new job definition to the database. AutoSys also provides a set of commands that run essential utility programs for defining, controlling, and reporting on jobs. For example, the autorep command allows you to generate a variety of reports about job execution, and the sendevent command allows you to manually control job processing. Additional utility programs are provided to assist you in troubleshooting, running monitors and browsers, and starting and stopping AutoSys and its components. AutoSys also provides a database maintenance utility that runs daily by default.
112
User Guide
Introduction to AutoSys
113
Explanation
1. 2. The event processor scans the event server for the next event to process. If no event is ready, the event processor scans again in five seconds. The event processor reads from the event server that an event is ready. If the event is a STARTJOB event, the job definition and attributes are retrieved from the Event Server, including the command and the pointer (full path name on the client machine) to the profile file to be used for the job. In addition, for jobs running on Windows machines, the event processor retrieves from the database the user IDs and passwords required to run the job on the client machine. The event processor processes the event. If the event is a STARTJOB, the event processor attempts to establish a connection with the remote agent on the client machine, and passes the job attributes to the client machine. The event processor sends a CHANGE_STATUS event marking in the event server that the job is in STARTING state. 4. On a UNIX machine, the inetd invokes the remote agent. On a Windows machine, the remote agent logs onto the machine as the user defined as the jobs owner, using the user IDs and passwords passed to it from the event processor. The remote agent sends an acknowledgment back to the event processor indicating that it has received the job parameters. The socket connection is terminated. At this point, the event processor resumes scanning the event server database, looking for events to process. The remote agent starts a process and executes the command in the job definition. The remote agent issues a CHANGE_STATUS event marking in the event server that the job is in RUNNING state. The client job process runs to completion, then returns an exit code to the remote agent and quits.
3.
5.
6. 7. 8.
114
User Guide
9.
The remote agent sends the event server a CHANGE_STATUS event corresponding to the completion status of the job and passes back an exit code, using the communications facilities of the database. If the return status is SUCCESS, the remote agent deletes the log file in its temporary file directory (usually tmp) on the client machine (if so specified in the AutoSys configuration file on UNIX or with the AutoSys Administrator on Windows). The remote agent quits.
The event processor, which is scanning the event server, sees the process completion status, determines if there are dependent jobs, and evaluates the rest of the dependent jobs starting conditions. For each job found whose remaining conditions are satisfied, the event processor sends a STARTJOB command to the event server, which it will then process in the next cycle.
Introduction to AutoSys
115
116
User Guide
Chapter
AutoSys Security
This chapter describes AutoSys security. To set up AutoSys correctly, you should understand the security features that control where and by who a job can be edited or executed. If you are installing AutoSys on both UNIX and Windows, you must understand how security is implemented on both systems. For information about AutoSys security on Windows, see the chapter AutoSys Security in the Unicenter AutoSys Job Management for Windows User Guide.
AutoSys Security
21
Overview
Overview
AutoSys security includes:
System-level security Job-level security Superuser privileges UNIX and Windows file permissions (See Restricting Access to AutoSys Jobs in this chapter.)
AutoSys security is initiated when either a user sends events that affect the running of a job or the event processor sends events that affect a job.
22
User Guide
Overview
AutoSys Security
23
Overview
If you start a job by sending an event, the job permissions are checked as shown in the following figure.
The previous figure shows how AutoSys checks for the following when a user starts a job by sending an event: 1. 2. 3. Checks the database to determine if the job definition was tampered with. If so, the job definition is invalid, and the job is not run. Does the user match the owner as indicated in the job definition? Is the user the exec superuser as defined with autosys_secure?
24
User Guide
Overview
4. 5. 6. 7.
Does the user have job execute permissions as indicated in the job definition? Is there a machine name in the owner value of the job definition? The edit superuser can remove this portion of the owner. Does the machine portion of the user logon match the job owner machine portion? Does the job have machine permission as indicated by the job definition?
AutoSys Security
25
Overview
The following figure shows the permissions and security checks that occur on a UNIX machine before a job is allowed to start on the machine.
Note: In the figure, an asterisk indicates checks that are made only if the specific method of remote authentication is enabled (see Remote Agent Authentication in this chapter).
26
User Guide
Overview
The previous figure shows how AutoSys checks for the following when the event processor sends a STARTJOB event to a remote agent machine: 1. 2. 3. 4. 5. Checks the database to determine if the job definition was tampered with. If so, the job definition is invalid, and the job is not run. Checks the DES encrypted job definition to determine if the event processor can connect to the remote agent machine. Does the user who is defined as the job owner (user@machine) have a logon account on the remote agent machine? If user authentication is enabled, is the user a trusted user (as defined in the /etc/hosts.equiv and $HOME/.rhosts files)? If event processor authentication is enabled, does the requesting event processor have permission to run jobs on this remote agent machine?
Note: The edit superuser can enable remote authentication by using the autosys_secure utility.
AutoSys Security
27
System-Level Security
System-Level Security
The AutoSys security scheme prevents unauthorized access to AutoSys facilities, which in turn prevents unauthorized access to AutoSys jobs. The following features handle system security in AutoSys:
AutoSys database field verification Job definition encryption Remote agent authentication AutoSys user and database administrator passwords
Note: On UNIX, the database field and control string encryption features provide a level of security comparable to the security provided in the native UNIX environment.
28
User Guide
System-Level Security
By default, both user authentication and event processor authentication are disabled. The edit superuser must enable them by using the autosys_secure command.
User Authentication This remote authentication method uses UNIX ruserok() authentication to verify that a user has permission to start a job on an AutoSys client machine. It accomplishes this by telling the clients remote agent to make the ruserok() UNIX system call to check the client machines /etc/hosts.equiv and the users .rhosts file to validate that the requesting user is registered in that environment. This function call performs a local verification, and it is not related in any way to rshd or rlogind. To activate this type of remote authentication, use the autosys_secure command. The hosts.equiv or .rhosts file entries must match the job owner and machine name field exactly. For example, if the owner is tarzan@jungle, the hosts.equiv or .rhosts file must contain jungle. Similarly, if the owner is tarzan@jungle.vine.com, the hosts.equiv or .rhosts file must contain jungle.vine.com. If they do not match, jobs will fail to run on that machine when ruserok() remote authentication is in use. For information on enabling this type of remote authentication, see autosys_secure in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
AutoSys Security
29
System-Level Security
Event processor Authentication When event processor authentication is enabled, the remote agent verifies that it has permission to process requests from the requesting event processor before starting each job. It does this by reading the /etc/.autostuff file on the machine on which the remote agent is running. For information on enabling event processor authentication, see autosys_secure in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide. Note: Before enabling event processor authentication, you must set up and properly configure the /etc/.autostuff file on every client machine that will participate in this authentication method, as described in Configuring Remote Authentication in the chapter Configuring AutoSys, in this guide.
210
User Guide
AutoSys Security
211
Who can edit, override, or delete a job definition. Who can execute the UNIX command specified in a job.
The owner of a job can allow other users to edit and execute the job by setting the permissions in the job definition (discussed in the following section).
212
User Guide
Granting Permissions The owner of a job cannot override his or her ownership designation; only the edit superuser has the authority to change the owner job attribute. However, the owner can grant other users edit and execute permissions for a job by using the GUI or JIL to set the permission job attribute in the job definition. The following table shows the permissions that you can set by using JIL or the Permission toggle buttons on the Job Definitions Advanced Features dialog. GUI Group Execute JIL gx Meaning Users assigned to the job owners primary group can execute the job if logged onto the machine where the job was created (the machine specified in the owner attribute, that is, user@machine). Users assigned to the job owners primary group can edit the job if logged onto the machine where the job was created (the machine specified in the owner attribute, that is, user@machine). Users, regardless of the machine logged onto, can execute the job (otherwise, the user must be logged onto the machine specified in the owner attribute, that is, user@machine). Users, regardless of the machine logged onto, can edit the job (otherwise, the user must be logged onto the machine specified in the owner attribute, that is, user@machine). Users can execute the job if logged onto the machine where the job was created (the machine specified in the owner attribute, that is, user@machine). Users can edit the job if logged onto the machine where the job was created (the machine specified in the owner attribute, that is, user@machine).
Group Edit
ge
m x
m e
World Execute
w x
World Edit
we
AutoSys Security
213
Note: A job and the command it executes will always run as the user specified in the owner attribute of the job definition. Execute permissions determine who can execute events against the job, but not who the job runs as. Even if World Execute permissions are granted, the job will still run as the user.
When defining a job to run on a Windows machine, you can set group permissions, but they will be ignored. Group permissions will be used if a job is edited or executed on a UNIX machine. When editing a job from a Windows machine, the group edit permission is ignored. In this case, the user editing the job must be the owner of the job, or World Edit permissions must be specified for the job. When executing a job from a Windows machine, the group execute permission is ignored. In this case, the user executing the job must be the owner of the job, or World Execute permissions must be specified for the job.
214
User Guide
Edit Superuser
Only the edit superuser has permission to:
Edit or delete any job regardless of its owner or its permissions. Change the owner attribute of a job. Change the AutoSys database password, change the remote authentication method, and add and change Windows user IDs and passwords by using the autosys_secure command.
The edit superuser can override user authentication (if enabled) on a job-by-job basis by changing the owner of the job from the form user@machine to the form user. User authentication of the job at execution time is not performed on the client machine. For more information about changing the job owner, see owner attribute in the chapter JIL/GUI Job Definitions in the Unicenter AutoSys Job Management Reference Guide. Note: The purpose of the user@machine form is to prevent users from running jobs on machines where they do not have the appropriate permission. For example, root@machine prevents root on any machine from running root jobs on all machines.
AutoSys Security
215
The edit superuser must enter valid Windows user IDs and passwords into the AutoSys database. These user IDs and passwords are required for AutoSys to log onto and run jobs on Windows client machines. When a remote agent runs a job on a machine, it logs on as the user defined in the owner attribute for the job. To do this, the event processor retrieves encrypted versions of the IDs and passwords for the user@host_or_domain and the user@machine from the event server and passes them to the remote agent. For information about entering and changing Windows user IDs and passwords, see autosys_secure in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide. Note: Any AutoSys user who knows an existing user ID and password can change that password or delete that user and password.
Exec Superuser
Only the exec superuser has permission to:
Issue commands that affect the running or the state of any AutoSys job, either using the sendevent command or the Send Event dialog. Stop the event processor by issuing the following command:
sendevent -E STOP_DEMON
Note: Exec superuser privileges are usually granted to the night operator.
216
User Guide
Only authorized users can use AutoSys. Any user can view jobs and reports about jobs, such as using autorep to see the status of a job, but only authorized users can create jobs and calendars or make changes to them.
If you want only authorized users to access AutoSys, ensure that only those users have execute permissions on the files in the AutoSys bin directory. If you want all users to view reports about jobs, but only authorized users to create and edit jobs and calendars, ensure that the following AutoSys files in the $AUTOSYS/bin directory are executable only by the authorized users. This will also prevent unauthorized users from making changes to the AutoSys configuration. Secure the following files: DBMaint, clean_files, jobscape*, archive_events, dbspace, sendevent, autocal, dbstatistics, timescape*, autocal_asc, gatekeeper, xql, autocons*, hostscape*, zappls, autotimezone, jil, and zql. Note: An asterisk (*) indicates files that can be executable by all users as long as sendevent and jil are not executable. This allows users to use the GUI to view job states, but does not allow them to add new jobs or calendars or start jobs (even if the job has world execute permissions). You should also protect the files in the $AUTOUSER directory from modification by ensuring that only users authorized to change the AutoSys configuration have write permission on the files. Read permission is necessary to source the AutoSys environment files.
AutoSys Security
217
218
User Guide
Chapter
AutoSys Jobs
All activity controlled by AutoSys is based on jobs. Other AutoSys objects, such as Monitors, Browsers (Reports), and the Operator Console, serve to track job progress. A job is the basic building block upon which the entire operations cycle is built. An AutoSys job is any single command or executable, UNIX shell script, or Windows batch file. Each AutoSys job definition contains a variety of qualifying attributes, including the conditions specifying when and where a job should be run. As with most control systems, there are many ways to correctly define and implement jobs. It is likely that the way you utilize AutoSys to address your distributed computing needs will evolve over time. As you become more familiar with both the features of AutoSys and the characteristics of your own jobs, you will also refine your use of AutoSys. Note: Before continuing with this chapter, read the chapter Maintaining AutoSys for details on starting the AutoSys event processor, which must be running before you start any AutoSys processes.
AutoSys Jobs
31
As their names imply, command jobs execute commands, box jobs are containers that hold other jobs (including other boxes), and file watcher jobs watch for the arrival of a specified file.
32
User Guide
Command Jobs
The command job is commonly thought of (and referred to) as a job. The command can be a shell script, an executable program, a file transfer, and so forth. When this type of job is run, the result is the execution of a specified command on a client machine. When all the starting conditions are met, AutoSys runs this command and captures its exit code upon completion. The exit event (either SUCCESS or FAILURE) and the exit code value are stored in the database. In addition to the primary functionality described previously, a command job has the following supporting features: Command Job Options Resource criteria Action AutoSys will check that a certain amount of free file space is available before starting a process. If it is not available, an alarm is sent and the job is rescheduled to start after a suitable delay. For each job, you can specify a script to be sourced before the execution of the command that defines the environment in which the command is to be run. All commands are run under the Bourne shell (/bin/sh). Therefore, all statements in the profile must use /bin/sh syntax. For each job, you can specify the standard input, standard output, and standard error files. To do this, use the JIL std_* commands, or use the Job Definition Advanced Features dialog.
Profile script
AutoSys Jobs
33
Box Jobs
In the AutoSys environment, the box job (or box) is a container of other jobs. A box job can be used to organize and control process flow. The box itself performs no actions, although it can trigger other jobs to run. An important feature of this type of job is that boxes can be put inside of other boxes. When this is done, jobs related by like starting conditions (not by similar application types) can be grouped and operated on in a logical way. Note: Box jobs are very powerful tools for organizing, managing, and administering large numbers of jobs that have similar starting conditions or have complex logic flows. Knowing how and when to use boxes is often the result of some experimentation. For more information on box jobs, see the chapter Box Job Logic, in this guide.
Starting Conditions for Box Jobs If no other starting conditions are specified at the job level, a job within a box will run as soon as the starting conditions for the box are satisfied. If several jobs in a box do not have job-level starting conditions, they will all run in parallel. Each time any job in a box changes state; the other jobs are checked to see if they are eligible to start running. If jobs in a box have a priority attribute setting, they will be processed in order of priority, highest to lowest. Jobs inside of boxes will be run only once per box execution. If you do specify multiple start times for a job during one box processing cycle, only the first start time will be used. This prevents jobs in boxes from inadvertently running multiple times. AutoSys starts a job if the current time matches, or is later than, the start time. In addition to explicit starting conditions, jobs inside of boxes have the following implicit condition: the box job itself is running. This means that jobs inside of a box will start only if the box job itself is running. However, if a job inside a box starts and the box job is stopped, the started job runs to completion.
34
User Guide
Note: Some caution must be exercised when placing a job with more than one time-related starting condition in a box. For example, a job that runs at 15 and 45 minutes past the hour is placed in a box that runs every hour. The first time the box starts; the job runs at 15 minutes past the hour. A future start is then issued for 45 minutes past the hour, by which time the box has completed. As a result, the job will not run until the box is running again at the top of the next hour. At that time, the job runs as soon as the box starts because it is past the start time. The job runs, another future start job is issued for 15 minutes past the hour, the box completes, and the cycle repeats itself.
AutoSys Jobs
35
36
User Guide
AutoSys Jobs
37
ACTIVATED
STARTING
RUNNING
SUCCESS
38
User Guide
Status The job exited with an exit code greater than the maximum exit code for success. By default, any number greater than zero is interpreted as failure. If the job is a box job, a FAILURE status means either that at least one job within the box exited with the status FAILURE (the default), or that the Exit Condition for Box Failure evaluated to true. AutoSys issues an alarm if a job fails. The job terminated while in the RUNNING state. A job can be terminated if a user sends a KILLJOB event or if it was defined to terminate if the box it is in failed. If the job itself fails, it has a FAILURE status, not a TERMINATED status. A job may also be terminated if it has exceeded the maximum run time (term_run_time attribute, if one was specified for the job), or if it was killed from the command line through a UNIX kill command. AutoSys issues an alarm if a job is terminated. The job was unable to start due to hardware or application problems, and has been scheduled to restart. The job can logically run (that is, all the starting conditions have been met), but there are not enough machine resources available. This job is on hold and will not be run until it receives the JOB_OFF_HOLD event. This job is removed from all conditions and logic, but is still defined to AutoSys. Operationally, this condition is like deactivating the job. It will remain on ice until it receives the JOB_OFF_ICE event.
TERMINATED
RESTART
QUE_WAIT
ON_HOLD
ON_ICE
The difference between on hold and on ice is that when an on hold job is taken off hold, if its starting conditions are already satisfied, it will be scheduled to run, and it will run. On the other hand, if an on ice job is taken off ice, it will not start, even if its starting conditions are already satisfied. This job will not run until its starting conditions reoccur.
AutoSys Jobs
39
The other major distinction is that jobs downstream from the job that is on ice will run as though the job succeeded. Whereas, all dependent jobs do not run when a job is on on holdnothing downstream from this job will run. For details on how on ice affects boxes, see the chapter Box Job Logic, in this guide.
Note: In the following diagrams, a state is depicted using the following box drawing:
STATE
310
User Guide
The following diagram depicts the simplest state transition for a job, in which an event satisfies the starting conditions for the job. The job starts, processes, and completes with either a failure or success exit code.
AutoSys Jobs
311
In the case of a box, the box always goes into the RUNNING state as soon as all its starting conditions are met. This RUNNING event usually triggers jobs within the box to also start. If the job has a priority associated with it, all its starting conditions have been met, and there are not enough machine resources available, it goes into the QUE_WAIT state. Once the resources become available, it goes into the STARTING state, then runs.
312
User Guide
The value of status reflects the AutoSys event processing. Therefore, a job may have actually completed on a machine and if the event processor has not processed that event yet, AutoSys will still show the jobs status as RUNNING. By displaying the detail of the job (either in the Job Activity Console, or in the output of the autorep command), you can see all the events for a job, including those that have not processed yet. In addition, the status always reflects the most recent event that was processed. Therefore, after a job has completed, the status will remain as it is on completion. If it ended successfully, the status will remain as SUCCESS until the job is run again. Note: When a box job starts, all jobs within the box change state to ACTIVATED before they run. Jobs will then run immediately, unless other conditions apply. If a box completes before a job is run, the job is set to INACTIVE at the time of box completion. As a result, jobs do not retain their statuses from previous box processing cycles once a new box cycle has begun.
AutoSys Jobs
313
Starting Parameters
Starting Parameters
AutoSys determines whether to start or not to start a job based on the evaluation of the starting conditions (or starting parameters) defined for the job. These conditions can be one or more of the following:
Date and time scheduling parameters are met (it is or has passed the specified date and time). Starting Conditions specified in the job definition evaluate to true. For jobs in a box, the box must be in the RUNNING state. The current status of the job is not ON_HOLD or ON_ICE.
Every time an event changes any of the above conditions, AutoSys finds all the jobs that may be affected by this change, and determines whether or not to start them. Note: It is very important to keep in mind the above four conditions. In order for a job to start, all defined starting conditions must be true.
314
User Guide
Starting Parameters
Date/Time Dependencies
AutoSys jobs can be automatically scheduled to start at a certain date and time, based on the information you supply using JIL statements or the GUI. You define these dependencies by specifying the days or dates and times for timebased job starts. AutoSys then calculates a matrix of these values and starts jobs at those times. A time range cannot span more than 24 hours. You can also specify a time zone to apply to your starting times. For example, if you define a job to be started on Monday, Wednesday, and Friday at 8:00 a.m. and 5:00 p.m., it will be started 6 times a week: Monday at 8:00 a.m. and 5:00 p.m., Wednesday at 8:00 a.m. and 5:00 p.m., and Friday at 8:00 a.m. and 5:00 p.m. You can specify days of the week or actual dates. However, you cannot specify both. You can specify days of the week using JIL or the GUI, but you can only specify actual dates through the use of custom calendars, which you can define using the Graphical Calendar Facility. You can specify times as certain times of the day, or hourly, denoted in minutes past the hour. Again, the two formats are mutually exclusive. You can specify either form using JIL or the GUI (you do not have to create custom calendars).
TZ Environment Variable By default, jobs with time-based starting conditions that do not specify a time zone are scheduled to start based on the time zone of the TZ environment variable (the same time zone under which the event processor runs). Before you start the event processor, ensure that the TZ environment variable is set. The 3.4.4 event processor must be started once after you upgrade your database to insert the value of the TZ environment variable into the database. Do this before executing jil, autosc, autocons, or autorep.
AutoSys Jobs
315
Starting Parameters
Custom Calendars Using the Graphical Calendar Facility or the autocal_asc utility, you can define any number of custom calendars, each with a unique name and containing any number of dates or date/time combinations. You can use these calendars in one of two ways: as days on which to run the jobs with which they are associated, or as days on which to not run the jobs with which they are associated. Calendars exist independently of any jobs that may be associated with them; they are referenced by jobs through job definitions.
Based on the current AutoSys status of other jobs Based on the UNIX exit codes of other jobs Based on AutoSys global variables
316
User Guide
Starting Parameters
where:
status success failure done
Is one of the following: Indicates that the status condition for job_name is SUCCESS. Indicates that the status condition for job_name is FAILURE. Indicates that the status condition for job_name is SUCCESS, FAILURE or TERMINATED. Indicates that the status condition for job_name is TERMINATED. Indicates that the status condition for job_name is anything except RUNNING. Is the job on which the new job is dependent. You can abbreviate the status condition identifiers with the first letter, using s, f, d, t, and n. You can also abbreviate the dependency specification exit code with the letter e and VALUE (of a global variable) with the letter v. These abbreviations can be uppercase or lowercase. You can control the value of the SUCCESS status by using the Maximum Exit Code for Success attribute, which can be set for a job. If you specify this attribute, any job that exits with an exit code less than or equal to the specified value will be treated as a success. A FAILURE status means the job exited with an exit code higher than this value. The convention (and the default) for normal job completion is 0. A TERMINATED status means the job was killed. Note: Either uppercase or lowercase can be used to specify a status. However, the case cannot be mixed in either of the forms described.
AutoSys Jobs
317
Starting Parameters
Cross-Instance Job Dependencies Cross-instance job dependencies can be implemented among different instances of AutoSys. An AutoSys instance is one licensed version of AutoSys software running as an AutoSys server, and as an AutoSys server/client, on a single machine or on multiple machines. It uses its own event server and event processor and operates independently of other AutoSys instances. Multiple instances of AutoSys are not inherently connected, but they can communicate with each other. You can define jobs to have cross-instance dependencies, and multiple instances can send events to each other. For example, multiple instances of AutoSys can send events to each other by way of a sendevent command line like:
sendevent -E STARTJOB -J job_name -S autoserv
The job_name argument is a job defined for the instance indicated by the autoserv argument, which is the instances unique, capitalized three-character identifier. In addition, jobs can be associated with more than one instance of AutoSys. For example, a job defined to run on one instance of AutoSys could have as a starting condition the successful completion of a job running on a different instance of AutoSys. The specification for such a job dependency may look like:
condition: success(jobA) AND success(jobB^PRD)
The success(jobB^PRD) condition specifies the successful completion of a job named jobB running on a different instance of AutoSys specified with the three-letter ID of PRD. If the dependency specification does not include a caret (^) and a different instance ID, the current instance will be used, by default. Each time a cross-instance dependency is encountered, an EXTERNAL_DEPENDENCY event is sent from the requesting instance. If the target instance cannot be reached, an INSTANCE_UNAVAILABLE alarm is issued. The following figure shows two instances of AutoSys, each with a single event server, exchanging cross-instance job dependencies.
318
User Guide
Starting Parameters
Different instances of AutoSys can run from the same executables and can have the same values for $AUTOSYS and $AUTOUSER, both on the event processor machine and on machines running remote agents. However, they must have a different value for $AUTOSERV. For information on configuring AutoSys for cross-instance job dependencies, see Running Cross-Instance Job Dependencies in the chapter Introduction to AutoSys in the Unicenter AutoSys Job Management for UNIX Installation Guide.
Event processors
When cross-instance dependencies are implemented, different event processors can do the following:
Be run on different server machines or on the same server machine. Access the same client machines to start jobs. Send events to other AutoSys instances.
Note: If the event server of a target instance is down, the event processor will try to resend an event (or events) every five minutes until the other instances event server can be reached.
AutoSys Jobs
319
Starting Parameters
Event Servers
Event servers keep track of the cross-instance job dependencies. Each time a job definition with a cross-instance job dependency is submitted to the AutoSys database, the following entries were made:
An entry to the ext_job table of the issuing instance. The entries in this table specify the status of jobs in other instances in which this instance has an interest. An entry to the req_job table of the receiving instance. The entries in this table specify the jobs that have been specified as a job dependency in a job definition on the source AutoSys instance.
In both tables above, jobs are entered using the job name, a caret symbol (^), and the instance name, as shown following.
jobB^PRD
The use of multiple databases is completely independent of instances using cross-instance dependencies. You can have multiple instances of AutoSys, each using dual-event servers. Note: When communicating with event servers, event processors can only connect to those instances with like event servers. That is, instances with Sybase data servers can only connect with other instances having Sybase data servers. The same holds true for instances with Oracle databases.
Example Job Dependencies For a job that runs only if the job named DB_BACKUP succeeds, the job dependency specification would be written as follows:
success(DB_BACKUP)
or:
s(DB_BACKUP)
You can configure more complex conditions by combining a series of conditions with the AND or the OR logical operators. You can enter these operators in uppercase or lowercase, but not in mixed case. In addition, you can use the pipe symbol (|) instead of the word OR, and the ampersand (&) instead of the word AND. Spaces between conditions and delimiters are optional.
320
User Guide
Starting Parameters
You can specify conditions that are more complex by grouping the expressions in parentheses. The parentheses do not imply any sort of precedence; they are simply used for grouping. For example, if JobC should only be started when both JobA and JobB complete successfully or when both JobD and JobE complete, regardless of whether they failed, succeeded, or terminated, you would specify the following dependency in the job definition for JobC:
(success(JobA) AND success(JobB)) OR (done(JobD) AND done(JobE))
or:
(s(JobA)&s(JobB))|(d(JobD)&d(JobE))
As indicated in this example, you can use any job status as part of the specification for a specific jobs starting conditions. With this latitude, you can program branching paths that must be taken and that will provide alternate actions for error conditions. For example, if JobB fails after processing only partially, you may want to call a routine titled Backout that backs out of the changes that were made. You would specify the following job dependency in the job definition for Backout:
failure(JobB)
or:
f(JobB)
You use the notrunning operator to keep multiple jobs from running simultaneously (that is, running one job is exclusive of any others). For example, it may be best not to run a database dump (DB_DUMP) and a file backup (BACKUP) at the same time. This would cause the hard disk to be accessed very frequently. However, you may have a smaller job that can run as long as both of these resource-intensive jobs are not running. You would specify the smaller jobs dependency like:
notrunning(DB_DUMP) AND notrunning(BACKUP)
Note: If you have jobs that you want to run exclusively, use the virtual machine and job queuing feature that is described in the chapter Load Balancing and Queuing Jobs, in this guide.
AutoSys Jobs
321
Starting Parameters
Managing Job Status Starting conditions that are based on job status use the current (or most recent) completion status of the job. The current completion status is defined by the last execution of the job, regardless of when it last ran. However, if you wish to enforce the concept of time-based processing cycles, where the completion status of a job for some previous time period should not affect the processing of this time cycle, there are several options you can use to control statuses. When a box job is started, all the jobs within the box have their status changed to ACTIVATED. Therefore, downstream jobs in the box that depend on the completion of jobs upstream in the same box will use only the completion statuses from this run of the box. Placing the jobs in one processing cycle inside a top-level box and setting the box to start at the beginning of the processing cycle will prevent time-critical jobs from being affected by invalid information. When a job is first entered into the database, and prior to its being run for the first time, its status is set to INACTIVE. By changing to INACTIVE the status of jobs that have completed, but whose completion status should no longer be used in dependent job conditions, the completion status from the last run will no longer be the current status, and it will not be used. To change a job status to INACTIVE, use the GUI (Send Event dialog), or use the sendevent command. Of course, you can create an AutoSys job to accomplish this as well. If you change the status of a top-level box to INACTIVE, all the jobs in the box are recursively set to INACTIVE. Deleting and reinserting the job using JIL will accomplish the same thing. However, the past reporting history on the job will no longer be available. (Updating a job using JIL does not change the status of the job.)
322
User Guide
Starting Parameters
where:
job_name operator
Is the name of the job upon which the new job is dependent. Is one of the following exitcode comparison operators: =, != (not equal), <, >, <=, or >= Is any numeric value. You can abbreviate the dependency specification exitcode with the letter e (uppercase or lowercase). For this example, you would enter the following for the job dependency specification for the JobB redial job:
e (JobA) = 4
value
You can use any job status or exit codes as part of the specification for starting conditions. With this latitude, you can program branching paths that will provide alternative actions for all types of error conditions.
AutoSys Jobs
323
Starting Parameters
Using Exit Codes and Batch Files with Jobs Running On Windows When you are defining jobs that will run batch files on Windows, you should be aware of, and account for, the Windows specific behavior. Windows programs return an exit value that is programmed within the executable code. This exit value is the last thing returned to Windows when the program terminates. Generally, a zero exit code indicates success, while a non zero exit code indicates an error. The expected error values should be documented with each individual program, but some programs can return unexpected exit codes. You should modify these programs so that they return expected values. Use these values when specifying exit code dependencies. AutoSys jobs are created using standard Windows process creation techniques. After the job has been created, the Remote Agent waits for the job to complete. When the job completes, AutoSys gets the program exit code from Windows and stores it in the AutoSys database for later use. When launching programs directly from AutoSys, the exit codes are returned and put in the database. However, there are some exit code behaviors that you must take into consideration when using AutoSys to start *.BAT batch files. The exit code returned from a batch file is the return code from the last operation executed from within that particular batch file. Consider the following example:
REM test batch file test if errorlevel 1 goto bad goto good :bad del test.tmp :good exit
This example batch file will return a 0 exit code as long as test.tmp exists. If test.tmp does not exist, the return code is from the del line and not from the line that executes test. Therefore, this batch file will return a 0 (successful) exit code to Autosys, even if test failed to execute as intended.
324
User Guide
Starting Parameters
To help handle situations like this, Autosys supplies a program called FALSE.EXE. This program is located in the AutoSys for Windows %AUTOSYS%\bin directory and takes only one parameter, which is the exit code you want false to return on completion. You can use false in the previous example batch file, like:
REM test batch file test if errorlevel 1 goto bad exit :bad del test.tmp false 1
When test fails with errorlevel 1, this batch file will return an exit code of 1 from false, whether the test.tmp file exists or not.
where:
VALUE global_name operator value
Can be uppercase or lowercase. Is the name of the global variable upon which the job is dependent. Is one of the following: =, != (not equal), <, >, <=, or >= Is any numeric value or text string (no quotes or spaces). Note: global_name and the value can each be a maximum of 30 characters. When using the Job Definition dialog to define a job, enter the expression shown here, in the Starting Condition field. When using JIL, enter the above expression in the appropriate JIL script using the condition attribute.
AutoSys Jobs
325
You can abbreviate the dependency specification VALUE with the letter v (uppercase or lowercase). In the example cited above, you would enter the following for the jobs condition statement:
VALUE(manager-ok) = OK
or:
v (manager-ok) = OK
326
User Guide
Job Definition dialog Used to define jobs. The Job Definition dialog and its related dialogs allow you to create, view, edit, and delete job definitions for command jobs, box jobs, and file watcher jobs.
Graphical Calendar Facility Used to define calendar definitions. The Graphical Calendar Facility and its related dialogs allow you to create calendars in order to simplify job scheduling. It allows you to create custom rules, block certain dates, set up conflict resolution, build calendars based on combinations of other calendars, and preview calendar definitions before assigning them to jobs. Then, you can assign them to a certain job, using the Job Definition dialog.
AutoSys Jobs
327
Operator Console Used to monitor and manage AutoSys jobs. The Operator Console consists of the following components: Job Activity Console, Job Selection dialog, Alarm Manager dialog, and Alarm Selection dialog. The Job Activity Console allows you to monitor AutoSys jobs, and to filter the jobs that it displays, you can use the Job Selection dialog. The Alarm Manager allows you to browse and handle alarms. To filter the alarms that the Alarm Manager displays, you can use the Alarm Selection dialog.
Monitor/Browser Used to define monitors and reports. The Monitor/Browser allows you to define filters by which you can screen AutoSys system information. Monitors provide real-time views of the system. Browsers (reports) provide historical views of system information.
Note: You can also access AutoSys/Xpert from the GUI Control Panel. For information on using AutoSys/Xpert, see the Unicenter AutoSys Job Management AutoSys/Xpert User Guide.
328
User Guide
Chapter
Job Attributes
This chapter describes the essential and optional job attributes used to define jobs in AutoSys. These attributes determine what a job does, as well as when and where it will run.
Job Attributes
41
where:
job_name attribute_keyword value
Is a unique job name. Is one of the legal JIL attributes. Is the setting to be applied to the attribute.
Chapter Organization
In this chapter, job attributes are organized into two categories: essential and optional. Essential attributes are those that must be specified in order for the job definition to be accepted. As the name implies, optional attributes are not necessarily required for a job definition to be accepted. For each attribute described in this chapter, we indicate its name, its JIL attribute keyword, its corresponding GUI object, or GUI field name, and a description of its use. In the previous example, the Job Type attribute, which specifies whether a job is a command, file watcher, or box, is specified with the JIL keyword job_type and is identified in the GUI as the field with the name Job Type.
42
User Guide
Because the chapter JIL/GUI Job Definitions in the Unicenter AutoSys Job Management Reference Guide is organized alphabetically by JIL keywords, the keywords in this chapter can act as pointers to more detailed descriptions about a particular attribute in the reference chapter. The heading on each reference page contains the same JIL keyword on the left and GUI field name on the right as in the example previously shown.
Job Attributes
43
Job Name
JIL Keyword GUI Field Name Description
insert_job Job Name The job name is used to identify the job to AutoSys, and must be unique within AutoSys. It can be from 1 to 30 alphanumeric characters, and is terminated with white space. Embedded blanks and tabs are illegal. Command, file watcher, and box jobs cannot use the same name.
Job Type
JIL Keyword GUI Field Name Description
job_type Job Type The job type specifies the type of job: command (c), file watcher (f), or box (b).
Job Owner
JIL Keyword GUI Field Name Description
owner Owner The job owner specifies whose user ID the command will be run under on the client machine. This attribute is automatically set to the user who invoked jil or the GUI to define the job, and cannot be changed except by the edit superuser.
44
User Guide
Command
JIL Keyword GUI Field Name Description
command Command to Execute The command attribute can be the name of any command, executable, UNIX shell script or batch file, and its arguments. When issuing commands that are to be run on a different operating system, you must use the syntax appropriate to the operating system of the client machine. The jobs owner must have execute permission for this command on the client machine. Input and output redirection cannot be part of the command. Redirection is specified by other job attributes. AutoSys global variables can be used as part of the command name itself, or as part of the commands runtime arguments. To set a global variable, use the sendevent command, or use the Send Event dialog in the GUI. This command will be executed in the environment defined in the profile scripteither the AutoSys default /etc/auto.profile, or the one specified in the job definition (which you can define, and which preempts the default file). Therefore, if $PATH is assigned in that script, that path will be searched to find the executable. For more information on specifying a profile, see Profile in Command Job Attributes in this chapter. The full path name can be specified, in which case, variables exported from the profile script can be used in the path name specification. If variable substitution is used, enclose the variable in curly braces, as in ${DIR}. These are additional points to keep in mind with regard to the command attribute:
Since AutoSys performs an exec to run the command, multiple commands separated by a semi-colon are not allowed. Piping or redirection of standard input, output, and error files is not allowed in the command attribute. Shell scripts can be invoked to execute piped commands and attributes, such as std_in_file used for standard input, to provide the necessary functionality.
Job Attributes
45
You cannot use the ampersand character (&) in the command attribute. A shell script can be called to provide that functionality. All commands are run under the Bourne shell (/bin/sh). Therefore, all statements in the profile must use /bin/sh syntax. For example:
Variable=value; export Variable
do not use:
export Variable=value or setenv Variable Value
If you are running a C-Shell (csh) script, the system will attempt to source a .cshrc file when it begins interpreting the file. Although this may be desired, the system will also overwrite any variables defined in the profile script (the default profile is /etc/auto.profile.) If you do not wish to have the .cshrc file sourced, you must invoke the csh script with the -f option. For example, this should be the first line of the script:
#!/bin/csh -f
Only one file is sourcedeither the default /etc/auto.profile or the profile file specified in the job definition. Therefore, the entire environment needed for the command must be defined in the profile file that will be sourced. Command line arguments can be passed using global variables.
Note: If a command is working properly when issued at a shell prompt, but it fails to run or run properly when specified as a command attribute, the shell and AutoSys environments are probably different. If this is the case, ensure that all required command variables are specified in the AutoSys profile script, either the default one or the one you have specified.
46
User Guide
Machine to Run On
JIL Keyword GUI Field Name Description
machine Execute on Machine This attribute specifies the client machine on which the command should be run. The jobs owner must have permission to access this machine and to execute the specified command at this machine. The machine can be a specific real machine (as listed in the /etc/hosts file of the AutoSys server machine), a set of real machines, or a virtual machine. Note: If you have implemented the shadow event processor feature, you should never set the machine attribute to localhost. The localhost value implies: run on the machine on which the event processor is currently running. The job may run normally on the primary event processor machine, and yet fail on the shadow event processor machine. You can also specify the svload program or your own, custom load-balancing program in place of a machine name. In this case, the event processor will run the program at runtime to select the best-suited machine to run the job. For more information about virtual machines, and how AutoSys chooses a machine to run on when you specify multiple machines or a load-balancing program, see the chapter Load Balancing and Queuing Jobs, in this guide.
Job Attributes
47
Machine to Run On
JIL Keyword GUI Field Name Description
machine Execute on Machine This attribute specifies the client machine on which the File Watcher should run. For a File Watcher, this attribute must specify a single real machine, defined in the /etc/hosts file on the AutoSys server machine.
watch_file File To Watch For The name of the file to watch for must be a legal file name and must include the full path to the file. All directories in the path must exist, but the file itself does not have to exist at the time the job is defined. Environment variables defined and exported in the profile file (the specified or default), as well as global variables, can be used in the path. Wildcards cannot be used in the file name. When using the GUI, this field only appears when the File Watcher type has been selected. This attribute is used in combination with the Watch File Minimum File Size and Watch Interval attributes, to determine when a file is considered to have arrived.
48
User Guide
date_conditions Is the Start Date/Time Dependent? The start date/time dependencies attribute is a toggle, which specifies whether or not there are date, time, or both, conditions required for starting the job. If the attribute is set to no, the remainder of the related date/time attributes, described following, will be ignored.
days_of_week Days of the Week The days of the week attribute specifies the days on which the job should be run. You can specify one or more days, or all for every day.
Job Attributes
49
run_calendar Run on Days in Calendar The days on which a job should be run can be specified by way of a custom calendar, rather than through a list of days of the week. Custom calendars, specified through the AutoSys Graphical Calendar Facility, or the autocal_asc command, can include any number of dates on which the job should be run. Each calendar is stored in the database as a separate object with a unique name, and a calendar can be associated with one or more jobs, using this attribute or the exclude_calendar attribute.
exclude_calendar Do NOT Run on Days in Calendar The days on which a job should not be run can be specified by way of a custom calendar. Custom calendars, specified through the AutoSys Graphical Calendar Facility, or the autocal_asc command, can include any number of dates on which the job should not be run. Each calendar is stored in the database as a separate object with a unique name, and a calendar can be associated with one or more jobs, using this attribute or the run_calendar attribute.
start_times Times of Day This attribute specifies one or more specific times of day when the job should be started. The job will be started at each specified time of day, on every day specified in the associated date attributes. This attribute and the Specific Times Every Hour to Run (start_mins) attribute are mutually exclusive.
410
User Guide
run_window Run Window This attribute specifies a time range (or time window) during which a job can be started. When the starting conditions for a job have been met, AutoSys checks if the current time is within the specified run window. The job will not start outside of the specified window. This attribute controls only when the job will start, not when it will stop running. This attribute is particularly useful when, for example, it is not known when a watched-for file will arrive, and there are certain times when jobs dependent on that file should not run. This setting can prevent a late-arriving file from causing a job to run at an inopportune time. The run window range cannot span more than 24 hours. Jobs that are not in a box must have starting conditions in addition to the run_window attribute in order for the job to be automatically started. Note: You can also block out times of day when you do not want a job to start by putting the job on hold, then taking it off hold later. The sendevent command can be used to accomplish this, executed either from the command line, through the Send Event dialog, or from within a shell script or batch file in another job.
start_mins Every Hour at One or more specific times per hour when the job should be started can be specified. Each time is specified in minutes past the hour. The job will be started at each specified time every hour of the day, on every day specified in the associated date attributes. This attribute and the Specific Times of Day to Run (start_times) attribute are mutually exclusive.
Job Attributes
411
condition Starting Condition Any number of job dependencies can be specified; however, every dependency must evaluate to true before the dependent job will be run. Examples of job dependencies include successful completion of a job, failure of a job, a jobs exit code, and the value of a global variable. Various combinations of conditions may also be specified. Job dependencies can reference jobs residing on different AutoSys instances. If a condition is specified for an undefined job, the condition will be evaluated as FALSE, and any jobs dependent on this condition will not run. To check for this type of invalid condition statement, you can use the chk_cond, stored procedure (see chk_cond (SP) in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.) Note: You should study the Starting Parameters in the chapter AutoSys Jobs and review the JIL examples provided in the chapter Defining Jobs Using JIL. This functionality opens up all sorts of possibilities for controlling jobs, and provides information that will help you when creating your own job definitions.
Description
JIL Keyword GUI Field Name Description
description Description This attribute provides a comment field, used for documentation purposes only. When entering a description using JIL, you should enclose the string in double quotes to ensure JIL properly interprets it. The GUI adds quotes for you automatically.
412
User Guide
Box Name
JIL Keyword GUI Field Name Description
box_name Name of the Box this Job is IN Boxes allow a set of jobs to be manipulated as a group. This feature is particularly useful for setting starting conditions at the box level, to gate the jobs inside the box, then specifying their starting conditions relative to each other individually, if necessary. This attribute specifies the name of the box in which the job is to be placed. The specified box must already exist before you can place jobs in it.
min_run_alarm Minimum Runtime A minimum runtime (in minutes) can be specified for a job; the job should not end in less than the specified time. This may prevent an inadvertent truncation of the file being processed before it is complete. If the job does end prior to this time, an alarm is generated to alert someone to investigate the situation and take corrective action. Alarms are informational, and they do nothing on their own. A monitor or the Operator Console must be running and tracking alarms in order for them to be seen and acted upon in real-time.
Job Attributes
413
max_run_alarm Maximum Runtime A maximum runtime can be specified for a job. If a maximum runtime is specified, the job should not take longer than the specified time to finish. This reasonability test may catch an error, such as the application being stuck in a loop, or the application waiting for additional data that may never arrive. If the job runs longer than this time, an alarm is generated to alert someone to investigate the situation and take corrective action. Alarms are informational, and they do nothing on their own. A monitor, or the Operator Console, must be running and tracking alarms in order for them to be seen and acted upon in real time. The attribute Terminate this job Mins after starting (term_run_time) can be used to automatically terminate a job that has been running for too long. If term_run_time is not set, the job will continue running until manually interrupted, or it completes by itself.
term_run_time Terminate this job Mins after starting A maximum runtime (in minutes) can be specified for a job; the job should not take longer than the specified time to finish. This feature allows the job to be automatically terminated if it runs longer than the allotted time.
414
User Guide
alarm_if_fail Send ALARM if this Job Fails? This attribute specifies whether or not an alarm should be generated when the job fails. Failure is defined as the job completing with a FAILURE or TERMINATED status. (The Maximum Exit Code for SUCCESS attribute determines what codes are interpreted as FAILURE for a job, and the Box Failure Condition attribute determines what constitutes a box failure.) Alarms are informational, and they do nothing on their own. A monitor or the Operator Console must be running and tracking alarms in order for them to be seen and acted upon in real-time.
box_terminator If this Job fails should the Box it is IN be Terminated? This attribute specifies whether or not the box containing this job should be terminated if the job fails or terminates. By using this attribute in combination with the Terminate the Job if the Box Fails attribute, you can control how nested jobs react when a job fails. This attribute only applies if the job is being placed in a box.
job_terminator If the Box fails should this Job be Terminated? This attribute specifies whether or not the job should be terminated if the box it is in fails or terminates. By using this attribute in combination with the Terminate the Box if the Job Fails attribute, you can control how nested jobs react when a job fails. This attribute only applies if the job is being placed in a box.
Job Attributes
415
n_retrys Number of Times to Restart this Job after a FAILURE This attribute specifies how many times, if any, the job should be restarted after exiting with a FAILURE status. The default is 0, which means the job will not be automatically restarted after an application failure. This attribute applies to application failures (for example, AutoSys is unable to find a file or a command, or permissions are not properly set); it does not apply to system or network failures (for example, machine unavailability, the socket connect timed out, the fork in the Remote Agent failed, or the file system space resource check failed). The number of restarts after system or network failures is specified using the MaxRestartTrys parameter in the configuration file.
timezone Time Zone This attribute allows you to schedule a job based on a chosen time zone. When this attribute is used, the time settings in the job are based on the specified time zone. For example, if you define a start time of 01:00 for a job running on a machine in Denver, and enter San Francisco in the Time Zone field, the job will start at 1:00 a.m. Pacific time, which is 2:00 a.m. in Denver. If you specify a time zone that includes a colon, you must quote the time zone name if you are using JIL. For example:
timezone: "IST-5:30"
If you do not quote a time zone specification that contains a colon, JIL will interpret the colon as a delimiter, producing unexpected results. Jobs with time-based starting conditions that do not specify a time zone will have their start event scheduled based on the TZ environment variable, which specifies the time zone under which the event processor is running.
416
User Guide
auto_delete Delete Job after completion? This attribute indicates whether or not the job definition should be automatically deleted after successful completion. A number of hours can be specified (including 0 for immediately), or the attribute can be turned off by specifying a negative value (for example -1), which is the default. If auto_delete is set to 0, AutoSys will immediately delete job definitions only if the job completed successfully. If the job did not complete successfully, AutoSys will keep the job definition for seven days before automatically deleting it. This attribute is useful for letting AutoSys schedule and run a one-time batch job.
Autohold
JIL Keyword GUI Field Name Description
auto_hold Autohold On? This feature is only for jobs in a box. When a job is in a box, it inherits the boxs starting conditions. This means that when a box goes into the RUNNING state, the box job will start all the jobs within it (unless other conditions are not satisfied). This is typically the desired behavior; however, there are occasions when it is not. For example, you may want to place a job in a box, but not start the job until a non-job (that is, operating system level) event arrives. By specifying yes to Autohold On, AutoSys automatically changes the job state to ON_HOLD when the box it is in begins RUNNING. At this point, the job is in exactly the same state as if it were manually placed on hold. To start the job, take the job off hold by sending the JOB_OFF_HOLD event through the Send Event dialog or the sendevent command.
Job Attributes
417
Permissions
JIL Keyword GUI Field Name Description
permission Permissions In UNIX, there are three levels of permission by user ID (uid), and within AutoSys, two levels of job permission. AutoSys permissions are based on the UNIX user ID scheme. This scheme contains three types: owner, group, and world (or public). The owner is the user who originally created the job. The group is the primary group of which the owner is a member, and the world is every user with a valid logon for the system. AutoSys also allows designation of an exec superuser and an edit superuser, by way of the autosys_secure command. The edit superuser can edit any job definition, change the owner of a job, and change the permissions assigned to a job; no other user has this authority. The exec superuser can shut down the event processor, and can issue sendevent commands to any job, regardless of its owner. This is the permission scheme used in AutoSys:
Execute If execute permissions are enabled for the users group, allows the user to issue events that affect the running (starting, stopping, and so on.) of the job. Users in primary and secondary groups have this permission.
Edit If edit permissions are enabled for the users group, allows the user to edit or delete the job definition. Only users in the primary group have this permission.
The AutoSys permission scheme is based on the same permissions used in native UNIX environment. AutoSys uses the user ID (uid) and group ID (gid) specified in the UNIX environment to do the following:
Control access to the job definitions themselves. Determine the execution permissions to be used when executing the actual UNIX command specified in the job.
418
User Guide
Profile
JIL Keyword GUI Field Name Description
profile Job Environment Profile The profile attribute specifies the file to be sourced by the Bourne shell before the specified command is executed. The AutoSys remote agent always spawns a process and starts the Bourne shell in that process, passing it the name of the profile to be sourced. This profile typically includes definitions and exports of environment variables, which can be referenced in the jobs command. The primary environment variable in the profile is the $PATH. If a profile is not specified, the default AutoSys profile, /etc/auto.profile, is used. If the profile attribute is specified, that profile is searched for on the machine on which the command is to run. If a command that normally executes when entered at the command line fails when run as an AutoSys job, it is usually due to the incomplete specification of the required environment for the command in the AutoSys profile file. Note: It is essential that no Korn shell and C-shell statements appear in profile file, because the Bourne shell that AutoSys runs will not be able to process them. If you include these types of statements, unexpected results will occur, often interfering with the proper redirection of the stdin, stdout, and stderr files.
Job Attributes
419
std_in_file File to Redirect to Standard Input The standard input file can be redirected to any file to which the job owner has read permission on the client machine. The full path name must be specified, although variables exported from the default /etc/auto.profile or jobs profile file, as well as global variables, can be used in the path name specification. The default is /dev/null.
std_out_file File to Redirect to Standard Output The standard output file can be redirected to any file on the client machine to which the job owner has write permission. The full path name must be specified, although variables exported from the default /etc/auto.profile or jobs profile file, as well as global variables, can be used in the path name specification. The default is /dev/null. By default, new information is appended to the file. By placing the following notation as the first characters in the std_out_file specification, you can specify if the error file should be appended to or overwritten:
> >> Overwrite file Append file
This setting overrides the instance-wide setting for the AutoInstWideAppend parameter in the AutoSys configuration file. It also overrides the machinespecific setting for the AutoMachWideAppend parameter in the /etc/auto.profile file. Note: If you are running jobs across platforms, realize that the event processor of the issuing instance controls the default behavior. If the issuing instance is Windows, the default is to overwrite this file.
420
User Guide
std_err_file File to Redirect to Standard Error The standard error file can be redirected to any file on the client machine to which the job owner has write permission. The full path name must be specified, although variables exported from the default /etc/auto.profile or jobs profile file, as well as global variables, can be used in the path name specification. The default is /dev/null. By default, new information is appended to the file. By placing the following notation as the first characters in the std_err_file specification, you can specify if the error file should be appended to or overwritten:
> >> Overwrite file Append file
This setting overrides the instance-wide setting for the AutoInstWideAppend parameter in the AutoSys configuration file. It also overrides the machinespecific setting for the AutoMachWideAppend parameter in the /etc/auto.profile file. Note: If you are running jobs across platforms, realize that the Event processor of the issuing instance controls the default behavior. If the issuing instance is Windows, the default is to overwrite this file.
Job Attributes
421
Job Load
JIL Keyword GUI Field Name Description
job_load Job Load Machines can be assigned maximum job loads, which is a measure of the CPU load that is desirable for a machine at any given time. Similarly, jobs can be assigned loads, indicating the relative amount of processing power they consume. This scheme allows for machine loading to be controlled, and prevents a machine from being overloaded. If a job is ready to run on a designated machine, but the current load on that machine is too large to accept the new jobs load, the job will be queued for that machine, to be run when sufficient resources are available. For load balancing to function properly, all jobs to be run on a controlled machine must have job loads specified; otherwise, their impact on a machine cannot be measured. Note: If you force a job to start, it will run even if its load exceeds the machines max_load. Also, if job_load is specified for a job and no priority attribute (described following) is set, AutoSys uses the default priority of 0, which means ignore the job_load and run the job immediately. For information about load balancing on machines, see the chapter Load Balancing and Queuing Jobs, in this guide.
Queue Priority
JIL Keyword GUI Field Name Description
priority Que Priority The queue priority establishes the relative priority of all jobs queued for a given machine, with the lower number indicating higher priority. If a job is ready to run on a designated machine, but the current load on that machine is too large to accept the new jobs load, the job will be queued for that machine. The priority attribute only influences the starting of jobs that are queued, unless the jobs are in a box. If jobs in a box have a priority attribute setting, they will be processed in order of priority, highest to lowest.
422
User Guide
Job Overrides
JIL Keyword GUI Field Name Description
override_job Edit One Time Over-Rides? You can specify a one-time job override for the next run of a particular job. An override lets you change the behavior of a job the next time the job runs. The following attributes can be modified in a job override: auto_hold min_run_alarm std_in_file command n_retrys std_out_file condition profile term_run_time date_conditions run_calendar watch_file days_of_week run_window watch_file_min_size exclude_calendar start_mins watch_interval machine start_times max_run_alarm std_err_file
For a description of how to use the GUI to specify job overrides, see Specifying One-Time Job Overrides in the chapter Defining AutoSys Jobs Using the GUI, in this guide.
max_exit_success Maximum Exit Code for SUCCESS The maximum exit code for success attribute indicates what exit codes will be considered by AutoSys as a success. It is used when a command can exit with more than just a single exit code, indicating either degrees of success, or other conditions that may not indicate a failure. This attribute lets you define complex branching logic based on specific exit code values. AutoSys reserves exit codes greater than 120 for internal use, so do not use exit codes of 120 or greater.
Job Attributes
423
Average Runtimes
JIL Keyword GUI Field Name Description
avg_runtime (JIL only) The avg_runtime attribute is used to provide an average runtime (in minutes) for a job that is newly submitted to the AutoSys database; it establishes this value in the absence of the job having been run multiple times. This attribute is used solely to establish an average runtime for the new job in the avg_job_runs table, which in turn can be used for projections and simulations in AutoSys/Xpert.
Heartbeat-Interval
JIL Keyword GUI Field Name Description
heartbeat_interval Heartbeat Interval (mins) In AutoSys, heartbeats are a means of monitoring a jobs progress. It automates the common practice of outputting characters, similar to displaying progress asterisks across the screen as a process runs. If a job does not send a heartbeat within this specified interval, a HEARTBEAT alarm is generated. The heartbeat interval is specified in minutes. To send a heartbeat from a C program, call the routine found in the following source file:
$AUTOSYS/code/heartbeat.c
To send a heartbeat from a Bourne shell script, execute the code found in the following file:
$AUTOSYS/code/heartbeat.sh
The event processor must be configured to check for heartbeats. To do this, modify the configuration file, which has the following name:
$AUTOUSER/config.$AUTOSERV
For more information about the AutoSys configuration file, see Sample Configuration File in the chapter Configuring AutoSys, in this guide. For information on sending heartbeats, see Sending Heartbeats in the chapter AutoSys API in the Unicenter AutoSys Job Management Reference Guide.
424
User Guide
chk_files Resource Check - File System Space... This attribute specifies a minimum amount of file space that must be available on the designated file systems for the job to be started. One or more file systems, specified with full path names or directory names, and their corresponding sizes, can be specified. If multiple file systems are specified, separate them with a single space. When the remote agent is preparing to start the job on the client machine, it checks whether or not the required space is available before starting the job. If the requirements are not met, an alarm is generated and AutoSys automatically reschedules the job to start again after a delay. It will perform the same resource check the next time it attempts to start. This feature is intended to prevent a job that is known to require large amounts of file space from failing due to a shortage of space during processing time.
watch_file_min_size Minimum File Size (in Bytes) The watch file minimum size determines when enough data has been written to the file to consider it complete. This attribute is specified in bytes. You should specify a reasonable file size to ensure that a nearly empty file is not assumed to be complete. Use caution with this attribute. If you specify a large file size AutoSys will wait for the file to reach that size, even if the file has reached a steady state and is no longer growing.
Job Attributes
425
Watch Interval
JIL Keyword GUI Field Name Description
watch_interval Time Interval (secs) to Determine Steady State The watch interval specifies (in seconds) how often the File Watcher should check the current file size to ascertain whether data is still being written to the file. The default is every 60 seconds.
chk_files Resource Check - File System Space... This attribute specifies a minimum amount of file space that must be available on designated file systems for a command job to be started. One or more file systems, specified with full path names or directory names, and their corresponding sizes can be specified. When the remote agent is preparing to start the job on the client machine, it checks whether the required space is available before starting the job. If the requirements are not met, an alarm is generated. File watcher jobs will still be started.
426
User Guide
box_success SUCCESS Conditions The default condition required for a box to be considered successful is that every job in the box must have completed with a success condition. A box can contain complex branching logic, which can take a number of different paths, all of which constitute a success. In this case, some jobs in the box may never need to run; but if the default box behavior is applied, the jobs that did not run would prevent the box from ever completing. This attribute can be used to specify what is considered a success, which could be as simple as the success of a single job, or as complex as necessary. This attribute is only displayed in the GUI when you select a box job type.
Box Failure
JIL Keyword GUI Field Name Description
box_failure FAILURE Condition The default condition required for a box to complete with a FAILURE status is that all jobs in the box have completed and one or more jobs in the box completed with a failure condition. A box can contain complex branching logic, which may take a number of different paths, one of which may include recovery from a failed job. In this case, you may want the box to be considered successful, even though a job within it failed. This attribute can be used to specify what will be considered as a failure, which could be as simple as the failure of a single job, or as complex as necessary. This attribute is only displayed in the GUI when you select a box job type.
Job Attributes
427
When this change occurs, time runs 1:58 ST, 1:59 ST, 3:00 DT, 3:01 DT, and the 2:00 to 2:59 hour is lost.
428
User Guide
In the fall, at 2 a.m., the clocks fall back to 1 a.m. In most of the United States, this happens on the fourth Sunday in October. The following figure illustrates the Daylight Savings to Standard time change.
When this change occurs, time runs 1:58 DT, 1:59 DT, 1:00 ST, 1:01 ST,..., 2:00 ST, 2:01 ST, and the 1:00 to 1:59 hour is repeated.
Job Attributes
429
Relative times are specified with respect to either the current time, or relative to the start of the hour. For example, start a job at 10 and 20 minutes after the hour, or terminate a job after it has run for 90 minutes. Relative time dependent job attributes include:
During the time change, absolute time attributes will behave differently than relative time attributes, as described following.
Spring Time Change During the change to daylight saving time in the spring, the 2:00-2:59 hour is lost, therefore AutoSys cannot schedule any jobs during that non-existent hour. The AutoSys solution is to schedule jobs with absolute time dependencies for the missing hour to start within the first minute of the 3:00 DT hour. For example, a job scheduled to run on Sundays at 2:05, will run at 3:00:05 that day; a job scheduled to run everyday at 2:45 will run at 3:00:45. Although it may not be possible to start a large number of jobs within the first minute of the hour, this feature does somewhat preserve the scheduling order. If you scheduled a job to run more than once during the missing hour, for example, 2:05 and 2:25, only the first scheduled job would run. Any additional start times for the same job in the missing hour will be ignored. Relative time dependencies, such as start_mins, will run as you would expect. For example, a job specified to run at 0, 20, and 40 minutes after the hour will be scheduled for 1:00 ST, 1:20 ST, 1:40 ST, 3:00 DT, 3:20 DT, and 3:40 DT. Relative interval calculations, such as max_run_alarm, min_run_alarm, term_run_time, and watch_interval are still calculated in minutes out from when the job started. For example, if our Sunday at 2:05 job has a term_run_time of 90 minutes, the job will start shortly after 3:00, the term_run_time will be at 4:30.
430
User Guide
Therefore, the behavior between two jobs that appear to have the same times specified, but use start_times versus start_mins, will not be the same. For example, job Jrel has start minutes of 10 and 20 minutes after the hour, and job Jobs has start times of 1:10, 1:20, 2:10, 2:20, 3:10, and 3:20. Jrel will run at 1:10, 1:20, 3:10, and 3:20. Jabs will run at 1:10, 1:20, 3:00, 3:10, and 3:20.
Run Windows
Run windows are treated a bit differently. If the specified closing of the run window falls within the missing hour, its recalculated closing time will be bumped up an hour, so that the effective duration of the run window remains the same. For example, a run window of 1:00 - 2:30 will have the closing time move to 3:30, so that the run window still remains open for an hour and a half. If the specified opening of the run window falls within the missing hour, its opening time is moved to 3:00. The closing time does not get altered, therefore the run window is foreshortened. For example, a run window of 2:45 - 3:45 will become 3:00 - 3:45, and the actual run window elapsed time will be 15 minutes shorter. If both the specified opening and closing of the run window is within the missing hour, its opening time is moved to the first minute after 3:00, and its closing time is pushed forward one hour. Therefore, the resultant run window may be lengthened. For example, a run window of 2:15 - 2:45 will become 3:00 3:45, or 15 minutes longer.
Job Attributes
431
Fall Time Change During the change from daylight saving to standard time in the fall, there are two 1:00-1:59 hours. Jobs with start_times set between 1:00 and 1:59 will run only in the second, or Standard Time hour. Jobs with start_mins settings will run in both hours. For example, a job scheduled to run on Sundays at 1:05, will run only at the second 1:05. A job scheduled to run every 30 minutes will run at 1:00 DT and 1:30 DT, then again at 1:00 ST and 1:30 ST, and so on (as shown in the following figure).
Jobs that are not time-based, but have other dependencies, will still run during the first hour. Relative interval calculations, such as max_run_alarm, min_run_alarm, term_run_time, and watch_interval are still calculated in minutes out from when the job started. For example, if a job is scheduled to run on Sunday at 0:30, and has a term_run_time of 120 minutes, the job would normally be terminated at 2:30. On the day of the fall time change, it will terminate at 1:30 Standard Time, which is 120 minutes after the job started.
Testing the Fall Time Change During the fall time change from Daylight Savings time to Standard Time, your operating system automatically falls back one hour from 2 a.m. to 1 a.m., causing the hour from 1 a.m. to 2 a.m to be repeated.
432
User Guide
When testing this time change, you must set the clock to a time before 1 a.m. and allow the entire hour to pass before you can observe the time change. If you manually set the time to a period within the 1 a.m to 2 a.m. window, the system will assume that the time change has already occurred and will not reset at 2 a.m. Run Windows Run windows are treated a bit differently. If the specified opening of a run window is before the time change, and its specified closing falls within the repeated hour, it will close during the daylight saving, or first hour. For example, a run window of 11:30 - 1:30 will have the closing time of 1:30 DT, not 1:30 ST, which means that the run window remains open for its specified two hours. This may be a problem if there are also associated start times on the job that occurs during the repeated hour. In the example above, if the job also had a start time of 1:15, the start time would be calculated for 1:15 ST, and the job would not run on the day of the time change. If the specified opening of the run window falls within the repeated hour, its opening time is moved to the second, Standard Time hour. The closing time does not get altered, therefore the length of the run window will remain the same. For example, a run window of 1:45 - 2:45 will become 1:45 ST to 2:45, or the same hour in length. If both the specified opening and closing of the run window is within the repeated hour, the run window will be open during the second, Standard Time hour.
Job Attributes
433
Chapter
This chapter explains how box jobs work, including default box behavior and how to override the default behavior. It also explains what types of jobs should and should not be placed in a box. To illustrate box logic, several examples of box job definitions and job streams are provided in Examples.
Jobs run only once per box execution. Jobs in a box will start only if the box itself is running. Boxes should be used primarily for jobs with the same starting conditions. A box used to group sequential jobs is limited to 1,000 jobs. As long as any job in a box is running, the box remains in RUNNING state; the box cannot complete until all jobs have run. By default, a box will return a status of SUCCESS only when all the jobs in the box have run and the status of all the jobs is success. Default SUCCESS is described in Default Box Success and Box Failure.
51
By default, a box will return a status of FAILURE only when all jobs in the box have run and the status of one or more of the jobs is failure. Default FAILURE is described in Default Box Success and Box Failure.
Unless otherwise specified, a box will run indefinitely until it reaches a status of SUCCESS or FAILURE. For a description of how to override this behavior, see Box Job Attributes and Terminators. Changing the state of a box to INACTIVE (through the sendevent command) changes the state of all the jobs in the box to INACTIVE.
52
User Guide
53
Simple Box Job In this example, a box named simple_box contains three jobs: job_a, job_b, and job_c. job_a and job_b have no starting conditions; the starting condition for job_c is the success of job_b.
When simple_box starts running, all jobs change to state ACTIVATED. Because job_a and job_b have no additional starting conditions, they will start running. After job_b completes successfully, job_c will start. When job_c completes successfully, the box completes with status of SUCCESS. If job_b fails, job_c will not start; it will remain in ACTIVATED state. Because no contingency conditions have been defined, simple_box will continue running indefinitely, waiting for the default completion criteria to be met, namely that all jobs in the box ran.
54
User Guide
Example of a Non-Default Success Condition Using the above simple box example, assume you defined the following success condition for simple_box:
box_success: success(job_a)
If job_a runs successfully, and job_b is still running, job_c would pass from ACTIVATED state directly to INACTIVE state without ever running because the box it is in would no longer be running. When overriding default box terminators, be careful that you do not define conflicting success and failure conditions.
55
56
User Guide
At 3:00 a.m., bx_stat starts running, which causes job_a to start running. If job_a is successful, job_report runs and all goes as expected. However, if job_a fails, it will not be able to run again until the next time the box starts, because jobs run only once per box execution. job_report will still be ACTIVATED waiting for the success of job_a, and the status of the box will be RUNNING. The box will remain in this state indefinitely.
57
You can also execute this by selecting the Force Start Job button in the Job Activity Console. If you force start a job in a box, the state of the box influences whether or not other jobs in the box will run as expected, as shown in the following example.
In the previous figure, if the job run_stats fails, the bx_report box job will terminate because run_stats has a box_terminator attribute. If you force start run_stats, and it completes successfully, report_stats would still not start because the box it is in is not running. The next section discusses how job status changes influence the status of the container box.
58
User Guide
If another AutoSys job is dependent on the status of the box, the status change could trigger the job to start. If the box status does not change, dependent jobs are not affected. If the box contains other jobs in addition to the job that changed status, the status of the box will be evaluated again according to the success or failure conditions assigned to the box (either the default or user-assigned). Any jobs in the box with a status of INACTIVE are ignored when the status of the box is being reevaluated. For example, consider an INACTIVE box that contains four jobs, all with a status of INACTIVE (this is typical of a newly created box). If one of the jobs is force started and completes successfully, the status of the box will change to SUCCESS even though none of the other jobs ran.
59
Examples
Examples
Spend some time studying the examples in this section. They will help explain the logic of job flow in a box and reduce your chances of creating unexpected box behavior.
The box job named bx_daily_update has date and time conditions specified for its starting conditions; it runs every day of the week at 3:00 a.m. This box contains three command jobs whose overall purpose is to update files and generate a report. The command job named job_update updates a set of files. It is defined as being inside bx_daily_update. It will run as soon as bx_daily_update starts because it has no other starting conditions. This job has a box_terminator attribute; therefore, if this job fails, the box containing this job will be terminated. The command job named job_run_stats runs statistics on the updated files. It is defined as being inside bx_daily_update. It will run only on the successful completion of the job named job_update. This job has a box_terminator attribute; therefore, if this job fails, the box containing this job will be terminated. The command job named job_report_stats reports on the statistics generated by job_run_stats. It is defined as being inside bx_daily_update. It has a job dependency condition specified for its starting parameter. It will run only on the successful completion of the job named job_run_stats.
510
User Guide
Examples
The command job named job_trigger_msg has a job dependency condition specified for its starting parameter. It will run only on the FAILURE of the box job named bx_daily_update. This job will page an operator in order that the problem is investigated.
511
Examples
512
User Guide
Examples
513
Examples
514
User Guide
Examples
515
Examples
516
User Guide
Examples
Scenario On the Second of Month On days of the month other than the 1st, job_Fwatch and job_monthly do not run. They still have a status of SUCCESS in the AutoSys database from the previous run on the first of the month. As a result, job_daily will still run.
517
Examples
Scenario I On First of the Month On the first of the next month (for example, February 1), the file from the mainframe fails to arrive; therefore, job_monthly does not run for the month. However, its event status in the AutoSys database is still SUCCESS from the previous month, and as a result, job_daily runs in error.
518
User Guide
Examples
Scenario II on First of the Month To fix statuses that are time-related, you can use a sendevent command to change them to INACTIVE at the end of their valid time period. You can create another job to do this automatically.
519
Examples
Scenario III On First of the Month Instead of issuing a sendevent command to change the status of the jobs, you could put the monthly process in a box, and set box_failure or box_terminator appropriately.
520
User Guide
Chapter
61
Ops ConsoleDisplays the Job Activity Console, used to monitor AutoSys jobs and alarms. Job DefinitionDisplays the Job Definition dialog, used to define AutoSys jobs. CalendarsDisplays the Calendar Definition window, used to define AutoSys run and exclude calendars. Monitor/BrowserDisplays the Monitor/Browser dialog, used to define and run monitors and reports (or browsers). HostScapeDisplays the AutoSys/Xpert HostScape window, used for viewing AutoSys machines (and their jobs) in real-time or simulated mode. JobScapeDisplays the AutoSys/Xpert JobScape window, used for viewing job progressions (and their dependencies) in real-time or simulated mode. TimeScapeDisplays the AutoSys/Xpert TimeScape window, used for viewing job progressions (across time) in real-time, real-time with projection, or simulated mode. ExitExits the GUI.
Note: The three AutoSys/Xpert buttons are disabled if you have not purchased the AutoSys/Xpert product.
62
User Guide
63
The buttons at the top of the dialog perform the following actions:
ClearClears the dialog without affecting the database. Use this button to clear all fields in the dialog (and in memory). DeleteDeletes the currently displayed job from the database. SaveStores the currently displayed job in the database, either modifying a pre-existing job, or creating a new one. It also clears the dialog in preparation for another job definition. Adv FeaturesDisplays the Job Definition Advanced Features dialog, which is used for all but the simplest of job definitions. ExitCloses the Job Definition dialog. If Exit is pressed without first pressing Save, the latest changes will not be saved.
The fields in the Job Definition dialog are context sensitive based on the type of job being defined. When you select a Job Type, only the fields appropriate to that type of job are displayed and activated, and the other fields are disabled. In this dialog, there is a special field named Edit One-Time Overrides? which is used to specify that certain job attributes be applied for the very next run of the job. When the Yes radio button in this field is selected, only those fields that can be used for overrides are active, and the remaining fields are disabled.
64
User Guide
The buttons at the bottom of the dialog perform the following actions:
CalendarsAccesses the Autosys Graphical Calendar facility. DismissCloses the Date/Time Options dialog.
In addition to setting specific date and time starting conditions, at this dialog you can specify (and search for) any run or exclude calendars that have been defined for the job. To do this, you use one of the provided Search buttons. All entries made in the dialog are maintained in memory after you close the dialog. They are saved only when you select Save at the Job Definition dialog.
65
The Job Definition Advanced Features dialog has fields for all the additional features that can be specified for a job. Many of these fields are organized by job type, and they only pertain to the type of job on which you are working. Note: The External Application field is disabled. It is a placeholder for future functionality. These are the control buttons in this dialog:
DismissCloses the dialog; any entries made in the dialog are maintained in memory, so that a subsequent Save will save this information. Save&DismissCloses the dialog and saves the job, including the advanced features information to the database.
66
User Guide
In the Job Type field, click the Command radio button. In the Execute on Machine field, enter the name of the machine on which the command will be executed. You should enter your own valid, licensed client machine name. In the Command to Execute field, enter the command to be executed:
/bin/echo "AUTOSYS install test run"
4.
67
Your entries in the Job Definition dialog should look like those as the following shows: Note: The Owner field for the job defaults to the currently logged on usernot the user shown in these examples.
Save the example job by: At the top of the Job Definition dialog, single-click on the Save button. Leave the Job Definition dialog on your screen to use for the next example.
68
User Guide
In the Job Type field, click the File Watcher radio button. In the Execute on Machine field, enter the name of the machine on which the command will be executed. You should enter your own valid, licensed client machine name. In the File To Watch For field, enter the file name:
/usr/common/EOD_trans_file
4.
69
Your entries in the Job Definition dialog should look like the following:
Set the file watching criteria: Click the Adv Features button at the top of the Job Definition dialog, and the Job Definition Advanced Features dialog appears.
610
User Guide
At the top-center of the dialog are the File Watching Criteria. In this region of the dialog, enter the following information: 1. In the Time Interval (secs) to Determine Steady State field, enter the time interval:
60
AutoSys will check for the files existence every 60 seconds, and it will check if the file has grown between checksif it has not changed in size, this is called a steady state. 2. In the Minimum File Size (in BYTES) field, enter the minimum file size, which should be reached before the file can be considered complete:
50000
The file must have reached this minimum size, and must have reached a steady state before the file watcher job will complete with a SUCCESS condition. 3. To save the job and dismiss the Job Definition Advanced Features dialog, click the Save&Dismiss button.
611
612
User Guide
In the Job Type field, click the Command button. In the Starting Condition field, enter the only starting condition, in this case the successful completion of the file watcher job:
success(EOD_watch)
4.
In the Execute on Machine field, enter the name of the machine on which the file watcher will run. You should enter your own valid, licensed client machine name. In the Command to Execute field, enter the command that is to run when the file watcher completes, like this:
$HOME/post
5.
The environment variable $HOME means that the post executable is located in the job owners home directory. Note: The jobs execution environment is determined exclusively by the profile, which is sourced immediately before the job is started. By default, the file /etc/auto.profile is sourced. This can be overridden by specifying another profile in the File to Define Job Environment field in the Job Definition Advanced Features dialog. For information on the profile job attribute, see the chapter JIL/GUI Job Definitions in the Unicenter AutoSys Job Management Reference Guide.
613
The filled-in Job Definition dialog should look like the following:
Save the job: Click the Save button at the top of the Job Definition dialog.
614
User Guide
In the Job Type field, click the Box button. When you select the box job type, the lowest section of the Job Definition dialog changes from Command & File Watch Information to Box Completion Conditions.
3.
In the Starting Condition field, enter the only starting condition, in this case the successful completion of the file watcher job:
success(EOD_watch)
To save the job: Click the Save button at the top of the Job Definition dialog.
615
616
User Guide
Changing a Job
Changing a Job
This exercise will change an existing job. You should make sure a job is not running before you modify or delete it. In this exercise, you will place the EOD_post job, created previously, in the newly created box. To load an existing job into the Job Definition dialog, you can either enter its name explicitly in the Job Name field and click the Search button, or you can use the Search Facility. For this exercise, you will use the search facility. You can enter some portion of the job name, followed by the % wildcard character. The percent (%) character will match any string of one or more characters in the job name. For instance, %box% will match any job name with the string box anywhere in the name. Use the search facility by: 1. 2. In the Job Name field, enter:
EOD%
Click the Search button below the Job Name field. A Selection List Box similar to the one shown in the following graphic appears containing all the jobs currently defined in the database that start with the string EOD. (The list box shown below may not match yours exactly, since you can have other jobs defined.)
3.
Typing just the percent (%) wildcard character will display all the jobs defined in the database.
617
Changing a Job
4.
Double-click the desired jobs name, in this case, EOD_post. This will automatically dismiss the Selection List Box and display the requested job in the Job Definition dialog. If the job you wanted was not in the list, you could click the Cancel button to dismiss the Selection List Box without making a selection.
Place the EOD_post job in the box: 1. In the Name of Box this Job is IN field, enter the box name:
EOD_Box
This field also has a search facility, which works the same as the Job Name search facility, complete with wildcarding using the % character. 2. In the Starting Condition field, delete the string:
success (EOD_watch)
This starting condition has been assigned to EOD_Box. Now that this job is in a box, it will inherit the starting condition of the box. 3. Click the Save button at the top of the dialog to save the changes.
618
User Guide
Click the Search button to display the specified job. In the Starting Parameters region of the Job Definition dialog, locate the Is the Start Date/Time Dependent? field and click the Yes button. Click the Date/Time Options button. The Date/Time Options dialog appears:
6.
In the Date region of the dialog, select the days on which the job is to run. In this case, click the Monday, Wednesday, and Friday buttons. These buttons can be toggled on and off and are not mutually exclusive.
619
7.
In the Time region of the dialog, select the times when the job is to be run, in this case, click on the Times of day field, then enter:
10:00, 14:00
Unlike in JIL, the times do not have to be enclosed in quotes when they are entered in the GUI. 8. To close the Date/Time Options dialog, click the Dismiss button. The information is retained in memory until the job is either saved to the database or cleared. To save the new start date/time information for this job in the database click the Save button in the Job Definition dialog.
620
User Guide
Deleting a Job
Deleting a Job
Now you will delete the test_run job, which you just modified. You delete all jobs, regardless of type, in exactly the same way. To delete a job, you can either enter its name explicitly in the Job Name field, or use the search facility. In this final exercise, you will use the search facility. You can enter some portion of the job name, followed by the percent (%) character, or just enter the (%) character alone, for a global search. Delete a job by: 1. 2. In the Job Name field, enter:
%
Click the Search button to initiate the search. A Selection List Box appears, allowing you to select the desired job.
3.
Double-click the desired jobs name, in this case, test_run. This will automatically dismiss the Selection List Box and display the requested job in the Job Definition dialog. Verify that the test_job job is displayed in the Job Definition dialog. Note: No confirmation dialog appears when you delete a job using the GUI, so ensure that you are deleting the right job.
4.
5.
To delete the job from the database, at the top of the Job Definition dialog, click the Delete button.
621
Edit Onetime OverRides? Starting Condition Is Start Date/Time Dependent? Execute on Machine Command to Execute File To Watch for
622
User Guide
Region Alarms
Field
Minimum Runtime Maximum Runtime Time Interval (secs) to Determine Steady State Minimum File Size (in BYTES) Terminate this job Mins after starting Job Environment Profile File to Redirect to Standard Input File to Redirect to Standard Output File to Redirect Standard Error External Application Number of Times to Restart this Job after a FAILURE Delete Job hours after Completion AutoHold On? (all date settings) Run on Days in Calendar Do NOT Run on Days in Calendar (Exclude) (all time settings) Run Window
Terminators
Command Information
Misc. Features
Date/Time Options
DATE
TIME
623
3.
Save the job overrides: Single-click on the Save button at the top of the Job Definition dialog. This action returns you to job definition mode. Delete the job overrides: 1. 2. 3. At the Job Definition dialog, enter the name of the job for which you want to delete the overrides and click the Search button. In the Edit OneTime Over-Rides? field, click the Yes button. Click the Delete button. This action deletes the job overrides.
624
User Guide
The time interval after which the Monitor/Browser GUI will drop the connection to the database. The Monitor/Browse GUI icon text and the Monitor/Browse title bar text.
Descriptions of the resources which can be customized are given following. All of these can be set by modifying the X resource file Autosc. The X resources files reside in the local app-defaults directory, which varies across platforms. It is usually in /usr/lib/X11/app-defaults or /usr/openwin/lib/app-defaults. If you are not sure which directory these files are in, ask your system administrator. Individual users may have their own copy of the X resources files in their $HOME directory, which will take precedence over the app-defaults files. For most operating systems, if you are exporting the display to another machine you must edit the appropriate files in the app-defaults directory on the local machine. For Solaris, you must edit the files in both the /usr/lib/X11/app-defaults and /usr/openwin/lib/app-defaults directories. The files in /usr/lib/X11/app-defaults control the resources when you export the display.
625
If DBDropTime is set to 0, the connection is dropped immediately after the database query has completed. A value greater than or equal to 5 means that the GUI will automatically drop all database connections if the database has not been accessed in the last DBDropTime minutes. (Values of 1 to 4 are invalid). A new database connection will subsequently be established when required. If DBDropTime is greater than 360, the connection to the database is maintained until the GUI screens are exited.
Note: When changing icon text, be sure the length of the new text string does not exceed the recommended maximum length for icon title text for your windowing system. Some window managers can display long icon text strings, while others will truncate them. Ensure the text string you specify for your icons displays appropriately. Also, some window managers allow you to change the size of icons and icon text font.
626
User Guide
Chapter
This chapter describes how to define jobs using the AutoSys Job Information Language or JIL. It also provides information about creating various types of AutoSys jobs. It discusses changing and deleting a job, and how to set time dependencies. An example JIL script is provided.
where:
sub_command
Is one of the sub-commands listed in the table in JIL Sub-commands in this chapter. Is the user-specified name of the job to receive action.
job_name
71
Rule 2 Each sub-command may be followed by one or more attribute statements. These statements may occur in any order, and are applied to the job specified in the preceding sub-command. A subsequent sub-command begins a new set of attributes for a different job. The attribute statements have the following form:
attribute_keyword: value
where:
attribute_keyword value
Is one of the legal JIL attributes. Is the setting to be applied to the attribute.
Rule 3 Multiple attribute statements can be entered on the same line, but the lines must be separated by at least one space.
Rule 4 A box must be defined before the jobs can be placed in it.
Rule 5 Legal value settings can include any of the following characters: uppercase and lowercase letters, hyphens, underscores, numbers, colons (if the colon is escaped with quotes or a preceding backslash), and the at character (@).
Rule 6 Any colons used in an attribute statements value setting must be escaped, because JIL parses on the combination of keyword followed by a colon. For example, to specify the time to start a job, specify 10:00. The colon may also be escaped with a preceding backslash (\) as in 10\:00.
72
User Guide
Rule 7 Comments are indicated using one of the following two methods:
An entire line can be commented by placing a pound sign (#) in the first column. The C programming syntax used for beginning a comment with a slash star (/*) and ending it with a star slash (*/) may be used; this allows comments to span multiple lines. The following is an example:
/* this is a comment */
JIL Sub-commands
JIL sub-commands are used to create, modify, override, or delete a job definition. These sub-commands are listed in the following table. Sub-command insert_job insert_machine update_job delete_job delete_box Action Add a new job to AutoSys. Add a new machine to AutoSys. Edit fields on an existing job. Delete an existing job from the AutoSys database. Delete an existing box job, and recursively delete all the jobs, which are contained in the box. Apply overrides on indicated job attributes for the next run of this job.
override_job
73
Submit it by redirecting a JIL script file to the jil command, for example:
jil < my_jil_script
Interactively submit it by issuing the jil command and pressing Enter; then entering JIL statements at the provided command prompts:
jil>>
To exit interactive mode, enter exit at the prompt, or press Control+D. Both of these methods are analogous to saving a job definition in the GUI, using the Job Definition dialog. To specify the instance of AutoSys to which definitions are to be sent and applied, you can use the -S autoserv_instance argument to the jil command. For single instance AutoSys environments, the command will default to the only available AutoSys instance. For the jil command to work properly, the correct AutoSys environment variables must be assigned. For more information about these variables, see the Unicenter AutoSys Job Management for UNIX Installation Guide. For more information about the jil command, see its definition in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
74
User Guide
Running JIL
After a job definition has been submitted to the AutoSys database, it will be started according to the starting parameters specified in its JIL script. That is, the Event Processor will continually poll the database and when it determines that the starting parameters have been met, it will run the job. If a JIL script does not specify any starting parameters for a job, the job will not be started automatically by the event processor; it will start only if you issue the sendevent command. For example, assume a job named test_install has no starting parameters specified in its JIL script. The only way to start it would be to issue the following command:
sendevent -E STARTJOB -J test_install
This command tells the event processor to start the job named test_install. For more information about the sendevent command, see its definition in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
To add a new job named test_run. That the new job is a command job. To run the job on the client machine named tibet. To execute the UNIX /bin/touch command on the file named /tmp/test_run.out.
75
To add a new job named EOD_watch. That the new job will be a file watcher job. To run the job on the client machine named tibet. To watch for an end of day transaction file named EOD_trans_file in the /tmp directory. Check the file every 60 seconds. Determine if the file has reached the minimum file size of 50,000 bytes.
Until the minimum file size of 50,000 bytes has been reached, the file will not be considered by AutoSys as complete. When the file reaches this minimum size and does not change between check intervals (60 seconds in this example) it is considered complete (also known as steady state). When this occurs, the file watcher job will end with a SUCCESS condition.
76
User Guide
To add a new job named EOD_post. That the new job will be a command job. To run the job on the client machine named tibet. To run the job only if the file watcher job named EOD_watch completes with a SUCCESS status. To source the /etc/auto.profile file (AutoSys sources this file by default), and then, to run the job named post located in the job owners home directory.
Note: The jobs execution environment is determined exclusively by the profile, which is sourced immediately before the job is started. By default, the file /etc/auto.profile, on the client machine, will be sourced. This can be overridden by specifying another profile by using the profile attribute. For information on the profile job attribute, see its entry in the chapter JIL/GUI Job Definitions in the Unicenter AutoSys Job Management Reference Guide.
77
Creating a Box
Creating a Box
Box jobs are a convenient way to start multiple jobs. When you put jobs in a box, you only have to start a single job (the box) in order for all the jobs in the box to start running. Assume you want to schedule a group of jobs to all start running once the file watcher completes successfully. Rather than make each job dependent on the file watcher, you can create a box that is dependent on the file watcher, and place all of the jobs in the box. Now you will create a box, then change the job you just created to put it in the box, and then make it no longer individually dependent on the file watcher. The JIL script required to define a box job named EOD_box is given below:
insert_job: EOD_box job_type: b condition: success(EOD_watch)
To add a new job named EOD_box. That the new job will be a box job. To run the job only if the file watcher job named EOD_watch completes with a SUCCESS status.
For information on box jobs, see the chapter Box Job Logic, in this guide.
78
User Guide
Adding Machines
Adding Machines
The insert_machine sub-command adds a new machine definition to the AutoSys database for one of the following:
Real machine Virtual machine NSM Universal Job Management Agent AutoSys Connect
The machine type can be specified as either r for real, v for virtual, n for Windows, t for NSM or a Universal Job Management Agent, and c for AutoSys Connect. The component real machines in a virtual machine definition must be all of the same type, for example, all UNIX machines or all Windows machines (not a mix). If the machine being defined is a virtual machine, the insert_machine subcommand is followed by one or more machine attributes that specify real machines.
jil insert_machine: (machine name bocovic) Type: t Exit: Database change was successful.
machine_name
The unique name of the machine to be defined. It can be from 1-30 alphanumeric characters, and is terminated with white space; embedded blanks and tabs are illegal. The default type is UNIX (n or v), if no type is specified. Any machine accessible through the TCP/IP protocol can be specified in the machine attribute of a job; it need not be explicitly defined using the insert_machine command. However, any undefined machine will have a default factor of 1.0 and no max_load, meaning that there will be no limit on the job load assigned to it.
79
Changing a Job
Any machine defined in the /etc/hosts file on the machine running the Event Processor can be specified in the machine attribute of a job; it need not be explicitly defined using the insert_machine command. However, any undefined machine will have a default factor of 1.0 and no max_load, meaning that there will be no limit on the job load assigned to it.
Changing a Job
To place an existing job in a box, you need to change the EOD_post command job that was created previously. You will place the EOD_post job in the newly created box. You should make sure a job is not running before you modify or delete it. To change a job, you can either use the update_job subcommand, or you can delete the job definition, using the delete_job subcommand, then redefine the job using the insert_job subcommand. The latter scenario is particularly useful when many non-default attributes have been specified, and you want to unset them rather than reset them; in other words, you want to deactivate them. However, youll have to respecify any of the attributes that need to remain the same. So, in the example following, youll use the update method. The JIL script required to change the EOD-post job and to put it in the EOD_box is given here:
update_job: EOD_post condition: NULL box_name: EOD_box
To update the job named EOD_post. Remove the starting condition from the job definition, because the job will inherit the starting condition of the box in which it is placed. Put the job named EOD_post in the box named EOD_box.
The EOD_post command job is now in the EOD_box box job, and has inherited the boxs starting parameters.
710
User Guide
To update the job named test_run. Activate the conditions based on date. Set the job to run on Mondays, Wednesdays, and Fridays. On each of these three days, start the job at 10:00 a.m. and 2:00 p.m.
The times shown in the script above are quoted, since they contain a colon. They could also have been escaped by using backslashes, as shown here:
start_times: 10\:00, 14\:00
If you do not quote a time zone that contains a colon, the colon will be interpreted as a delimiter, producing unexpected results.
711
If you had wanted to run the job every day, rather than only on specific days, you could have specified the all value, instead of listing the individual day values. Or, if you had wanted to schedule the job for specific dates, rather than specific days of the week, you could have specified a custom calendar. First, you would have had to define the calendar, using the Graphical Calendar Facility, or the autocal_asc command. Then, you would specify the calendar name, weekday_cal, using the following JIL statement:
run_calendar: weekday_cal
Alternatively, you could have specified a custom calendar specifying the days on which the job was not to be run, holiday_cal, using the following JIL statement:
exclude_calendar: holiday_cal
If you wanted the job to run at specific times every hour, as opposed to specific times of day, the minutes past every hour could have been specified. For example, to run a job at a quarter after and a quarter before each hour, use the following JIL statement:
start_mins: 15, 45
712
User Guide
Deleting a Job
Deleting a Job
Now you will delete the test_run job, which you specified at the beginning of this chapter. To delete the test_run job, enter the following JIL sub-command:
delete_job: test_run
The delete_job sub-command checks the job_cond table and notifies you if dependent conditions for the deleted job exist. This functionality only works when JIL is in job verification mode (which is the default mode).
To delete a box, but leave its contents intact: Use the delete_job subcommand on the box, like:
delete_job: EOD_box
Using JIL, there are a number of other attributes, which you can configure. These attributes are described in detail in the chapter JIL/GUI Job Definitions in the Unicenter AutoSys Job Management Reference Guide. This reference chapter provides complete information on all job attributes specified using JIL.
713
714
User Guide
To cancel the job overrides specified in the script above, you would enter the following JIL script:
override_job: RunData delete
Note: Once you have submitted a JIL script to the AutoSys database, you cannot view the JIL script and edit a job override. If you want to change the override values, you must submit another JIL script with new values, or use the GUI. However, the original override (that is, the first over_num) remains stored in the overjob table in the AutoSys database.
715
A file named /download/mainframe/sales.raw is expected to arrive from the mainframe sometime after 2:00 a.m. When the file arrives, it is processed by the command file named filter_mainframe_info, and the results are placed in the file named /download/mainframe/sales.sql. When the above functions are completed, the file named /download/mainframe/sales.sql (containing SQL statements) is executed.
# Example of Jobs insert_job: Nightly_Download job_type: b date_conditions: yes days_of_week: all start_times: "02:00" insert_job: Watch_4_file job_type: f box_name: Nightly_Download watch_file: /download/mainframe/sales.raw machine: gateway insert_job: filter_data job_type: c box_name: Nightly_Download condition: success(Watch_4_file) command: filter_mainframe_info machine: gateway std_in_file: /download/mainframe/sales.raw std_out_file: /download/mainframe/sales.sql std_err_file: /log/filter_mainframe_info.err insert_job: update_DBMS job_type: c box_name: Nightly_Download condition: success(filter_data) machine: gateway command: isql -U mutt -P jeff std_in_file: /download/mainframe/sales.sql
An example of the output generated by the autorep command for the above job definition is provided in Examples for the autorep command in the chapter AutoSys Commands of the Unicenter AutoSys Job Management Reference Guide.
716
User Guide
Chapter
This chapter describes the AutoSys Graphical Calendar Facility and how to use it to create, view, and maintain AutoSys calendars. It also describes how to preview a calendar before applying its dates and how to apply custom rules to a calendar. An AutoSys calendar is simply a collection of dates grouped as a single entity. The Calendar Facility lets you define and maintain AutoSys calendars using a point and click approach on graphical displays of a conventional calendar. Once these dates are defined in a calendar, you can use the calendar to schedule jobs. In the job definition, you can specify that a job start (or not start) on the dates defined in a calendar. Calendars do not in themselves convey any rules about when a job should or should not start; this meaning is assigned exclusively through the job definition. The Calendar Facility lets you:
Define calendars. Apply custom rules to a calendar, such as the first weekday of every month, rather than selecting the individual dates by hand. Block out certain dates, such as holidays, when editing calendars. Select options that will automatically reschedule conflicting dates when applying a rule. A conflicted date is a date that is blocked out but also meets the qualifications of the rule being applied. A number of alternatives for rescheduling are provided.
81
Build a new calendar by overlaying multiple, pre-existing calendars, and allowing you to further customize the new calendar manually. Preview a calendar before applying it to another calendar. Import and export text definitions for calendars.
With JIL, enter a calendar name in the run_calendar or exclude_calendar attribute. In the Job Definition Date/Time Options dialog, enter a calendar name in the Run on Days in Calendar field or the Do NOT Run on Days in Calendar (Exclude) field.
For example, you could create a calendar called holidays containing the dates of all corporate holidays. For jobs that you want to start on holidays, you would define this using the attribute:
run_calendar: holidays
For jobs that you do not want to start on holidays, you would define this using the attribute:
exclude_calendar: holidays
Jobs scheduled with the run_calendar attribute are scheduled to start on every day specified in the calendar, at the times specified in the calendar or in the start_times or start_mins attribute. If present in the job definition, the start_times or start_mins attribute overrides the times specified in a run calendar. If no start time is specified, calendar-scheduled jobs start at midnight, by default. Note: Times can be assigned to calendars only when using the command-line calendar definition tool, autocal_asc. Jobs scheduled with the run_calendar attribute are scheduled on the next available date from that calendar. Dates previous to the current date are ignored.
82
User Guide
Jobs scheduled with the exclude_calendar attribute can make use of other start conditions in the job definition. In this case, AutoSys evaluates the start conditions and, if they are true, checks if the date is set in the exclude_calendar. If it is in the exclude_calendar, the job will not be started, and its status will be changed to INACTIVE. For more details on these job attributes, see their reference pages in the chapter JIL/GUI Job Definitions in the Unicenter AutoSys Job Management Reference Guide.
Single-click on the Calendars button in the GUI Control Panel. Enter the following command at the UNIX prompt:
autocal &
The Calendar Definition window appears, as shown in Calendar Definition Window in this chapter.
83
Calendar Selection
Import/Export File Name This dialog is used to specify a calendar text file (ASCII format) to import from or export to.
84
User Guide
The Calendar Definition Window is divided into the following regions: Menu BarAt the top of the window. Calendar DisplayAt the center of the window. Navigation and Legend ControlsAt the right of the window.
85
Menu Bar
At the top of the Calendar Definition window is the menu bar, containing four pull-down menus: File, Edit, Tools, and Options.
File Menu The File menu contains the following options: File Menu Option New Action Displays the New Calendar Name dialog for specifying a name for a new calendar to be created. Displays the Calendar Selection dialog for choosing an existing calendar to be edited. Saves the calendar currently being edited, using its current name. Displays the Save As dialog, asking you for the new name under which the calendar currently being edited will be saved. Displays a verification dialog, asking you to confirm that you want to delete the calendar currently being edited. Displays the New Calendar Name dialog, asking you to specify a new name for the calendar currently being edited. Displays the Import File Name dialog so you can select the directory and filename of the text file containing calendar definitions that you want to import.
Open
Save
Save As
Delete
Rename
Import
86
User Guide
Action Displays the Export File Name dialog so you can select the directory and name of the file to which you want to save all the calendars in the database, in text form. Prints the calendar currently being edited, using the print command specified in the application defaults or the print command specified using the Set Print Command option from the Options menu. Displays a verification dialog, asking you to confirm that you want to exit the application. If you have made changes that have not been saved, you will be notified and given an opportunity to save your changes. If you indicate that you want to exit, the application is exited.
Exit
Edit Menu The Edit menu contains the following options: Edit Menu Options Apply Rule Action Displays the Term Calendar Rule dialog, which allows you to set multiple dates at once using a variety of rule options. Resets the state of all the dates in the current calendar to those last saved to the AutoSys database. Resets the state of all the dates in the current calendar to the Unset state, regardless of their current states.
Revert
Clear
87
Tools Menu The Tools menu contains the following options: Tool Menu Options Term Calendar Viewer Action Displays the Term Calendar Viewer window, which allows you to preview various calendars prior to applying their dates to the calendar currently being edited.
Job Definition Reference Displays a list of all the jobs that reference the List calendar currently being edited, either as their run calendar or exclude calendar. This list indicates which jobs will be affected by any changes you make to the current calendar. The following is the Job Definition Reference List:
Note: Whenever a calendar is updated, AutoSys recomputes the starting times for all jobs that use that calendar.
88
User Guide
Options Menu The Options menu contains the following options: Option Menu Options Date Range Actions Displays the Date Range pull-down menu, through which you may specify the date range of the calendar currently being edited. The default date range is the prior year through the current year (Current Year). You may extend that to include additional years, and these are the options:
Thru Next Year Three Calendar Years Four Calendar Years Five Calendar Years Ten Calendar Years
The Date Range option limits how far into the future you can set dates in the current calendar. You can increase the date range of a calendar, and you can decrease it through the previous year. When you open an existing calendar, either the date range of the calendar or the current date range, whichever is greater, becomes the new date range for the current Calendar Facility session. Set Print Command Displays a dialog box prompting you to specify the full path to the print command to be executed when printing the calendar currently being edited. For example, the print command might be /bin/qprt -Plp3. The print command could also be set using the application default settings, as described in Print Command in this chapter.
89
Calendar Display
The Calendar Display region of the window displays six months of the calendar currently being edited. By default at startup, two quarters of the current year display, one of which contains the current day (indicated by a box). Using the navigation controls at the right side of the window, you can advance or move backward through the calendar, one quarter at a time. The Calendar Name field at the top left of the Calendar Display region displays the name of the calendar currently being edited. You can change this name using the Rename option from the File menu. You cannot edit the Calendar Name field.
Date States Each date in the calendar is a selectable button. By using the mouse and by clicking, you can set each date to one of the following states: UnsetThe date is not set. SetThe date is set. BlockedThe date is ineligible for setting when applying a rule. You can cycle through these three states by clicking the mouse button additional times. The current state of each date is indicated by its color, as listed in the Color Key area of the Navigation Controls region. Note: Calendars are stored in the AutoSys database. Only the Set dates are stored. Dates designated as Blocked are only in effect while editing a calendar. The Blocked state is only useful while applying rules. Blocked dates are not saved in the AutoSys database, nor are the rules. In addition to the three user-selectable states above, there is a fourth, systemgenerated state called the Conflict state. This state occurs when both of the following are true:
810
User Guide
A date that qualifies for setting by the rule has been previously set to the Blocked state and rescheduling has not occurred. (Rescheduling may not have occurred if either a reschedule rule has not been applied, or an applied reschedule rule cannot find a non-conflicting date to move to.)
For example, you may have marked all holidays as Blocked, one of which was January 1. Now you want to apply a rule to the first day of each month, which would conflict with the January 1 Blocked date. If there is no reschedule rule in effect, such as move to the next weekday, or if the reschedule rule specifies to move backwards, which would end up on a date in the previous year, a Conflict state would result. In this case, you must manually correct the situation before attempting to save the calendar to the database. For more information about Rescheduling Rules, see Rescheduling Rule in this chapter.
Navigation Controls
The Navigation Controls region of the window contains controls that do the following:
Let you specify which six months (or two quarters) are to be displayed in the window. Display additional information relevant to the calendar currently being edited.
Shift Months Area The Shift Months area has two push buttons (with up and down arrows) to advance or move backward through the calendar, one quarter at a time. You can change the calendar display to any two quarters within the calendar's date range. (For information about the date range, see Menu Bar in this chapter.) When shifting backward, you can move from the current year into the prior year. Only one prior year is viewable through the Graphical Calendar Facility (that is, if the current year is 2002, you could shift back to 2001, but not to 2000). Note: Rules cannot be applied to dates prior to the current date.
811
Skip Button Area To the right of the Shift Months area is a row of push buttons to skip to a particular date whose state has been set. The following buttons are available: First EntryChanges focus to the first date in the calendar that has been set. Next EntryChanges focus to the next date in the calendar that has been set relative to the date that currently has the focus. Next ConflictChanges focus to the next date in the calendar that has been set to the Conflict state by the system. Last EntryChanges focus to the last date in the calendar that has been set. Note: The date with the focus is designated with a darkened box around the date. When you use the skip buttons, if the focus is changed to a date that is not currently displayed, the calendar display will shift to bring that date into view.
Number of Conflicts Area Below the Shift Months area is the Number of Conflicts area displaying how many dates with the Conflict state currently exist in the calendar. This field is updated each time you apply a rule, or manually change a Conflict date to another state. Note: You cannot save a calendar containing unresolved conflicts.
Color Key Area Immediately below the Number of Conflicts area is the Color Key area, which displays the colors that indicate the states of each date. These colors are userconfigurable, as described in Object Color in this chapter. The following colors are used by default: Unset datesBackground color of the Graphical Calendar Facility Set datesGreen Conflict datesRed Blocked datesBlack
812
User Guide
Press Enter. In the Calendar Definition window, set the holiday dates by clicking the left mouse button on each date. If the desired dates are not displayed, use the Shift Months arrows to bring them into view. Choose File, Save. Your calendar will resemble the following:
5.
813
When this dialog appears, it contains a scroll list of all the calendars currently in the AutoSys database. In the Filter field of this dialog, you may specify any string, including the asterisk (*) wildcard character, to show only those calendars whose names include the string. The default filter, (*) matches all calendar names. For example, to display only those calendar names starting with the string us_hol, such as us_hol_97 and us_hol_98, you would specify the filter us_hol*, then click the Filter button. The list will display only the matching calendar names. You can select any calendar in the list by clicking the mouse. Click OK to open the calendar and dismiss the dialog. Click Cancel to dismiss the dialog without selecting a calendar.
814
User Guide
The Term Calendar Rule dialog is divided into the following three regions: Rule SpecificationThe top portion of the dialog. Rescheduling RuleCenter of the dialog. ControlThe bottom portion of the dialog.
815
Rule Specification
The Rule Specification region provides a variety of options for specifying which dates in the currently selected calendar you want to affect, and what state to set for those dates. This region consists of the following three areas:
Action Area The Action Area lets you apply one of the following states on the selected dates: Set DatesChange the state to Set. Unset DatesChange the state to Unset. Block DatesChange the state to Blocked. Blocked dates will not be set during any subsequent rule applications. Note: These actions are mutually exclusive; only one of these actions can be selected.
Date Range Area In the Date Range area, you specify the date range over which the rule should apply. By default, the date range of the currently selected calendar is in effect. However, you can change the date range by selecting an option from either the All in Year pull-down menu, or by specifying a date range in the All in Range edit fields. When entering a date range, the dates must fall within the range set in the Options Date Range pull-down menu. Note: Rules are not applied to any date prior to todays date.
816
User Guide
Date Selection Rule Area The Date Selection Rule area contains the following three options: Date Selection Options Occurrences Action Lets you specify the occurrence of a day for which the rule should be applied. Select one or more options by pressing the corresponding toggle button. Lets you specify the days of the week to which the rule should be applied. Select one of the following: Day (Any), Weekday, or Specific Days. If Specific Days is selected, one or more of the specific days of the week must be chosen as well. Lets you specify the period during which the rule should be applied. Only one of the following options can be chosen: No Period, Monthly, Quarterly, Every n weeks, or the days in a specified Calendar. The No Period option is used for non-repeating periods, such as Every/Mondaythis option is generally used with the Every or Every th option in the Occurrences sub-area.
Day
Period
If the Calendar option is selected, only the dates specified in the indicated calendar will be used when applying the rule. You can either enter the calendar name directly, or press the Calendar button to display the Calendar Selection dialog, from which you can choose a calendar for the period. Note: The rules used to set a calendar are not saved after the calendar has been created and saved.
817
Setting Dates To set the 3rd Tuesday of every month throughout the entire currently selected calendar: 1. 2. 3. 4. 5. 6. In the Action area, select Set Dates. Keep the default date range, which is the currently selected calendar's entire range. In the Occurrences sub-area, select either the Third option or enter 3 in the th option. In the Day sub-area, select Tuesday (which automatically selects the Specific Days option). In the Period sub-area, select Monthly. Click OK or Apply to apply the rule to the calendar you are currently editing.
Blocking Dates To block every holiday date and prevent those days from being scheduled: Follow these steps (assuming the holiday dates already exist in a calendar named us_hol_98): 1. 2. 3. 4. In the Action area, select Block Dates. Keep the default date range, which is the currently selected calendar's entire range. In the Occurrences sub-area, select Every. In the Day sub-area, select Day (Any).
818
User Guide
5.
In the Period sub-area, press the Calendar button. When the Calendar Selection dialog appears, select the calendar named us_hol_98, and click OK. The calendar name will appear in the Calendar field in the Period subarea. Click Apply to apply the rule to the calendar you are currently editing.
6.
Following on with this example, assume that you want to change the state of the first day of every month to Set. To change the state of the first day of every month to Set: 1. 2. 3. 4. 5. 6. In the Action area, select Set Dates. Keep the default date range, which is the currently selected calendar's entire range. In the Occurrences sub-area, select First. In the Day sub-area, select Day (Any). In the Period sub-area, select Monthly. Click Apply to apply the rule to whatever calendar you are currently editing.
When you apply this rule, a Conflict state will be assigned to January 1, since this date is defined in us_hol_98, and was marked as Blocked in the previous example. To address this you must either unset it manually and, if desired, set another date in its place, or you could specify a Rescheduling Rule to accommodate this type of conflict. Rescheduling Rules are described in the next section.
819
Rescheduling Rule
The Rescheduling Rule region lets you control how date conflicts should be resolved when applying a rule. To specify a rescheduling rule, you must indicate a move direction and the day to which the newly scheduled Set state should be moved when a conflict occurs. You do this in the Move Direction and To Day areas. Note: Conflicts can also occur if a date is Set, then Blocked. Whenever a conflict occurs, the Set state is moved when rescheduling.
Move Direction You can select one of the following Move Direction options: To PreviousMoves the Set state backward in the calendar. To FollowingMoves the Set state forward in the calendar.
To Day In addition to specifying the Move Direction, you must specify one of the following To Day options: Any DayNext available date. WeekdayNext available weekday date. Calendar-basedNext available date in the specified calendar. You must choose either In Calendar or Not In Calendar, and specify a calendar name, either by entering it directly, or by pressing the Calendar button, then selecting a calendar from the Calendar Selection dialog that appears.
Example To remedy the conflict situation in the last example from the previous section, assume that you want to apply a Rescheduling Rule that specifies any day following the date in conflict that is not also a holiday. To apply a Rescheduling Rule that specifies any day following the date in conflict that is not also a holiday: 1. In the Action area, select Set Dates.
820
User Guide
2. 3. 4. 5.
Keep the default date range, which is the currently selected calendar's entire range. In the Occurrences sub-area, select First. In the Day sub-area, select Day (Any). In the Period sub-area, select Calendar. When the Calendar Selection dialog appears, select the calendar named us_hol_98, and click OK. The calendar name will appear in the Calendar field. In the Move Direction area, select To Following. In the To Day area, select the Not in Calendar option and enter the calendar named us_hol_98. Click OK or Apply to apply the rule to the calendar you are currently editing.
6. 7. 8.
Notes:
Multiple conflict dates can be rescheduled to the same new date. For example, if you block all weekend dates, then apply a rule to set every day, with rescheduling to the previous weekday, both the Saturday and Sunday conflicts will be resolved to the preceding Friday. If you want separate runs for Saturday and Sunday, turn off the Rescheduling Rule and resolve the conflicts manually. If you intend to use a Rescheduling Rule, you should set it up at the same time you set up the Date Selection Rule, rather than wait for conflicts to occur.
Control
At the bottom of the Term Calendar Rule dialog, there are three buttons: OkApplies the current rule and dismisses the dialog. ApplyApplies the rule but does not dismiss the dialog CancelDismisses the dialog without applying the rule.
821
The Term Calendar Viewer is divided into the following regions: Calendar DisplayAt the left and center of the window. The dates in this calendar are read onlythey cannot be edited. You can only edit dates using the Calendar Definition window. Navigation ControlsAt the right of the window. Note: Unlike the Calendar Definition window, there is no Next Conflict button in the Navigation Controls region of this window. This is because calendars cannot be stored in the database if they contain conflicts. Also note that Blocked days are not stored in the databaseall dates saved in a calendar are Set dates. However, calendars may be used to block out dates by way of the Term Calendar Rule dialog.
822
User Guide
Combining Calendars
When the Term Calendar Viewer first appears, it displays six months of the current year. You must then select a calendar by clicking its name in the Calendar Selection list in the lower-right corner of the window. Navigation is exactly the same as in the Calendar Definition window. As each calendar is selected, it completely replaces the previously viewed calendar, and its name is displayed in the Calendar Name field at the top of the window. When you are finished viewing calendars, click the Dismiss button to close the window.
Combining Calendars
Calendars can be combined in a number of ways. For example, you can create a calendar that includes all the dates that are in either one calendar or another. Or you can create a calendar that includes all the dates that are in one calendar but not in another. In fact, you can combine any number of calendars in these ways. For example, suppose you have a calendar named us_hol_98 containing all United States holidays, and you have a calendar named corp_hol_98 containing all your corporation holidays. In this scenario, you want create a new calendar for all combined holidays called all_hol_98. This new calendar will include all the dates that are in either of the previously defined calendars. The best way to accomplish this is described in the following steps. To combine calendars: 1. 2. 3. 4. 5. 6. Choose File, New. In the New Calendar Name dialog, enter:
all_hol_98
Click OK. Choose Edit, Apply Rule. In the Term Calendar Rule dialog, select the Set Dates Action, the Every Occurrence, and the Day (Any) Day settings. In the Period sub-area, click the Calendar button, then select us_hol_98 from the Calendar Selection dialog, and click OK.
823
Printing Calendars
7. 8. 9.
In the Term Calendar Rule dialog, click Apply. Repeat steps 5 and 6, selecting the calendar corp_hol_98. Choose File, Save.
This process of combining calendars can be repeated for any number of calendars. Note: When combining calendars, any dates prior to the current date will be dropped from the new calendar.
Printing Calendars
You can print calendars to obtain a hard copy for your files. Before you attempt to print a calendar, be sure the print command is correctly specified, either through your X resources file, or by way of the Set Print Command option from the Options menu. The print command must include the full path to the print facility to be used, plus all appropriate arguments. Once this is set, you can open an existing calendar, or create a new calendar, then select the Print option from the File menu.
824
User Guide
Comments may be included following a space after the calendar name or after each date. If you export a calendar, the output will be in this same format. To access the Import File Name dialog: Choose File, Import.
825
At this dialog, you can either enter the full path to the calendar in the Selection field, or use the Filter field, Directories list, and Files list to select the file. When using a filter, the Filter field must contain a directory path name, which is everything up to the last slash (/) followed by the file pattern containing an asterisk (*). For example, to search the directory /home/my_dir for all file names with the string cal, you would specify: /home/my_dir/*cal*. Then, click Filter, and the Files list will display all the files containing that string. Then, click on the file you wish to import, and the full path name will appear in the Selection field. Finally, click OK to import the file. Note: If the text file being imported contains calendars with the same names as calendars already existing in the database, these calendars will not be imported. A warning dialog will notify you that a calendar name is duplicated, and that the import of this calendar will not occur. Also, after the import has completed, an information dialog will tell you how many calendars were imported.
826
User Guide
Exporting Calendars
You can export all the calendars from the AutoSys database to an ASCII text file. To export, select File, Export. The Export File Name dialog will display, allowing you to specify a file name for the calendar text file. Note: When you use the export facility, all the calendars in the AutoSys database are exported to a single ASCII text file. After the export has completed, an information dialog will tell you how many calendars were exported. You cannot export just a single calendar; however, you can edit the ASCII file after you export all the calendars. The Export File Name dialog uses the same filtering as described in Importing Calendar Text Files in this chapter.
827
The fonts that are used in the Calendar Definition window as well as the Calendar Facility dialogs. The choice of fonts will affect the size of the various windows and dialogs. The colors used to indicate whether dates are Set, Unset, or Blocked. The print command to be executed when you want to print a calendar. The Calendar GUI icon text and the Calendar title bar text.
Descriptions of each of the resources, which can be customized, are given below. All of these can be set by modifying the X resource file Autocal. The X resources files reside in the local app-defaults directory, which varies across platforms. It is usually in /usr/lib/X11/app-defaults or /usr/openwin/lib/app-defaults. If you are not sure which directory these files are in, ask your system administrator. Individual users may have their own copy of the X resources files in their $HOME directory, which will take precedence over the app-defaults files. For most operating systems, if you are exporting the display to another machine you must edit the appropriate files in the app-defaults directory on the local machine. For Solaris, you must edit the files in both the /usr/lib/X11/app-defaults and /usr/openwin/lib/app-defaults directories. The files in /usr/lib/X11/appdefaults control the resources when you export the display. We have listed the various resource names here as a reference only. For the default values, see the Autocal file. The provided resource file values work well for a majority of platforms.
828
User Guide
Object Color
The following resources affect the colors of the various objects within the Calendar Facility. Note: Several of the following colors come in pairs, such as setColor and setLabelColor. The LabelColor is the color of the numbers that appear on the date buttons and should be chosen so that they are easily readable with the color of the button on which they display. For example, do not specify setColor as blue and setLabelColor also as bluethe numbers will not be visible on the dates. Select another color, such as black or white for the label.
! general application background color Autocal*background: ! color to represent "set" dates Autocal*setColor: ! color for label (foreground numbers) to represent ! "set" dates Autocal*setLabelColor: ! color to represent "blocked" dates Autocal*blockedColor: ! color for label (foreground) to represent ! "blocked" dates Autocal*blockedLabelColor: ! color to represent "conflict" dates Autocal*conflictColor: ! color for label (foreground) to represent ! "conflict" dates Autocal*conflictLabelColor:
The following resource sets the color for a grid that appears between the dates in the Calendar Viewer window, to help differentiate it from the Calendar Definition window. If you do not want the grid to appear, select the same color as the background.
Autocal*viewerColor:
829
Date Range
The following resource sets the number of years in the date range of the calendar, as a default at start up. This can be overridden manually by way of the Date Range option from the Options menu, or by loading a calendar that has a larger date range than the default. These are the legal settings:
! 1 = prior year plus current year, ! 2 = thru next year, 3 = thru third year, ! 4 = thru fourth year, 5 = thru fifth year, ! 10 = thru tenth year Autocal.dateRange:
Window Size
The following resources help keep the size of the window small, and it is recommend that you do not change their settings.
Autocal*calViewForm*month_form*marginHeight: 0 Autocal*calViewForm*month_form*marginWidth: 0 Autocal*calViewForm*month_form.marginWidth: 2
Print Command
The following resource specifies the print command, and its arguments, which will be executed when you request that a calendar be printed. Be sure to specify the full path name of the executable, or it may not be located and the print will fail.
Autocal.printCommand:
Note: When changing icon text, be sure the length of the new text string does not exceed the recommended maximum length for icon title text for your windowing system. Some window managers can display long icon text strings, while others will truncate them. Ensure the text string you specify for your icons displays appropriately. Also, some window managers allow you to change the size of icons and icon text font.
830
User Guide
Chapter
This chapter describes the use of real and virtual machines in the AutoSys environment to provide load balancing and queuing functionality. It provides information about load balancing jobs across multiple machines, as well as queuing jobs to real and virtual machines.
Real Machines
In the AutoSys environment, a real machine is any physical CPU that has:
Been identified in the appropriate network database (for example, /etc/hosts) so that AutoSys can access it. Undergone a client software installation (and is licensed) so that AutoSys can run jobs on it.
The above two conditions are required for a real machine to run AutoSys jobs. However, for AutoSys to perform intelligent load balancing and queuing while executing jobs, it needs to know the relative processing power of the various real machines. AutoSys provides both load balancing and queuing by way of the logical construct called virtual machines.
91
Virtual Machines
Virtual Machines
A virtual machine is comprised of one or more real machines, in whole or in part (or a combination of both). All real machines within a virtual machine must be of the same type, either Windows or UNIX. Virtual machines cannot be a mix of both UNIX and Windows machines. By defining virtual machines to AutoSys, and then submitting jobs to run on those machines, you can specify:
Runtime resource policies (or constraints) at a high level. That AutoSys automatically execute those policies in a multiple machine environment.
The following JIL machine attributes are used when defining machines: Machine Attribute type Description Specifies a machine type, which can be one of the following:
machine
Specifies a real machine name to be inserted in a virtual machine. For real machines only, and used for load balancing. For real machines only, and used for load balancing.
max_load factor
92
User Guide
Real machines only need to be defined to AutoSys if they meet one of the following criteria:
Require a max_load or factor attribute to be set for them. (These attributes are discussed in the next section.) Are to be included in a virtual machine.
Virtual machines must be defined before you can use them. Load balancing and queuing can be done only if real and virtual machines have been defined to AutoSys using these machine attribute statements. The following two attributes, used when defining real machines, are key for load balancing and queuing: max_load and factor. Note: Real and virtual machines can only be defined using JIL. There is no GUI interface for defining machines. For more information about the JIL subcommands and attributes pertaining to machines, see the chapter JIL Machine Definitions in the Unicenter AutoSys Job Management Reference Guide.
93
Job Attributes and Load Balancing and Queuing For load balancing to work, every defined job that will impact the load on a machine must be assigned a job_load job attribute, which defines the relative load the job will place on a machine. Thus, a machines current load can be tracked, and overloading of a machine can be prevented. For example, if the max_load on a machine is 100 and the job_load for one job is 10, then that job will use 10 percent of the machines resources. In addition, for job queuing to take place, the priority job attribute must also be assigned in the job definition. The priority attribute specifies the relative priority of all jobs queued for a given machine. Without this attribute set, a job will run immediately on a machine, and it will not be placed in the queue.
94
User Guide
Note: A virtual machine is comprised of real machines. Therefore you do not specify max_load and factor attributes explicitly in a virtual machine definition. They are specified in the definitions of the real machines that make up the virtual machine.
Machine Definitions
AutoSys can infer whether a machine being defined is a real or a virtual machine based solely on the attributes in the definition. Any machine definition containing a max_load or factor attribute must be a real machine definition, because only real machines can have these attributes. Any machine definition containing a list of machine attributes is a virtual machine definition. Because of this, you can omit the type attribute when defining a UNIX machine. For Windows NT, however, the type attribute is required. Compare the following definitions: Real UNIX insert_machine: toad max_load: 100 factor: .8 Real Windows insert_machine: tiger type: n max_load: 100 factor: .8
95
To help you understand virtual machines and their capabilities, the following sections provide a series of examples that demonstrate the different combinations of real machines that can constitute a virtual machine. These examples include the JIL statements used to define these machines.
96
User Guide
The following demonstrates the definition of a real UNIX machine named jaguar with a max_load of 100 and a factor of 1.0.
insert_machine: jaguar type: r max_load: 100 factor: 1.0
To define a real Windows NT machine named tiger, enter the following JIL statements:
insert_machine: tiger type: n max_load: 100 factor: 1.0
97
The following demonstrates the definition of a virtual machine named modena, which is composed of two real machines named ferrari and lambo. Because the real machines do not specify a max_load and factor, they will have the default values for these attributes: a factor of 1.0 and unlimited load units.
insert_machine: modena type: v machine: ferrari machine: lambo
The following JIL statements define two real machines named fiat and lotus, and a virtual machine named capri, which is composed of the two real machines. The virtual machine is a superset of the two previously defined real machines. (Because the real machines are defined first to AutoSys, the virtual machine will use the max_load and factor attributes specified for them.)
insert_machine: fiat type: r max_load: 100 factor: 1 insert_machine: lotus type: r max_load: 80 factor: .9 insert_machine: capri type: v machine: fiat machine: lotus
98
User Guide
The following JIL statements define a virtual machine named mustang which is composed of slices, or subsets, of the real machines named fiat and lotus. Even though the real machines have been previously defined, only the reduced load portion (or slices) will be used in the virtual machine mustang.
insert_machine: mustang type: v machine: fiat max_load: 10 machine: lotus max_load: 9
If the machine lambo had been individually defined outside of the virtual machine, its individual definition still remains in effect. To delete the entire virtual machine, you dont have to specify any of the component real machines. The real machines are still definedonly the virtual machine they were in is deleted. The following JIL statement deletes the virtual machine named mustang:
delete_machine: mustang
Because the real machines fiat and lotus had been individually defined outside of the virtual machine, their individual definitions remain in effect.
99
Load Balancing
Load Balancing
By specifying a virtual machine or a list of real machines in a jobs machine attribute, rather than a single real machine, you can implement simple load balancing. That is, you can cause the workload to be spread across multiple machines, based on each machines capabilities. In addition to load balancing, this feature is a useful way to ensure reliable job processing. For example, if one of the machines is down, load balancing will run the job on another machine. When a job is ready to start, AutoSys will determine which of the specified machines is best suited to run the job. The following JIL example shows the job definition statements for such a job:
insert_job: test_load machine: modena command: echo "Test Load Balancing" job_load: 50 priority: 1
where:
modena
Is a virtual machine. Alternatively, you can specify a list of real machines in the jobs machine attribute, as shown below:
machine: ferrari, lambo
If the max_load attribute was not defined for either real machine (as in our example), or both machines had ample load units available, AutoSys would choose the machine to run on based solely on available processing power. To accomplish this, AutoSys does the following: 1. Determines the percentage of CPU cycles available on each real machine in the specified virtual machine. This is accomplished by one of the following actions:
910
User Guide
Load Balancing
4.
Chooses the machine with the largest result (that is, the machine with the most relative processing cycles available).
In the example machine list previously shown, the factor attribute is not specified for either machine, and thus the default factor value for each machine is 1.0. If the machines have equal max_load and factor values, it is equivalent to defining a job and specifying the following in the machine field:
machine: ferrari, lambo
The advantages of building a virtual machine are that it can be changed, and the new construct is immediately applied globally. Also, the values can vary between machines. Even when a set of real machines that have not been explicitly defined to AutoSys are specified in a jobs machine attribute, the available CPU cycles are used to determine which machine will run the job. In all likelihood, your system configuration will include machines of varying processing power, so you will need to specify the factor attribute value for each real machine. The following illustrates three machines having different capabilities, which are grouped into a virtual machine.
911
Load Balancing
To start a job on this virtual machine, simply specify italia as the machine attribute for the job. The event processor will perform the necessary calculations to determine on which machine to run the job, and reflect these calculations in its output log. The output is similar to this:
EVENT: STARTJOB JOB: test_mach Checking Machine usages using RSTATD :<ferrari=78*[1.00]=78> <alfa_romeo=80*[.80]=64> <lambo=2*[.30]=06> [ferrari connected] EVENT: CHANGE_STATUS STATUS: STARTING JOB: test_mach
Note that even though the ferrari usage was less than alfa_romeo, ferrari was picked because of the factors (78 * 1.0 > 80 * 0.8). Thus, the factors weigh each machine to account for variations in processing power.
912
User Guide
Is the svload executable Is the method to be used by ServerVision to select the best machine. This algorithm must be defined in the profile file that you reference with the -p argument. For example, the algorithm you define could provide one method that would be used for I/O-bound jobs and a different method that would be used for cpu-bound jobs. Is a virtual machine that has been defined to AutoSys. Is a comma-separated list of real machines. Is a file containing basic information and the performance metrics that apply to each possible algorithm that can be chosen. You can supply a path and filename for this option. To know what the ServerVision profile file may look like, see Examining the ServerVision Profile File. Specifies to redirect any standard error messages to a file.
2> filename
913
where:
[ServervisionConfigur ation] Timeout
Is a literal that must appear exactly as shown in this example. Is the number of seconds ServerVision should attempt to connect to a machine. The recommended setting is 12 seconds. Indicates the instance type. For this implementation, the value should be unix. For more information. Are examples of names of the real machines defined to AutoSys with the name of the corresponding ServerVision instance. It is in the form of machine_name=ServerVision_instance. Typically, the ServerVision instance will have the machine name also. These specify examples of names and definitions of the algorithms. You can create your own definitions, using any name, and you can have as many definitions in this file as you want. Each algorithm definition must use the Scan Groups as defined by ServerVision. The syntax must be a colon-separated string with the following elements, in the following order:
InstanceType=unix
honda=honda
Scan TypeIndicates the Scan Type as defined by ServerVision. Scan ObjectIndicates the Scan Object that is associated with the Scan Type as defined by ServerVision.
914
User Guide
maximum or minimumIndicates the value that is desired, either the maximum number or the minimum number. WeightIndicates the items importance in relationship to other items that make up the algorithm. For example, in the CPUMemory definition above, the CPU item is indicated as having twice the weight of the memory item.
This command prints to the screen the result metrics. You can use this information when you are creating the profile file and want to test your algorithm definitions.
WARNING! The -y argument is for testing only. Do not use it when using this command as the value for the machine job attribute. When testing the algorithm, do not include the 2> filename argument.
915
Queuing Jobs
Queuing jobs in AutoSys is a mechanism for ordering jobs that are unable to be run immediately. You can also issue a change priority event to change the priority of a job in the queue. There is no actual queue entity. Instead, jobs are chosen based on queuing policies. Queuing policies in AutoSys are established through the use, and subsequent interaction, of the two job attributes job_load and priority, and the two machine attributes max_load and factor. The following sections discuss queuing jobs and give examples of how load balancing and queuing are used to optimize job processing in your AutoSys environment.
916
User Guide
Queuing Jobs
The words in the queue refer to an actual AutoSys QUE_WAIT job status, and the job will stay in this state until the necessary load units become available. When the necessary load units become available, AutoSys again checks all the jobs starting conditions to ensure it is still okay to run the job. If any of the starting conditions are no longer true, the following message is generated:
Job: job_name Starting Conditions are no longer TRUE. De-Queuing this Job and setting to ACTIVATED.
Note: In order for any queuing to take place, all jobs must have their priority attribute set. By default, the priority attribute is set to 0 indicating that the job should not be queued, but be run immediately. When this is the case, even jobs whose job load would push the machine over its load limit will be run. However, it is important to note that even when jobs have a priority of 0, job loads will still be tracked on each machine. This is done so that jobs that do have nonzero priorities will still be queued. Using a previously defined machine named fiat with a max_load of 100, a simple queuing example would be as follows:
insert_job: jobA machine: fiat job_load: 80 priority: 1 insert_job: jobB machine: fiat job_load: 90 priority: 1
If jobA was running when jobB started, jobB would be in a QUE_WAIT state until jobA completed and jobB could run. Note: If a job is in the QUE_WAIT state and you want to run it immediately, do not force start the job. To change the job queue priority, use the sendevent command with the -E CHANGE_PRIORITY option.
917
Queuing Jobs
918
User Guide
Queuing Jobs
The following illustrates a situation where a machine has 80 load units, and multiple jobs are waiting to start. In this example, JobB and JobC are executing while JobA and JobD are queued (in the QUE_WAIT state), waiting for available load units. The numbers in the figure indicate the job_load assigned to each job, and the max_load of the machine. The JIL statements provided below define the machine and the jobs.
insert_machine: ferrari max_load: 80 insert_job: JobA machine: ferrari job_load: 50 priority: 60 insert_job: JobB machine: ferrari job_load: 50 priority: 50 insert_job: JobC machine: ferrari job_load: 30 priority: 80 insert_job: JobD machine: ferrari job_load: 30 priority: 70
In the above scenario, JobB and JobC are already running because their starting conditions were satisfied first. After JobB or JobC are completed, JobA or JobD will start. Which job will start, JobA or JobD, is determined by a combination of the priority and job_load attributes of each job, and the max_load machine attribute. The resulting scenario will differ, based on which job finishes first. If JobB finishes first, 50 load units become available, so either JobA or JobD could be run. Since JobA has a higher priority (lower value = higher priority), it will run first. However, if JobC finishes first, only 30 load units become available, so only JobD could be run.
919
Queuing Jobs
SubsetsIndividual Queues
One variety of virtual machine can be considered a subset of a real machine. Typically, this type of virtual machine is used to construct an individual queue on a given machine. One use for this construct might be to limit the number of jobs, of a certain type, that will run on a machine at any given time. For example, you have created three different print jobs, but you want only one job to run on a machine at a time. You can accomplish this by using a combination of the max_load attribute for the virtual machine and the job_load attribute for the jobs themselves. The following illustration depicts a virtual machine functioning as a queue. The JIL statements to define the queue, called ferrari_printQ follow the graphic. Note that ferrari is a real machine.
To implement the schema in the previous illustration, you first create the virtual machine named ferrari_printQ, like:
insert_machine: ferrari_printQ machine: ferrari max_load: 15
Using this definition, only one of the jobs would run on ferrari at one time, since each job requires all of the load units available on the specified machine.
920
User Guide
Queuing Jobs
Load Units and Virtual Machines It is important to note that the load units associated with a virtual machine have no interaction with the load units for the real machine. In the previous example, this means that the virtual load of 15 does not subtract from the load units of 80 for the real machine. Load units are simply a convention that allows the user to constrict concurrent jobs running on any one machine.
To implement the previous schema, you would first create the virtual machine named printQ, then you would specify two real machines, ferrari and lambo as shown in the following example:
insert_machine: printQ type: v machine: ferrari max_load: 15 machine: lambo max_load: 15
As a job is logically ready to start on printQ, AutoSys will determine if there are enough load units available on either machine. If there are not, it will place the job in the QUE_WAIT state, and start it when there are enough load units. If there are enough units on only one machine, it will start it on that machine. In the case that there are enough available load units on both machines, AutoSys will determine the usage on each, and start the job on the machine with the most available CPU resources.
921
At runtime, the script /usr/local/bin/pick_free_mach is run on the event processor machine. The standard output will be substituted for the name of the machine, and the job will be run on that machine. Note: If you specify a user-defined load balancing script in the machine attribute, you cannot use the priority or job_load job attributes.
922
User Guide
Chapter
10
This chapter describes how to use the AutoSys Operator Console to monitor and control job activity in real-time. It also describes the job selection and reporting features of the console, as well as the Alarm Manager. Customizing the Operator Console is also covered. The Operator Console provides a sophisticated method of monitoring AutoSys jobs in real-time. The Operator Console lets you view any jobs that are defined to AutoSys, whether they are currently active or not. Job selection criteria, which you can dynamically change, allows you to control which jobs you want to view based on various parameters, such as the current job state, the job name (with wildcarding), and the machine on which the job runs. You can select any job and view more detailed information about it, including its starting conditions, dependent jobs, and autorep reports. You can even invoke the Job Definition dialog directly from this window and change the job, if the correct permissions are set.
101
Alarm Manager
In addition to its job monitoring capabilities, the Operator Console provides an Alarm Manager, which lets you monitor alarms as they are generated. You can manage alarms by doing the following:
Entering responses directly at the Alarm Manager dialog. Setting the alarms state to either acknowledged or closed. (If an alarms state is open and you simply acknowledge it without closing it, it will be set to acknowledged.)
Alarms and their responses are stored in the AutoSys database, from which they can be retrieved for viewing, or for adding additional responses. You can dynamically select which alarms you want to view based on such criteria as alarm type, alarm state, and the date and time range in which the alarm was generated.
InfoReports
AutoSys provides a number of reports with details on AutoSys jobs which you can view using the InfoReports GUI (either the one bundled with AutoSys or your own InfoReports installation). The following types of reports are available: Job ListThis report allows you to select the list of jobs to be reported on or report on all jobs in the database. It shows almost everything associated with a job. Job FindThis report allows you to enter a pattern for the job name, and report on all jobs that match the pattern. This is very similar to the Job List report. Last RunThis report allows you to enter the job name and get information regarding the last run of the specified job. Last n RunThis report allows you to enter the job name and get information on the nth to last run of the job. You access these reports by clicking on one of the user-configurable action buttons in the Operator Console. (Enabling these buttons is described in Accessing InfoReports from the Operator Console.) The buttons can be configured to open any of the available reports.
102
User Guide
Once you have defined these buttons, click one to run the report. When you do this, the InfoReports Database Login dialog appears. In this dialog, you enter information in the following fields: 1. 2. 3. 4. 5. In the Source Type drop-down menu, specify which database you are using, either Sybase or Oracle. In the User field, enter the AutoSys database user name. In the Password field, enter the password for the AutoSys database user. In the Connection field, enter the Sybase database name (for Sybase) or the TNS name (for Oracle). Click OK after entering all the required information.
The dialog that appears next depends on the type of report to be displayed. You will be asked to supply information, such as a job name or a job run number. After processing these dialogs, the requested information is read from the database and is displayed in an InfoReports report.
Setting a Scrolling Buffer To set the number of panels (pages) buffered for the InfoReports Print Preview, you must modify the InfoReports prorep.prf file in your home directory ($HOME). In the prorep.prf file, and add the following entry at the end of the file:
" [Buffer-Control] Print-Preview-Panels=50 "
where:
50
Is any integer greater than zero. It is the limit on the number of panels to buffer. The default value is 3. Refer to your InfoReports documentation for more information on using InfoReports.
103
Job Selection
Alarm Manager
Alarm Selection
104
User Guide
The Job Activity Console has a menu bar and the following three regions:
Job ListDisplays a list of all jobs stored in the database, subject to the job selection criteria currently in effect. Currently Selected JobDisplays more detailed information about the currently selected job. Control AreaThe bottom portion of the Job Activity console is the Control area. The left side of this area contains buttons that act on the currently selected job; the middle contains buttons that act on the console screen; the right side contains the Alarm button, which displays the Alarm Dialog, and Exit to exit the Job Activity Console.
By default the Job Activity Console starts up in Freeze Frame mode, which prevents the display from regularly updating and refreshing. To observe changes as they occur, click Freeze Frame.
105
Menu Bar
At the top of the Job Activity Console is the menu bar, containing three menus: File, View, and Options. File MenuContains the Exit option, which functions exactly like the Exit button in the Control areait displays a verification dialog asking you to confirm the exit. If you confirm, the Job Activity Console is closed. If the Alarm Manager was open, it will be closed as well. View MenuContains the Select Jobs option, which displays the Job Selection dialog, discussed in Job Selection Dialog in this chapter. Options MenuContains the Console Clock Perspective option. You have three choices: Server Time, Current Job Time, or Local Machine Time. This option controls the time perspective of the display, as discussed in the 10.
Job List
The Job List region displays a list of all the jobs that are defined to AutoSys, subject to the job selection criteria currently in effect. Each entry in the Job List contains the following information about a single job:
Job name. Description. Current status. Command that is defined for this job, if it is a command job. If it is a file watcher job, the file to watch for appears in the Command column. If it is a box job, the Command column is empty. Machine on which the job ran or is currently running.
The entries in the Job List provide a snapshot of the entire system, across multiple machines.
106
User Guide
When you select a job from the list, the highlighted job becomes the currently selected job, and more detailed information about the job appears in the Currently Selected Job region. To select a job: Click the job in the Job List display. When you select multiple jobs, you can perform actions on all those jobs at the same time. To select contiguous multiple jobs: Press and hold the mouse button and drag to select a group of jobs. To select noncontiguous multiple jobs: Hold the Control key and click each job you want to select. Hold the Control key and clicking a selected job will deselect that job. To deselect all the currently-selected jobs: Click anywhere in the job list. Note: The Job List region has a scroll bar along the right side for scrolling through the job list. Using the X resource file, you can configure the relative sizes of the columns in the Job List, as well as the length of each field and the spacing between fields.
107
108
User Guide
Next StartIf the job has date and time starting conditions, this field shows when the next run of the job is scheduled to start. MachineThe name of the machine on which the job ran or is currently running. If a job is defined to run on a virtual machine, the name of the real machine component on which it actually ran will appear here. Queue NameIf the job is queued to start on a machine, the name of that machine appears here. PriorityIf the job is queued to start on a machine, its priority in the queue appears here. Num. of TriesIf the job had to be restarted, the number of times it was started appears here.
Starting Conditions
The Starting Conditions area displays the jobs entire starting condition, as specified in its job definition, as well as the atomic conditionsthe most basic components of an overall condition. This information is very useful when troubleshooting a job. For example, in the sample Job Activity Console, shown in Job Activity Console in this chapter, the job named DripCoffee has a starting condition called:
This starting condition is specified in the jobs definition. However, this starting condition is actually composed of the following two atomic conditions:
SUCCESS(Grinder) SUCCESS(BoilWater)
In the Starting Conditions area, each atomic condition is displayed with the Current State of the job upon which it is based; in our example, Grinder and BoilWater, respectively. Also, a True/False flag is provided that indicates whether or not that atomic starting condition has been satisfied.
109
If a job has not run within the time frame it was expected to, you would select the job from the Job List and check its starting conditions to quickly determine what upstream job might be preventing it from running. The atomic condition list is selectable. By clicking any one of the atomic conditions, the job associated with that condition will become the currently selected job, and its details will be displayed in the middle region of the screen. This feature allows you to quickly step through upstream dependencies, checking out each job along that path.
Reports
The Reports area displays a realtime report, and it is also included in the Currently Selected Job region. This report presents job run information in the same format as that produced by the autorep command. You can choose from the following report types: Summary A one-line synopsis of the last or current execution of the job showing the job name, timestamp of the last start and last end of the job, status, job run number and number of tries (separated by a slash), and priority (if the job is in QUE_WAIT status) or exit code (if the job completed). EventA detailed report listing all the events and statuses from the last or current execution of the job. The screen shown in Job Activity Console in this chapter shows an Event Report. NoneDoes not display a report. Summary and Event reports will be run automatically each time the dialog is refreshed. The default refresh interval is every five seconds, but the interval is user-configurable. If the Event report is chosen, you can watch the realtime progression of a job, observing, as they occur, the arrival of the various events, such as the job starting, running, completing, and restarting.
1010
User Guide
Control Area
The bottom region of the Job Activity Console is the Control area.
Action Buttons On the left side of this region is a group of push buttons that can be pressed to initiate certain actions on the currently selected job or jobs. By pressing the appropriate button, you can issue an event that will:
Start a job. Kill a job. Force a job to start. Place a job on hold. Take a job off hold.
In each of these cases, a dialog box asks you to confirm, after which the action is taken immediately, without requiring you to perform any further actions. If you have initiated an action on multiple jobs, the dialog will display the following:
Ready to send event event for # jobs
where:
event #
Is the action you chose. Is the number of jobs that are selected. When you click Send Event, the Send Event dialog displays, and it allows you to send any type of event. In the last column are buttons that are user configurable. You can associate any command with these buttons, and specify your own button labels. This process is explained in User-Configurable Action Buttons in this chapter.
1011
Send Event Dialog The Send Event dialog box provides you the means to do the following:
Send any event that can be sent manually in AutoSys. Select the various event parameters you want to specify when sending the event. Cancel an event that has been scheduled to occur in the future.
Note: The fields of the Send Event dialog correspond to sendevent command options. For a complete description of these options, see the description of the sendevent command in the chapter AutoSys Commands of the Unicenter AutoSys Job Management Reference Guide. The following is the Send Event dialog:
You specify an event using one of the radio buttons at the top of the dialog. Just below these buttons is the Job Name field, which by default contains the name of the currently selected job. You can change this field if desired.
1012
User Guide
You can specify when the event is to take effect, either Now (the default), or at some future time and date. (The current time and date are provided as examples of the required format.) Use the A.M. and P.M. radio buttons if you want to specify the time using a 12-hour format. If the time field contains an hours setting that is less than 13, it is considered a.m., while any larger value is considered p.m. The Comment field is a free-form field in which you can enter any text you want to associate with this event in the database; this field is for documentation purposes only. For example, if you force a job to start, you might provide an explanation about why this was necessary. The AUTOSERV instance field displays the current AutoSys server identifier; only when events need to be sent to a different AutoSys instance should this field be changed. The Global Name and Global Value fields are used when you have specified a SET_GLOBAL event. Global Name and Global Value can each be a maximum of 30 characters. The Signal field is used to specify the signal number if you specified a SEND_SIGNAL event. The Queue Priority field is used only when you have specified a CHANGE_PRIORITY event. This affects the run priority of a job that is in QUE_WAIT state. You can only make a selection from the Status pull-down menu if the Change Status event type (radio button) has been selected. This menu lets you select a new status for the currently selected job. Use the Send Priority radio buttons to specify whether the event is to be sent with normal priority (the default), placing the event in the queue with all system-generated events, or with high priority, placing it at the top of the event queue. The latter is normally reserved for emergencies, such as killing a job. The Execute button of the dialog executes, or sends, the event. The Cancel button cancels the event that was about to be sent. When either button is pressed, the Send Event dialog is dismissed.
1013
Canceling a Sent Event At the Send Event dialog, you can cancel one or more events scheduled to occur sometime in the future. You can do this in one of two ways: by canceling a specific event or by canceling a specific event type for a specific scheduled time. Note: You should use this feature to cancel events that you have sent from the Send Event dialog. If you want to override a scheduled starting condition for a job, you should use the one-time override job attribute, either from the Job Definition dialog or from JIL. To cancel a specific event: 1. In the Event Type region, specify an event type by selecting one of the radio buttons. Note: You can select multiple jobs in the Job Activity Console before you open the Send Event dialog. If you do so, the Send Event dialog will send this cancel event for all of the selected jobs that meet the Event Type criteria. 2. 3. 4. Select the Cancel Previously Sent Event radio button. In the Job Name field, enter the job name. Click the Execute button. This process cancels all pending events of the specified Event Type for the selected jobs.
To cancel a specific event by its scheduled time: 1. Make sure you have the appropriate job in the Job Name field. Note: You can select multiple jobs in the Job Activity Console before you open the Send Event dialog. If you do so, the Send Event dialog will send this cancel event for all of the selected jobs that meet the Event Type and Time criteria. 2. 3. 4. In the Event Type region, specify an event type by selecting one of the radio buttons. Select the Cancel Previously Sent Event radio button. Select the Match on Time radio button.
1014
User Guide
5. 6.
In the Time field (of the Future region), specify the time the event is scheduled to occur. Click the Execute button. This cancels all pending events of the specified Event Type at the specified Time for the selected jobs.
The Cancel Previously Sent Event feature is designed to be used primarily on events that you have sent from the Send Event dialog. If you want to override a scheduled starting condition for a job, you should use the one time override job attribute, either from the Job Definition dialog or from JIL. If you cancel a future Start Job event for a time-dependent job with no other starting conditions, the job may never run again without manually starting it with a Send Event command. For example, jobA is scheduled to run daily at 11:00. jobA starts at 11:00 on Monday and completes at 11:30, at which time the next future Start Job event is sent for 11:00 Tuesday. At 9:00 on Tuesday, you cancel the 11:00 Start Job event. The job not only does not run at 11:00 on Tuesday, but it will not be scheduled to run again. To restart the job, you can either update its job definition, or manually issue a Start Job Send Event.
Control Buttons In the middle of the Control area, there are several push buttons that provide you with control functions for the Console screen. The Job Definition button displays the Job Definition dialog with the currently selected job already displayed. This allows you to quickly review the jobs definition, and change it if necessary (and if permissions allow). The Dependent Jobs button displays the Dependent Jobs dialog that contains a list of all the jobs directly dependent on the currently selected job. This allows you to quickly see which jobs will be affected by the current job; in particular, which jobs will not run until the current job completes. This is useful if the currently selected job is running late and you need to determine which other jobs will be affected. The following Dependent Jobs dialog is for the job BoilWater. As with the atomic conditions list, any job in this Dependent Jobs List can be selected, making it the currently selected job (and dismissing the dialog).
1015
The Dependent Jobs dialog can be dismissed by pressing the Close button, or by selecting another job in the Job Activity Console. The following is the Dependent Jobs dialog for the job BoilWater:
This feature allows you to arbitrarily follow the chain of job dependencies far downstream, and to determine which jobs are in some way dependent on another job. The Freeze Frame button freezes the Console display, which otherwise is regularly updated. By default it is updated every five seconds; this refresh interval is user-configurable. In Freeze Frame mode, all processing continues normally, but the screen is not refreshed. This feature is useful, for instance, when you are viewing the Event Reports output and the display has scrolled through some of the output. A refresh operation would reset the report display to the first line of output, forcing you to scroll back to the area that you were viewing. When the Freeze Frame button is toggled back off, the Console once again reflects the current state of the system. There is a user-configurable X resource that causes the Operator Console to start in Freeze Frame modethis is the installation default. For more information about customizing the Operator Console, see Customizing the Operator Console in this chapter. The three Report buttons let you choose the type of report you want to view, as described earlier in this section. You can specify a default report type using the X resources, see Default Report in this chapter.
1016
User Guide
Job Path (History) Dialog When you single-click on a job in the atomic conditions list or in the dependent jobs list (thereby changing which job is the currently selected one), the Job Path (History) dialog appears. This dialog contains a list of all the jobs selected since the last time a job was selected directly from the Job List, in the order in which they were selected. Assuming that you had clicked the atomic condition for BoilWater while displaying the DripCoffee job, the new currently selected job would be BoilWater, and the Job Path (History) dialog would display with the appropriate entries for both of the jobs you have traversed since your last selection from the Job List. The following shows a Job Path (History) dialog box with these entries.
Using this dialog, you can quickly return to any previously selected job by clicking on the jobs name.
Alarm Button The large Alarm button serves both as an indicator that a new alarm has been detected and as way to display the Alarm Manager dialog. When a new alarm occurs, the Alarm button changes to the color red. When this happens, and you press the button, its color returns to green, and the Alarm Manager dialog is displayed. If the Alarm Manager dialog is already on the screen, but is obscured by the Job Activity Console, pressing the Alarm button will bring the Alarm Manager dialog to the top of the display. The Alarm button can also be used to update the Alarm Manager dialog, even if Freeze Frame is in effect.
1017
Exit Button The Exit button is used to close the Job Activity Console, including the Alarm Manager. When this button is pressed, a verification dialog displays asking you to confirm the exit.
1018
User Guide
1019
All
1020
User Guide
When selecting jobs based on box name, each level of box/job will be indented two spaces to indicate the nesting. The following shows this convention in the Job List:
1021
Selecting Machines To choose a single machine: Click on that machine. To choose a range of machines: Click and hold on the first machine name, drag the cursor to the last name, then release the mouse button. To choose additional machines after the initial selection: Hold down the Control key and perform the actions previously shown. To choose all machines: Select the All Machines option.
1022
User Guide
Job Status
Machine Name
Unsorted
or:
Clicking the Cancel button dismisses the Job Selection dialog without changing the selection criteria.
1023
1024
User Guide
The Alarm Manager dialog has a menu bar and the following three regions: Alarm ListAt the top of the dialog. Currently Selected AlarmIn the middle of the dialog. ControlAt the bottom of the dialog.
1025
Options Menu
1026
User Guide
Alarm List
The Alarm List region of the dialog displays a list of all the alarms that are currently in the system, and that meet the viewing criteria specified by the user, which may include closed alarms. The default is to display all Open and Acknowledged alarms, of any type, regardless of the time they were generated. Each entry in the Alarm List contains the following information about a single alarm:
Alarm type. The job for which the alarm was generated. Date and time at which the alarm was generated. The alarms current state. Any comment associated with the alarm at the time it was generated.
Alarms are displayed in reverse order of occurrence; the newest alarms appear at the top of the list and older ones appear farther down. An alarm is made the currently selected alarm by clicking anywhere on the line on which it is displayed. Selecting multiple alarms allows you to perform actions on multiple alarms at once. To select multiple alarms: Do one of the following:
Press and hold the mouse button and drag to select a group of alarms. Hold the Control key and click each alarm that you want to select. Holding the Control key and clicking a selected alarm will deselect the alarm.
Clicking anywhere in the alarm list will deselect all currently selected alarms. Note: You can configure the widths of the columns in the Alarm List using the X resource file, described in Alarm List Column Width in this chapter.
1027
Control
The third region of the Alarm Manager dialog is the Control region, at the bottom of the dialog. In this region, there are several buttons used to control the Alarm Manager. They are:
Freeze Frame Button As in the Job Activity Console, the Alarm Manager is automatically updated at regular intervals. The Freeze Frame button suspends this automatic refreshing of the screen, keeping the Alarm List static while you are working on it. The currently selected alarm will remain displayed until another alarm is selected, even when the dialog is being updated; as a result, responses can be entered uninterrupted. The Freeze Frame feature is particularly useful when scrolling through the Alarm List, since newly arriving alarms are added at the top of the list, and the list scrolls back to the top each time the display is refreshed.
1028
User Guide
Select Job Button The Select Job button causes the job associated with the currently selected alarm (if there is one) to become the currently selected job on the Job Activity Console. This is useful when you want to review the details of the job for which the alarm was generated. In the example in Alarm Manager Dialog, the currently selected alarm is associated with the job AlarmClock. Therefore, pressing the Select Job button in the Alarm Manager will cause AlarmClock to become the currently selected job in the Job Activity Console. This operation will also update the Job Path (History) dialog (discussed in Job Path (History) Dialog in this chapter) in the process.
New Alarm Button The New Alarm button serves the same purpose as the Alarm button on the Job Activity Console. This button turns red when a new alarm arrives, which is particularly useful when the Alarm Manager is not refreshing (when the Freeze Frame feature is in effect). When you press either the New Alarm button on this dialog, or the Alarm button on the Job Activity Console, the Alarm List is updated and both of these buttons are reset to green, even if the Freeze Frame feature is on. Even when this dialog is refreshing regularly, the New Alarm button can serve as an indicator that a new alarm has arrived. When the alarm selection criteria is restrictive enough to filter out the new alarm, a warning dialog displays when the New Alarm button is pressed. This dialog offers to reset the selection criteria to the defaults; this will ensure that any new alarms that are generated will be displayed. Regardless of whether or not you accept this option, the New Alarm buttons color will be reset to green.
1029
Registering Responses and Changing Alarm States To register a response or change the state of an alarm in the AutoSys database, you must explicitly save the alarm. To save an alarm to the database: Do one of the following:
Click Apply. This action does not dismiss the Alarm Manager.
Click Cancel.
Typically, you would use Apply to register changes, since the Alarm Manager would probably be running on a continual basis.
1030
User Guide
The Alarm Selection dialog is divided into three regions, described in the next sections:
1031
Select by Type
In the Select by Type region of the dialog, a list of all possible alarm types is displayed. From this list, you can select one, several, or all types of alarms. The default is All alarm types. To choose a single alarm from the list: Click the alarms name. To choose a range of alarms: Click and hold the mouse button on the first alarm name, drag the cursor to the last alarm in the range, and release the mouse button. To choose noncontiguous alarms: Hold down the Control key and click the desired alarms. Hold down the Control key and click a selected alarm to deselect the alarm. To choose all alarm types: Select the All Types option, which overrides any more-specific settings.
Select by State
You can also select alarms by the state of the alarm. You can select any or all of the states by toggling on the appropriate buttons, or the All States toggle. The default is to display all Open and Acknowledged alarms.
1032
User Guide
Select by Time
By default, alarms are shown regardless of the time they were generated. You can choose to display only alarms that were generated during a specific date and time window. Fields are provided to specify a From Date, From Time, To Date, and To Time. You can specify dates without times. However, you cannot specify times without dates. The current system date and time are automatically filled in for your convenience. You use a 24-hour format when specifying times. To make the alarm selection take effect: Do one of the following:
Click OK. This sets your selections and dismisses the Alarm Selection dialog. or:
Click Apply. This sets your selections without dismissing the dialog.
Click Cancel.
1033
The interval between database reads to refresh the displays. The fonts and colors that are used in the Job Activity window as well as the Operator Console dialogs. How many characters of certain fields should be displayed. Whether or not the Operator Console starts up in Freeze Frame mode. The Operator Console GUI icon text and the Operator Console title bar text. User-configurable action buttons.
Descriptions of each of the resources which can be customized are given following. All of these can be set by modifying the X resource file Autocons. The X resources files reside in the local app-defaults directory, which varies across platforms. It is usually in /usr/lib/X11/app-defaults or /usr/openwin/lib/app-defaults. If you are not sure which directory these files are in, ask your system administrator. Individual users may have their own copy of the X resources files in their $HOME directory, which will take precedence over the app-defaults files. For most operating systems, if you are exporting the display to another machine you must edit the appropriate files in the app-defaults directory on the local machine. For Solaris, you must edit the files in both the /usr/lib/X11/app-defaults and /usr/openwin/lib/app-defaults directories. The files in /usr/lib/X11/appdefaults control the resources when you export the display. We have listed the various resource names here as a reference only. For the default values, see the Autocons file. The provided resource file values work well for a majority of platforms.
1034
User Guide
Changing Fonts
The fonts used in the Operator Console fall into the following three categories:
Those you can choose, independent of other fonts. Bold fonts, which must correlate with normal fonts. Normal fonts, which must correlate with, bold fonts.
Several of the lists in the Operator Console present information in columns, such as the Job List, which displays the job name, command, and so forth. In order to maintain the appropriate spacing for these columns, all characters in each field must be the same width. Therefore, fixed-format fonts must be specified for all column-oriented lists. In order for the labels at the top of the columns to align with the columns themselves, those labels must also use fixed-format fonts, in the same size and style. We recommend that a bold-faced font be used for the labels, so that they are consistent with the other, non-list field labels (unless, of course, you have chosen a non-bold font for everything). The font chosen for the column data should be a non-bold version of the same font used in the labels. Note: The model X resources file provided specifies values that work well for each of the SunOS and AIX platforms.
1035
The following resources do not necessarily relate to other fonts, but should match the non-bold font of the list fonts shown previously, for consistency across the application.
*pathList*fontList: *alarmCurrentText*fontList:
1036
User Guide
Object Color
The following resources affect the colors of the various objects within the Operator Console. We recommend that the default colors be used, since they match those used in the main AutoSys graphical user interface.
Currently Selected Job Name Field The following resource controls the currently selected job name field, and should be a different color from the rest of the interface, for emphasis.
*workAreaForm*XmForm*currJobName.background:white
Background Color of Variable Fields The following resources control the background color of the variable fields found in the interface, such as lists and text values, which change based on user interactions.
*workAreaForm*XmForm*XmText.background: *workAreaForm*XmForm*XmTextField.background: *workAreaForm*XmForm*XmList.background: *workAreaForm*XmForm*XmScrolledList.background: *workAreaForm*XmForm*XmScrolledText.background: *workAreaForm*XmPanedWindow*XmForm* XmScrolledWindow.background: *XmDialogShell*XmTextField.background: *XmDialogShell*XmText.background: *XmDialogShell*XmList.background: *selection_dialog*XmTextField.background: *selection_dialog*XmList.background: *depend_dialog*XmList.background: *path_dialog*XmList.background: *alarm_dialog*XmTextField.background: *alarm_dialog*XmText.background: *alarm_dialog*XmList.background:
Border Colors The following resources set the decorative dark blue borders of the interface:
*workAreaForm.background: *workAreaForm*XmPanedWindow.background: *workAreaForm*controlForm*XmSeparator.background:
1037
Primary Interface Color The following resources set the primary interface color, medium gray:
*workAreaForm*XmForm*background: *menuBar*background: *exitDialog*background: *XmDialogShell*background: *selection_dialog*background: *depend_dialog*background: *path_dialog*background: *alarm_dialog*background:
Toggle Button Color The following resource sets the toggle button color used to indicate that the toggle button is on:
*selectColor:
Atomic Condition Fields The following resources apply to the atomic conditions in the Job Activity Consolethe fields that describe a starting condition at its most basic level:
#length of condition (such as SUCCESS(JOBA)): *atomicCondLength: #length of current state (eg, SUCCESS): *atomicStateLength: #length of true/false flag - whether the current #state satisfies the condition: *atomicTFLength:
1038
User Guide
1039
Note: When changing icon text, be sure the length of the new text string does not exceed the recommended maximum length for icon title text for your windowing system. Some window managers can display long icon text strings, while others will truncate them. Ensure the text string you specify for your icons displays appropriately. Also, some window managers allow you to change the size of icons and icon text font.
By default, these resources are commented out. Be sure to remove the preceding exclamation point (!) when you edit these specifications. Button labels can be up to 20 characters, including spaces.
1040
User Guide
You must specify the full path to the command (environment variables are not resolved). After the path to the command, environment variables are allowed. $JOB will take the value of the currently active job. We recommend that you run the command in the background by placing an ampersand (&) at the end of the command. Otherwise, the Operator Console will pause until the command has completed. The following example shows how to use this resource to customize button1 to delete the AutoSys job that is currently selected in the Job Activity Console, and button2 to run a command called printScreen:
Autocons*userButton1Label: Delete Job Autocons*userButton1Command: /usr/local/bin/deleteJob $JOB& Autocons*userButton2Label: SnapShot Autocons*userButton2Command: /usr/local/bin/printScreen&
Accessing InfoReports from the Operator Console This section explains how to configure an Operator Console action button to display reports on AutoSys jobs using InfoReports. The Autocons X resource file contains the following example InfoReports button label and command string, which are commented out by default:
!Autocons*userButton1Label:NthfromLastRun !Autocons*userButton1Command:/autosys/inforeports/xproview rep/autosys/inforeports/reports/lastnrun.rep &
To enable the display of reports by clicking a user-configurable button, open the Autocons file in a text editor and follow these steps: 1. 2. 3. Remove the comment characters (!) from the beginning of the lines. If desired, change the button label text to any text string of 20 characters or less. The default example assumes the InfoReports executable file (xproview) is located in the /autosys/inforeports directory. If necessary, replace this path with the actual path on your system. If you have the full InfoReports product installed on your machine, you can replace this path with the path to your InfoReports installation.
1041
4.
The default example assumes the reports (.rep files) are located in the /autosys/inforeports/reports directory. If necessary, replace this path with the actual path on your system.
The example command line is configured to display the Last n Run report (lastnrun.rep). You can change this to display any of the following reports: Report joblist.rep Description Creates a job list report based on a list of jobs you select, or on the entire list of jobs in that instances database. Creates a job report that includes the jobs that match the pattern for a job name that you enter. Creates a last run report on the job that you specify. The report contains information on the last run of the job. Creates a last specified run (n) report on the job that you specify. You can also enter the last, second to last, or the number for the last run.
jobfind.rep
lastrun.rep
lastnrun.rep
Configuring InfoReports Viewer for Printing If you want to enable printing of the InfoReports reports, you must first configure the InfoReports printer. To do this, you run the InfoReports prfgen.sh script. 1. 2. 3. Change to the $AUTOSYS/install directory. Enter the following command to run the printer configuration script:
./prfgen.sh
1042
User Guide
Server is the server machine time. Machine is the time zone of the machine running the Operator Console. Job is the time zone specified in the job definition (using the timezone attribute). If no time zone is set for the job, then the server machine time is used.
1043
Chapter
11
This chapter describes how to define AutoSys monitors and reports using essential and optional monitor/report attributes. It also explains how to define monitors and reports using both the GUI and JIL.
111
Monitors and reports enable you to filter and screen only the information you are interested in from of a vast collection of data. That is, they are tools that can give you information meaningful to you. Both monitors and reports filter events by the following:
In addition, reports also filter by time. Monitors do not filter by time because they provide real time information. Note: Monitors can provide a picture of the systems state in real-time. If the Event processor is down, monitors will not provide any information. On the other hand, reports provide a picture of the systems state from a historical view, not in real time.
Monitors
Monitors provide a real time view of the AutoSys system. These are the two steps necessary to use a monitor:
A running monitor is an application that polls the database for new events that meet the selection criteria. Monitors are strictly informational. They provide an up-to-the-minute window to AutoSys events as they occur. For box jobs, all job levels can be observed, if desired.
112
User Guide
Reports
A report (or browser) is a query run against the database, based on the selection criteria defined for that report. Its primary function is to enable you to quickly get very specific information, such as the finish time of the database backup for the last two weeks or all jobs that have an alarm associated with them. In addition, all job levels in box jobs can be reported if desired. Like monitors, a report definition is stored in the database, enabling reports to be run at any time, without redefining the criteria to AutoSys. Reports can only display events that are still in the database. Archived events are inaccessible and cannot be displayed.
Through the Graphical User Interface (GUI). By passing Job Information Language (JIL) statements to the jil command.
In either case, the monitor or report specification is stored in the database, and the attributes you specify are virtually the same. You define a new monitor or report by assigning it a name and specifying any number of attributes that further define its behavior.
113
3.
Using JIL
To define a monitor or report using JIL: Issue the jil command, and pass it the insert_monbro subcommand followed by a set of attribute: value pairs.
Chapter Organization
In this chapter, monitor and report attributes fit into two categories (sections): essential and optional. Essential attributes must be specified in order for a definition to be valid, and optional attributes are not required. For each attribute described in this chapter, the following is indicated:
Its name Its JIL attribute keyword Its corresponding GUI object or GUI Field Name A description of its use
114
User Guide
Monitor/Report Name
JIL Keyword GUI Field Name Description
insert_monbro: monbro_name Name The monitor or report name is used to identify the monitor or report to AutoSys, and must be unique within AutoSys. A monitor and report cannot have the same name, but a monitor or report can have the same name as a job. A monitor or report name can be from 1-30 alphanumeric characters; embedded blanks and tabs are illegal.
Mode
JIL Keyword GUI Field Name Description
mode Mode The mode attribute indicates whether a monitor or report is being specified.
115
All Events
Keyword GUI Field Name Description
JIL all_events ALL EVENTS The all_events attribute specifies whether any event filtering is in effect. If it is set to yes, the other event filtering attributes are ignored, and all events, regardless of source, will be reported for the selected jobs. These events include job status events and alarms. Note: If you wish to monitor all the events for all jobs, you should not run a monitor. Instead, you should display the event processor log time in real time, using the following command:
autosyslog e
Running a monitor adds another connection to the database, and establishes another process that is continually polling the database. This process will have a significant impact on system performance. Moreover, the information logged by the event processor contains much more diagnostic information than that monitor does.
Alarms
JIL Keyword GUI Field Name Description
alarm Alarms This attribute specifies whether AutoSys-generated alarms should be tracked. Alarms can be tracked in addition to job status events (described following).
116
User Guide
all_status ALL Job Status Events This attribute specifies whether all job status events should be tracked. Job status events occur whenever a jobs status changes. If this attribute is set to yes, the individual job status events shown below and a few AutoSys-internal job status events, will be tracked. Alarms can also be tracked in addition to job status events.
Individual Job Status Events The following table contains individual job status events: JIL Keyword running success failure terminated starting restart Field Name Running Success Failure Terminated Starting ReStarting
Job Filter
JIL Keyword GUI Field Name Description
job_filter Job Filter The job filter attribute determines which AutoSys jobs are to be monitored or reported. Monitors and reports can track events based on selected jobs. The events to be tracked are determined by the combination of the various event filters and the job filter. The job filter can be set to one of three settings: track all jobs (no job filtering), track a single box with the jobs it contains, or track a single job. If either of the latter two is selected, the name of the job is required.
117
currun Current Run Only This attribute specifies that only events in the current or most recent execution of the specified jobs will be reported. This feature is useful for getting a sense of what is happening right now. For example, you could select the job status event restarting, turn off job filtering, and set this attribute to yes to see all the jobs that have been automatically restarted by AutoSys in their current or latest run.
after_time Events After Date/Time This attribute specifies that only events occurring after a certain date and time for the specified jobs will be reported. This attribute cannot be used in a monitor definition because monitors only show events as they occur.
118
User Guide
Sound
JIL Keyword GUI Field Name Description
sound Sound This attribute specifies whether the sound facility should be used. If the workstation running the monitor has sound capabilities, AutoSys will use them to announce the events as they occur. The announced message is pieced together from pre-recorded sound clips. Note: We strongly recommend that you use the sound attribute for monitoring AutoSys, especially alarms. It frees you from needing to look through output files to see if there are any problems. For details on recording sound and a list of machines for which AutoSys supports sound, see the record_sounds command in Chapter 1, AutoSys Commands, in the Unicenter AutoSys Job Management Reference Guide.
119
alarm_verif Verification Required for Alarms This attribute specifies whether alarms need to be responded to by personnel before they will be turned off. This verification feature prompts the user, for their initials and a comment, in the running window. This information is timestamped and recorded in the database, along with the alarm event. This approach provides an account of the alarms that were responded to, and when they were responded to. An important feature of this attribute is that if the response is not given within 20 seconds, the message is repeated. Therefore, if one momentarily steps out of the room and there is an alarm, it keeps writing to the window, and playing the sound clip (if specified) until someone responds.
1110
User Guide
1111
The buttons at the top of the dialog are the dialogs control buttons. They perform the following actions: Dialog Control Buttons Clear Actions Clears the dialog without affecting the database. Use this button to clear all fields (in the dialog and memory), before you begin defining a new monitor or report. Deletes the currently displayed monitor or report from the database. Stores the currently displayed monitor or report in the database, either modifying a pre-existing object, or creating a new one. It also clears the dialog in preparation for another monitor or report definition. Runs the monitor or report and displays output in an xterm window. Closes the Monitor/Browser dialog and displays the GUI Control Panel. If Exit is pressed without pressing Save first, the recent changes are not saved. Exit only exits the dialog.
Delete
Save
Run MonBro
Exit
As indicated above, monitors and reports can be run from the Monitor/Browser dialog. A monitor must be saved in the database before it can be run, but reports do not need to be saved before they can be run. When a monitor or report is run from the dialog, a new terminal window appears to display the output. To close this window, press Control+C.
1112
User Guide
Defining a Monitor First, you will define a monitor with the name Regular. This monitor will monitor all alarms, plus job status events when a job changes state to running, success, failure, or terminated for the current job run. To open the Monitor/Browser dialog: In the GUI Control Panel, click the Monitor/Browser button, the Monitor/Browser dialog appears. To define the example monitor: 1. 2. 3. 4. 5. In the Name field, enter the monitors name:
Regular
In the Mode field, click the Monitor button. In the Monitor/Browse these Types of Events region of the dialog, click Alarms. In the Job CHANGE_STATUS Events region, click the Running, Success, Failure, and Terminated buttons. In the Job Selection Criteria region of the dialog, click ALL Jobs.
1113
To save the monitor definition in the database: At the top of the Monitor/Browser dialog, click Save. To dismiss the Monitor/Browser dialog: You can either dismiss the dialog using Exit, or you can leave it open to do the next exercise. Note: If you want to run the monitor immediately, you must save it first, then click the Run MonBro button. When running a monitor or a report from the GUI, an xterm window is created to display the monitor or report output.
1114
User Guide
Defining a Report Next, you will define a report with the name Alarm_Rep. This report will report all alarms on any job, from December 22, 1997, at 12:00 to the present. To define the example report: 1. 2. 3. 4. 5. 6. Click Clear to clear the dialog and begin a new report. In the Name field, enter the reports name:
Alarm_Rep.
In the Mode field, click Browser. In the Monitor/Browse these Types of Events region of the dialog, click the Alarms button only. In the Job Selection Criteria region of the dialog, single-click ALL Jobs. In the Browser Time Criteria region of the dialog, click the No button next to the words Current Run Only, and enter the date and time in the Events After Date/Time text field as follows (or you can enter a different, appropriate time):
12/22/1997 12:00
1115
To save the report definition in the database: At the top of the Monitor/Browser dialog, click Save. To dismiss the Monitor/Browser dialog: You can either dismiss the dialog using the Exit button, or you can leave it open to create other definitions.
1116
User Guide
where
monbro_name
Is the monitor or reports name. It can be from 1-30 alphanumeric characters and is terminated with white space; embedded blanks and tabs are illegal. The only difference between defining monitor/reports and jobs is that different subcommands are used. For defining monitor or reports, these are the JIL subcommands:
The following JIL script creates a monitor that will sound an alarm whenever a job finishes successfully:
insert_monbro: Job_Success mode: monitor /* "m" may be specified instead */ sound: yes success: yes job_filter: all
Note: The JIL examples in the following sections include the monitor and report definitions presented in the GUI section previously shown. For a list of machines for which AutoSys supports sound, see the record_sounds command in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
1117
Defining Monitors
First, you will define a monitor with the name Regular. This monitor will monitor all alarms, plus job status events when a job changes state to running, success, failure, and terminated. It sounds an audible alarm whenever any of the events occur (for a list of machines for which AutoSys supports sound, see the record_sounds command in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.) These are the JIL statements for this monitor definition:
/* Monitor for all ALARMS, and * Job EVENTS: RUNNING,SUCCESS,FAILURE & TERMINATED * * Sound is ON! */ insert_monbro: Regular mode: m sound: y alarm: y running: y success: y failure: y terminated: y
The $AUTOSYS/install/data/monbro.jil file contains the example JIL statements shown previously. It also contains the following JIL statements, which define a sample monitor:
/* Monitor for JUST ALARMS! * Verification Required is ON so someone must type in a response. */ insert_monbro: Alarm mode: m sound: y alarm: y alarm_verif: y
This set of statements defines a monitor that catches alarms, generates an audible alarm, and continually repeats the alarm until someone responds.
1118
User Guide
Defining a Report
Next, you will define a report with the name Alarm_Rep. This report will report all alarms on any job, from June 1, 1997, at 2:00 a.m., to the present.
insert_monbro: Alarm_Rep mode: b alarm: y after_time: "06/01/1997 2:00"
Notice that quotes are required in the previous example, because the time contains a special character, the colon. Note: Reports can only display events that are still in the database. Archived events are inaccessible and cannot be displayed.
1119
Running a Monitor
Running a Monitor
To run a monitor: Do one of the following:
Enter the name in the Name field of the Monitor/Browser dialog, and click the Run MonBro button. or:
Run the monitor by executing the following AutoSys command at the UNIX command line:
monbro -N monitor_name
The time interval after which the Monitor/Browser GUI will drop the connection to the database. The Monitor/Browse GUI icon text and the Monitor/Browse title bar text.
Descriptions of the resources which can be customized are given following. All of these can be set by modifying the X resource file Autosc. The X resources files reside in the local app-defaults directory, which varies across platforms. It is usually in /usr/lib/X11/app-defaults or /usr/openwin/lib/app-defaults. If you are not sure which directory these files are in, ask your system administrator. Individual users may have their own copy of the X resources files in their $HOME directory, which will take precedence over the app-defaults files. For most operating systems, if you are exporting the display to another machine you must edit the appropriate files in the app-defaults directory on the local machine. For Solaris, you must edit the files in both the /usr/lib/X11/app-defaults and /usr/openwin/lib/app-defaults directories. The files in /usr/lib/X11/appdefaults control the resources when you export the display.
1120
User Guide
If DBDropTime is set to zero, the connection is dropped immediately after the database query has completed. A value greater than or equal to five means that the GUI will automatically drop all database connections if the database has not been accessed in the last DBDropTime minutes. (Values of one to four are invalid). A new database connection will subsequently be established when required. If DBDropTime is greater than 360, the connection to the database is maintained until you exit the GUI screens.
Note: When changing icon text, be sure the length of the new text string does not exceed the recommended maximum length for icon title text for your windowing system. Some window managers can display long icon text strings, while others will truncate them. Ensure the text string you specify for your icons displays appropriately. Also, some window managers allow you to change the size of icons and icon text font.
1121
Chapter
12
Maintaining AutoSys
This chapter describes the procedures for maintaining AutoSys and the AutoSys database.
This command performs a number of consistency checks, and then starts the event_demon program. You can start the shadow event processor at the same time as the primary event processor by specifying the -M option followed by the name of the machine on which you want the shadow event processor to run, like this:
eventor -M machine_name
WARNING! Do not try to start the event processor by invoking the event_demon binary at the command line. The eventor script is required to properly check and configure the environment for the event processor.
Maintaining AutoSys
121
Verifies that no other event processor is running on that machine. Invokes a restart procedure that looks for any events that are hung in the processing state. Because events are processed one at a time, there can only be one event hung in the processing state at a time. If there is an event in this state, it is queued again for processing. Normally, events will only be in this state if the event processor was stopped while it was processing an event. Note: If the event being processed was a STARTJOB event, and the job it started is still alive, it will not be started again. Also, if an event is in the processing state for a long time, this does not imply that the job associated with that event is in some unusual state. It could be that the command is lengthy, or is waiting for machine resources.
Invokes the chase command. chase inspects the AutoSys database to see what AutoSys should be running, then checks each machine to verify that the appropriate jobs are running. If chase sees any anomalies, it sends an alarm, and, if the job definitions permit, it restarts any missing jobs. The purpose of the chase feature is to verify the state of AutoSys and detect any problems upon startup.
122
User Guide
This command displays the output of event_demon to the screen. To break out of the display, press Control+C. This terminates the tail command only, making the window availablethe event_demon continues to run in the background. Note: If you do not want eventor to run the tail command, you can use the q option when you start the event processor. Several command line options allow you to start the event processor without some of the checks being performed; however, an experienced AutoSys administrator should use these options only for special circumstances. For more information about the eventor command, see its entry in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
Starting in Global Auto Hold Mode If you are restarting the event processor after a period of downtime, you can specify the -G option to start the event processor in Global Auto Hold mode. This prevents the system from being overloaded with job starts for the numerous jobs that were scheduled to run during the downtime. When the event processor is in Global Auto Hold mode, it evaluates all jobs whose starting conditions have passed and are eligible to run. Instead of starting the jobs, however, the event processor puts the jobs ON_HOLD. It does this for all types of jobs (box, command, and file watcher). This allows you to decide which jobs should run and to start them selectively with the Force Start Job button in the Operator Console, or with the following command:
sendevent -E FORCE_STARTJOB
This FORCE_STARTJOB event is the only way to start a job put ON_HOLD with Global Auto Hold; it overrides the Global Auto Hold. To turn off Global Auto Hold, you must shut down the event processor, and then start it again without the -G option. You can start both the primary and shadow event processors with the -G option.
Maintaining AutoSys
123
This log file contains a record of all the actions taken by the event processor, including startup and shutdown information. If the $AUTOUSER directory is NFS mounted, you can view this output from any machine on the network. To view the log file: Do one of the following: 1. Type the following UNIX command:
tail -f $AUTOUSER/out/event_demon.$AUTOSERV
When you execute this command, the last ten lines of the log file are displayed, and then all additions to the log are automatically displayed as they occur. To terminate the autosyslog process: Press Ctrl+C. Note: We recommend that you use the autosyslog -e command to follow the behavior of the event processor. It displays the log file, which generates information on all event processor activity.
124
User Guide
Event Processor Log File Size At startup, the event processor checks the size of its log file, and if the file is 250 KB or more, the event processor deletes the file. The event processor log has a file system threshold setting. The event processor shuts down if there is less than 8 KB of disk space available. However, if the amount of available disk space falls below that specified by the FileSystemThreshold parameter in the configuration file, the event processor issues warnings in the event processor log file. For information on the FileSystemThreshold parameter, see Event Processor Log Disk Space in the chapter Configuring AutoSys.
This method allows the event processor to complete gracefully any processing it is performing. You can assign a high priority to the sendevent -E STOP_DEMON command by including the -P 1 argument.
Maintaining AutoSys
125
When you issue the sendevent command, the STOP_DEMON event is sent to the AutoSys database. The event processor then reads the STOP_DEMON event, goes into an orderly shutdown cycle, and exits. There might be a delay between when you send the STOP_DEMON event and when the event processor reads it and shuts down. If the event processor does not shut down immediately, do not send another STOP_DEMON event, because the event processor will process that event the next time it starts, and it will promptly shut down. For more information about the sendevent command, see sendevent in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
WARNING! Do not attempt to stop the event processor (the event_demon process) by using the UNIX kill command. This method stops the event processor no matter what it is doing; it might be in the middle of processing an event. Also, if you are using dual-event servers and use these methods, the databases can lose synchronization.
126
User Guide
If it cannot connect to the third machine, the shadow event processor shuts down. If it can connect but cannot locate the .dibs file, the shadow event processor creates the file, attempts to signal the primary event processor to stop, and takes over processing the events. If it can connect and the .dibs file already exists, the shadow event processor shuts down.
Similarly, if the primary event processor cannot locate and signal the shadow event processor, the primary processor checks the third machine for the .dibs file, and follows the same procedure as the shadow event processor (as described previously). If the primary event processor and an event server are on the same machine, the event processor failure could also mean an event server failure. In this situation, if dual event servers are configured, AutoSys will roll over to the shadow event processor and to single-server mode. AutoSys uses the third machine and the existence of the .dibs file to resolve contentions and to eliminate the case where one processor takes over because its own network is down. However, the shadow event processor is not guaranteed to take over in 100% of the cases. For example, in the case of network problems, AutoSys might not be able to determine which event processor is the healthy one, and it will shut down both processors. Note: You can specify the shadow event processor and the third machine by modifying the tunable parameters in the AutoSys configuration file. For information, see the chapter Configuring AutoSys, in this guide.
Maintaining AutoSys
127
Restoring the Primary Event Processor To restore the Primary and the Shadow event processor 1. Stop the shadow event processor by logging on as the AutoSys exec superuser, and issuing the following command:
sendevent -E STOP_DEMON
2.
This command starts the primary and shadow event processors at the same time. Note: If you attempt to start the primary and shadow event processors without having a third machine specified in the AutoSys configuration file, the shadow event processor will not start.
128
User Guide
If the event processor and the remote agents are installed and configured properly. Running in test mode uses the same mechanisms of starting jobs and sending events that AutoSys uses in its normal mode. If the conditional logic for jobs, including nested boxes, is functioning correctly.
You can run the event processor at two levels of test mode. You do this by setting the $AUTOTESTMODE environment variable before starting the event processor. The levels of test mode are determined by the value of the $AUTOTESTMODE variable. These are the values, which are discussed in the following sections:
$AUTOTESTMODE = 1 $AUTOTESTMODE = 2
WARNING: The event processor cannot run partially in test mode; AutoSys does not provide a test mode for the database. Therefore, you should exercise extreme caution when you run in test mode on a live production system.
Maintaining AutoSys
129
$AUTOTESTMODE = 1
At the first level of test mode, each job that you specify runs with the following test mode variations:
The command /bin/date is executed on the remote machine instead of the command specified in the job definition. Standard output and standard errors for the command are redirected to the /tmp/autotest.$AUTO_JOB_NAME file, where $AUTO_JOB_NAME is the job name as defined to AutoSys. If the job being run in test mode is a file watcher job, it is not disabled; it runs as it would in real mode.
Minimum and Maximum Run Alarms Sourcing a user-specified .profile file All resource checks
$AUTOTESTMODE = 2
The second level of test mode runs with the same behaviors as the first level with the addition of the following procedures:
Resource checks are performed. A user defined .profile file is sourced. Output from the /bin/date command goes to the user defined standard output and standard error files, if they are defined; otherwise, output goes to the file named /tmp/autotest.$AUTO_JOB_NAME.
1210
User Guide
chase
The chase command verifies that the expected jobs are running. It goes to every machine that should be running a job and verifies that the following are true:
Errors detected by chase are sent to standard output. The options used with chase further determine what actions are taken when error conditions are detected. chase can send alarms to AutoSys to alert you to the problems it finds (by using the -A option). In addition, it can automatically restart jobs that are missing in action and that are defined for restart (by using the -E option). For more information about the chase command, see the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide. Note: There is no way for chase to tell if a machine is down; therefore, it cannot tell if jobs on that machine are running, or if the network connection to the machine is down. If you run chase with the -E option, jobs that have already run, or are running on the machine with the failed network connection might be restarted if the network connection is established again.
Maintaining AutoSys
1211
clean_files
The clean_files command deletes old remote agent log files. It performs this task by searching the database for all machines that have had jobs started on them, and then sending a command on that machine to purge all remaining log files from the machines Remote Agent Log directory (specified by AutoRemoteDir in the AutoSys configuration file). To remove only the log files older than a specific number of days, use the following command:
clean_files -d days
where:
days
Specifies that files older than this number of days should be deleted. For more information about the clean_files command, see the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
1212
User Guide
calendar definitions job definitions machine definitions monitor and browser definitions global variables
For information about restoring the backed up definitions, see Restoring AutoSys Definitions in this chapter. We recommend that you keep a copy of your AutoSys license keys in case you need to reinstall them. To back up AutoSys definitions: 1. To save your calendar definitions: a. Open a Calendar Definition window.
b. Choose File, Export. The Export File Name dialog is displayed. c. In the Export File Name dialog, select a directory that is outside of the AutoSys directory structure and select or enter a file name.
d. Click OK. Note: The calendar definitions are saved as text. 2. To save your job definitions, from a UNIX command prompt, execute the following command:
autorep -J ALL -q > /directory/autosys.jil
where:
directory
Is a directory outside of the AutoSys directory structure. We recommend that you save to the same directory where you saved your calendar definitions. This command saves your job definitions to a file named autosys.jil.
Maintaining AutoSys
1213
3.
To append your machine definitions to the same file that contains your job definitions, from the UNIX command prompt, execute the following command:
autorep -M ALL -q >> /directory/autosys.jil
where:
directory
Is the same directory where you saved your job definitions, a directory outside of the AutoSys directory structure. Note: To append definitions to an existing file, you enter >> instead of >. We recommend that you append your job, machine, and monitor and browser definitions to the same file. Then you have only one file to restore following a system failure. 4. To append your monitor and browser definitions to the same file that contains your job and machine definitions, from the UNIX command prompt, execute the following command:
monbro -N ALL -q >> /directory/autosys.jil
where:
directory
Is the same directory where you saved your job definitions, a directory outside of the AutoSys directory structure. 5. To save your global variables to their own file, from the UNIX command prompt, execute the following command:
autorep -G ALL > /directory/globals.jil
where:
directory
Is a directory outside of the AutoSys directory structure. We recommend that you save to the same directory where you saved your other AutoSys definitions. This command saves your global variables to a file named globals.jil. This file is simply a record of what you must redefine following a system failure. Note: You can create a job that runs periodically to back up your definitions automatically. 6. To save your license keys, run the gatekeeper command to print your current license keys to a file.
1214
User Guide
b. Choose File, Import. The Import File Name dialog is displayed. c. In the Import File Name dialog, select the directory and file name of the text file that contains your calendar definitions.
d. Click OK. 2. To restore your job, machine, and monitor and browser definitions, from a UNIX command prompt, execute the following command:
jil < /directory/autosys.jil
where:
directory
Is the directory where you saved your definitions. 3. Restore your global variables, reference your backup file and redefine any global variables.
Maintaining AutoSys
1215
Job definitions Events Monitor and report (browser) definitions Calendar information Machine definitions
For a list of the database tables and views as well as the event and alarm codes used in the database, see the chapter Database Tables and Codes in the
1216
User Guide
Using Dual Event Server Mode When you configure AutoSys with dual-event servers, all of the data is duplicated on two event servers. In dual-server mode both servers are peers, and the event processor is responsible for keeping the databases synchronized. The event processor continually reads from both databases as it processes events. For information about installing a second event server, see Installing Dual-Event Servers in the chapter Advanced Configurations in the Unicenter AutoSys Job Management for UNIX Installation Guide.
Database Storage Requirements The standard sizes for AutoSys databases are 64 MB (Sybase) and 100 MB (Oracle). The standard sizes for AutoSys databases are the recommended sizes. If your job load is large, you should create a larger database. The size requirements for your database depend on the following:
The number of jobs you define. How many of the jobs have dependencies. How often the jobs are run. How often the database is cleaned. (Every time a job runs, it generates at least three events and an entry in the job_runs table.)
Maintaining AutoSys
1217
Database Architecture
The following figure shows the layout of databases in an AutoSys environment, and it will help you understand AutoSys configuration options. It depicts how AutoSys determines which database to use, and how the three primary AutoSys components (the event processor, the AutoSys database, and the remote agent) interact.
The previous figure shows one instance of AutoSys that is configured with dualevent servers, which are used only by this one instance. Both the event processor and the remote agent ensure that events are written to the appropriate databases. The controlling variable in the architecture is the environment variable named AUTOSERV. This variable is the instance name that indicates, among other things, the name of the configuration file to be used. The configuration file name is determined by expanding the environment variable in the following way:
$AUTOUSER/config.$AUTOSERV
1218
User Guide
This configuration file contains information about which database to use, and in what capacity. All AutoSys commands access this file, unless they are overridden at the command line with an argument that states an instance name. For information on configuring AutoSys instances, see the chapter Configuring AutoSys, in this guide. Note: Different instances of AutoSys can start jobs on the same machine. The remote agent receives instructions from the event processor at runtime, and the remote agent can send events to the necessary databases.
Maintaining AutoSys
1219
DBMaint Script
By default, AutoSys executes the $AUTOSYS/bin/DBMaint script during its daily maintenance cycle. This script runs the dbstatistics and archive_events commands. DBMaint runs the dbstatistics command to perform the following tasks:
Update statistics in the database for optimal performance. For Sybase database, it updates statistics for the event, job, job_status, and job_cond tables. For Oracle, it computes statistics for all of the AutoSys tables. Run the AutoSys dbspace command to check the available space in the database. If the amount of free space is insufficient, it issues warning messages and generates a DB_PROBLEM alarm. Note: If you use an Oracle database, running DBMaint may report that your database is close to full when this is not the case. This can occur because DBMaint calculates how much space is not allocated for extents. The extents may be nearly empty, but DBMaint reports the whole extent as used space.
Calculate and update the average job run statistics in the avg_job_run table. This information is used by AutoSys/Xpert. When dbstatistics is run, old data is overwritten with the new data.
DBMaint runs the archive_events command to remove old information from the various AutoSys database tables. Specifically, archive_events removes the following:
Events and any alarms associated with them from the event table Job run information from the job_runs table. autotrack log information from the audit_info and audit_msg tables ServerVision audit information from the svarchive_tbl table
The output from DBMaint, $AUTOUSER/out/DBMaint.out, tells you how much space is left in your database, so that you can check (and monitor) if the event tables are filling up. This is a good way to calculate how many events in a single day can be maintained safely in the database before they should be archived.
1220
User Guide
For more information on the dbstatistics and archive_events commands, see their entries in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide. For a list of the database tables and views, as well as the event and alarm codes used in the tables, see the chapter Database Tables and Codes in the Unicenter AutoSys Job Management Reference Guide.
WARNING! If you are archiving large event tables, your SQL connection might time out, causing the DBMaint script to core dump. If this occurs, change the -t argument of the archive_events command to a higher value.
Modifying the DBMaint Script You can modify the $AUTOSYS/bin/DBMaint script. For example, you might want to modify the script to perform database backups also. For information on backing up bundled Sybase, see Bundled Sybase Backup and Recovery in this chapter. When you modify the script, copy it first, and then add your enhancements to the copied version. If you modify the script, you should keep a backup copy of it; then, when you upgrade, you will not lose your changes. You can use your enhanced script to modify the newly installed script. The script name is specified by the DBMaintCmd parameter in the AutoSys configuration file.
Maintaining AutoSys
1221
The connection to the database is lost, and, after the configured number of attempts to remedy this situation have transpired, the database still remains unconnected. A database has had an unrecoverable error (for example, corrupt database or media failure).
Upon event server rollover, the event processor edits the $AUTOUSER/config.$AUTOSERV configuration file on only the event processor machines. The event processor comments out the database that has been taken off-line and marks the remaining database as being in single server mode. The event processor makes these changes so that AutoSys utilities attempting to access the database will write to or read from only the running event server. Note: On an event server rollover, the configuration file is edited on the event processor machines only; configuration files on client machines are not modified.
run in dual server mode. Before starting the down server, you must make sure that the two event servers are synchronized, following the instructions in Synchronizing the Event Servers.
1222
User Guide
After the event processor is stopped, no additional jobs will be started. 4. Synchronize the databases using the autobcp script. This is described in detail in Run the autobcp Script in the chapter Advanced Configurations of the Unicenter AutoSys Job Management for UNIX Installation Guide. Edit the $AUTOUSER/config.$AUTOSERV configuration file on the server machine. In the configuration file, remove the rollover comment from the EventServer line that defines the sever that went offline. The event processor commented out this line during the rollover to single-server mode. Start the event processor, using the eventor command on the AutoSys event processor machine, like:
eventor
5.
6.
Or, if you are running a shadow event processor, start the event processors like:
eventor -M shadow_machine
The event processor should print a message indicating that it is in dualserver mode. Note: If AutoSys is configured to run in dual-server mode, the event processor will not start unless both databases are available.
Maintaining AutoSys
1223
Upgrade your processor, memory, or hard disks. Install the Sybase server and the event processor on a dedicated machine or machines. Do not share machine resources with other processes. Use the Sybase server for AutoSys only. Tune the kernel for optimal Sybase performance. For information on how to do this, see your Sybase and operating system documentation. For unbundled Sybase only, put your data on a raw partition. This improves access time.
WARNING! Only the database administrator should put your data on a raw partition.
1224
User Guide
Tune the shared pool size. Make changes to the shared pool size by altering the init.ora value of SHARED_POOL_SIZE. To determine if you need to increase the shared pool size, enter the following query in SQL*Plus:
select sum(pins) Executions, sum(reloads) Cache Misses while Executing, ((sum(reloads)/sum(pins))*100) Ratio of Misses from v$librarycache;
The ratio of misses should be less than 1%. (The ratio of misses number is displayed as a percentage.) If it is higher than 1%, you should increase the value of SHARED_POOL_SIZE incrementally until the value of executions approaches zero.
Tune the buffer cache. Make changes to the buffer cache by altering the init.ora value of DB_BLOCK_BUFFERS. To determine if you need to allocate more memory, enter the following query as the sys user in SQL*Plus:
select name, value from v$sysstat where name in (db block gets, consistent gets, physical reads);
Monitor the statistics from the query while AutoSys is running. Calculate the hit ratio for the buffer cache by using this formula:
Maximize disk I/O by separating the data files. If you have disk contention, place the autodata and autoindexes data files on separate disk drives, and if possible, different drive controllers. Tune the sort area. A sort area in memory sorts records before they are written to disk. Increasing the size of the sort area by increasing the init.ora value of SORT_AREA_SIZE improves sort efficiency.
Maintaining AutoSys
1225
To determine if sorting is affecting the performance of your system, monitor the sorting disk activity in your system by entering the following query in SQL*Plus:
select name, value from v$sysstat where name in (sorts (memory), sorts (disk));
If disk sorts are greater than 1% of memory sorts, then increase the value of SORT_AREA_SIZE.
Sybase Architecture
The Sybase database is based on a client/server architecture, with the communications between clients and server built into the product. The server portion is called the Sybase SQL Server. This server is a multi-threaded, single process that runs on one machine. It listens on a specific port for a request from a client, fulfills that request, and then returns the information to the client. The client communicates with the server using a C library known as Open Client, or the DB Library. This library handles the communications between the client application and the Sybase SQL Server as well as sending requests and parsing results for the use of the application.
1226
User Guide
This means that all AutoSys commands and processes, including the event processor, the remote agent, and monitors, are DB Library applications that connect to the AutoSys databases. Because all AutoSys commands are merely Sybase clients, you can execute those commands from any machine that has access to the event server and is a licensed AutoSys client. Note: The DB Library allows a client to maintain multiple connections to the same server or multiple servers. It is through this mechanism that the dual-event servers are maintained.
Sybase Environment
If you are using a Sybase, the following environment variables are used: DSQUERYDefines the name of the Sybase data server. SYBASESpecifies the complete path to the Sybase software directory. The Sybase software directory contains the Sybase configuration file, which on UNIX is the interfaces file and on Windows is the SQL.INI file. AutoSys uses the Sybase configuration file to look up database information. It is the means by which the network is navigated to find the Sybase data server.
Maintaining AutoSys
1227
Database Users When using the bundled Sybase version of AutoSys, there are two users defined by default in the AutoSys database: the system administrator and the AutoSys user. Each of these users has a default user name and password. User System Administrator AutoSys User User Name sa autosys Default Password sysadmin autosys
For information on changing the sa password, see the next section, Changing the System Administrator Password. For information on changing the autosys password, see autosys_secure in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide.
Changing the System Administrator Password You can change the system administrator password from its initial value of sysadmin to a new value. To change the system administrator password 1. Execute the following xql command:
xql -Usa -Psysadmin
Note: If you are not using the default settings, the name of your data server is substituted for AUTOSYSDB. 2. Enter the following command sequence (replacing newPassword with the new password):
xql>>[AUTOSYSDB][master] 1> sp_password xql>>[AUTOSYSDB][master] 2> sysadmin, newPassword;
Notice the semi-colon at the end of the command line. It is an end-ofstatement delimiter, replacing the Sybase go.
WARNING! After you change the sa password, the old password is unrecoverable. Make sure that the new password is recorded; otherwise, the entire database will have to be re-installed.
1228
User Guide
Starting Sybase
Starting Sybase involves executing the command that will bring up the AutoSys database on the machine where AutoSys has been installed. The following example is for Sybase-bundled versions of AutoSys. If there is a pre-existing data server at your site, consult your local database administrator about how to start the database. To start the bundled Sybase: 1. As the user who installed AutoSys, log onto the machine where the data server is to run. This is usually done on the machine where AutoSys has been installed, but not always. If the machine has a windowing system running, use the console window. (You do this so all AutoSys database messages will be displayed in that window.) Source the $AUTOUSER/autosys.env file. This sets the $SYBASE environment variable and the alias start_autodb. Enter one of the following commands:
2. 3.
start_autodb $SYBASE/install/RUN_AUTOSYSDB
Both of these commands start the server in the background, then output startup messages. Note: The most common problem encountered when starting up the AutoSys database is the permission assignments on the user who owns the files that AutoSys uses. If you experience some difficulty, ensure that you have the proper permissions to execute the operation. If you do have the necessary permissions and the database does not start, contact Computer Associates Technical Support. 4. To verify that the database is up and accessible, enter the following xql command:
xql -Uautosys -Pautosys -c "select getdate()"
Maintaining AutoSys
1229
Stopping Sybase
You must shut down Sybase before shutting down or rebooting the machine. Before you shut down Sybase, however, you should shut down the event processor. To stop Sybase: 1. If the event processor is running, stop it. To do this, log on as the exec superuser and enter the following command:
sendevent -E STOP_DEMON
Note: If you are not sure whether the event processor is running, do not sent the STOP_DEMON event. If the event processor is not running and you send the event, it will be queued and sent when the event processor is started again. Before you attempt to stop the event processor, ensure that it is running by using the chk_auto_up command. 2. Stop the Sybase service by entering the following command (the sa_password is initially installed as sysadmin):
xql -Usa -Psa_password -c "shutdown"
This command allows any database processes to complete, and then shuts the database down. If you must shut down the database immediately, use this command:
xql -Usa -Psa_password -c "shutdown with no_wait"
For information on changing the sa password, see Changing the System Administrator Password in this chapter.
1230
User Guide
Accessing Sybase
AutoSys comes with a utility named xql, which resides in the $AUTOSYS/bin directory. Use this utility for interactive database queries and for shell-level database access. For a detailed description of how to use xql, see its entry in the chapter AutoSys Commands in the Unicenter AutoSys Job Management Reference Guide. Some examples of using xql in a shell script can be found in the following scripts:
$AUTOSYS/dbobj/create_table $AUTOSYS/bin/chk_auto_up
Note: The xql utility functions only with the Sybase database. If you use an Oracle database, use sqlplus instead.
Maintaining AutoSys
1231
Before you change the autosys user password or shut down AutoSys, you should ensure that no processes are connected to the database. You can identify the active processes, then shut them down. Many GUI processes connected to the database can slow it down. You can see how many and what type of processes are connected to the database, and then ask users to shut down the GUIs they are not currently using.
To see what processes are connected to the database and their status: 1. At the xql prompt, enter the following:
xql>>[AUTOSYSDB][master] 1> select program_name, hostname, xql>>[AUTOSYSDB][master] 2> hostprocess, xql>>[AUTOSYSDB][master] 3> status from sysprocesses;
event_demon jil
Michelle Erik
13448 12272
In the example list of processes, xql is the process running your query to display the processes, and event_demon is the event processor. The processes without names are internal Sybase processes, which you can ignore. The following are the most common processes you will see (there are others): autocal, event_demon, sendevent, autocons, hostscape, timescape, autosc, jil, xql, auto_remote, and jobscape.
1232
User Guide
Maintaining AutoSys
1233
To define a disk as a dump device 1. Display the xql prompt, by entering the following command (assuming that the sa password is sysadmin):
xql -Usa -Psysadmin
2.
where:
dump_device
Is an arbitrary name; this name will be used for subsequent database dumps and loads. Is the full path name of the file where the database dump will be written. Ensure that this file can be created to the appropriate size for your database, and that it can be mounted on the machine that hosts the second event server. For example, the following command will define a disk dump device named autodump, and the database will be dumped to a file named /backup/autodumpdata (for this example to work, the backup directory must already exist).
xql>sp_addumpdevice "disk, autodump, "/backup/autodumpdata, 2;
"path name"
To define a tape device as a dump device: 1. Display the xql prompt, by entering the following command (assuming that the sa password is sysadmin):
xql -Usa -Psysadmin
1234
User Guide
2.
where:
dump_device
Is an arbitrary name; this name will be used for subsequent database dumps and loads. Is the device name of the actual tape device on your machine. Is the capacity of the tape device (in MB). For example, the following command will define a tape dump device named autodumptape, and the database will be dumped to a tape with a 40 MB capacity loaded in the device named /dev/rmt8:
xql>sp_addumpdevice "tape, autodumptape, "/dev/rmt8, 3, skip, 40;
"physical_device" size
Sybase Backup Server If you are using bundled Sybase, you must create and start a backup server. Note: If you are running on Dynix, NCR, Pyramid, SCO UNIX, SCO UnixWare, or Solaris, skip the following section and see Creating a Backup Server Using autotli.
Creating a Backup Server
To create a backup server: 1. 2. 3. Edit the $SYBASE/interfaces file Copy the entire entry for AUTOSYSDB to a new entry. In the new entry, change AUTOSYSDB to SYB_BACKUP.
When you are finished, you will have two entries in your interfaces file: one for AUTOSYSDB and another for SYB_BACKUP. Your SYB_BACKUP entry should look similar to the following:
SYB_BACKUP query tcp sun-ether fiji 5335 master tcp sun-ether fiji 5335 console tcp sun-ether fiji 5336
Maintaining AutoSys
1235
The previous example creates a backup server on a host named fiji, using port number 5335. The backup server port can be the same as the port used by the data server. Note: The format of the interfaces file requires that a single tab (do not use spaces) precede the first word of every line that follows the SYB_BACKUP line (which should not have any spaces before it). A single space is used to delimit each element in an entry line. Incorrect formatting will prevent communication with the database.
Creating a Backup Server Using autotli
For this format, instead of editing the interfaces file directly, you run the autotli command to define the backup server and append the output to the interfaces file. To create a backup server using autotli: Enter the following command:
$AUTOSYS/install/autotli -s SYB_BACKUP -h host -p port >> $SYBASE/interfaces
where:
SYB_BACKUP host port
Specifies the name of the backup server. Specifies the host name. Specifies the backup server port. The backup server port can be the same as the port used by the data server, as long as the backup server and data server are on different machines. Appends the information to the $SYBASE/interfaces file. (Be sure to use two redirect symbols ( >> ); a single redirect ( > ) symbol will overwrite the existing file.) For example, the following command creates the backup server SYB_BACKUP on host fiji, using port 5335:
$AUTOSYS/install/autotli -s SYB_BACKUP -h fiji -p 5335 >> $SYBASE/interfaces
>>
1236
User Guide
The previous command will create the following entry in the $SYBASE/interfaces file:
# SYB_BACKUP on fiji (192.186.244.21) using tcp # services: query (5335) master (5335) console (5336) # SYB_BACKUP query tli /dev/tcp x:00>214d7c0a8f4150000000000000000 master tli /dev/tcp x:00>214d7c0a8f4150000000000000000 console tli /dev/tcp x:00>214d8c0a8f4150000000000000000
If you already defined a dump device to Sybase, continue with the next section. Otherwise, you must define a disk or tape dump device to Sybase, as described in Defining a Dump Device in this chapter.
Dumping the Database Verify that SYB_BACKUP (the backup server) is running before you dump the database. If it is not running, start the backup server by entering the following command:
$SYBASE/install/RUN_SYB_BACKUP
where:
database_name
Is the name of the database that contains your AutoSys data. By convention, this is autosys. Is the name of the Sybase dump device that you already defined to Sybase. For example, the following writes to a dump device named autodump:
xql>dump database autosys to autodump;
dump_device
Maintaining AutoSys
1237
Loading the Database To load the database: 1. 2. 3. Sign on to the server that will contain the second event server. Enter the following xql command:
xql -Usa -Psysadmin
To place the database in single user mode and guarantee that no transactions can occur while the load is in progress, add the following database options:
xql>sp_dboption autosys, "no chkpt on recovery, TRUE; xql>sp_dboption autosys, "dbo use only, TRUE; xql>sp_dboption autosys, "read only, TRUE;
4. 5.
where:
database_name2
Is the name of the database that contains your AutoSys data. By convention, this is autosys. Is the name for the Sybase dump device that you already defined to Sybase. For example, the following will load a disk dump named autodump of the database named autosys:
xql>load database autosys from autodump;
dump_device
6.
After the load has succeeded, enter the following to unset the single user options that you set before executing the load:
xql>sp_dboption autosys, "no chkpt on recovery, FALSE; xql>sp_dboption autosys, "dbo use only, FALSE; xql>sp_dboption autosys, "read only, FALSE;
7.
1238
User Guide
Recovering a Bundled Sybase Database To recover a damaged AutoSys database using the backup file: 1. 2. 3. 4. 5. Stop the event processor. Drop the damaged database. Re-create the database. Reload the database. Restart the event processor.
WARNING! You should drop the database and re-create it only in extreme situations. Before using this process to recover a damaged database, investigate all other options.
Stopping the Event Processor
To stop the event processor: Log on as the exec superuser and issue the following command:
sendevent -E STOP_DEMON
You can drop the damaged AutoSys database by using the drop database command. Before you drop the damaged database, however, you should ensure that you have a database dump file to use to restore the database. To drop the damaged database: 1. Enter the following xql command (assuming that the sa password is sysadmin):
xql -Usa -Psysadmin
2.
If the AutoSys database is so damaged that drop database does not work, contact Computer Associates Technical Support. Note: The semicolon at the end of the command line is an end-of-statement delimiter in xql, replacing the Sybase go.
Re-Creating the AutoSys Database
After dropping the damaged database, you must recreate a new AutoSys database. This new database is used to load the database dump file (for example, the autodump file) that you are recovering. Use the create database command to create the new AutoSys database.
Maintaining AutoSys
1239
To re-create the AutoSys database: 1. To create the new database, at the xql prompt, enter the following:
xql>>[AUTOSYSDB][master] 1> create database xql>>[AUTOSYSDB][master] 2> autosys on default xql>>[AUTOSYSDB][master] 3> = 50;
Note: The above size of 50 MB is only an example. The output generated by this command will look similar to this:
Msg 1805, Level 0, State 1, Line 1 CREATE DATABASE: allocating 15360 pages on disk default
2.
To change the owner of the new AutoSys database to autosys, enter the following at the xql prompt:
xql>>[AUTOSYSDB][master] 1> use autosys; xql>>[AUTOSYSDB][autosys] 1> sp_changedbowner xql>>[AUTOSYSDB][autosys] 2> autosys;
When the procedure has completed successfully, a message similar to this should be returned:
Database owner changed.
You are now ready to execute the load database command to restore the database you originally dumped to autodump. To reload the database and put it online: Enter the following at the xql prompt:
xql>>[AUTOSYSDB][master] 1> load database autosys xql>>[AUTOSYSDB][master] 2> from autodump; xql>>[AUTOSYSDB][master] 1> online database autosys;
The AutoSys database is now restored and online. Note: For Sybase System 11, you must put the database online. Previous versions of Sybase did not require this. To exit xql: Enter the following at the xql prompt:
xql>>[AUTOSYSDB][master] 1> exit
When you complete this process, AutoSys will be in the state it was when the database was dumped to the backup file.
1240
User Guide
Chapter
13
Configuring AutoSys
The runtime behavior of AutoSys is controlled by the parameters in the AutoSys configuration file and the environment variables set in the /etc/auto.profile file. This chapter describes these files. Note: If you are running AutoSys on Windows, the configuration parameters are set through the AutoSys Administrator. For more information on the AutoSys Administrator, see the Unicenter AutoSys Job Management for UNIX User Guide.
The file is instance-specific, and the $AUTOSERV value is the name of the instance of AutoSys with which the configuration file is associated. The $AUTOSERV variable must be three uppercase alphabetic characters and must be unique to each instance of AutoSys. Note: Events have a unique ID called eoid, which is prefixed by the first three letters of $AUTOSERV. This ensures an events uniqueness and traceability across multiple instances.
Configuring AutoSys
131
132
User Guide
DBMaintTime=03:30 # Command to Run to perform Maintenance # DBMaintCmd=$AUTOSYS/bin/DBMaint # # For Dual Server Mode - transfer events timeout EvtTransferWaitTime=5 # # Check Heartbeat every 2 minutes #Check_Heartbeat=2 # # Output Directory for the Remote Agent # # Note: Some OSs have problems with file locks in /tmp # If so use some directory other than /tmp. # AutoRemoteDir=/tmp # # Clean Remote Agent files: 1=Remove files if No # Problems! CleanTmpFiles=1 # # Create Remote Agent Output File for Sourcing the # Environment # In UNIX: capture std_out & std_err from sourcing # /etc/auto.profile # In NT: output the Environment to the file # RemoteProFiles=1 # # Host machines to send SNMP traps to. (Specifying # a machine ENABLES traps) #SnmpManagerHosts=host1,host2 # # Snmp community. This is almost always "public" #SnmpCommunity=public # Enable sending HP Operations Center messages (opcmsg) # for AutoSys alarms. # This defines the message group under which # messages will be sent. #OpcMessageGroup=Job # # This parameter sets the amount of time that the # event_demon will wait for the OpCenter message # to be sent. # #OpcWaitTime=4 # # RESTART configuration stuff # # Max number of times to RESTART a job due to system # errors MaxRestartTrys=10 # # Formula for computing the Wait time between # restart attempts: # WaitTime = RestartConstant+(Num_of_Trys * # RestartFactor) # if (WaitTime > MaxRestartWait) then # WaitTime = MaxRestartWait # RestartConstant=10 RestartFactor=5 MaxRestartWait=300 # Preferred method of Load Balancing # Can be: vmstat | rstatd (default is vmstat)
Configuring AutoSys
133
#MachineMethod=rstatd # # List of Signals to Send to a Job for the KILLJOB # event KillSignals=2,9 # # Port number of auto_remote AutoRemPort=5280 # # Specify if standard error and standard output # files should be appended to or overwritten. # 0 overwrites the file. # 1 appends the file. AutoInstWideAppend=1 #
Or, if you are running AutoSys with a shadow event processor, you can start both the primary and the shadow event processors at the same time, like:
eventor -M shadow_machine
134
User Guide
Typically, the database should never time out. However, if a database does time out, AutoSys will attempt to reconnect to the database the number of times specified by the DBEventReconnect parameter. If you see the database connections timing out often, it probably indicates some kind of machine or data server contention problem. Note: If you set this value to DBLibWaitTime=0, it means that no time-out value is to be appliedthe connection is continuous. Because it can cause the event processor to hang, this setting is not recommended.
Configuring AutoSys
135
Database Connections
AutoSys can be configured to attempt connect, and reconnect, to databases a specified number of times. This behavior occurs when the first attempt to connect to the database is made, and when a database connection has been lost and there is a reconnect attempt made. This database connection behavior also sets the rollover criteria for dual-server mode.
136
User Guide
DBEventReconnect The DBEventReconnect parameter controls the number of times an event processor should attempt to connect (or reconnect) to an event server before shutting down, or before rolling over to single-server mode. This parameter is used on startup and when there is a connection problem during runtime. In single-server mode, this parameter is set to a simple number, like:
DBEventReconnect=50
This setting specifies that the event processor should attempt a connection with the event server 50 times. If it cannot connect after 50 attempts, it shuts down. In dual-server mode, this parameter contains two values describing the connection and rollover behaviors, like:
DBEventReconnect=50,5
This setting specifies that the event processor should attempt five connections with the event servers. If after five times it cannot connect, it should rollover to single-server mode, marking the other event server as down. Once in singleserver mode, the event processor should attempt a connection 50 times, and if it is unsuccessful, the event processor shuts down. Upon startup, the event processor attempts to connect to the event servers five times. If the event processor is unable to connect to both databases, it assumes there is a connection or configuration problem, and will gracefully shut down.
Configuring AutoSys
137
EDNumErrors, EDErrTimeInt The EDNumErrors parameter specifies the maximum number of errors that can happen within the time specified by the EDErrTimeInt parameter, in order to determine if AutoSys should shut down the event processor. If the specified Number of Errors occurs within the Error Time Interval, AutoSys shuts down the event processor as a safety measure. If there are more than EDNumErrors errors within EDErrTimeInt seconds, AutoSys shuts down the event processor. The default settings specify to shut the processor down if more than 20 errors occur within 60 seconds, and the entries in the configuration file look like this:
EDNumErrors=20 EDErrTimeInt=60
138
User Guide
Note: The third machine must have a remote agent installed on it, and it must have a valid client license. In addition, the third machine must be installed on the same type of machine as the primary and shadow event processors, either Windows or UNIX. For more information on running AutoSys with a shadow event processor, see the shadow event processor section in Chapter 1, Introduction to AutoSys, of the Unicenter AutoSys Job Management for UNIX Installation Guide.
Configuring AutoSys
139
The default FileSystemThreshold setting is 32 KB. Valid settings must be less than 10 MB and greater than 8192 bytes. If the amount of disk space falls below 8 KB, the event processor will issue an EP_SHUTDOWN alarm and shut down, issuing messages similar to the following:
ERROR: No disk space left to write Event Processor log EVENT: STOP_DEMON The Event STOP_DEMON has just been received. We are going down!
1310
User Guide
Calculates and updates the average job runs statistics for AutoSys/Xpert. Updates statistics for the optimizer, checks the available space in the database, and sends a DB_PROBLEM alarm if the amount of free space is insufficient. Cleans out old information from the AutoSys database tables using the archive_events command.
The DBMaint file is installed in the $AUTOSYS/bin directory. We recommend that you configure your system to backup the database during this maintenance cycle. For more information on using the DBMaint script and backing up a bundled Sybase database, see DBMaint Script and Bundled Sybase Backup and Recovery in this chapter.
DBMaintTime, DBMaintCmd The following entries in the configuration file instruct the event processor to begin its maintenance cycle at 3:30 a.m. every day, and to execute the $AUTOSYS/bin/DBMaint script at that time:
DBMaintTime=03:30 DBMaintCmd=$AUTOSYS/bin/DBMaint
Configuring AutoSys
1311
Event Transfer
When in dual-server mode, the event processor will copy a missing event from one event server to the other event server, after a time-out delay. This time-out delay is specified by the EvtTransferWaitTime parameter in the configuration file. The default setting of five does not usually need to be modified.
EvtTransferWaitTime To set the default behavior of five seconds for the time-out, the configuration file contains the following line:
EvtTransferWaitTime=5
Sendevent Retries
The following two configuration-file parameters control how many times and how frequently the sendevent command will attempt to send an event to the event server database.
SendeventMaxRetries Specifies the maximum number of times that the sendevent command will attempt to send an event to the event server database. To set the number of retry attempts to 10, enter the following line in the configuration file:
SendeventMaxRetries=10
SendeventRetryInterval Specifies the interval, in seconds, between attempts to send an event to the event server database. There is no default value. To set the interval to 15 seconds, enter the following line in the configuration file:
SendeventRetryInterval=5
1312
User Guide
Heartbeats
Heartbeats offer a method by which the continued progress of an application can be automatically monitored. That is, you can program your user applications to send heartbeats that can be monitored by AutoSys. A heartbeat is a signal (SIGUSR2) sent from the application to the remote agent that started the application. That remote agent then sends a HEARTBEAT event to the AutoSys event servers. The event processor will check that the HEARTBEAT event has occurred within the heartbeat interval specified for the job. Note: The event processor, and not the remote agent, checks if there is a HEARTBEAT between the remote agent and the event servers, and if the HEARTBEAT is absent, there is a problem, and an alarm is sent. Therefore, the HEARTBEAT option also provides a good indication of the stability of the network.
Check_Heartbeat The Check_Heartbeat parameter specifies the interval value (in minutes) that you want the event processor to use when checking for heartbeats. If there are no applications sending heartbeats, you do not have to set this parameter. By default, this parameter is commented out in the AutoSys configuration file. For example, to instruct the event processor to check for missing heartbeats every two minutes, you would uncomment the following line in the configuration file:
Check_Heartbeat=2
Configuring AutoSys
1313
ShadowPingDelay Specifies the interval, in seconds, after a successful ping of the shadow event processor before another ping is attempted. The default is 60 seconds. To set the interval to 30 seconds, enter the following line in the configuration file:
ShadowPingDelay=30
AutoRemoteDir The following entry in the configuration file specifies that the remote agents should use the /tmp directory for enterprise-wide logging:
AutoRemoteDir=/tmp
1314
User Guide
File Maintenance
CleanTmpFiles For every job that AutoSys runs, it creates a file in the remote agent Log directory on the machine where the job runs. If CleanTmpFiles is turned off, these files remain on each machine until they are removed with the clean_files command. As an alternative to using the clean_files command, you can set the value for the CleanTmpFiles variable in the configuration file to be equal to 1, like:
CleanTmpFiles=1
Then, upon the successful completion of its tasks, the remote agent will remove the /tmp/auto_rem* file (assuming the default /tmp directory is specified by the AutoRemoteDir parameter). This is the format of the remote agent log filename, the auto_rem* file that is removed:
auto_rem.joid.run_number.ntry
If the remote agent cannot run the job successfully, the files will not be removed because they are useful to have when diagnosing the run problem. Notes: To view the remote agent output file, use the autosys_log command on the client machine, like:
autosys_log -J job_name
Regardless of how you set the CleanTmpFiles parameter, you should run the clean_files command on a periodic basis to remove files from unsuccessful remote agent processes.
Configuring AutoSys
1315
RemoteProFiles When the RemoteProFiles parameter is turned on, it redirects to a file any stderr and stdout output generated when the /etc/auto.profile file is sourced. This parameter is on by default, and the entry in the configuration file looks like:
RemoteProFiles=1
The name of the file to which the profile output is written is based on the log file name. This is the form of the auto_rem_pro* filename:
auto_rem_pro.joid.run_number.ntry
This output file will contain entries if anything specified in the profile file failed (for example, environment variables or definitions were not set). For example, if the profile file attempts to set an environment variable using setenv, the Bourne shell that AutoSys runs would not be able to process this C Shell syntax, and the output file would contain the following line:
setenv: not found
Non-fatal errors when a profile file is sourced are not recorded and will not appear in the output file. To view the profile output file, use the autosys_log command on the client machine, like:
autosys_log -J job_name -p
This command will display the log file first, appending the profile output, if there is any. If no profile output file exists, the log file will display:
File: profile_output_file Does Not Exist.
Note: If CleanTmpFiles is turned on (set to 1), the output file will be removed when the job completes successfully, and the profile log information will not be available. If CleanTmpFiles is turned off, the file will remain until it is removed with the clean_files command.
1316
User Guide
Note: This parameter governs retries that result because of system or network problems. It is different from the n_retrys job definition attribute, which controls restarts when a job fails due to application failure (for example, AutoSys is unable to find a file or a command, or permissions are not properly set).
Configuring AutoSys
1317
where:
WaitTime
Is the calculated interval in seconds to wait before attempting the next restart of a job. Is a constant value you specify. Is the AutoSys counter tracking the number of times the job has already tried to start. Is a factor you specify. Is the maximum amount of time (in seconds) AutoSys will wait before it attempts to restart a job. If the calculated WaitTime is greater than the specified value for MaxRestartWait, then the resulting WaitTime is set to MaxRestartWait. For example, the entry in the configuration file might look like:
RestartConstant=10 RestartFactor=5 MaxRestartWait=300
RestartConstant Num_of_Trys
RestartFactor MaxRestartWait
1318
User Guide
If rstatd is chosen, you must ensure that this demon is running on all client machines. To ensure that the demon is running, do the following:
Edit the internet demon configuration file (/etc/inetd.conf) on all client machines, and un-comment the rstatd entry. Send a SIGHUP signal (kill -1) to reset the running inetd process.
Sometimes, a kill -1 command is not sufficient to reset the inetd. If rstatd fails, you might have to issue a kill -9 command, then restart inetd. If necessary, check with your systems administrator. Note: rstatd is not currently supported on NCR or Pyramid client machines.
Configuring AutoSys
1319
KILLJOB Signals
KillSignals The KillSignals parameter specifies a comma-separated list of signals to send to a job whenever the KILLJOB event is sent. If the job is running on a UNIX machine, the parameter specifies a single or comma-delimited list of UNIX signals to be sent to the job. If a list of signals is specified, the signals are sent in the order listed, with five-second sleeps between each call. If the job to be killed is running on a Windows machine, this list is ignored and the job is simply terminated. To preserve AutoSys backward compatibility, the entry in the configuration file is:
KillSignals=2,9
We recommend that you set this parameter to 15,9. In most cases, this will lead AutoSys to return a TERMINATED state for a job that was killed. If it does not, as might happen for some shells on some operating systems, set the KillSignals parameter to 9. Note: The KillSignals listed in the configuration file are overridden when issuing the sendevent command with the -k option.
1320
User Guide
Note: If you are using NIS or NIS+, and wish to change AutoRemPort, you must modify /etc/services on your NIS or NIS+ master and push it to all client machines, and then do a kill -1 process on the inetd.
Configuring AutoSys
1321
Cross-Platform Scheduling
AutoSysAgentSupport The AutoSysAgentSupport parameter specifies whether the event processor should start the asbIII process for cross-platform scheduling. If set to 1, crossplatform scheduling is enabled. If the value of this parameter is equal to zero, then the event processor will not send jobs to AutoSys Connect and AutoSys Agent managed machines. If the value of this parameter is equal to one, then the event processor will send jobs to these machines. By default, AutoSys Connect and AutoSys Agent job support is off, and the entry in the configuration file looks like:
AutoSysAgentSupport=0
For more information, see the appendix Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS, in this guide.
AutoSysAgentDebug The AutoSysAgentDebug parameter specifies whether the asbIII process will write verbose trace information to the event processor log file. If the parameter is set to 1, verbose trace logging is enabled. If the value of this parameter is equal to zero, then the event processor will not send jobs to AutoSys Connect and AutoSys Agent managed machines. If the value of this parameter is equal to one, then the event processor will send jobs to these machines. By default, Autosys Connect and AutoSys Agent job debug is off, and the entry in the configuration file looks like:
AutoSysAgentDebug=0
For more information, see the appendix Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS, in this guide.
1322
User Guide
Each client machine can override the instance-wide setting by using the AutoMachWideAppend variable in the /etc/auto.profile file. If specified, this variable would appear as shown in the following example:
#AUTOENV#AutoMachWideAppend=YES
Note: If you are running jobs across platforms, the event processor of the issuing instance controls the default behavior. For Windows, the default is to overwrite this file. An individual job definition can override either the instance-wide or the machine setting by placing the following notation as the first characters in the standard output and standard error files specifications:
>> >> Overwrite file Append file
Configuring AutoSys
1323
Note: Setting InetdSleepTime too low for your hardware could adversely affect performance. In addition, you must ensure your machine has a processor fast enough to handle starting jobs at a faster interval. Otherwise, there will be frequent socket connection failures, which will cause numerous job restarts.
1324
User Guide
where:
0
0To report no events to Unicenter (default setting) 1To log all Alarms. 2To only log Alarms and job completions. 3To log all events that are generated to the Unicenter Console.
Configuring AutoSys
1325
It specifies a number of descriptors that set the environment for the remote agent on the client machine. These descriptors are environment variables that are preceded by the characters:
#AUTOENV#
It specifies default settings for AutoSys jobs that do not have a profile specified in the job definition. A job profile sets environment variables for a job immediately before the job is started.
You may want to view the /etc/auto.profile file in a text editor to familiarize yourself with the environment settings in this file. For information about other settings that can be added to the auto.profile, see the following sections in this chapter:
AutoMachWideAppendSee Instance Wide Append Parameter. AUTOSV and AUTOSV_DIRSee ServerVision Environment. DENY_USERSee Client-Side Security.
1326
User Guide
The remote agent looks for the #AUTOENV# descriptors and reads the variables that follow to set its environment. Do not remove the #AUTOENV# descriptors from the file. They are required to enable the remote agent to communicate with the AutoSys database. Note: The #AUTOENV# descriptor is not a comment. Do not remove the # character from the beginning of the line.
Configuring AutoSys
1327
The #AUTOENV#SYBASE descriptor defines where the remote agent looks for the Sybase interfaces file, which is specified by the SYBASE environment variable. If the interfaces file is not present in this location, the remote agent will not be able to write job statuses to the database. Note: A common symptom that results from the remote agent being unable to write to the database is that jobs will remain in starting state.
The #AUTOENV#TNS_ADMIN descriptor defines where the remote agent looks for the tnsnames.ora file, if the TNS_ADMIN environment variable is set. If the TNS_ADMIN variable is not set, the remote agent looks in the default directory, which is operating system specific. The #AUTOENV#ORACLE_HOME descriptor defines the top-level Oracle directory, defined by the ORACLE_HOME environment variable.
1328
User Guide
In the AutoSys configuration file, the socket connection is defined by the AutoRemPort variable, like:
# Port number of auto_remote AutoRemPort=5280
In the /etc/services file, the auto_remote entry defines the socket connection, like:
auto_remote 5280/tcp # AutoSys Version 4.0
The internet services demon (inetd) on the client machine uses the port number defined in the AutoSys configuration file to point to the name of the auto_remote service found in the /etc/services file. The service name is then located in the internet demon configuration file (usually /etc/inetd.conf), where it finds the path to the remote agent binary, like:
auto_remote stream tcp nowait root /usr/vendor/autotree/auto_remote auto_remote
Configuring AutoSys
1329
On the same machine, you could have a 3.4 remote agent using port number 4500. The entry in its AutoSys configuration file (for example, config.DEV) would look like:
AutoRemPort=4500
If you do this, you must also modify the /etc/services file like:
auto_remote_33 auto_remote_34 3500/tcp # AutoSys Version 3.3 4500/tcp # AutoSys Version 3.4
Then, you would modify the internet demon configuration file (/etc/inetd.conf) like:
auto_remote_33 stream tcp nowait root /usr/vendor/autotree_33/auto_remote auto_remote auto_remote_34 stream tcp nowait root /usr/vendor/autotree_34/auto_remote auto_remote
1330
User Guide
Unix ruserok() Authentication When using this method, AutoSys instructs a clients remote agent to make the UNIX system ruserok() call. This function checks the client machines /etc/hosts.equiv and the users .rhosts files to validate that the requesting user is registered in that environment.
AutoSys remote agent event processor Authentication When using this method, a specific remote agent can be bound to one or more event processors. In other words, a remote agent will have to verify its permission to process an event processors requests before starting each job. The remote agent does this by reading the /etc/.autostuff file on the machine where it is running.
Using the autosys_secure command, the AutoSys edit superuser can enable (or disable) remote authentication. By default, remote authentication is initially disabled. If you enable remote agent, event processor authentication, you must configure AutoSys to support it.
Enable remote agent event processor Authentication. Create an ASCII file named.autostuff in the /etc directory of every client machine that will participate in this authentication method.
If both are present, the remote agent will only run jobs submitted by event processors listed in the autostuff file.
Configuring AutoSys
1331
The /etc/.autostuff File The /etc/.autostuff file should have read and write permissions for root only. Entries in this file must be in the following form:
AUTOSERV:hostname
where:
AUTOSERV hostname
Is an AutoSys instance name. This is the name of the machine on which the event processor is running. This must be a real machine (if using DNS, this should be a fully-qualified name). The file should contain an entry for each event processor you want authorized to run jobs on the remote agent machine. The entries cannot contain spaces within the event processor specification. You can use pound signs (#) for comments. These are example file entries:
PRD:curly DEV:moe #Production Instance #Development Instance
In this example, PRD is an AutoSys instance name and the event processor for this instance is running on the machine curly. DEV is an AutoSys instance name and the event processor for this instance is running on the machine moe. These event processors are authorized to issue jobs to the remote agent.
1332
User Guide
Client-Side Security
Client-Side Security
The AUTOENV environment variable DENY_ACCESS restricts access to the remote agent machine. In the auto.profile file for the remote agent machine, you can specify a list of users whose jobs are prohibited from running on that machine. The list is a comma-delimited list of user names, with no spaces. The maximum number of characters is 512. For example:
######################################################## # auto_remote environment variables # DO NOT REMOVE #AUTOENV#DENY_ACCESS=root,demon,admin
In this example, jobs owned by root, demon, and admin will not be launched by the remote agent. If a job owned by one of these users is submitted to run on the remote agent, the job fails as if the job's owner did not have a valid account on the machine. There will be no job restart attempts, regardless of MaxRestartTrys setting in the AutoSys configuration file. When this occurs, the following error appears in the event processor log, as a STARTJOBFAIL alarm on the job:
Permission ERROR: Could not SET uid=uid on Host: host
Configuring AutoSys
1333
Library Path
A ServerVision library is shipped with this AutoSys version. It is installed in $AUTOSYS/install/data. You must set your library path as appropriate for your operating system to point to this library.
svload Requirements
To use the svload load-balancing executable, the following is required: 1. The ServerVision uvcfgref environment variable must be set in the environment for the AutoSys event processor. This must be set to the directory that contains the ServerVision instances file. The ServerVision instances file must contain entries for all machines that you will use with the svload executable. For example, to use a machine named mack, the instances file must contain the following entry:
[mack] type=unix node=mack
2.
3. 4.
The ServerVision uvroot environment variable must be set as required by ServerVision. The ServerVision xuv_net_svc binary must be in your path.
1334
User Guide
YESWrites AutoSys job information to a file for ServerVision. This enables a user to view AutoSys jobs from the ServerVision GUI. YesWithArchiveWrites AutoSys job information to a file for ServerVision and expects a file in return to place in the archive database table (called svarchive_tbl). This creates an archive of a jobs resource usage to generate reports for capacity planning and UNIX process auditing (chargeback).
AUTOSV_DIR must be set to tell AutoSys where to write the files for ServerVision. This must be a full path name and it must be the same path that you specify in your ServerVision instance configuration file.
Configuring AutoSys
1335
ServerVision Configurations
To view AutoSys jobs in the ServerVision GUIs: 1. 2. The ServerVision product must already be installed as instructed in the ServerVision documentation. View the ServerVision configuration file, typically $uvunix/instance/uv_unix.cfg (where instance is the ServerVision instance name). In this configuration file, locate the app_* scan types (scan types are shown in square brackets) and look for the line that begins with the string command, such as:
command exec $uvbin/unix/appd -d /tmp
The directory specified with the -d argument must match the directory you specify with the AUTOSV_DIR variable in the AutoSys /etc/auto.profile file. 3. 4. Restart the ServerVision instance. From the XUV GUI, select Control, Instance, Stop. Then select Control, Instance, Start. Ensure the app_group scan types are on. From the XUV GUI, select Control, Configure, Scan.
1336
User Guide
One file is created for each job run. These files are stored in the directory specified by the AutoRemoteDir parameter in the AutoSys configuration file. You must run the svarchive utility to process these files and place the information in the svarchive database table. After the process is complete, the svarchive utility deletes the files. Note: You may want to create an AutoSys job to run the svarchive utility automatically on a regular schedule. The information in the svarchive database table can be referenced to generate reports for capacity planning and UNIX process auditing (chargeback).
Configuring AutoSys
1337
DB_ROLLOVERAutoSys has rolled over from dual-server to single-server mode. DB_PROBLEMThere is a problem with one of the AutoSys databases. EP_ROLLOVERThe shadow event processor is taking over processing. EP_SHUTDOWN The event processor is shutting down. This might be due to a normal shutdown, or due to an error condition. EP_HIGH_AVAIL The third machine for resolving contentions between two event processors cannot be reached, one of the event processors is shutting down, or there are other event processor take-over problems.
To specify what executable should be invoked as a user-defined callback for one of the above alarms, a file named notify.$AUTOSERV must be created in the $AUTOUSER directory. An example of this file is provided in the $AUTOSYS/install/data/notify.ACE file, which contains the following entries:
# Notify for certain CRITICAL ALARMS # # Form is: ALARM executable # We pass in $1 = numeric code # $2 = Text Message # Only have 1 space between the ALARM and the executable # # The environment is inherited from the Event Processor # The following is executed: system( <executable> # $1 $2 & ); # #DB_ROLLOVER $AUTOUSER/notify_db #DB_PROBLEM $AUTOUSER/notify_db #EP_ROLLOVER $AUTOUSER/notify_ep #EP_SHUTDOWN $AUTOUSER/notify_ep #EP_HIGH_AVAIL $AUTOUSER/notify_ep
1338
User Guide
Notification Example
To have AutoSys call the program /usr/local/bin/pager when the event processor shuts down: 1. 2. Copy the sample notification file to:
$AUTOUSER/notify.$AUTOSERV
Then, AutoSys will pass pager a numeric code and a text message. The pager itself must be coded to accept these parameters.
Configuring AutoSys
1339
Chapter
14
Troubleshooting
Problems with AutoSys usually involve the interactions between the major AutoSys components, rather than with the individual components themselves. This chapter presents a number of common AutoSys problems, their symptoms, and how to resolve them. It provides very useful information about troubleshooting the primary AutoSys components. To troubleshoot AutoSys more effectively, it is essential that you understand the stages in the life of a job, the order in which they occur, and the roles played by the three major AutoSys components (that is, event server, event processor, and remote agent). When a job is defined, its starting conditions are saved to the event server (database), and the following occurs:
When its starting conditions are met, the event processor initiates a remote agent on the client machine to execute the job. The remote agent runs the job and sends the exit status of the job back to the event server. After the job completes, it is not run again until its starting conditions are met.
This is the basic cycle for all jobs. For more information about job processing, see Basic AutoSys Functionality in the chapter Introduction to AutoSys, in this guide. For information about troubleshooting CCI, see the appendix Troubleshooting CCI in this guide.
Troubleshooting
141
2.
When running programs like autorep or autosc, you get a messages like:
Client ERROR:
or:
Unable to connect: SQL Server is unavailable or does not exist.
3. 4.
Resolution This indicates that either the data server is down, or the process in question is unable to access it. To confirm that the data server is down, log on to the server machine and run the chk_auto_up utility. You can also look at the process table (using the UNIX ps command) for the process name data server. If the database is indeed down, you must restart it. For directions on how to do this, see Starting Sybase in the chapter Maintaining AutoSys, in this guide. If the database is running, the problem could be that you are pointing to the wrong data server. The DSQUERY environment variable points to the name of the data server (typically AUTOSYSDB). If it is not set properly and you are not specifying a data server name to xql (using the -S server option), then xql will fail.
142
User Guide
The AutoSys environment variables point to the configuration file $AUTOUSER/config.$AUTOSERV. This file contains the name of the event server. Check to make sure that the environment variables and the configuration file point to the proper location and event server. Enter the following command for the instance ID:
echo $AUTOSERV
Enter the following command for the event server and database name:
get_server $AUTOSERV
where:
E AUTOSYSDB autosys
Means that this is an event server. Is the name of the event server. Is the name of the database (for Sybase and Microsoft SQL Server). If the database service is up, and the environment variable is set properly, then the Sybase interfaces file might not have the proper form, or be in the proper location. This typically occurs on servers if the environment has been changed, or, on clients, if the interfaces file has not been correctly installed. For more information about the Sybase interfaces file and connecting to databases, see the chapter Introduction to AutoSys in the Unicenter AutoSys
Troubleshooting
143
Sybase Deadlock
Symptom A message similar to the following appears in the event processor log when viewed with the autosyslog -e command or in the Sybase error log ($SYBASE/ install/errorlog_EventServer):
Your server command (process id #11) was deadlocked with another process and has been chosen as deadlock victim. Re-run your command.
Resolution A deadlock is a Sybase condition that occurs when two users have a lock on separate objects, and they each want to acquire an additional lock on the other users object. The first user is waiting for the second user to let go of the lock, but the second user will not let go until the lock on the first users object is freed. The data server detects the situation and chooses the user whose process has accumulated the least amount of CPU time as the victim. The data server rolls back the victims transaction, notifies the application with the above error message, and allows the other users processes to move forward. sendevent will try to rerun the command until it is successful or until it reaches the maximum number of tries specified by the -M option.
144
User Guide
Resolution These messages occur because there are more users who want to run jobs simultaneously than there are user connections; there are not enough connections available to the database. By default, the bundled Sybase installation of AutoSys has a limit of 25 user connections. You can increase the number of user connections, but first you must determine the maximum number of user connections your system can support. To determine the maximum number of user connections you can set for your system: 1. 2. Log into the database as the sa. At the isql or xql prompt, enter:
1> select @@max_connections;
3.
Troubleshooting
145
4.
Enter:
select count(*) from master..sysdevices where mirrorname is not NULL;
5.
Enter:
1> select count(*) from master..sysservers where srvname != @@servername;
The maximum number of user connections that you can set is @@max_connections minus the sum of the results of the last three queries. In the example results to the above queries, the maximum number of user connections is 249. To increase the number of user connections: 1. At the isql or xql prompt, enter the following command to specify the number of user connections you want:
1> sp_configure user connections, number;
where:
number
WARNING! If you set the number of user connections too high, the database will be unusable. At this point, you might not be able to rerun sp_configure to lower the number of user connections. To return the database to working order, you must run buildmaster or recover the database from backups.
2. Stop and restart the event server. Changes will not take effect until you stop and restart the event server.
146
User Guide
2. 3.
If trunc.log on chkpt is not set, enter one of the following commands depending on your version of Sybase:
1> sp_dboption autosys, trunc log on chkpt, true;
or:
1> sp_dboption autosys, trunc. log on chkpt., true;
4.
Troubleshooting
147
Resolution To resolve the full transaction log problem: 1. Log into the database server as the sa by entering the following command (if you changed the sa password, use that one instead of sysadmin):
xql -Usa -Psysadmin
2. 3. 4.
Repeat step 4 and each time increase the rowcount incrementally. To begin with, try doubling the number. If this fails at any time, decrease the rowcount and try again. When the rowcount is large, log out of the data server. 5. Log out of the data server, and then run archive_events. Begin with a large number of days and work down until you reach your target number of days. For example:
archive_events -n 30 archive_events -n 10
148
User Guide
Resolution You are not the same user who last started Sybase, and do not have permission to write to the Sybase error log file. Become that user, change the error log file ($SYBASE/install/errorlog_$DSQUERY) permissions, or delete the old error log file.
Troubleshooting
149
Everything that the event processor does, in the order it was done, is in this file. Network problems are usually reflected in this file as well. This file is very useful for reconstructing what happened when a problem occurs.
3.
The event processor log has not registered a date and timestamp for a period of time. The event processor log should register date and timestamps every minute.
Resolution Confirm that the event processor is down by performing one of the following actions:
Run the chk_auto_up utility. Perform a tail on the log file and check for date stamps. Look for the event_demon process using ps.
If the event processor is indeed down, log on as the exec superuser and run the eventor command to restart it.
1410
User Guide
Resolution You are not the same user who last started the event processor. Ether become that user, or change permissions on the event processor output file, the $AUTOUSER/out/event_demon.$AUTOSERV file.
Troubleshooting
1411
autoping autoping is used to test the connections between the event processor and the remote agent. If you use the autoping -M -D client_hostname command, and it does not return an error, the remote agent should start properly. The remote agent writes RUNNING and completion statuses directly to the event server.
Database Verification Use autoping to verify the remote agent database connection. To check the database connections on machine, enter:
autoping -m machine -D
Instead of a single machine, you can type -m ALL to check all machines. This command captures the output from the attempted database connection, displays it, and includes it in the alarm, if one is generated (use the -A argument to generate an alarm if problems are found).
autoping -m venice -D AutoPinging Machine [venice] AND checking the Remote Agent's DB Access. AutoPing WAS SUCCESSFUL!
1412
User Guide
Symptom The Event Processors $AUTOUSER/out/event_demon.$AUTOSERV log file contains a message similar to the following:
Attempting to connect to AutoSys Remote Agent Service on socket=5280: Connection Refused Attempting to connect to inetd on socket=5280: Interrupted system call The connection to machine: spartacus TIMED OUT. Either that machine, or the network to it, is having problems. ERROR trying to start job: test Error: Connect to socket FAILED. Command: sleep 1 Machine: spartacus
Resolution Either there is a network communication problem, or the client machine is down. The network or machine problems must be resolved before jobs can be run on that machine.
Symptom The event processors $AUTOUSER/out/event_demon.$AUTOSERV log file contains a message similar to:
Attempting to connect to AutoSys Remote Agent Service on socket=5280: Connection Refused Attempting to connect to inetd on socket=5280: Interrupted system call Could NOT connect to machine: spartacus The internet demon may not be configured properly. ERROR trying to start job: test Error: Connect to socket FAILED. Command: sleep 1 Machine: spartacus
Troubleshooting
1413
Resolution There is a problem with the /etc/services file. To locate this problem, check the following: 1. 2. Check the /etc/services file to see if the following entry is there:
auto_remote 5280/tcp
Confirm that the remote agent port number, specified with AutoRemPort in the AutoSys configuration file ($AUTOUSER/config.$AUTOSERV), is the same number as used in the /etc/services file. If you are using NIS/NIS+, remake the services and push it out. This refers to modifying the NIS/NIS+ master machine and propagating the information to all NIS/NIS+ clients. You should contact your NIS system administrator for instructions on how to do this.
3.
Symptom In the event processors $AUTOUSER/out/event_demon.$AUTOSERV log file, you see a message similar to the following:
[spartacus connected] *** Remote Agent Process not started. *** socket read <>, rc=0 ERROR trying to start job: test Error: Auto Remote process did not start. Command: sleep 1 Machine: spartacus
1414
User Guide
Resolution Check that the internet demon is properly configured on the client (remote agent) machine. There should be an entry in inetd.conf for auto_remote. If present, this line points to the remote agents executable. Assuming that auto_remote is in the /usr/local/bin directory, and will be run by the user root, the entry should look like this (on one line):
auto_remote stream tcp nowait root/usr/local/bin/auto_remote auto_remote
Note: The auto_remote command must be run as root, because it does a setuid. For more information about configuring the internet demon on the client machine, see the Unicenter AutoSys Job Management for UNIX Installation Guide. If the auto_remote entry exists in the inetd.conf file, follow these steps: 1. 2. Make sure that the auto_remote binary (the remote agent executable) exists in the specified directory and has execute permission. If the inetd.conf file needs to be modified, a SIGHUP (kill -1) needs to be sent to the inetd process to cause the configuration to be re-read. To do this, sign on as root and execute one of these commands:
$AUTOSYS/install/touch_inetd
Or execute:
kill -1 pid
where :
pid
Is the process ID of the internet demon (inetd). This command forces the internet demon to re-read its configuration file, and to reset itself. Note: On most AIX machines, the kill -1 command fails to reset the inetd. Instead, you must kill the inetd process using kill -9, and then restart the inetd manually. 3. Make sure the remote agent can write to the AutoRemoteDir directory (usually /tmp), and that this directory has available space.
Troubleshooting
1415
where:
AutoRemoteDir
Is the remote agent log directory specified in the configuration file (usually specified as the /tmp directory). Is the process ID of the remote agent. When the remote agent receives its instructions from the event processor, it renames this file in order to give it a unique name. This is the form of the new filename:
AutoRemoteDir/auto_rem.joid.run_num.ntry
auto_rem.pid
where:
joid run_num ntry
Is the job object ID (the jobs number in the database). Is the run job run number. Is the number of tries or restarts. This file contains all the instructions passed to the remote agent by the event processor, the results of any resource checks, and a record of all actions it took. Any problems experienced by the remote agent are reported here, including the inability to send events to the databases, which is the most common problem.
1416
User Guide
To find the most recent instance of the remote agent log for a given job, you issue the following command on the machine where the job last ran:
autosyslog -J job_name
Note: If the configuration file specifies that the remote agent log files are to be cleaned up at the completion of a job, and the job completed normally, the file will have been removed. If the job failed for some reason, the file will not be deleted, regardless of the configuration file setting. To turn off automatic deletion of the remote agent log files, set the CleanTmpFiles parameter in the configuration file to 0. For more information about the AutoSys configuration file and the CleanTmpFiles parameter, see Configuration File Parameters in the chapter Configuring AutoSys, in this guide.
Symptoms 1. The job is stuck in either the STARTING or RUNNING state as seen in either the event processor log or the output resulting from issuing the following command:
autorep -J job_name
2.
Troubleshooting
1417
Resolution If you have gotten to this point, the AutoRemoteDir/auto_rem* file should be present. By looking in this file, you will see how far AutoSys was able to get. You should check and verify the following items: 1. The correct file is being sourced before running the job command. The default file is /etc/auto.profile. A job-specific file may be sourced instead of the default profile. For more information, see the description of the profile attribute in the chapter Job Attributes. 2. The $PATH variable is set up properly, and includes the proper location for all required executables. Variables, such as $PATH, must be exported for the job to see them. The file system that the job command is on is accessible from this machine. The permissions are correct on the job command to be executed. The permissions are correct on any standard input/output files specified for re-direction. The profile is a Bourne shell script. (Korn shell and C shell scripts will not work.)
3. 4. 5. 6.
Note: A valuable debugging technique is to specify a file to be used for standard output and standard error for the job that will not run. If there are any shell level command problems, all error messages will be in that file.
1418
User Guide
Resolution This is a common problem and is nearly always the result of the remote agent being unable to contact the event server. First, ensure that network problems are not preventing communication between the remote agent and the event server machines. If this is not the problem, then check the following database-specific solutions. With Sybase, this problem usually occurs because the interfaces file is not set up properly on the machine running the remote agent. With Oracle, this problem usually occurs because the SQL*Net V2 connections are not set up properly. The remote agent must be able to connect to the event server in order to send the RUNNING, SUCCESS, FAILURE, or TERMINATED status events. To verify the problem, look in the AutoRemoteDir/auto_rem* file for this job. You can accomplish this by issuing the following command on the machine where the job is supposed to have run:
autosyslog -J job_name
where:
job_name
Troubleshooting
1419
If the remote agent cannot send the event back to the database, it will write a message to that effect, plus some diagnostics, into this file. (The output from the autosyslog command could provide a helpful DBMS error number from the connect attempt.) If you are using Sybase, check the following: 1. Check that the Sybase interfaces file exists and is readable by all users. The location of the interfaces file is pointed to by an entry in the /etc/auto.profile, which looks like this:
#AUTOENV#SYBASE=/usr/home/sybase
2.
Check that the Sybase interfaces file has an entry for the data server that contains the AutoSys event server. The format of the Sybase interfaces file requires that:
Each data server name begins at the left margin with no preceding spaces or tabs. Each entry line has a single preceding tab.
Each element in an entry line is separated by a single space. Incorrect formatting will cause the remote agent to be unable to communicate with the database.
1420
User Guide
If you are using Oracle, check for the following: 1. Check that the Oracle TNS names file, tnsnames.ora, exists, is readable, and contains the correct information for the event server. By default, the TNS names file is in one of the following locations:
On most systems, it is in /etc/tnsnames.ora On some System V systems (for example, Solaris), it is in /var/opt/oracle/tnsnames.ora, or in /var/opt/tnsnames.ora
If the tnsnames.ora file is not in one of these locations, you must define it using the $TNS_ADMIN environment variable. Set the $TNS_ADMIN variable for the remote agent with an entry in /etc/auto.profile that looks like this:
#AUTOENV#TNS_ADMIN=/usr/home/oracle /etc/auto.profile must be readable by all users.
2.
Check that the Oracle TNS names file has a SQL*Net V2 formatted entry for the event server.
To test that everything is set up properly, try to log onto the event server from the client machine, using the xql utility (for Sybase), or using sqlplus (for Oracle). When you log onto the event server, use the autosys user and password. When testing Sybase using xql, be sure that your user environment is looking at the same interfaces file as the auto_remote (remote agent). Set SYBASE to the same value that is in /etc/auto.profile. Note that the auto_remote only attempts to read the interfaces file once. After a bad interfaces file has been read, correcting it will not allow a running auto_remote to connect. After you correct the interfaces file, you will have to kill the auto_remote and restart the job. For Sybase, try to log onto the event server from the remote machine using xql, like:
xql -U autosys -P autosys -S AUTOSYSDB
When testing Oracle using sqlplus, be sure that your user environment is looking at the same tnsnames.ora file as the auto_remote (remote agent). Set TNS_ADMIN to the same value that is in /etc/auto.profile.
Troubleshooting
1421
Note that the auto_remote only attempts to read the tnsnames.ora file once. After a bad tnsnames.ora file has been read, correcting it will not allow a running auto_remote to connect. After you correct the tnsnames.ora file, you will have to kill the auto_remote and restart the job. For Oracle, try to log onto the event server from the remote machine using sqlplus with a V2 connect descriptor, like:
sqlplus autosys/autosys@AUTOSYSDB
1422
User Guide
Resolution Check the following: 1. 2. Determine if the data server is started and running, if not start the data server. Verify that the DSQUERY environment is set to the proper data server. or do the following: Run xql with the -S and -D options to specify the correct data server and database. 3. If a fully-qualified xql statement still fails, then it is a problem with the interfaces file. For more information about dealing with this file, see the resolution in Remote Agent Starts, Command RunsNo RUNNING Event is Sent in this chapter.
Troubleshooting
1423
Check the AutoSys configuration file, $AUTOUSER/config.$AUTOSERV ($AUTOSERV is the name of the AutoSys instance). A space after the machine name is hard to see. Use an editor, such as vi (with the :set list option), to edit the configuration file and remove anything after the name of the machine and before the $ that marks the end of the line.
1424
User Guide
Severe performance problems on the client are the main reason this occurs. For example, the following might affect performance:
Running a full system backup on the client at the same time jobs are starting might slow down the system so that it cannot respond to the server. Network problems. If a jobs home directory is on an NFS drive and there are bandwidth problems, the job might take so long to start that the socket times out.
Because socket time-out is not a customizable parameter, there is little you can do to avoid this situation from an AutoSys perspective. However, you can analyze the performance of the client by asking these questions:
Are there too many processes running on the client when you run jobs? Are you having network problems? Are you using NFS-mounted directories? Do you need more memory or processors on the client?
Troubleshooting
1425
Resolution This problem is nearly always the shell environment where the job runs. The following are the possible reasons for the problem: 1. 2. The profile in the job definition is not a Bourne shell (sh) type profile. If this is the case, the profile fails. The default AutoSys profile does not produce the proper environment for the job to run. The default profile for all AutoSys jobs is /etc/auto.profile, not the job owners logon profile $HOME/.profile. If the job owners profile is not specified in the job definition, it is never sourced. To check the difference between the job definition and the user environment, do the following: 3. Write the current owners environment to a file. Log in as the owner of the job on the machine where the job will run and enter the following command:
env >user.env
4.
Write the remote agent environment to a file by entering the following JIL command:
insert_job: auto_env machine: client_hostname owner: owner command: env std_out_file: /tmp/auto.env std_err_file: /tmp/auto.err
where:
client_hostname owner
Is the hostname of the machine where the problem job runs. Is the owner of the job that will not run.
1426
User Guide
5. 6.
Check the two files for differences by entering the following command:
diff /tmp/auto.env user.env
This shows you where the AutoSys environment and the user environment differ. Make the necessary changes in the job definition and the user profile. Also, it is useful to define the std_err_file for the job that fails, because you can check the errors from the shell for a clue about what is missing.
Troubleshooting
1427
Appendix
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
This appendix describes how to integrate AutoSys with the mainframe, AutoSys Agents for AS/400 and OpenVMS, Unicenter NSM Job Management Agents and Unicenter Universal Job Management Agents for an advanced AutoSys configuration. AutoSys enterprise-wide scheduling lets you integrate AutoSys jobs with AutoSys Agents for, AS/400, OpenVMS, and with various scheduling products on the mainframe. The following types of integration are supported:
AutoSys jobs can be defined to conditionally start based on the status of jobs running on OS/390, AS/400, OpenVMS, and any Unicenter NSM Job Management Agent node. AutoSys can schedule jobs on any of those machines as well. AutoSys can receive work from other agents.
Using cross-platform job dependency notation, AutoSys jobs can be defined to conditionally start based on the status of a job running on the included set of agent machines. You can also create AutoSys jobs that will run on any of the agent machines (if the agent machine is defined to AutoSys). The term agent machine is defined in the next paragraph.
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A1
Definition of Terms
The following terms are used in this appendix: AutoSysA job scheduler that runs on UNIX and Windows. AutoSys ConnectSoftware that enables AutoSys to communicate with legacy OS/390 schedulers. CCICommon Communication Interface. AutoSys AgentAny remote agent in the set of agents supported by AutoSys 4.0. These are AutoSys, AS/400, AutoSys Connect, scheduling agents for VMS, and any Unicenter NSM Job Management agent. Agent MachinesAny machine, which supports an agent. Unicenter NSM Job Management Workload AgentA small set of programs that execute on each target. Machine where jobs are processed. The agent performs the following functions:
Receives job requests from one or more managers, such as Unicenter NSM Job Management Workload Server and AutoSys Server, and initiates the requested program, script, JCL or other unit of work. Collects status information about job execution and file creation. Sends status information to the requesting workload manager.
Cross-instance job dependencyA dependency between jobs running on different instances of AutoSys. Cross-platform job dependencyA dependency between jobs running on different platforms. For example, an AutoSys job running on a UNIX or Windows machine can be dependent on a job running on a mainframe.
A2
User Guide
Related Documentation
The information presented in this appendix supplements the following documents:
Unicenter AutoSys Job Management for UNIX Installation Guide Unicenter AutoSys Job Management Connect Option User Guide
Note: AutoSys can also drive your legacy mainframe scheduler by way of the AutoSys Connect product. See the Unicenter AutoSys Job Management Connect Option User Guide for instructions on how to install and use this product.
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A3
Prerequisites
Before you can implement enterprise job scheduling, you must install and configure the basic AutoSys software as instructed in the Unicenter AutoSys Job Management for UNIX Installation Guide. You must also install and configure AutoSys Connect or AutoSys Agents, or both, as instructed in the documentation for these components. The required software and version levels are listed in the following table: AutoSys AutoSys version 4.0 for UNIX CCI Version 16123032000 (Minimum) To determine your version of CCI, enter the following command:
$CAIGLBL0000/cci/bin/ccine t release
AutoSys Connect (OS/390) AutoSys Agent for AS/400 or OpenVMS Unicenter NSM Job Management Workload Agent (for UNIX versions such as AIX, HP, OSF, SCO, DGI, NCR, and Windows)
asbIIIAn AutoSys process that communicates with AutoSys Connect or any supported AutoSys Agent. CCIThe Common Communication Interface. See the appendix Introducing CCI in the Unicenter AutoSys Job Management for UNIX Installation Guide for more information.
A4
User Guide
When this parameter is set to 1, the asbIII process is started which allows AutoSys to schedule jobs to NSM Job Management, Universal Job Management Agents, AutoSys Connect and various mainframe schedulers.
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A5
Set the AutoSysAgentDebug Parameter To debug jobs directly on an AutoSys Agent, enable AutoSys Agent job debug by setting the following parameter in the AutoSys instance configuration file ($AUTOUSER/config.$AUTOSERV):
AutoSysAgentDebug=1
When this parameter is set to 1, it indicates that an AutoSys instance can dispatch jobs to an AutoSys Agent. where:
1
0 To omit trace information from the logs. (default setting) 1 To enable verbose trace information. 2 To enable logging of CCI messages. 4 To log communication between asbIII and asbIIIsr processes.
Note: You may add these values. Setting the parameter to three, for example, would enable verbose logging and CCI message logging.
Set the AutoSysAgentSupportReceiveSubmit Parameter To enable bi-directional support, set the following parameter in the AutoSys instance configuration file ($AUTOUSER/config.$AUTOSERV):
AutoSysAgentSupportReceiveSubmit=1
When this parameter is set to 1, the asbIIIsr1 process is started. For more information see the section Bi-Directional Scheduling in this appendix.
A6
User Guide
Create the config.EXTERNAL File To enable cross-platform dependencies, create a file named config.EXTERNAL in the $AUTOUSER directory. In config.EXTERNAL, add an entry similar to the following for each agent for which cross-platform dependencies will be exchanged:
INS:AGT=REMOTE_HOST
where:
INS
Is the three-letter uppercase identifier for the AutoSys Connect instance; for example, CA7. This is the name by which the AutoSys Connect application will be known to AutoSys.
AGT
CNCTRepresents an AutoSys Connect machine. NSM Job ManagementRepresents a Unicenter NSM Job Management machine.
REMOTE_HOST
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A7
where:
FLO TNG
Is a three-character instance name, or INS. Is a REMOTE_HOST that is a Unicenter NSM Job Management Workload Agent machine on AS/400, UNIX, VMS, or Windows, CA-7, CA-Jobtrac, or CA-Scheduler
FRU:CNCT=fruit
where:
FRU CNCT fruit
Is a three-character instance name, or INS. Is the agent type, or AGT. Is a REMOTE_HOST that is an AutoSys Connect machine on OS/390. Note: The config.EXTERNAL file can contain a maximum of 249 entries.
A8
User Guide
Ensure Consistent Integration Settings When the event processor starts up, it checks the setting of the AutoSysAgentSupport parameter in the AutoSys configuration file. If the parameter setting is not set to 1, the event processor does not start nor communicate with asbIII. It is important, therefore, to ensure the settings are consistent. That is, to enable communication with any AutoSys Agent, the following must be true:
AutoSysAgentSupport is set to 1 in the AutoSys configuration file. AutoSys Connect instances are defined in the config.EXTERNAL file (if you want to communicate with AutoSys Connect).
If you want to disable both AutoSys Agent and AutoSys Connect communications, you must do the following:
Set the AutoSysAgentSupport parameter to 0 in the AutoSys configuration file. Comment out any AutoSys Connect instances in the config.EXTERNAL file.
Configure the Communication Components See the appendix Introducing CCI in the Unicenter AutoSys Job Management
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A9
License Keys
AutoSys client license keys are required for agent machines in order for AutoSys to run jobs on those machines. The client license key is tied to a specific machine. The client license key is based on the host name and host id of the client. The Computer Associates TLC (Total License Care) group is responsible for generating the license key. For agent machines, the host name is the REMOTE_HOST and the host id is always zero (0). When you supply the above information to the TLC group, they will provide you with a client key. You install the client key in the AutoSys database using the gatekeeper command, as shown in the following example:
gatekeeper Enter Utility to Add/Delete or Print Add (A) or Delete (D) or Print KEY Type: [(c)lient, (s)erver, Hostname: REMOTE_HOST Hostid: 0 KEY: IIJJKKLLMMNNOOPP ***** New Key ADDED! ***** KEY Type: [(c)lient, (s)erver, KEYs. (P) ? a (t)ime, (x)pert]: c
(t)ime, (x)pert]:Enter
A10
User Guide
About asbIII
About asbIII
Communication between AutoSys and AutoSys Connect or an AutoSys Agent is facilitated by asbIII. It is a necessary component for cross-platform communication. Letting asbIII handle that exchange frees the event processor to handle other events. When the event processor is started, it checks if the AutoSysAgentSupport parameter in the AutoSys configuration file is enabled. If it is, the event processor starts asbIII automatically. If the event processor is stopped, or goes down for any reason, asbIII will also stop running. The only way to bring up asbIII is to restart the event processor. If the event processor and asbIII go down while an AutoSys Connect job finishes, the completion event will be lost. (This does not happen to an AutoSys Agent job because missed events are resent during a checkpoint restart at asbIII startup.) The information is sent only once and is not saved. If this happens, the only way to change the status of the job is to change it manually. (You must have execute permission on a job in order to change its status.) To change the status of a job, use the AutoSys sendevent command, as shown in the following example:
sendevent -E CHANGE_STATUS -J job_name -s status
You can also use the Send Event dialog to change the status of a job. This dialog is accessed from the AutoSys Operator Console.
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A11
About asbIII
autosys.ksh.hostname, if you use ksh (korn shell) autosys.sh.hostname, if you use sh (bourne shell) autosys.csh.hostname, if you use csh (c shell)
PRIMARYCCISYSID PRIMARYCCISYSID = cci_system_id The PRIMARYCCISYSID environment variable is key to providing AutoSys Broker failover support within the Unicenter AutoSys Job Management environment, should the Primary EP shutdown or become unreachable. The Shadow Machine, must be set to the CCI system ID of the primary machine so that in the event that the Shadow takes over, the Shadow will be able to notify the remote nodes (mainframe or Unicenter Workload Agents) that it will now be receiving messages intended for the Primary Machine. Should the Primary EP failover, all AutoSys Broker communication under the Shadow EP will take place as normal. Any statuses currently residing on the remote agent machines (mainframe or Unicenter Workload Agents) will now be dispatched to the Shadow EP machine as opposed to the primary for processing. The PRIMARYCCISYSID environment variable must be set in /etc/auto.profile on the Shadow EP machine. It should be set to the CCI system ID of the machine running the Primary EP. To set this environment variable on Windows:
Automatically configured at installation time should the Event Processor component of Unicenter AutoSys be installed. Through the AutoSys Administrator System selection.
A12
User Guide
About asbIII
Bi-Directional Scheduling
Running Jobs on AutoSys on Behalf of a Workload Manager With AutoSys 4.0, it is possible to extend the Workload Managers capabilities into AutoSys. You can define a job in the AutoSys database, destined to run on any agent supported by AutoSys, and include the agents defined in this appendix with a full-fledged AutoSys job definition, so that job can be initiated by a Workload Manager. The statuses of these jobs will be reported back to the Workload Manager as if AutoSys was the Workload Agent. You can start jobs in other AutoSys instances running on other nodes through out their respective asbIIIs. It is also possible for a later instance to run jobs defined by an earlier instance, since multiple instances can be thought of as full peers of one another. Note: There is no restriction on platforms, databases or number of instances when running this broker-to-broker mode. For example; a Linux Sybase instance can initiate jobs in an NT MSSQL environment, or an NT MSSQL instance can initiate jobs in a Linux Sybase. In either case, a Solaris Sybase and an AIX Oracle or HP environment can be added. Any AutoSys instance can initiate or be a recipient of any other AutoSys instance, regardless of platform or database, provided the instances run on distinct servers. Although, these instances can share the same database. The broker-to-broker function can be combined with the Workload function described previously, and the scheduling load can be distributed across the network. In order to enable this feature, you must have the AutoSysAgentSupportReceiveSubmit parameter set to 1, in the AutoSys configuration file. For more information on AutoSysAgentSupport, see the section Configure the AutoSys Machine, in this chapter.
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A13
Event Server
CCI
CCI
C O N N E C T
Job
Event Processor
asbIII
Note: CCI components are listed in the appendix Introducing CCI in the Unicenter AutoSys Job Management for UNIX Installation Guide. In the previous figure, communication between AutoSys and AutoSys Connect Option is handled by asbIII and the communication components. The AutoSys event processor communicates with AutoSys Connect through an AutoSys process called asbIII, which communicates with any supported agent through CCI. In addition, a file named config.EXTERNAL is present on the machine on which the AutoSys event processor was installed. In this file, the mainframe has been identified with a three-letter uppercase instance name, as described in Create the config.EXTERNAL File in this appendix.
A14
User Guide
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A15
where:
JOB_NAME INS ^
Is the name of the job. Is a three-letter uppercase identifier of the instance on which the job is running. (Caret symbol before the instance name)Indicates that the job resides on a different instance of AutoSys. Note: Job names for cross-platform dependencies must be all uppercase. From JIL, enter this in the condition job attribute, as shown in the following example:
condition: success(JOB_NAME^INS)
From the AutoSys GUIs, enter this information in the Starting Condition field of the Job Definition screen. Use the following statuses in the condition attribute of an AutoSys job definition dependent on an AutoSys Connect job:
AutoSys and AutoSys Connect Cross-Platform Dependency Example Below is an example of an AutoSys job that will start only upon the successful completion of JOBA, an OS/390 scheduler defined job that runs on a mainframe:
condition: success(JOBA^CA7)
where:
success(JOBA^CA7)
Specifies the successful completion of an OS/390 scheduler defined job named JOBA running on a mainframe specified with the three-letter ID of CA7.
A16
User Guide
The first character of a job name must be an uppercase alpha character or one of the following characters: a pound sign (#), an at symbol (@), or a dollar sign ($). The remaining characters in the job name can be any combination of uppercase alphabetic characters, numbers, or #, @ $ characters. Job names can be no longer than eight characters. All alphabetic characters must be in uppercase.
Note: These limitations do not apply to all AutoSys jobs, only to jobs that will be referenced to AutoSys Connect. For more information on cross-instance job dependencies, see the topic CrossInstance Job Dependencies in the chapter AutoSys Jobs, in this guide.
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A17
Event Server
CCI
CCI
Job
Event Processor
asbIII
AGENT
In the previous figure, communication between AutoSys and all agents is handled by asbIII and CCI.
The AutoSys event processor communicates with an agent through an AutoSys process called asbIII. The communication components running on the AutoSys machine receive information from the agent and pass it to asbIII. An agent (which in this example is running on the AS/400 platform) is analogous to an AutoSys remote agent.
A18
User Guide
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A19
where:
REMOTE_HOST MACHINE_TYPE
tIndicates a machine running Unicenter NSM , CA-7, or CA-Jobtrac. cIndicates a machine running AutoSys Connect.
If AutoSys Connect is running on the same machine as CA-7, CA-Jobtrac, or any OS/390 scheduling system, the machine type should be c.
Note: Agent managed machines cannot be part of a virtual machine. The following attributes are not supported for agent managed machines: job_load, max_load, and factor. For example, to define the machine shown in Running AutoSys Jobs on Agents in this appendix, which has a REMOTE_HOST of ZASYS400, you would specify the following:
insert_machine: ZASYS400 type: t
A20
User Guide
Job Definition Examples The following job definition example is for AS/400:
insert_job: as400_a1 job_type: c command: DLYJOB DLY(15) machine: usprncax owner: user1@usprncax permission: gx,wx date_conditions: 1 days_of_week: all start_mins: 30
Note: A job that executes successfully on an OpenVMS machine returns an exit code of 1. AutoSys, by default, will interpret an exit code of 1 as a failure unless the max_exit_success attribute is used in the job definition. For the job to be considered successful by AutoSys, when the job exits with any exit code of 1 or less, enter
max_exit_success:1
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A21
The owner identified in the owner attribute of the job definition must have an account on the target agent machine. The account must match the owner name exactly in order for the job to run. The owner of the job definition must be specified as user@machine. The AutoSys edit superuser must use the autosys_secure binary to add valid userids and passwords using option 4 as follows: 1. 2. Start the AutoSys Security Utility:
$ autosys_secure
3.
Enter the user name, user host or domain, and password information when prompted:
Enter user name : bob ZASYS400 Enter user Host or Domain : Enter new password: Enter new password again: User Create successful.
A22
User Guide
RUNNING
SUCCESS
FAILURE
Any AutoSys Connect or AutoSys Agent managed job without a return code of zero will be considered TERMINATED.
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A23
A24
User Guide
Cross-Platform Limitations
Cross-Platform Limitations
When you are running across platforms, keep the following in mind:
If you are running a shadow event processor, cross-platform dependencies will be lost if the shadow event processor takes over. The chase and autoping commands cannot return any information on AutoSys Connect or AutoSys Agent jobs and machines. Remote authentication is not supported for AutoSys Connect or AutoSys Agent jobs. The following events cannot be executed on an AutoSys Connect or AutoSys Agent job: CHANGE_PRIORITY SEND_SIGNAL
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A25
Cross-Platform Limitations
Cross-Platform Limitations
When you are running across platforms, keep the following in mind:
If you are running a shadow event processor, cross-platform dependencies will be lost if the shadow event processor takes over. The chase and autoping commands cannot return any information on AutoSys Connect or AutoSys Agent jobs and machines. Remote authentication is not supported for AutoSys Connect or AutoSys Agent jobs. The following events cannot be executed on an AutoSys Connect or AutoSys Agent job: CHANGE_PRIORITY SEND_SIGNAL
Integrating with the Mainframe and AutoSys Agents for AS/400 and OpenVMS
A27
Appendix
Troubleshooting CCI
netstat
The netstat command allows you to check TCP/IP statistics:
netstat a | grep caic Shows all connections to the local host involving a port, which can be resolved to caic(ci). The important connections are ESTABLISHED and LISTEN. If the latter is present, you know that the kernel accepts connections on behalf of the ccirmtd process. This means that a remote host attempting to connect to this host should get the TCP/IP connected state. Established connections are important because we know that CCI transactions may not transpire between the hosts in question if a TCP/IP Established connection does not exist. It is important to understand that netstat output is of the form:
ip-address:port
where the local host is listed to the left of the remote host. One side will always have a port that resolves to caicci and the other side will have a numeric port. The latter side is that which initiated the connection.
Troubleshooting CCI
B1
Sometimes netstat a does not return or may take a long time to return with very little information. This is usually indicative of name resolution problems. You can issue:
netstat an | grep 1721
netstat skips the name resolution and displays information about connections.
netstat i Shows information about the network interfaces on the local hosts. You can use the netstat i command to determine if the host has more than one network card and determine the hostnames or IP addresses of these cards. The netstat i command also provides valuable statistics about network collisions. A collision occurs when two hosts simultaneously attempt to send on an ethernet. The important thing to look for is a high ratio of outgoing or incoming packets to collisions.
ping
ping allows you to establish that a remote host can be reached. It is important to ping by IP address as well as host name. If you cannot ping a host, CCI cannot establish a connection to that host.
nslookup
nslookup allows you to be sure that the name of the host to which you wish to connect, as well as the IP address, is resolvable. If there is a question as to the integrity of the DNS environment, you can use nslookup to verify the IP address of the host to which you need to communicate. You then enter the IP address back into nslookup and verify that the same host name is returned. Verify the IP address and hostname for both hosts.
traceroute
The traceroute command on UNIX and the tracert command on Windows allow you to determine the route taken between two hosts. If a client cannot ping a host, this command may show where the network path is failing.
B2
User Guide
ccinet
ccinet may be used to pass commands to the ccirmtd demon on UNIX. On Windows, this is the rmtcntrl binary. This may be used as follows:
ccinet ping Can be used to send a special CCI test packet across the CCI connection. This does not use the native ping command nor does it operate in quite the same way.
ccinet status Allows you to determine the status of the CCI connections.
ccir/ccis/ccic/ccii Provides you a suite of test binaries to test CCI communication. (ccinet ping tests remote process to remote process communication)
Troubleshooting CCI
B3
cci show cci semashow and cci semaclear X cci shutdown cci debugon and cci debugoff
cci show
This allows you to view the shared memory segment where CCI stores the RVT list. This is useful to determine general CCI information, such as:
The number of free and active RVTs The key used to create the CCI resources The identifiers for the CCI resources The process IDs of the CCI demons The time the shared memory was created
You can also use this command to display information about a specific receiver, such as:
The existence of a specific receiver The number of pending messages for a specific receiver The PID of the process that created the receiver The PID of the process that holds the semaphore for this receiver The last send and receive time The number of sends and receives
B4
User Guide
When there is not a problem with the CCI semaphores, only the last line is returned. where:
X YYYY Z
The particular semaphore identifier in the CCI semaphore group. The semaphore group. The process ID of the process holding the semaphore. To use this to troubleshoot a hanging condition, execute the command and note the process IDs of those processes holding a CCI semaphore. It is always a good idea to issue ps ef in conjunction with this command. If the process holding the semaphore is defunct, the group responsible for support of this application should be contacted because CCI does not release resources held by defunct processes. Next, you issue the following for each held semaphore in the semashow output:
cci semaclear X
The semaclear command releases the semaphore and can allow AutoSys to continue normal operations.
cci shutdown
Tells the main demon to shut down. The use of this command is not advised if AutoSys is still running.
Troubleshooting CCI
B5
ccinet show This command will output data concerning the hosts to which the remote demon is or should be connected. It will also output information about the receivers available on those remote platforms similar to the RVT information displayed by cci show. The output from this command is written to the ccirmtd trace file. Therefore, to capture this output we prefer that traces have been enabled prior to its execution. This data is also output to the system console. The output from this command is usually important for solving all issues involving remote communication.
ccinet debugon and ccinet debugoff Used to enable or disable remote traces without recycling the remote demon. Trace data is written to: $CAIGLBL0000/cci/logs/ccirmtd_<pid>.log
ccinet status Will display information concerning the remote hosts to which the remote demon is connected or to which it should be connected. This data is displayed in tabular form on stdout. If you are receiving a "no receiver online" type error, check this output as it may show that you are not connected to the host in question.
B6
User Guide
ccinet release The release of the ccirmtd is displayed to stdout. Note: The cci 666 command is no longer supported in NSM. The command gives the release as follows:
xxxyyzzzz
where:
xxx yy zzzz
Is the source code version (for example, 1.137) Is the genlevel of Unicenter (for example, 21 for TNG 2.1) Is the release of Unicenter (for example, 9708) ccinet disconnect sysid Causes the local remote to issue a disconnect command to the specified sysid and close down the connection. This has the effect of severing the connection between these hosts. Neither side attempts to reestablish the connection.
ccinet reconnect sysid If hosts are connected, this causes them to disconnect and then reconnect. If hosts are not connected, the local remote demon will attempt to connect to the remote host.
ccinet ping sysid A useful diagnostic tool which causes the local remote demon to send a special CCI packet to which the other host responds. This command allows you to determine if the CCI connection is useful at the most basic level. Upon successful completion, the roundtrip time is displayed.
Troubleshooting CCI
B7
Reinstalling CCI
ccinet echo sysid message If successful, the message is displayed on the target systems console. Another useful tool for determining how well the CCI connection between two hosts is functioning.
ccinet retry sysid N Will affect the retry time interval as follows:
Set the retry time interval to N, if N >0 Set the retry time interval to 2, then double on each successive failure, if
N = -1
Reinstalling CCI
If you need to reinstall CCI, for any reason, reinstall AutoSys. When the AutoSys installation asks you if you want to install CCI, answer yes. Note: Before you reinstall CCI, unset CAIGLBL0000. Log in as root at a UNIX prompt and enter:
unset CAIGLBL0000.
For more information about CCI installation, see the appendix Introducing CCI in the Unicenter AutoSys Job Management for UNIX Installation Guide.
B8
User Guide
Index
A $
Adding Machines - JIL, 7-9 $AUTORUN, 3-26 $AUTOSYS/bin/chk_auto_up, 12-31 advanced configuration, 13-1 after_time report attribute, 11-8 alarm callbacks, 13-38 Alarm Manager about, 10-2 acknowledging alarms, 10-25 Alarm List, 10-25 Alarm Selection dialog, 10-31 Select by State region, 10-32 Select by Time region, 10-33 Select by Type region, 10-32 changing alarm states, 10-30 closing alarms, 10-25 Control Region Freeze Frame button, 10-28 New Alarm button, 10-29 Select Job button, 10-29 Currently Selected Alarm acknowledging, 10-28 closing, 10-28 Response edit box, 10-28 menu bar, 10-26 registering responses, 10-30 alarm monitor/report attribute, 11-6 alarm_if_fail job attribute, 4-15 alarm_verif monitor attribute, 11-10
/
/bin/date command, 12-10 /etc/.autostuff file, 13-31, 13-32 /etc/auto.profile file. See auto.profile file, 13-26 /etc/services file remote agent port number, 13-29 /tmp/autotest.$JobName, 12-10
Index1
alarms about, 1-12 DB_PROBLEM, 13-38 DB_ROLLOVER, 13-38 EP_HIGH_AVAIL, 13-38 EP_ROLLOVER, 13-38 EP_SHUTDOWN, 13-38 all_events monitor/report attribute, 11-6 all_status monitor/report attribute, 11-7 archive data for ServerVision, 13-35 atomic starting conditions, 10-10 attributes job alarm_if_fail, 4-15 auto_delete, 4-17 auto_hold, 4-17 avg_runtime, 4-24 basic, 3-6 box_failure, 4-27 box_name, 4-13 box_success, 4-27 box_terminator, 4-15 chk_files, 4-25, 4-26 command, 4-4, 4-5 date_conditions, 4-9 days_of_week, 4-9 description, 4-12 essential all jobs, 4-4 Box Jobs, 4-8 Command Jobs, 4-5 File Watcher Jobs, 4-8 heartbeat_interval, 4-24 job_load, 4-22 job_name, 4-4 job_terminator, 4-15 machine, 4-7, 4-8 max_exit_success, 4-23 max_run_alarm, 4-14 min_run_alarm, 4-13 n_retrys, 4-16 optional
Box Jobs, 4-27 Command Jobs, 4-19 File Watcher Jobs, 4-25 non-starting parameters, 4-12 starting parameters, 4-9 override_job, 4-23 permission, 4-18 priority, 4-22, 9-4 run_calendar, 4-10 std_err_file, 4-21 std_in_file, 4-20 term_run_time, 4-14 timezone, 4-16 watch_file, 4-8 watch_file_min_size, 4-25 watch_interval, 4-26 job dependencies, 3-16 machine factor attribute, 9-4 max_load, 9-3 monitor alarm_verif, 11-10 sound, 11-9 monitor/report alarm, 11-6 all_events, 11-6 all_status, 11-7 essential, 11-5 job_filter, 11-7 mode, 11-5 name, 11-5 report after_time, 11-8 currun, 11-8 starting conditions, 3-16 authentication remote, 13-31 auto.profile file AutoMachWideAppend variable, 13-23 AUTOSV variable, 13-35 AUTOSV_DIR variable, 13-35 DENY_ACCESS, 13-33 remote agent settings, 13-26 remoteProFiles, 13-16
Index2
Installation Guide
auto_delete, 4-17 auto_remote. See, 13-26 autocal, 8-3 autocons, 10-4 autohold job attribute, 4-17 AutoInstWideAppend, 13-23 AutoMachWideAppend variable, 13-23 autoping, 14-12 AutoRemoteDir, 13-14 AutoRemPort, 13-21, 13-29 autosc, 11-4 AUTOSERV environment variable, 13-1 autostuff file, 13-31, 13-32 AUTOSV environment variable, 13-35 AUTOSV_DIR environment variable, 13-35 AutoSys components, 1-4 database defined, 1-4 Graphical User Interface see GUI, 1-2 instances defined, 1-10 machines, 1-10 security, 2-1 AutoSys Agent software requirements, A-4 AutoSys Agent support, A-1 AutoSys Connect and AutoSys Agent support, A-1 AutoSys/Xpert, 4-24, 6-2 autosys_secure, 4-18 AutoSysAgentDebug, 13-22 Cross-Platform Scheduling, 13-22 AutoSysAgentDebug Configure the AutoSys Machine, A-6
AutoSysAgentDebug Parameter, A-6 AutoSysAgentSupport, 13-22 Cross-Platform Scheduling, 13-22 AutoSysAgentSupport Configure the AutoSys Machine, A-5 AutoSysAgentSupport Parameter, A-5 AutoSysAgentSupportReceiveSubmit Configure the AutoSys Machine, A-6 AutoSysAgentSupportReceiveSubmit Parameter, A-6 autosyslog command, 12-4, 14-17 avg_runtime, 4-24
B
backups bundled Sybase, 12-33 calendar definitions, 12-13 global variables (using autorep), 12-14 job definitions (using autorep), 12-13 machine definitions (using autorep), 12-14 monitor and browser definitions (using monbro), 12-14 batch files and exit codes, 3-24 Box Jobs, 3-4, 5-1 basic job definition, 3-7 default behavior, 5-1 diagram, 3-12 examples, 5-10 force starting jobs in a box, 5-8 guidelines, 5-2 non-default terminators, 5-5 placing job in GUI, 6-15 placing job in JIL, 7-10 starting conditions, 3-4, 3-14 status changes, 5-9 box_failure, 4-27
Index3
box_name, 4-13 box_success, 4-27 box_terminator, 4-15 browsers backing up definitions, 12-14 defined, 11-1 restoring definitions from backup file, 12-15 CCI
command line controls, B-4 reinstalling, B-8 troubleshooting tools, B-1 ccinet, B-3 charge back reporting, 9-16
C
Calendar Selection dialog, 8-14 calendars, 8-1 backing up definitions, 12-13 blocked dates, 8-10 blocking dates, 8-16 Calendar Definition window, 8-5 color key, 8-12 combining, 8-23 conflicting dates, 8-10, 8-12 creating example, 8-13 custom, 3-16 customizing Calendar Facility, 8-28 date range, 8-16 date states, 8-10 Edit menu, 8-7 exporting, 8-25, 8-27 exporting definitions to file, 12-13 File menu, 8-6 importing, 8-25 importing definitions from file, 12-15 Job Definition Reference List, 8-8 merging, 8-23 Options menu, 8-9 printing, 8-24 rescheduling rules, 8-20 restoring definitions, 12-15 rule specification, 8-16 selecting, 8-14 setting dates, 8-16 Term Calendar Rule, 8-15 Term Calendar Viewer, 8-8, 8-22
chase command, 12-11 check file space, 4-25, 4-26 Check_Heartbeat, 13-13 chk_auto_up, 13-8 chk_files, 4-25 clean_files, 12-12, 13-15 CleanTmpFiles, 13-15 client machine, 1-10 command job attribute, 4-4, 4-5 Command Jobs, 3-3, 3-6 command line controls CCI, B-4 components Event Processor, 1-5 Event Server, 1-4 Remote Agent, 1-6 conditions starting, 3-16 config.EXTERNAL creating the file, A-7 configuration file, 13-2 AutoInstWideAppend, 13-23 AutoRemoteDir, 13-14 AutoRemPort, 13-21 Check_Heartbeat, 13-13 CleanTmpFiles, 13-15 DBEventReconnect, 13-7 DBLibWaitTime, 13-5 DBMaintCmd, 13-11
Index4
Installation Guide
EDErrTimeInt, 13-8 EDMachines, 13-8 EDNumErrors, 13-8 EvtTransferWaitTime, 13-12 FileSystemThreshold, 13-10 InetdSleepTime, 13-24 KillSignals, 13-20 MachineMethod, 13-19 MaxRestartTrys, 13-17 MaxRestartWait, 13-18 RemoteProFiles, 13-16 RestartConstant, 13-18 RestartFactor, 13-18 sample, 13-2 ThirdMachine, 13-9 WaitTime, 13-18 XInstanceDBDropTime, 13-6 Control Panel GUI, 6-1 cpu using available cycles to select machine to run on, 9-10 crash recovery database, 12-22, 12-39 cross-instance database connection, 13-6 Cross-Platform Scheduling, 13-22 Currently Selected Job region, 10-8 currun report attribute, 11-8 custom calendars overview, 3-16 customizing Calendar Facility, 8-28 Job Definition, 6-25 Operator Console, 10-34, 11-20
accessing interactively, 12-31 administration, 12-27 architecture, 12-18 backup bundled Sybase, 12-33 changing the, 12-28 checking if up, 12-29 connection to Job Definition GUI time-out interval, 6-26 connection to Monitor/Browser GUI time-out interval, 11-21 connections, 13-6 crash recovery, 12-22 defined, 12-16 defining which to use, 12-18 dumping, 12-33 identifying connected processes, 12-32 maintenance, 13-11 maintenance script, 13-11 maintenance time, 12-19 passwords, 2-10 recovery, 12-22, 12-39 rollover, 12-22 shutdown, 12-30 starting, 12-29 stopping, 12-30 stopping service, 12-30 storage requirements, 12-17 time-out period, 13-5 unrecoverable error, 12-22 verifying connection, 14-12 dataserver defined, 1-4 date dependency, 3-15 date range in calendars, 8-16 date/time job dependencies, 3-15 Date/Time Options dialog example, 6-5 date_conditions, 4-9 days to run job setting GUI, 6-19
D
database
Index5
JIL, 7-11 days_of_week, 4-9 DB Library, 12-26 DB_PROBLEM, 13-38 DB_ROLLOVER, 13-38 DBDropTime Job Definition GUI, 6-26 Monitor/Browser GUI, 11-21 DBEventReconnect, 13-7 DBLibWaitTime, 13-5 DBMaint script, 12-20 DBMaintCmd, 13-11 dbstatistics script, 12-20 default owner of job, 2-11 delete job GUI, 6-21 JIL, 7-13 DENY_ACCESS AUTOENV setting, 13-33 dependency job date/time, 3-15
E
EDErrTimeInt, 13-8 edit permissions, 2-12 Edit Superuser, 2-15 EDMachines, 13-8 EDNumErrors, 13-8 environment variables AUTOSERV, 13-1 DSQUERY, 12-27 See also profiles, 4-6 SYBASE, 12-27 user-defined, 4-6 EP_HIGH_AVAIL, 13-38 EP_ROLLOVER, 13-38 EP_SHUTDOWN, 13-38 errors event processor handling, 13-8 event processor time interval, 13-8 essential job attributes, 4-4 event processor allowable time between errors, 13-8 authentication on remote agent, 13-31 automatic shutdown, 13-7, 13-8 checking for running, 13-8 error handling, 13-8 heartbeat interval, 13-13 log minimum disk space, 13-10 viewing, 12-4 monitoring, 12-2, 12-4 restoring, 12-8 running in test mode, 12-9 shutdown automatic, 13-7 starting, 12-1, A-10 stopping, 12-5, A-5 tail command, 12-2 third machine, 13-9
Index6
Installation Guide
troubleshooting, 14-10 Event Processor defined, 1-5 See also event_demon, 1-5 Event Report from Operator Console, 10-10 event server dual, 12-16, 12-18 recovery, 12-22 synchronizing process, 12-23 transferring events, 13-12 troubleshooting, 14-2 Event Server defined, 1-4 rollover, 12-22 See also database, 1-4 See also Dual Event Servers, 1-4 event_demon, 12-1 See also Event Processor, 1-5 eventor script, 12-2 events about, 1-11 cancelling, 10-14 copying between event servers, 13-12 EvtTransferWaitTime, 13-12 examples calendar creating, 8-13 calendar rescheduling rule, 8-20 individual queues, 9-20 JIL, 7-16 job dependencies, 3-20 load balancing, 9-11 monitor/report definition, 11-13 multiple machine queues, 9-21 queuing with priority, 9-19 real machine definition, 9-7 reports defining in JIL, 11-19 system architecture, 1-7
virtual machine definition, 9-8 xql scripts, 12-31 exclusive condition, 3-21 exec superuser, 2-16 Exec Superuser, 2-16 execute permissions, 2-12 exit codes batch files with, 3-24 FALSE.EXE, 3-25 job dependencies, 3-23 maximum for success, 3-17
exitcode, 3-23
exporting calendars, 8-27, 12-13
F
factor attribute, 9-4 FALSE.EXE, 3-25 file locking, 13-14 file maintenance, 13-15 File Watcher Jobs, 3-5 basic job definition, 3-6 creating, 6-9 FileSystemThreshold, 13-10 filter monitor/report, 11-7 force starting jobs from Job Activity console, 10-11 impact on load balancing, 9-12
G
gid, 2-12, 4-18 global variables
Index7
backing up definitions, 12-14 job dependencies, 3-25 restoring definitions, 12-15 setting in the Send Event dialog, 10-13 Graphical Calendar Facility, 8-1 Calendar Definition window, 8-5 screens, 8-4 See also calendars, 8-3 starting, 8-3 group ID, 2-12, 4-18 GUI Advanced Features dialog, 6-6 Control Panel, 6-1 Date/Time Options dialog, 6-5 defined, 1-2 defining monitor/reports, 11-11 HostScape, 6-2 Job Definition dialog, 6-3 JobScape, 6-2 Monitor/Browser dialog, 6-2 one-time job overrides, 6-22 Operator Console, 6-2 starting, 6-1 time dependencies setting, 6-19 TimeScape, 6-2 using to create a job definition, 4-2
I
Import/Export File Name (calendar) dialog, 8-25 importing calendars, 8-25, 12-15 inetd job starting inverval, 13-24 resetting, 13-19 InetdSleepTime, 13-24 InfoReports configuring for printing, 10-42 inherit jobs starting conditions, 3-14 insert_machine - JIL, 7-9 instances of AutoSys, 1-10
J
JIL creating a job definition, 4-2 defined, 1-3 defining jobs, 7-1 defining monitors/reports, 11-17 defining report, 11-19 example, 7-16 running, 7-5 sub-commands, 7-3 syntax rules, 7-1 Job Activity Console, 10-5 Alarm button, 10-17 Alarm Manager dialog, 10-25 cancelling a sent event, 10-14 Control Area, 10-11 action buttons, 10-11 control buttons, 10-15 Dependent Jobs button, 10-15 Freeze Frame button, 10-16 Job Definition button, 10-15 Report buttons, 10-16 Currently Selected Job region, 10-8
H
heartbeat_interval, 4-24 heartbeats about, 13-13 checking, 13-13 code to include, 4-24 time interval, 13-13 high availability dual event servers, 1-5 shadow event processor, 1-6 HostScape, 6-2
Index8
Installation Guide
Dependent Jobs dialog, 10-16 Job List, 10-6 Job Path (History) dialog, 10-17 Job Selection dialog, 10-19 menu bar, 10-6 reports, 10-10 resizing regions, 10-18 See also Operator Console, 10-5 Send Event dialog, 10-12 AUTOSERV instance, 10-13 Cancel button, 10-13 Change Status, 10-13 Comment field, 10-13 Execute button, 10-13 Global Name/Value, 10-13 Job Name, 10-12 Queue Priority, 10-13 Send Priority, 10-13 Signal, 10-13 time of event, 10-13 starting conditions, 10-9 Job Definition dialog, 6-3 customizing, 6-25 database connection time-out, 6-26 icon text, 6-26 title bar text, 6-26 Job Definition Reference List, 8-8 Job Information Language, 1-12 See also JIL, 1-12 Job List region, 10-6 job overrides setting, 7-14, 7-15 job resource usage, 9-16, 13-37 Job Selection dialog, 10-19 Box Levels field, 10-20 Job Name field, 10-20 setting job selection criteria, 10-23 sorting specified jobs, 10-23 specifying jobs by machine, 10-21 by name, 10-20 by status, 10-21
job starts reducing start interval, 13-24 job status as job dependency, 3-16 job_filter monitor/report attribute, 11-7 job_load, 4-22 job_name, 4-4 job_terminator, 4-15 jobs attributes basic, 3-6 backing up definitions, 12-13 basic job information, 3-3 Box Jobs, 5-1 creating - GUI, 6-15 creating - JIL, 7-8 defined, 3-4 changing - GUI, 6-17 changing - JIL, 7-10 Command Jobs creating - GUI, 6-7 creating - JIL, 7-5 defined, 3-3 creating Box Jobs - GUI, 6-15 Box Jobs - JIL, 7-8 Command Jobs - GUI, 6-7 Command Jobs - JIL, 7-5 File Watcher Jobs - GUI, 6-9 File Watcher Jobs - JIL, 7-6 creating a job definition, 4-2 custom calendars, 3-16 cycles of processing, 14-1 date/time dependencies setting in GUI, 6-19 setting in JIL, 7-11 days to run setting, 6-19 defined, 1-2 defining environment for, 4-6 defining in AutoSys, 3-27 definition
Index9
Box, 3-7 Command, 3-6 File Watcher, 3-6 deleting GUI, 6-21 JIL, 7-13 dependencies creating with GUI, 6-13 creating with JIL, 7-7 exit codes, 3-23 global variables, 3-25 job status, 3-16 time GUI, 6-19 JIL, 7-11 edit permissions, 2-12 execute permssions, 2-12
K
KILLJOB signals, 13-20 KillSignals, 13-20
L
license keys, A-10 load balancing, 9-10 example, 9-11 force starting jobs, 9-12 job attributes required for, 9-4 maximum load on machine, 9-3 method, 13-19 ServerVision, 13-34 user-defined, 9-22 locking remote agent log file, 13-14
M
machine attribute, 4-7, 4-8 backing up definitions, 12-14 client, 1-10 defining with JIL, 9-2 deleting real machines, 9-7 virtual machines, 9-9 factor attribute, 9-4 job runs on, 4-7 max_load attribute, 9-3 permissions edit and execute, 2-12 priority job attribute, 9-4
Index10
Installation Guide
real, 9-1 restoring definitions from backup file, 12-15 saving definitions to backup file, 12-14 server, 1-10 virtual, 9-2 defining, 9-8 deleting, 9-9 MachineMethod, 13-19 maintenance backing up AutoSys definitions, 12-13 calendar definitions, 12-13 global variables, 12-14 job definitions, 12-13 machine definitions, 12-14 monitor and browser definitions, 12-14 chase command, 12-11 clean_files, 12-12 commands, 12-11 database, 13-11 database time, 12-19 files, 13-15 restoring AutoSys definitions, 12-15 managing job status, 3-22 max_exit_success, 4-23 max_load machine attribute, 9-3 max_run_alarm, 4-14 maximum exit code for success, 3-17 maximum system load, 9-3 MaxRestartTrys, 13-17 MaxRestartWait, 13-18 method of load balancing, 13-19 min_run_alarm, 4-13 mode monitor/report attribute, 11-5 Monitor/Browser dialog, 6-2, 11-11 customizing, 11-20 database connection time-out, 11-21 icon text, 11-21 title bar text, 11-21
monitors, 11-1 about, 11-2 alarm_verif, 11-10 backing up definitions, 12-14 defined, 11-1 defining, 11-3, 11-13 GUI, 11-11 JIL, 11-17 filtering, 11-6 filtering events, 11-2 job_filter, 11-7 name, 11-5 restoring definitions from backup file, 12-15 sound, 11-9 status events, 11-7 multiple machine queues, 9-21
N
n_retrys, 4-16 name monitor/report attribute, 11-5 netstat, B-1 NIS troubleshooting, 14-14 notification user-defined notification routines, 13-38 nslookup, B-2 ntrys, 3-26
O
ON_HOLD vs. ON_ICE, 3-9 one-time job overrides, 6-22 Open Client C Library, 12-16 Operator Console, 6-2 about, 10-1 customizing, 10-34
Index11
Alarm List column width, 10-39 alarm poll time interval, 10-35 atomic conditions fields, 10-38 background color of variable fields, 10-37 border colors, 10-37 changing fonts, 10-35 currently selected Job Name field, 10-37 Default Report type, 10-39 font selection, 10-36 freeze frame at start up, 10-36 icon text, 10-40 Job List column widths, 10-38 label font, 10-36 list font, 10-36 object color, 10-37 Operator Console size, 10-39 primary interface colors, 10-38 refresh time interval, 10-35, 10-38 title bar text, 10-40 Job Activity Console, 10-5 See also Job Activity Console, 10-1 starting, 10-4 optional job attributes, 4-9 Oracle improving database performance, 12-25 SQL*Net V2, 12-16 override_job, 4-23 overrides job GUI, 6-22 JIL, 7-14 owner default, 2-11
permissions, 2-2, 4-18 edit, 2-12, 4-18 execute, 2-12, 4-18 granting, 2-13 machine, 2-12 types, 2-12 user, 2-11 using umask, 2-11 Windows NT, 2-14 ping, B-2 port number for remote agent, 13-21 priority job attribute, 4-22, 9-4 profiles, 4-6
Q
queuing and simple load limiting, 9-16 as subset of virtual machine, 9-20 jobs, 9-16 multiple machine, 9-21 policies, 9-16 with priority, 9-18, 9-19
R
real machines, 9-1 defining, 9-2 deleting, 9-7 example, 9-7 reconnecting to database, 13-6
P
passwords autosys user, 2-10 database, 2-10 database system administrator, 12-28 permission attribute, 4-18
recovery database, 12-39 Sybase database, 12-39 remote agent auto_remote service, 13-29 database connectivity, 14-19 heartbeat interval, 13-13 log, 14-17
Index12
Installation Guide
modifying settings for, 13-29 port number, 13-21 security DENY_ACCESS, 13-33 settings, 13-29 socket connection, 13-29 troubleshooting, 14-12 Remote Agent defined, 1-6 Event Processor authentication, 2-10 security, 2-18 user authentication, 2-9 remote agent event processor authentication, 13-31 remote agent log cleanup, 13-15 name assigned, 14-16 specifying directory for file, 13-14 remote agent settings in auto.profile, 13-26 remote authentication, 13-31 configuring, 13-31 Event Processor, 2-10 user, 2-9 RemoteProFiles, 13-16 reports, 11-1 about, 11-3 all_events, 11-6 all_status, 11-7 defined, 11-1 defining, 11-3, 11-13 GUI, 11-11 JIL, 11-17 defining - GUI, 11-15 defining - JIL, 11-19 filtering, 11-6 filtering events, 11-2 job_filter, 11-7 name, 11-5 Operator Console, 10-10 status events, 11-7 resource check file space, 4-25, 4-26
resource usage monitoring, 13-37 resources Calendar Facility date range, 8-30 Font Selection, 8-29 icon text, 8-30 Object Color, 8-29 Print Command, 8-30 title bar text, 8-30 Window Size, 8-30 Job Definition, 6-25 database connection time-out, 6-26 icon text, 6-26 title bar text, 6-26 Monitor/Browser database connection time-out, 11-21 icon text, 11-21 title bar text, 11-21 Operator Console, 10-34, 11-20 alarm list column width, 10-39 alarm poll time interval, 10-35 atomic condition fields, 10-38 border colors, 10-37 currently selected job name field color, 10-37 default report type, 10-39 font selection, 10-36 freeze frame at start up, 10-36 icon text, 10-40 job list column widths, 10-38 label font, 10-36 list font, 10-36 Operator Console size, 10-39 primary interface color, 10-38 refresh time interval, 10-35 title bar text, 10-40 toggle button color, 10-38 variable fields background color, 10-37 restart attempts maximum number, 13-17 restart wait time calculating, 13-18 RestartConstant, 13-18
Index13
RestartFactor, 13-18 restoring calendar definitions, 12-15 global variables (using autorep), 12-15 job definitions (using autorep), 12-15 machine definitions (using autorep), 12-15 monitor and browser definitions, 12-15 primary event processor, 12-8 rollover Event Server, 12-22 shadow event processor, 12-7 rstatd, 13-19 Rule Specification region, 8-16 run number, 3-26 RUN_AUTOSYSDB, 12-29 run_calendar, 4-10 run_num/ntry defined, 3-26 ruserok, 2-9, 13-31
AUTOSERV instance, 10-13 Cancel button, 10-13 cancelling an event, 10-14 Change Status, 10-13 Comment field, 10-13 Execute button, 10-13 Global Name/Value, 10-13 Job Name, 10-12 Queue Priority, 10-13 Send Priority, 10-13 Signal, 10-13 time of event, 10-13 sendevent command, 2-16 server instance, 1-4 server machine, 1-10 ServerVision charge back reporting, 9-16 integration, 13-34 job resource usage monitoring, 9-16, 13-37 load balancing, 9-13 services file Remote Agent port number, 13-29 shadow event processor defined, 1-6 rollover, 12-7 starting, 12-8 SIGHUP signal, 13-19 signals UNIX, 13-20 Single Server Mode, 12-22 SNMP connections, 13-17 socket connection for remote agent, 13-29 socket time-out, 13-17 sound monitor attribute, 11-9 SQL*Net V2, 12-16 standard output/error append parameter, 13-23 start interval between jobs, 13-24
S
scripts database maintenance, 13-11 DBMaint, 12-20 start_autodb, 12-29 security, 2-1, 2-2 DENY_ACCESS remote agent setting, 13-33 granting permissions, 2-13 permission types, 2-12 preventing unauthorized access, 2-8 Remote Agent, 2-18 remote authentication, 13-31 superusers AutoSys, 2-15 user permissions, 2-11 select getdate, 12-29 Send Event dialog, 10-12
Index14
Installation Guide
start_autodb script, 12-29 starting conditions, 3-14, 3-16 in Job Activity Console, 10-9 STARTJOB, 7-5 states job, 3-8 See also status, 3-8 status job as job dependency, 3-16 monitoring and reporting, 11-7 tracking changes, 11-7 monitor/report failure, 11-7 restart, 11-7 running, 11-7 starting, 11-7 success, 11-7 terminated, 11-7 processed vs. unprocessed, 3-13 using in job dependencies, 3-16 std_err_file, 4-21 std_in_file, 4-20 STOP_DEMON, 12-5, 12-30 success maximum exit code, 3-17 Summary Report from Operator Console, 10-10 superusers, 2-15 Edit Superuser defined, 2-15 exec superuser defined, 2-16 svarchive utility, 13-37 svload command, 9-13 svload configuration requirements, 13-34 Sybase accessing interactively, 12-31
architecture, 12-26 bundled, 12-26 defining a dump device, 12-34 dumping the database, 12-37 loading the database, 12-38 communication with, 12-26 database displaying date and time, 12-33 identifying connected processes, 12-32 improving performance, 12-24 DB Library, 12-26 DSQUERY variable, 12-27 environment, 12-27 environment variable, 12-27 interfaces file, 12-27 server, 12-26 shutdown command, 12-30 SQL.INI file, 12-27 starting, 12-29 stopping, 12-30 System 10 backup server, 12-35 users, 12-27 SYBASE variable, 12-27 synchronizing event servers, 12-23 syntax rules JIL, 7-1 system administrator database, 12-28 system load maximum, 9-3
T
target machine, 4-7 Term Calendar Rule dialog, 8-15 Control region, 8-21 Rescheduling Rule region, 8-20 Term Calendar Viewer, 8-8, 8-22 Calendar Display, 8-22 Navigation Controls, 8-22
Index15
term_run_time, 4-14 test mode output file, 12-10 running in, 12-9 ThirdMachine, 13-9 time dependencies in job definition, 4-9 overview, 3-15 setting GUI, 6-19 JIL, 7-11 TimeScape, 6-2 timezone, 4-16 tmpfs locking problems, 13-14 traceroute, B-2 transferring events between event servers, 13-12 troubleshooting, 14-1 event processor, 14-10 event servers, 14-2 remote agent, 14-12 troubleshooting tools CCI, B-1 ccinet, B-3 netstat, B-1 nslookup, B-2 ping, B-2 traceroute, B-2
group, 2-12 owner, 2-12 world, 2-12 user-defined environment variables, 4-6 user-defined alarm callbacks, 13-38 load balancing, 9-22 users bundled Sybase, 12-27 utilities provided with AutoSys, 1-12
V
variables in a Command Job definition, 4-6 virtual machines, 9-2 defining, 9-2, 9-8 deleting, 9-9 example, 9-8 using delete_machine, 9-2 using insert_machine, 9-2 vmstat, 13-19
W
WaitTime, 13-18 watch_file, 4-8
U
uid, 2-12, 4-18 Unicenter Event Management Integration, 13-25 UnicenterEvents, 13-25 UnicenterEvents, 13-25 user ID, 2-12, 4-18 user types
X
X resources
Index16
Installation Guide
Index17