Continual Service Improvement
Purpose: to continually align and
realign IT services to changing
business needs by identifying and
implementing improvements to IT
services that support business
Processes
5. Empower others to act on the
vision
6. Plan for and create quick wins
7. Consolidate improvements and
produce more Changes
8. Institutionalize the Change
Deming Cycle
CSI is about looking for ways to
improve Process effectiveness and
efficiency, as well as costeffectiveness
Main Objectives:
to review, analyse and make
recommendations on
improvement opportunities
to review and analyse Service
Level Achievement results;
to identify and implement
individual activities
to improve cost-effectiveness of
delivering IT services without
sacrificing customer satisfaction;
to ensure applicable quality
management methods are used
CSI : Service Measurement
Measuring and reporting
performance against targets of
an end-to-end service
Combines component
measurements to provide a view
of customer experience
Data can be analyzed over time
to produce a trend
Data can be collected at multiple
levels (for example, CIs,
processes, services)
CSI and Organizational Change
Successful CSI supports
organizational change
Organizational change presents
challenges
Use formal approaches to
address people-related issues:
John Kotters Eight steps to
transforming your organization
Project Management
John Kotters Eight steps to
transforming your organization
1.
2.
3.
4.
Create a sense of urgency
Form a guiding coalition
Create a vision
Communicate the vision
W. Edwards Deming
four key stages of the cycle are
Plan, Do, Check and Act
critical at two points in CSI:
1.implementation of CSIs
2. application of CSI to services
and service management
processes
Key point: Take baseline(s) before
implementing improvements
Why do we measure?
To
To
To
To
validate
Intervene
Justify
Direct
Types of Metrics
Technology - typically
components and applicable
Example : Performance,
Availability
Process - Critical Success
Factors (CSFs), Key Performance
Indicators (KPIs), activity
metrics for ITSM processes
-help determine the overall
health of a process
Service - end-to-end service
metrics
Critical Success Factors (CSF)
Something that must happen if a
Process, Project, Plan, or IT Service is
to succeed.
Key Performance Indicator (KPI)- A
Metric that is used to help manage a
Process, IT Service or Activity.
--only the most important of these are
defined as KPIs and used to actively
manage and report on the Process, IT
Service or Activity
--selected to ensure that Efficiency,
Effectiveness, and Cost Effectiveness
are all managed.
--used to measure the achievements
of each CCSF
Qualitative example:
CSF: Improving IT service quality
KPI: 10 percent increase in customer
satisfaction rating for handling
incidents over the next 6 months.
KPI: 10 percent reduction in the costs
of handling printer incidents.
Metrics required:
Original cost of handling a
printer incidents
Final cost of handling a printer
incidents
Cost of the improvement
effort.
Measurements:
Time spent on the incident by
first-level operative and their average
salary
Time spent on the incident by
second-level operative and their
average salary
Time spent on Problem
Management activities by second-level
operative and their average salary
Time spent on the training
first-level operative on the workaround
Cost of a service call to thirdparty vendor
ITIL V3 Service Improvement
Metrics required:
Time and material from third-party
vendor.
Original customer satisfaction score
for handling incidents
The 7 Step Improvement Process
Purpose:
Ending customer satisfaction score
for handling incidents.
Measurement is essential for
Continual Service Improvement
Measurements:
The 7 Step Improvement Process is
designed to provide this
measurement
Incident handling survey score
Number of survey scores.
Quantitative example:
CSF: Reducing IT costs
7 Steps
1. Should
2. Can
3. Gather
4.
5.
6.
7.
Process
Analyze
Present
Implement
This role is very similar to the
"Product Manager" role in Service
Strategy
From Vision to Measure
ITIL V3 REVIEW
Measure-Metrics-KPI-CSF-ObjectivesGoals-Mission-Vision
Service-Delivers value to customer by
facilitating outcomes customers want
to achieve without ownership of the
specific costs and risks
Continual Service Improvement
Manager
Roles:
Accountable for the success of all
improvement activities
Communicates the CSI vision
across the IT organization
Defines and reports on CSI Critical
Success Factors, Key Performance
Indicators and CSI activity metrics
Co-ordinates CSI throughout the
service lifecycle
Builds effective relationships with
the business and IT managers
Ensures monitoring is in place to
gather data
Works with process and service
owners to identify improvements
and improve quality
Service Level-Measured and reported
achievement against one or more
service level targets
Example :
Red = 1 hour response 24/7
Amber = 4 hour response 8/5
Green = Next business day
Service Level Agreement-Written
and negotiated agreement between
Service Provider and Customer
documenting agreed service levels
and costs
Configuration Management
System (CMS)
Tools and databases to manage IT
service providers configuration
data
Contains Configuration
Management Database (CMDB)
Service Manager
Roles:
Manages the development,
implementation, evaluation and
on-going management of new and
existing products and services
Develops business case, product
line strategy and architecture
Develops new service deployment
and lifecycle management
schedules
Performs Service Cost
Management activities
Works to instill a market focus
Configuration Management
Database (CMDB)
Records hardware, software,
documentation and anything else
important to IT provision
Release - Collection of hardware,
software, documentation, processes or
other things require to implement one
or more approved changes to IT
Services
Data Data and Information Layer
Information Information Integration
Layer
Knowledge Knowledge Processing
Layer
Wisdom Presentation Layer
Incident-Unplanned interruption to an
IT service or an unplanned reduction in
its quality
Workaround -Reducing or eliminating
the impact of an incident without
resolving it
Problem-Unknown underlying cause
of one or more incidents
Known Error-A problem that has a
documented root cause and a
Workaround.
4Ps of Service Management
People skills, training,
communication
Processes actions, activities,
changes, goals
Products tools, monitor, measure,
improve
Partners specialist suppliers
Service Strategy
4Ps of Service Strategy
Service Delivery Strategies
Perspective where service
strategy is articulated as
mission/vision
Position-where service
strategy is expressed in a
specific way that allows
comparison
Plan where ss is asked as a
question
Pattern - ss is a consistent set
of activities to be performed
4 Activities of Service Strategy
Define the market
Develop offerings
Develop strategic assets
Prepare for execution
SS: Service Assets
RESOURCES
Things you buy or pay for
IT Infrastructure, people, money
Tangible Assets
Example: Hardware & Software,
People, Buildings, Tools, Cash
CAPABILITIES
Things you grow
Ability to carry out an activity
Intangible assets
Transform resources into
Services
Example: Processes, Functions,
Specialized workflow, Competency
Service Strategy: Service Portfolio
Management
-Prioritizes and manages investments
and resource allocation
4 Steps in Service Portfolio
Management
1.
2.
3.
4.
Define
Charter
Analyze
Approve
-used to help the IT Service Provider
understand and plan for different
levels of business activity
User Profiles
-pattern of user demand for IT services
-Each user profile will include one or
more patterns of business activity
-communicate information on the
roles, responsibilities, interactions,
schedules and work environments of
users
Respond to business demands by:
Analyzing patterns of business
activity and user profiles; and
Influence demand in line with the
strategic objectives
Service Design
-Holistic approach to determine the
impact of change introduction on the
existing services and management
processes
Five Major Aspects of Service
Design
Business and Service Requirements
Service Portfolio Design
Technology architecture
processes
Measurement methods and metrics
Service Strategy: Demand
Management
SD Core Processes
Goal: assist the IT Service Provider in
understanding and influencing
Customer demand for Services and
the provision of Capacity to meet
these demands
Service Level Agreement (SLA)
Patterns of Business Activity
(PBA)
-workload profile of one or more
business activities
Service Design: Service Level
Management
-Written agreement between a service
provider and customers that
documents agreed Service Levels for a
Service
Operational Level Agreement
(OLA)
-Internal agreement (within same
organization) which supports the IT
organization in their delivery of
services
3. Multi-Level SLAs
Underpinning Contract (UC)
-Any contract with an external supplier
that supports the IT organization in
their delivery of services
Service Catalog
-Written statement of IT services(Menu
of Services), default levels and
options, costs and which businesses
process that they related to.
Service Level Requirements (SLR)
-The detailed recording of the agreed
customer's needs
Service Improvement Plan (SIP)
-Formal plans to implement
improvements to a process or service
Types of SLAs:
1. Customer Based
A separate SLA for each and
every customer covering the
multiple services provided to
that customer
Different customers get
different deal (and different
cost)
2. Service Based
A SLA for each service that is
provided to all customers of
this service
All customers get the same
deal for same services
These involve corporate,
customer and service levels and
avoid repetition
Service Design: Supplier
Management
Manage suppliers and the services
they supply, to provide seamless
quality of IT service to the business,
ensuring value for money is
obtained.
This is the process that provides the
development, maintenance and
renewal of underpinning contracts
It is important that suppliers are
involved in all aspects of the service
lifecycle not just the service design
stage
Maintain a supplier policy and
maintain a database of suppliers
Service Design: Service Catalog
Management
To ensure that a Service Catalog is
produced and maintained, containing
accurate information on all operational
services and those being prepared to
be run operationally
Two Aspects:
Business Service Catalog
A customer friendly
document (not technical jargon)
used to show the relationship
between IT services and the
business
Technical Service Catalog
A technical document used by IT to
define and communicate the
relationship between IT services
and any supporting services,
components or CIs.
Service Design: Capacity
Management
To ensure that cost-justifiable IT
capacity in all areas of IT always exists
and is matched to the current and
future needs of the business, in a
timely manner.
It manages:
The right capacity
The right location
The right moment
Helps ensure the business can make
use of capacity and functionality when
they need it. Ensure that the level of
service availability delivered in all
services is matched to or exceeds the
current and future agreed needs of the
business in a cost effective manner.
Optimizing Availability
MTBF (Mean Time Between
Failures)- (Uptime)
for measuring and
reporting reliability.
average time that a configuration
item or IT service can perform its
agreed functino without
interuption. This is measured from
when the CI or IT service starts
working, until it next fails
For the right customer
MTRS Mean time to Restore
Service (downtime)
Against the right costs
The average time taken to
restore a CI or IT service after a
failure.
measured from when the CI or
IT service fails until it is fully
restored and delivering its
normal functionality
**Balances Cost against Capacity
so minimizes costs while
maintaining quality of service
Demand Management
Identify and analyze Patterns of
Business Activity
Influence demand to reduce excess
capacity needs
Capacity Management
Determines the actual resources
requirements to meet business
needs
Plan future implementation of
capacity
Service Design: Availability
Management
MTBSI Mean time between
system incidents (frequency if
failure)
metric used for measuring and
reporting Reliability.
mean time from when a system
or IT service fails, until it next
fails. MTBSI is equal to MTBF +
MTRS
Service Design: IT Service
Continuity Management
- Deals with Disaster
- To support the overall Business
Continuity Management process by
ensuring that the required IT technical
and service facilities can be resumed
within required, and agreed, business
timescales
Disaster- not part of the daily
operational activities and requires a
separate system
BIA (Business Impact Analysis)quantifies the impact loss of IT service
would have on the business
Availability- information should be
accessible at any agreed time. This
depends on the continuity by the
information processing systems
Security Incident- Any incident that
may interfere with achieving the SLA
security requirements, materialization
of a threat
Service Transition
- The capabilities used to build, test
and deploy changes that could affect
IT service quality
Risk Analysis- Evaluate Assets,
Threats and Vulnerabilities
To help set customer expectations
Countermeasure- to refer to any
type of control and
To enable business
change/project teams/customers
to integrate an IT release
-most often used when referring to
measures that increase resilience,
fault tolerance or reliability of an IT
Service
Business Continuity Plan
- A plan defining the steps required to
restore business processes following a
disruption.
Service Design: Information
Security Management
CIA
- Confidentiality, Integrity, Availability
- align IT security with business
security and ensure that information
security is effectively managed in all
service and Service Management
activities
Confidentiality- Protecting
information against unauthorized
access and use
Integrity- Accuracy, completeness,
and timeliness of the information
To reduce variations in predicted
and actual performance
To reduce known errors and
minimize risks
To ensure Services are built and
used in accordance with the
agreed requirements and
constraints defined within Service
Design
Value to Business
Increased success rate of changes
and releases
More accurate predictions of
service levels and warranties
Less variation of costs against
estimated and approved resource
plans
Service Transition: Knowledge
Management
-enable organizations to improve the
quality of management decision
making by ensuring that reliable and
secure information and data is
available throughout the service
lifecycle
2 objectives:
1. To ensure that the right
information is delivered to the
appropriate place or competent
person at the right time to
enable informed decisions
2. Improve efficiency by reducing
the need to rediscover
knowledge.
DIKW (Data -> Information ->
Knowledge -> Wisdom)
Data is a set of facts collected
during ITSM activities or
measurements. An example of data is
the time and date when an incident
occurred.
Information it gives context to data
(i.e. data endowed with meaning and
purpose) so that data become useful
for further decision making. An
example of information is average
time to close priority 1 incidents
(which combines data for many
incidents such as incident start-time,
incident end-time and priority).
Knowledge organizing, processing
and structuring of information using
human experience, ideas and
judgments. An example of knowledge
is the average time to close priority 1
incidents has decreased since ITSM
tool implementation.
Wisdom wisdom uses judgment to
make use of knowledge and create
value. This is usually done by a human
brain. Wisdom takes into consideration
data, information and knowledge; e.g.
customer satisfaction rose by 10% due
to the ITSM tool implementation, back-
office personnel training, self-service
portal and new service introduced.
SKMS (Service Knowledge
Management System)
-Tools and databases used to manage
service knowledge and information
including CMS, Service Portfolio, AMIS,
CMIS, SCD, Known Error
Database,Soft Knowledge
Related Elements within SKMS
Configuration management
database
Config. Management system
Incident, Problem, Change and
Release data
Human Resource or People data
Availablility Management Info.
System
Known Error database
Service Portfolio
Supplier and Contracts
database
Capacity Management Info.
System
Service Transition: Change
Management
To ensure that standardized methods
and procedures are used for efficient
and prompt handling of all changes, in
order to minimize the impact of
change related Incidents upon service
quality, and consequently to improve
the day to day operations of the
organization.
Respond to changing business
requirements
Minimize impact/risk of
implementing changes
Ensure all changes are
approved at the appropriate
level with the business and IT
Implement changes successfully
Implement changes in times
that meet business needs
Use standard processes to
record all changes
*** Not every change is an
improvement, but every
improvement is a change!
Activities of Change Management
1. Change logging and filtering /
acceptance
Does the RFC (request for
change) have enough,
quality, information
Unique identification
number
Filter out impractical
RFCs and provide
feedback to issuer
2. Managing changes and the
change process
Prioritize RFCs (based on
risk assessment)
Categorize RFCs (e.g.
minor, significant or
major impact)
3. Chairing the CAB and ECAB
Assess all RFCs
Impact and resource
assessment
Approval based on
financial, business and
technical criteria
The Forward Schedule of
Change (FSC)
4. Coordinating the change
Supported by release
management, change
management coordinates
the building, testing and
implementation of the
change
5. Reviewing and closing CFRs
6. Management reporting
Request for Change (RFC)
Every change to the IT
Infrastructure has to go through
change management. A Request
for Change (RFC) is formally
issued for every change request.
Normal Change
A change that follows all of the
steps of the change process.
Standard Change
A pre-approved change that is low
risk, relatively common and
follows a procedure or work
instruction
Emergency Change
A change that must be introduced
as soon as possible
Change Manager
Responsible for the change
management process.
Change Advisory Board (CAB)
A dynamic group of people
(depending on the change) that
approve changes with medium to
high priority, risk and impact.
Emergency Change Advisory
Board (ECAB)
Subgroup of CAB with
authority to evaluate emergency
changes and make urgent change
decisions
Minor release: a few minor
improvements and fixes to known
errors.
Emergency fix: a temporary or
permanent quick fix for a problem
or known error.
Forward Schedule of Change (FSC)
Contains details of all
approved changes and their
proposed implementation date.
The 7Rs of Change Management
Raised
Reason
Return
Risks
Resources
Responsible
Relationship
Release and Deployment
Management
-To deploy releases into production and
enable effective use of the service in
order to deliver value to the
organization / customer / user.
- Includes the processes, systems and
functions to package, build, test and
deploy a release into production and
prepare for service operation.
Rollout
Introduces a release into the live
environment
Full Release: Office 2007
Delta (Partial) Release:
Windows Updates
Package: Windows Service Pack
Deployment Options: Big Bang VS
Phased
Big Bang
The new or changed service is
deployed to all user areas in one
operation. This will often be used
when
Introducing an application change
and consistency of service across
the organization is considered
important.
Objective:
to build, test and deliver the capability
to provide the services specified by
service design
Release Package
A collection of authorized changes
to an IT service.
Release Unit
The portion of the IT infrastructure
that is released together
Major release: major roll out of
new hardware and/or software;
Phased
The service is deployed
to a part of the user base initially,
and then this operation is
repeated for subsequent parts of
the user base via a scheduled
rollout plan.
Deployment Options: Push VS Pull
Push
Used where the service
component is deployed from the
center and pushed out to the
target locations.
Pull
Used for software releases,
where the software is made
available in a central location,
but users are free to pull the
software down to their own
location at their preferred time
- This process ensures the integrity of
service assets and configurations in
order to support the effective and
efficient management of the IT
organization.
- To gather the information needed
about the IT components and how
they relate to each other to ensure
that the relevant information is
available for all the other processes
Managing these properly is key
Provides Logical Model of
Infrastructure and Accurate
Configuration information
Controls assets
Minimized costs
Deployment Options: Automation
VS Manual
Automation
Helps to ensure repeatability and
consistency.
The time required to provide a well
designed and efficient automated
mechanism may not always be
available or viable.
Manual
Important to monitor and
measure the impact of many
repeated manual activities, as they
are likely to be inefficient and error
prone
Service Asset and Configuration
Management
- Support efficient and effective
Service Management processes by
providing accurate configuration
information to enable people to make
decisions at the right time
Enables proper change and
release management
Speeds incident and problem
resolution
CIs (Configurations Items)
Any component that needs to be
managed in order to deliver an IT
service, including hardware,
software, organization roles,
processes and controlled
documentation
CMDB (Configuration Management
Database)
A database used to store
configuration records throughout
their lifecycle. The Configuration
Management System will contain
one or more CMDBs.
CMS (Configuration Management
System)
A set of tools and databases that
are used to manage an IT service
provider's configuration data. The
CMS also includes information
about incidents, problems, known
errors, changes and releases. It
also provides the presentation
layer for Configuration
Management.
Configuration Model
Configuration Management
delivers a model of the services,
assets and the infrastructure by
recording the relationships
between configuration items
management of the technology
that is used to deliver and
support services
Service Operation Processes
Incident Management
Problem Management
Service Operation
Management of IT Services that
ensures effectiveness and
efficiency in delivery and support
Coordinated and organized
activities and processes, required
to deliver and manage services at
agreed levels to business users
and customers
Provides guidance on supporting
operations
Ensures day to day operation of
processes
Enables successful service
improvements
** Service Operation is
responsible for ongoing
Manage the lifecycle of
all problems
Event Management
Configuration Baseline
The configuration of a service,
product or infrastructure that has
been formally reviewed and
agreed on. It serves as the basis
for further activities and can only
be changed via formal change
procedures.
Restore normal service
operation ASAP and minimize
adverse impacts on the business
Enable stability by
monitoring all events that occur
Request Fulfillment
Process of dealing with
Service Requests from the users
Access Management
Provide the rights for users to use
a service
Service Operation Functions
Service Desk
Support the agreed IT
service provision by ensuring
accessibility and availability of the
IT organization and by performing
various supporting activities
Technical Management
Helps plan, implement and
maintain a stable network
infrastructure to support the
organizations business processes
IT Operations Management
Perform daily operational activities
to manage IT Infrastructure
Application Management
Managing applications throughout
their lifecycle
Service Operation: Achieving
Balance
Internal IT viewpoint versus external
customer viewpoint
-Optimum and reliable services that
meet both the needs of the customer
as well as the feasibility based on the
capabilities of the IT organizations can
only be achieved by taking a balanced
account of both views.
Stability versus response capability in
the event of a need for change.
-technology requirements change over
the course of time. Ensuring stability
and simultaneously implementing
changes on time must be weighed up
appropriately, giving due
consideration to the risks.
Service quality versus service costs
-IT organizations are under pressure to
permanently reduce their costs. This
normally also entails a reduction in
quality. The task is to ensure a
considered balance between cost
savings and the simultaneous
adherence to quality standards. This
will avoid disproportionate risks for the
company.
Reactive versus proactive
-Purely reactive organizations only
respond to external influences.
-Examples: new business
requirements, disruptions or customer
complaints. Organizations like these
tend to focus exclusively on the
stability of the services. This is not the
way to encourage a proactive attitude
on the part of the employees. By
contrast, proactive organizations
continually look for internal and
external improvements. However,
where this approach is too marked
these organizations can take on
unnecessary risks and put too much
pressure on employees in the IT
operation. It is fundamentally better to
manage IT services on a proactive
basis. The task is to find an optimum
balance and not to jeopardize the
business with unnecessary pseudo
improvements.
Service Operation: Incident
Management
- Deals with unplanned interruptions
to IT Services or reductions in their
quality
-Restore normal service operation
ASAP and minimize adverse impacts
on the business
Incident unplanned interruption to
or reduction in quality of IT service
Functional escalation escalation
across IT to subject matter experts
Hierarchical escalation involves
more senior levels of management
Work around a temporary fix for
the incident
Major incident an Incident which
has high impact on the organization
and for which a separate process
exists
Remember
Incident Management - addresses
symptoms, not root cause
Problem Management investigates root cause to find solution
Service Operation: Problem
Management
Aims to prevent problems
and resulting incidents
Minimizes impact of
unavoidable incidents
Eliminates recurring
incidents
Problem management works together
with incident management and
change management to ensure that IT
service availability and quality are
increased.
Reactive Problem Management
Generally executed as part of
service operation
Identifies underlying causes of
incidents
Identifies changes to prevent
recurrence
Proactive Problem Management
Driven as part of CSI
Identifies areas of potential
weakness
Identifies workarounds
Service Operation: Event
Management
-To provide the entry point for the
execution of many service operation
processes and activities.
Informational: an event that does
not require any action and does not
represent an exception.
Warning: an event that is generated
when a service or device is
approaching a threshold
Exception: an exception means that
a service or device is currently
operating abnormally
Trigger: an indication that some
action or response to an event may be
needed.
Event Management Process
Event occurs events occur
continuously, but not all of them are
detected or registered. It is therefore
important that everybody involved in
designing, developing, managing and
supporting IT services and the IT
infrastructure that they run on
understands what types of event need
to be detected.
Event notification most
configuration items are designed to
communicate certain information
about themselves
Event detection once an event
notification has been generated, it will
be detected by an agent running on
the same system, or transmitted
directly to a management tool
specifically designed to read and
interpret the meaning of the event.
Event filtering the purpose of
filtering is to decide whether to
communicate the event to a
management tool or to ignore it.
Significance of events every
organisation will have its own
categorisation of the significance of an
event, but it is suggested that at least
these three broad categories be
represented:
Event correlation if an event is
significant, a decision has to be made
about exactly what the significance is
and what actions need to be taken to
deal with it. It is here that the meaning
of the event is determined.
Trigger if the correlation activity
recognises an event, a response will
be required. The mechanism used to
initiate that response is also called a
trigger.
Response selection at this point of
the process, there are a number of
response options available:
Identity Management (removing
access when people change roles or
jobs and regularly auditing access
permissions to ensure they are
correct)
Access refers to the level and
the extent of a services
functionality or data to which a
user is entitled.
Identity refers to the
information about the user that
distinguishes them as an
individual and which verifies
their status within the
organization.
Rights (also called privileges)
refer to the actual settings
whereby a user is provided
access to a service or group of
services.
Service or service groups
Instead of providing access to
each service for each user
separately, it is more efficient to
be able to grant each user,
access to the whole set of
services that they are entitled
to use at the same time.
Directory services refers to
a specific type of tool that is
used to manage access and
rights. (LDAP)
Service Operation: Request
Fulfillment
-The process for dealing with service
requests via the Service Desk, using a
process similar but separate to that of
incident management.
To provide a channel for users to
request and receive standard
services for which a predefined
approval qualification process
exists
To provide information to users and
customers about the availability of
services and the procedure for
obtaining them
To source and deliver the
components of requested standard
services (e.g. licenses and software
media)
To assist with general information,
complaints or comments
Service Operation: Access
Management
--The process of granting authorized
users the right to use a service, while
preventing access to non-authorized
users.
--Protecting Confidentiality, Integrity
and Availability (CIA), sometimes
known as Rights Management or