DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • The Art of Postmortem
  • The Art of Prompt Engineering in Incident Response
  • Leveraging AIOps for Observability Workflows: How to Improve the Scalability and Intelligence of Observability
  • Decoding DORA: EU's Unified Approach to ICT Risk Governance

Trending

  • Understanding and Mitigating IP Spoofing Attacks
  • Enhancing Security With ZTNA in Hybrid and Multi-Cloud Deployments
  • The Role of Functional Programming in Modern Software Development
  • AI-Based Threat Detection in Cloud Security
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Maintenance
  4. An Effective Method to Manage Incident Response SLA

An Effective Method to Manage Incident Response SLA

Effective incident management provides recurring value. Learn about managing incident response in your ITIL team's operations.

By 
Nageswara Rao user avatar
Nageswara Rao
·
Sep. 02, 18 · Opinion
Likes (4)
Comment
Save
Tweet
Share
17.5K Views

Join the DZone community and get the full member experience.

Join For Free

Incident Management

ITIL defines an incident as an unplanned interruption to or quality reduction of an IT Service. The service level agreements (SLA) define the agreed-upon service level between the provider and the customer.

An incident interrupts normal service; such as when a user’s computer breaks, when the VPN won’t connect, or when the performance of a service degrades. These are unplanned events that require help from the service provider to restore normal function; incident management restores IT services to normal working levels.

Incident management focuses solely on handling and escalating incidents as they occur to restore defined service levels. The main goal is to take user incidents from a reported stage to a closed stage.

Once established, effective incident management provides recurring value for the business. It allows incidents to be resolved in time-frames previously unseen. Incident management also involves creating incident models, which allow support staff to efficiently resolve recurring issues. Models allow support staff to resolve incidents quickly with a defined process for incident handling. The visibility of incident management makes it the important to implement and get buy-in for, its value is evident to users at all levels of the organization.

Operational incident management requires the following key pieces:

  1. A service level agreement between the provider and the customer that defines incident priorities, escalation paths, and response/resolution timeframes

  2. Incident models, or templates, that allow incidents to be resolved efficiently

  3. Categorization of incident types for better data gathering and management

  4. Agreement on incident statuses, categories, and priorities

  5. Agreement on incident management role assignment

The Incident Management Process

In ITIL, incidents go through a structured workflow that encourages efficiency and best results for both providers and customers. ITIL recommends the incident management process follow these steps

  1. Incident identification

  2. Incident logging

  3. Incident categorization

  4. Incident Prioritization

  5. Incident Response (Diagnosis, escalation, investigation, resolution, recovery, and closure)

Incident Identification and Logging

The first step in the life of an incident is incident identification. In an IM/AM case incidents comes from automated notices such as monitoring software, emails, support chat etc.,

Once identified as an incident, logs the incident as a ticket. The ticket should include information such as the user’s name, the incident description, and the date and time of the incident report. The logging process can also include categorization, prioritization etc.,

Incident Categorization

Incident categorization is a vital step in the incident management process.

Categorization structures in IT Service Management are divided into two distinct components: Operational Categorization and Product Categorization.

Operational categorization is a three-tier structure that helps you to define the work that is being done for a particular incident. This structure is also used to qualify reporting in the system, qualify how groups and support staff are assigned, and route approvals.

Product categorization is a three-tier structure that helps you to define a description of the object or service on which you are performing the work (for example, Hardware, Peripheral Device, Monitor).

Operational Categorization

The structure of Operational categorizations template is Action -> Object -> Subject. Stated from the perspective of the user/customer reporting the outage, the classification should be “I (the user) need your (support) to <Op Cat1> the <Op Cat2> on my <Op Cat3>”. Those values should be in four sections, differentiated by the value in the incident type field.

  • User Service Restoration: Related to existing products broken or service interrupted.

    • Fix/Repair -> Connectivity -> Network

    • Fix/Repair -> Hardware -> Laptop

  • Infrastructure Event: Created exclusively from Event Management application(s) feeding ITSM. Would use any other User Service Restoration OpCats in addition to these exclusive ones that could likely be generated by an automated alarm but would not result in any action.

    • Check/Verify -> Service -> Server

    • Check/Verify -> Alarm -> Server

  • Infrastructure Restoration: These are exclusively for Events that are not only created by an automated rule but resolved by preset automated action without the need for human intervention. They would use the User Service Restoration OpCats.

The details of the exact software and hardware in play are defined in the Product Categorizations. Note that these values are for Incident Types that are Incident Related.

Product Categorization

Operational Categorizations tell what needs to happen but are deliberately vague, or generic, in terms of exactly which objects will be affected. The categorization of a specific laptop or server is more appropriate to mention in the ProdCats. The ProdCat of a CI is closely related to that mentioned in OpCat3, since the server is what is being changed. Then the relationship between the Incident and the actual CI(s) affected can be drawn, using the same ProdCats. The principle is that the Cat3 is the object being most affected, and should be detailed in the ProdCats. Optionally (and probably optimally) both the Cat2 and Cat3 CIs should also be related to the Incident.

Product Categorizations are not application-specific. They should reflect a combination of sources, and use the subset of the aggregate appropriate to the installation. The potential source of data is CMDB.

Prioritization

Priority reflects the organizational response required for an Incident. Establishing a priority coding system requires two major parts

1. Definition of organizational response, for example, Critical, High, Medium, Low or Platinum, Gold, Silver, Bronze

2. A method for determining which response to apply to any given incident

ITIL presents an example of a 2-part priority coding system with five priority levels or tiers: 1-Critical, 2-High, 3-Medium, 4-Low, and 5-Planning.

It then offers a simple matrix with impact on the top, and urgency on the side to select the priority. Thus, establishing priority is a matter of mostly two things; impact and urgency.

Assignment Routing

You can configure assignment routing so that the system automatically assigns records, such as investigations or change requests, to the appropriate support group.

When an ITSM application uses the routing order, which is a feature of many of the main ticketing forms, it uses information from the form to find an assignment entry and select the support group for assignment.

Incident management

Opinions expressed by DZone contributors are their own.

Related

  • The Art of Postmortem
  • The Art of Prompt Engineering in Incident Response
  • Leveraging AIOps for Observability Workflows: How to Improve the Scalability and Intelligence of Observability
  • Decoding DORA: EU's Unified Approach to ICT Risk Governance

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!