An Overview of Implementing an Incident Management Process

What is an Incident Management Process?

In the context of an IT help desk, incident management refers to identifying, analysing, resolving, and preventing IT issues ('incidents') that impact the availability and reliability of IT services.

an impage of a computer with a warning symbol

An IT incident can be any event that causes disruption or degradation of the normal functioning of an IT system, application, or infrastructure.

The incident management process involves the IT help desk team logging and prioritising incidents based on their impact and urgency, diagnosing and resolving incidents according to predefined procedures, and ensuring they are fully documented and reported.

The aim is to minimise the impact of incidents on business operations and restore regular service as quickly as possible.

If you don't use the term "incident management," just think of "ticket management".

Why do we have Incident Management?

If you don't have a straightforward process, how can you implement a tool to automate, communicate, or evaluate it?

Here are a couple of other textbook-style reasons;

To minimise the impact of incidents on the business and its customers.
To restore regular service operation as quickly as possible.
To prevent incidents from recurring.
To continuously improve the incident management process
To communicate effectively with stakeholders during and after an incident
To comply with relevant industry standards and regulations.

The Incident Management Process

Here's a video overview of the Incident Management Process.

Incident Management Process Steps

incident management process diagram — Incident Management Process Diagram

1) Record Incident

When an incident is identified, a comprehensive record is generated within the Incident Management system. This record acts as a dynamic logbook that will be updated throughout the life of the incident. It documents the initial issue, every action taken, who undertook it, and when it happened. This ensures traceability and accountability for each phase of resolving the incident.

2) Classification & Assessment

After the initial recording, experts review the incident details to classify it. The classification could range from hardware issues to software bugs or user errors. A priority level is also assigned based on factors like impact and urgency. This step is crucial for allocating resources effectively and can also serve a secondary role in trend analysis. Detecting patterns in incidents can lead to proactive measures in the future.

3) Investigation & Recovery

The Help Desk is then tasked with a brief investigation into the issue. During this time, staff may refer to an existing knowledge base for potential solutions or fixes. Depending on the complexity, they may resolve or escalate the issue to a specialised support team. Time is of the essence here, as a speedy recovery minimises downtime and impact.

4) Contact the Customer with a Resolution

Once the problem is resolved or a workaround is found, the user or customer is contacted to confirm the solution's effectiveness. Their acceptance is crucial; if they are satisfied, the incident record is updated to indicate a successful resolution.

5) Update Knowledge Base

This step is crucial for organisational learning. If the incident led to a new solution or workaround, this information is documented in the knowledge base. By doing this, the organisation equips itself better for future incidents, enabling quicker resolutions and reducing time spent on investigations.

6) Close Incident

Finally, the incident is formally closed once the user accepts the resolution. At this point, a final classification of the cause is added to the record, such as whether it was due to a user error, a recent change in systems, a software fault, etc. This closure process ensures that all actions are documented and provides valuable data for reviewing the effectiveness of the incident management process.

Incident Management Roles & Responsibilities

Helpdesk Staff

Incident Identification and Logging: Responsible for recognising and documenting incidents as users report them.
Incident Categorisation: Classifying incidents based on their impact and urgency.
Data Capture for Analysis: Gathering necessary information that will aid in diagnosing the issue.
Customer Updates: Providing timely updates to customers upon request.
Incident Escalation: Escalate incidents to the appropriate technical teams or the Major Incident Manager.

Help Desk Manager / Team Leader

Process Management: Overseeing the entire incident process from start to finish.
Response Coordination: Coordinating the collective response to incidents among various teams.
Resource Allocation and Task Prioritisation: Assigning human and technical resources while setting task priorities.
Progress Monitoring: Keeping track of incident resolution progress and updating stakeholders accordingly.
Procedure Adherence: Ensuring incidents are logged, categorised, and resolved per established protocols.
Post-Incident Reviews: Conducting reviews after incident resolution to identify and implement improvements.
Metrics and Trend Analysis: Reporting on key performance indicators and analysing incident trends for future preventive measures.

Technical Support Staff

Collaboration: Working with other technical units or third-party suppliers to facilitate incident resolution.
Implementation of Fixes or Workarounds: Taking necessary actions to restore affected services, whether fixes or workarounds.
System Updates: Keeping the incident management system updated with the incident resolution status and progress.
Managerial Communication: Offering the Help Desk Manager updates regarding the incident's status, impact, and estimated resolution time.
Post-Incident Review Participation: Engaging in reviews after the incident has been resolved to identify areas for improvement and execute corrective actions.

Incident Management RACI Matrix

Task / Activity	Help Desk Staff	Help Desk Team Leader / Manager	Technical Support Staff
Incident identification & logging	R	A	I
Incident Categorisation	R	A	I
Data capture for analysis	R	A	I
Customer updates	R	A	I
Incident escalation	R	A	C
Process management	I	A	I
Response coordination	I	A	R
Resource allocation and task prioritisation	I	A	R
Progress monitoring	I	A	R
Procedure adherence	R	A	C
Post-incident reviews	C	A	R
Metrics & trend analysis	I	A	C
Implementation of fixes or workarounds	I	C	R
System updates	I	C	R
Managerial communication	I	A	R
Post-incident review participation	C	A	R

Key:

R (Responsible): The person who performs an activity or does the work.
A (Accountable): The person who is ultimately accountable and has the final authority on the task.
C (Consulted): The person must be consulted before a decision or action is taken.
I (Informed): The person who must be informed after a decision or action is taken.

Taking in on a level

If you want to drive the maturity of your incident process, then there are two main steps you can take;

1) Implementing an Incident Management Policy.

So, this is optional.

It may have value depending on the type of organisation you are in (pharmaceutical, financial or regulatory). If you feel it conveys important information to various parties and has value, go ahead. If you think it is bureaucratic and has little value, skip it.

The benefit it really brings is consolidating everything under a single roof. All processes, including Major Incidents, roles & responsibilities, and any specific expectations or guidance.

Have a look and see if you think it adds value. If not, maybe its something to consider as the maturity in the teams improves.

Download The Incident Management Policy

2) Developing a Major Incident Process.

In circumstances where there is a major outage, then you need a major process. Check out the following guidance on creating a major incident process.

Exploring The Incident Management Process

What is an Incident Management Process?

Why do we have Incident Management?

The Incident Management Process

Incident Management Process Steps

1) Record Incident

2) Classification & Assessment

3) Investigation & Recovery

4) Contact the Customer with a Resolution

5) Update Knowledge Base

6) Close Incident

Incident Management Roles & Responsibilities

Helpdesk Staff

Help Desk Manager / Team Leader

Technical Support Staff

Incident Management RACI Matrix

Taking in on a level

1) Implementing an Incident Management Policy.

2) Developing a Major Incident Process.

Related Posts

Comments

Why Bringing in an External Eye on Your Cloud Setup Might Save Your Bacon

How To Navigate an ISO 27001 Audit

Understanding the Basics of Information Security Frameworks

How to Write a Project Plan That Keeps Your Team on Track

How to Define the Scope of Your ISMS Using My Template

Acceptable Usage Policy Example: A Guide to Structure and Content

March 25 - Impact of Geopolitical Conflicts on Cybersecurity Risks

Incident Response Policy

ISO 27001 Control 8.33: Test Information

ISO 27001 Control 8.32: Change Management

ISO 27001 Control 8.31: Separation of Development, Test & Production Environments

ISO 27001 Control 8.30: Outsourced Development

ISO 27001 Control 8.29: Security Testing in Development & Acceptance

ISO 27001 Control 8.28: Secure Coding

ISO 27001 Control 8.27: Secure System Architecture & Engineering Principles