<?xml version='1.0' encoding='utf-8'?>

<!DOCTYPE rfc [

]>

<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-davis-nmop-incident-terminology-00" category="info" obsoletes="" updates="" submissionType="IETF" xml:lang="en" tocInclude="true" tocDepth="3" symRefs="true" sortRefs="true" version="3">

  <front>

    <title abbrev="Incident Terminology">Some Key Terms for Incident Management</title>

    <seriesInfo name="Internet-Draft" value="draft-davis-nmop-incident-terminology-00"/>

    <author initials="N." surname="Davis" fullname="Nigel Davis">
      <organization>Ciena</organization>
      <address>
        <postal>
          <street/>
          <city/>
          <country>United Kingdom</country>
        </postal>
        <email>ndavis@ciena.com</email>
      </address>
    </author>

    <author initials="A." surname="Farrel" fullname="Adrian Farrel">
      <organization>Old Dog Consulting</organization>
      <address>
        <postal>
          <street/>
          <city/>
          <country>United Kingdom</country>
        </postal>
        <email>adrian@olddog.co.uk</email>
      </address>
    </author>

    <date year="2024"/>

    <keyword>Problem</keyword>
    <keyword>Event</keyword>

    <abstract>

      <t>This document sets out some key terms that are fundamental to a common understanding
         of Incident Management.</t>

      <t>The purpose of this document is to bring clarity to discussions and other work
         related to Incident Management in particular YANG models and management protocols
         that report, make visible, or manage incidents.</t>

    </abstract>

  </front>

  <middle>

    <section anchor="introduction" numbered="true" toc="default">
      <name>Introduction</name>

      <t>Incident Management is an important aspect of network management and control solutions. It deals with the
         reporting, inspection, correlation, and management of events within the network where those events
         have a negative effect on the network&apos;s ability to forward traffic in an
         optimal way.  Incident management extends to include actions taken that work toward recovery of optimal network behavior.</t>

      <t>A number of work efforts within the IETF seek to provide components of an Incident
         Management system, such as YANG models or management protocols. It is important that
         a common terminology is used so that there is a clear understanding of how the
         elements of the management and control solutions fit together, and how the incidents will be handled.</t>

      <t>This document sets out some key terms that are fundamental to a common understanding
         of Incident Management.</t>

    </section>

    <section anchor="terms" numbered="true" toc="default">
      <name>Terminology</name>

      <t>The terms are presented below in an order that is intended to flow such that it is possible
         to gain understanding reading top to bottom.</t>

      <dl newline="false" spacing="normal">
        <dt>Resource:</dt>
          <dd>
            <t>A component or commodity that can be used in a valuable way in the performance of some activity.</t>
          </dd>
        <dt>State:</dt>
          <dd>
            <t>A particular condition that something is in (at a specific time).</t>
          </dd>
        <dt>Change:</dt>
          <dd>
            <t>A modification to the state of a resource in time.</t>
            <ul>
              <li>
                <t>Most changes are not noteworthy (and are not relevant).</t>
              </li>
              <li>
                <t>Perception of change depends upon the sampling rate/accuracy/detail and perspective.</t>
              </li>
            </ul>
          </dd>
        <dt>Occurrence:</dt>
          <dd>
            <t>A particular relevant change.</t>
            <ul>
              <li>
                <t>The change is potentially without a plan or intent.</t>
              </li>
              <li>
                <t>An occurrence may be an aggregation or abstraction of smaller occurrences.</t>
              </li>
              <li>
                <t>Applies to all scales and scopes, i.e., is essentially fractal (can recurse indefinitely).</t>
              </li>
              <li>
                <t>Note that occurrence is used here with respect to the temporal dimension.</t>
              </li>
            </ul>
          </dd>
        <dt>Event:</dt>
          <dd>
            <t>The state modification in an occurrence.</t>
            <ul>
              <li>
                <t>Compared with a change which is over a period of time, an event happens at a
                   measurable instant.</t>
              </li>
            </ul>
          </dd>
        <dt>Incident:</dt>
          <dd>
            <t>An event that has a negative effect that is not as required/desired.</t>
          </dd>
        <dt>Problem:</dt>
          <dd>
            <t>A state regarded as undesirable that needs to be dealt with and overcome.</t>
            <ul>
              <li>
                <t>There is a need to change to a desirable/appropriate state.</t>
              </li>
              <li>
                <t>Note that there is a historic aspect to this. The current state may be operational,
                   but there was a failure that is unexplained and therefore the network is in a state
                   of unexplained recent failure which, although the network has recovered, is a problem.</t>
              </li>
              <li>
                <t>Note that whilst a problem is unresolved it requires attention. A record of a
                   resolved problem may be maintained in a log of history.</t>
              </li>
              <li>
                <t>Note that the network may be in a state which is considered to be a problem
                   from several perspectives (e.g., there is loss of light causing services to fail).
                   A state change (so that the light recovers) may cause the problem to be resolved from one
                   perspective (the services have are now operational) but may still leave the problem
                   as unresolved from another perspective (because the loss of light has not been explained).
                   There can be further developments (the reason for the temporary loss of light is
                   traced to a microbend in the fiber that is repaired) that cause another problem
                   to be resolved. But this leaves a final problem still unresolved (why did the
                   microbend occur in the first place?).</t>
              </li>
            </ul>
          </dd>
        <dt>Alert:</dt>
          <dd>
            <t>The indication of the potential existence of a problem</t>
          </dd>
        <dt>Notification:</dt>
          <dd>
            <t>Communication of a state change.</t>
            <ul>
              <li>
                <t>May be an alert.</t>
              </li>
            </ul>
          </dd>
        <dt>Alarm:</dt>
          <dd>
            <t>An indication to a human operator highlighting the potential presence of a problem.</t>
            <ul>
              <li>
                <t>The alarm state change is an event.</t>
              </li>
            </ul>
          </dd>
        <dt>Transient:</dt>
          <dd>
            <t>A state, considered as a problem, that persists for a limited amount of time before becoming resolved
               without direct action by an operator or control system.</t>
          </dd>
        <dt>Intermittent:</dt>
          <dd>
            <t>A state that is not maintained, but keeps occurring in some meaningfully short time frame.</t>
          </dd>
        <dt>Cause:</dt>
          <dd>
            <t>The activity, event, etc. that gives rise to an (undesired) event, condition, or behavior.</t>
          </dd>
        <dt>Detect:</dt>
          <dd>
            <t>To notice the presence of something (state, activity, form, etc.).</t>
            <ul>
              <li>
                <t>Hence also to notice a change (from the perspective of the viewer).</t>
              </li>
            </ul>
          </dd>
        <dt>Condition:</dt>
          <dd>
            <t>The state of something with regard to its working order.</t>
            <ul>
              <li>
                <t>Here, this term is used where the state is an issue with operation. For
                   example, "signal degraded" is a condition that indicates an issue with the
                   operation.</t>
              </li>
            </ul>
          </dd>
      </dl>

    </section>

    <section anchor="security-considerations" numbered="true" toc="default">
      <name>Security Considerations</name>

      <t>This document specifies terminology and has no direct effect on the security of
         implementations or deployments. However, protocol solutions and management models
         need to be aware of several aspects:</t>

      <ul>
        <li>
          <t>The exposure of information pertaining to incidents may make available knowledge
             of the internal workings of a network (in particular its vulnerabilities) that
             may be of use to an attacker.</t>
        </li>
        <li>
          <t>Systems that generate management information (messages, notifications, etc.) when
             incidents occur, may be attacked by causing them to generate so much information
             that the management system is swamped an unable to properly manage the network.</t>
        </li>
        <li>
          <t>Reporting false information about incidents (or masking reports of incidents) may
             cause the management system to function incorrectly.</t>
        </li>
      </ul>

    </section>

    <section anchor="privacy-considerations" numbered="true" toc="default">
      <name>Privacy Considerations</name>

      <t>In general, Incident Management will not expose information about end-user activities
         or user data. The main privacy concern is for a network operator to keep control of
         all information about incidents to protect their privacy and the details of how they
         operate their network.</t>

    </section>

    <section anchor="iana-considerations" numbered="true" toc="default">
      <name>IANA Considerations</name>

      <t>This document makes no requests for IANA action.</t>

    </section>

<!--
    <section anchor="acknowledgments" numbered="false" toc="default">
      <name>Acknowledgments</name>

    </section>
-->

<!--
    <section anchor="contributors" numbered="false" toc="default">
      <name>Contributors</name>

      <t>The following authors contributed significantly to this document:</t>
        <artwork name="" type="" align="left" alt="">
          <![CDATA[

          ]]>
       </artwork>

    </section>
 -->

  </middle>

  <back>

<!--
    <references>
      <name>Normative References</name>

    </references>

    <references>
      <name>Informative References</name>

    </references>

-->

  </back>

</rfc>
