Trigger Warning severity actions when the Critical severity threshold limit includes multiple polls

Trigger Warning severity actions when the Critical severity threshold limit includes multiple polls

Hi,

As per the post in the troubleshooting sub forum Warning action not occuring when Critical threshold breached on first poll. I propose that the determination of whether a monitored value has breached a threshold level should also consider the number of polls configured for the threshold.

With the following configuration:

Critical Severity : Monitored value is > than the Threshold Limit 40
      Polls to try 2
Warning Severity : Monitored value is > than the Threshold Limit 40
      Polls to try 1
Clear





Severity : Monitored value is <= than the Threshold Limit 30
      Polls to try 1

I would like to see the Warning Severity be actioned if the monitored value goes above 40 on the first poll and the Critical Severity be actioned if the monitored value is still above 40 on the second poll.

In this way if I have an automated action associated with the Warning Severity that can address the issue, e.g. restarting a service, deleting files older than a specific date, etc... then Applications Manager will only report a Critical situation, if the automated action did not fix the problem (My monitors send out SMS for critical alerts and this would stop them being sent if the automated action fixes the issue.).

NOTE: I have attempted to set the Warning threshold limit to 30 and the Critical threshold limit to 40 and do my automated action at 30, but sometimes the value jumps from 29 to 41 between polls and skips the warning action that could have fixed the issue.

Another possible solution to meet the needs of this requirement, would be to provide the ability to associate different actions to individual polling events of a threshold severity.  E.g. deleteFilesAction on 1st poll where Critical limit has been exceeded, sendEmailAction on 2nd consecutive poll where the Critical limit has been exceeded.

Cheers,
Damien

Additional Comment: 06/07/2012
I've had a further thought on another potential implementation.  A concept of "Remedial Actions" could be introduced.  If a monitor's Severity Threshold were to have an associated "Remedial Action", AppManager could:
1. Initial Poll, detects monitored value is within a Warning or Critical Threshold
2. Triggers "Remedial Actions" (if they exist)

















   2a.  If "Remedial Actions" timeout, trigger standard actions and display the monitor as Warning or Critical Health (as appropriate).
  
2b. If "Remedial Actions" complete, do a second poll immediately.
   2c. If mointor remains in a Warning or Critical Threshold
, trigger standard actions and display the monitor as Warning or Critical Health (as appropriate).



                New to ADManager Plus?

                  New to ADSelfService Plus?