article Still receiving alerts when topological dependency has failed

When the host check for a topological host enters a failure state (i.e. WARN or CRIT), the service monitors for hosts that depend on that element will have their alerts / actions suppressed and enter an UNKN state until the host check on the topological host recovers.  However, there are a few scenarios where you may still notice alerts being sent when it appears that the topological host has failed:

  • The host check for the topological host has not failed all re-checks. If the host check is still in it's re-checking loop and has not started alerting, other monitors will still register outages and potentially send alerts.
  • When a non-host check monitor runs, it will not force a run of the host check before sending an alert. For example, if your host check runs once every 15 minutes but you have a monitor set up to run once per minute, the once per minute monitor may fail and alert well before the host check has registered the outage.

As a general rule, your topological parent's host check should check as often as the most checked service on any of the child elements and have a re-check interval / max re-checks value that is shorter than the most checked service.

Related Articles


Why am I not receiving email alerts?

RatingViews
article

Email alerts may not be properly sent for any of the following reasons - The SMTP server settings are incorrect. If you are receiving other email alerts or are able to send an emailed report to...

By: uptime Support | Date Created: 12-31-1969 | Last Modified: 9-1-2011 | Index: 268

  4502

Why am I still receiving alerts when my host check is down?

RatingViews
article

When a host check goes into a failure state (WARN,CRIT) the other service monitors on the target element will have their alerts/actions supressed and enter an UNKN state until the host check...

By: uptime Support | Date Created: 8-27-2010 | Last Modified: 8-13-2011 | Index: 498

  2152

monitor failed: software caused connection abort: recv failed

RatingViews
article

This error message indicates a general network connectivity error from the monitoring station to the monitored agent servers. It has only been seen on windows monitoring stations and is generally...

By: uptime Support | Date Created: 12-31-1969 | Last Modified: 6-27-2013 | Index: 280

  6560

Creating Topological Dependencies

RatingViews
article

In large deployments, a single system or node can act as the gateway to other entities or entity groups. For example, up.time might need to go through a router - configured as a node in up.time -...

By: uptime Support | Date Created: 1-30-2009 | Last Modified: 8-25-2011 | Index: 354

  3189

Receiving "Could not connect to database" from Oracle Monitor

RatingViews
article

This article suggests how to troubleshoot and resolve issues related to connecting to an Oracle database with any of the three up.time Oracle service monitors (Basic Check, Advanced Metrics or...

By: uptime Support | Date Created: 8-10-2012 | Last Modified: 8-11-2012 | Index: 583

  1880

User Comments



No comments have been posted.

Copyright © 2021 IDERA, Inc.   Legal   Privacy Statement