Allow host down while under nagios watch

by Michal Bryxí   Last Updated August 13, 2019 21:00 PM

We're using nagios to watch our servers health. Now we have a task to add server which will be up only for certain time. And during that time we have to ensure that all services are up and running. Unfortunately we don't know when will be the host down. So we need some automatic way to achieve this.

  1. Is there a way (configuration directive) to not report if host goes down. I mean even in nagios clients like nagstamon. I don't like the idea of black icon in systray all day.
  2. Is there a way to not report any of services running on a host, while the host is down?
  3. While achieving points 1. and 2. is there a way to monitor all host services when and only when the host is up?
Tags : nagios


Answers 5


You can define custom time periods using the timeperiod definition in your timeperiods.cfg. Here is an example

# Here is a slightly friendlier period during work hours
define timeperiod{
    timeperiod_name workhours
    alias           Standard Work Hours
    monday          09:00-17:00
    tuesday         09:00-17:00
    wednesday       09:00-17:00
    thursday        09:00-17:00
    friday          09:00-17:00
    }

Then use this for the check_period value in your host and service definitions.

slillibri
slillibri
December 18, 2010 19:28 PM

This answer will not cover third-party monitoring addins like nagstamon or the firefox nagios plugin, because it will vary wildly.

A few ways that I can think of off the top of my head:

  • Disable host checks for the server. However, this will leave the host in a permanent state of "UP", and services will begin to alert.
  • Constantly set floating downtime. Floating downtime starts the next time the host is seen "DOWN" and runs for the duration you specify. You could perhaps do this with a cronjob issuing the "SCHEDULE_HOST_DOWNTIME" for 2 floating hours every two hours. ( See NAGIOS Developer - External Commands ).
  • Schedule downtime for the host and all services (see above link).

  • What you could also do is put service dependencies to use. If you check the host via PING, then add a PING service, and a service_dependency for all other services on that host to depend on PING, and then shut the ping notifications off. This will look something like;

define servicedependency {  
    dependent_host_name             flaky_biscuit  
    dependent_service_description   service1, service2, service3, service4, service5
    host_name                       flaky_biscuit  
    service_description             PING  
    execution_failure_criteria      w,u,c  
    notification_failure_criteria   w,u,c  
}

What this means in essence is that when PING is in warning, unknown or critical, PING will notify, none of the dependent services will. (And again, shut notifications off for PING!) Also, when PING is in warning, unknown, or critical state, the dependent services will not even execute.

I can't speak for NagStaMon, but the Firefox NAGIOS plugin has preferences that essentially says "ignore acknowledged services", meaning that if you acknowledge or schedule a service in downtime, have notifications off, or any other modification to a service, it will not render as "warning/critical" in the status bar even if it's in that state. I don't know what NagStaMon does or doesn't have in this manner.

VxJasonxV
VxJasonxV
December 18, 2010 20:07 PM

Let me take the points in the wrong order.

2) NAGIOS should already do this; if a host is down, service alerts will not be sent.

1) I was thinking you could do this with flexible downtime: this is downtime of a given window duration which doesn't start at a known time; instead, the window starts automatically when the host goes down.

But then it occurred to me: all you really need to do is send no alerts when the host is down. If you manage that, then

  • When the host is down, service alerts will not be sent. You don't care that the host is down, because as you say, you don't know when it'll come and go, so the absence of a host alert is immaterial. The HOST DOWN will still be logged, allowing you to retrospectively see what has gone on, but alerts will not be sent.

  • When the host is up, service alerts will be sent anyway.

That's what you want, isn't it? If so, you need to add to the host definition

notification_options   n

I think that's also dealt with problem 3, as that's what happens normally. I can't speak for non-core clients like nagstamon. In my experience, these are usually screen-scrapers, and their decisions about what to notify aren't based on NAGIOS' notification logic. If your client honours NAGIOS' built-in rules, it should be fine; otherwise, you'll have to work with that particular tool to add a similar logic.

MadHatter
MadHatter
December 18, 2010 20:08 PM

Here is my idea for using passive checks for this, but I need to state an assumptions first. That you don't want to monitor host uptime at all. Just that when the host is up, that the appropriate services are running.

On your random uptime hosts, you can run something like the following shell script https://gist.github.com/746998. This example would monitor SMTP but it's fairly simple. You will need to have this run as a user that can ssh into your nagios host using a keypair, and securing that is left as an exercise to the reader (or post a new question). I haven't tested this but it should work. The Passive check documentation ( http://nagios.sourceforge.net/docs/3_0/passivechecks.html ) should be helpful.

This won't automate provisioning the hosts on your Nagios server, but you can use something like puppet for that.

slillibri
slillibri
December 19, 2010 00:17 AM

For alerting you should check out Nagios BPI (Business Process Intelegence) Nagios Enterprises just released this new addon. go to Nagios dot come, and check out there roadmap.

user60578
user60578
December 20, 2010 16:45 PM

Related Questions


How to search Nagios by service?

Updated April 01, 2015 01:00 AM

Unable to Add Plugin to Redis

Updated April 25, 2015 22:00 PM



Nagios showing "OK" state despite being in CRITICAL

Updated April 13, 2017 17:00 PM