First of all – why you need to use passive checks in nagios.
It`s useful for large systems, nagios will not wait for connect timeout during telecom issues.
And it`s easy to configure.
Our case (large social network).
Need to check number of unsubscribers. If no “unsubscribe” letters for 1 hour – something goes wrong… FBL list not working and we need Alert. If we will not process FBL letters for several hours, email providers rise our SPAM rating.
How to fetch letters (I use ruby Imap) – topic for another article :).
1. Nagios Check code:
#!/bin/bash
NUM=`/usr/bin/psql -t -h 1.1.1.1 -p 5450 -U cron_user base3 -c "select count(1) from email_stop_list where (esl_created BETWEEN current_timestamp - interval '1 hour' and current_timestamp) and esl_reason ~ '^fbl'"`
if [ $NUM -eq 0 ]; then
echo -e "nest\tunsubscribe_fbl\t3\tNo_Unsubscribe" | /home/scripts/send_nsca -H 2.2.2.2 -p 5667 -c /etc/send_nsca.conf
else
echo -e "nest\tunsubscribe_fbl\t0\t$NUM unsubscribes last houer" | /home/scripts/send_nsca -H 2.2.2.2 -p 5667 -c /etc/send_nsca.conf
fi
2. Code for send_nsca
Plugin Return Code | Service State | Host State |
0 | OK | UP |
1 | WARNING | UP or DOWN/UNREACHABLE* |
2 | CRITICAL | DOWN/UNREACHABLE |
3 | UNKNOWN | DOWN/UNREACHABLE |
3. Nginx service config
define service{
use generic-service-template-passive
host_name nest
service_description unsubscribe_fbl
freshness_threshold 3600
check_command volatile_no_information
contact_groups nagios-wheel,nagios-wheel-smsmail
}
4. Service template
use generic-service-template
name generic-service-template-passive
active_checks_enabled 0
passive_checks_enabled 1
obsess_over_service 0
flap_detection_enabled 0
event_handler_enabled 1
failure_prediction_enabled 1
is_volatile 1
register 0
check_period 24x7
max_check_attempts 1
normal_check_interval 5
retry_check_interval 2
check_freshness 1
freshness_threshold 90000
contact_groups nagios-wheel
check_command volatile_no_information
notifications_enabled 1
notification_interval 15
notification_period 24x7
notification_options w,u,c,r
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
}
Recent Comments